Quote:
Originally Posted by Capsinurass
Not really what I was asking. I was asking why your peak number (2,906,312) from the data you have is different to the peak data that was displayed on the stats page (2,907,967) and also can I add,
|
It's not hard to assume that measuring a simple quantity, like users online at a given time, would be repeatable and indisputable. Fact is, though, it's not.
We show users the graph you linked to give them a feel for how busy the system is. It also shows our customers and partners how Steam might be growing, over time. It's a quick check to see the health of the system, as well--once you get used to the patterns. If we're near 1130 and the number of users online is substantially less than it was at 1130 yesterday, it's likely something has gone wrong.
The numbers I have are used operationally. I record what's going on with Steam once per minute, and save it in a database. The number of logged in users is just one of the statistics I record. We constantly measure about 3500 statistics, in fact -- though I only record about 15 right now. Those numbers tell me how many servers are up and how many I expect to be up; how many users are logged in, how many game servers are logged in, and so on.
Over time, I can go backwards through that recording to see how we're doing. If we didn't record data for a given minute, we must have been down, LOL! If the number of servers online doesn't match the number expected, the system is at least partially down. If the number of users seems odd, or is trending one way or another, it can tell me that there might be an external problem. For example, if our user count is a certain percentage less, but all the servers are online for that same moment, we might surmise that something is wrong with the network
outside of Steam. There are regional problems in the Internet all the time, throughout the day.
The numbers also trend over time. The peaks slowly rise during the winter and taper off during the summer. But the summer peaks this year were larger than last year, and by looking at the numbers and thinking about the features we want to implement, I can surmise what hardware I might need to buy in order to plan for upcoming load. This practice is called "
capacity planning".
These recordings, then, are pretty valuable information for us. Since the people who consume those numbers are engineers, like me, we understand that they have variance and noise. We know how they're recorded (since I wrote the code to do the recording), and we know what they mean over time.
The graph you see is, in a way, watered down. I'm sure there are people who will use the "statistics are lies!" cliche to jump to the conclusion that we're fudging the numbers, but the raw numbers are not very useful to anyone outside of Valve. One reason is something I already mentioned: noise.
The number of users online changes every second. In fact, it changes faster than it can be measured. There are more than a dozen servers which hold on to user's state when they're logged in. These servers each have a faction of the load, so we need to have each server report their individual total and sum that to know how many users are online.
The clocks on those servers aren't precisely in synch, and they're doing other work. If we wanted the report to happen once a minute, it would -- but not every server would report at exactly the same time. And by the time each server reported, the number would change because users constantly log in and log out. That means the actual count, when graphed as a line on a chart, is extremely bumpy and jaggy -- it's not a smooth ascending and descending number as represented there.
The smoothing that happens isn't a fabrication; it's necessary to make the data readable. The noise can be so substantial that it makes the graph unusable, even for the casual customer observing the number. Because of noise, the variance in the reported number might overwhelm any short-term trend. If one minute we report there are 1000 users online, and the next minute there are 980 users online, and the next minute there are 1010 users online, is the number of users online going up, or going down, or staying the same?
The core answer to your question, then, is that the data looks different because it
is different. For the number of users online, I record a maximum within the last minute, then write it to the database. For the chart, we do a bit different smoothing method to help eliminate noise and make trends more obvious. The difference you're observing is less than 0.05% by the way; it's really not substantial. In the physical world, very few measurements are this accurate.
Furthermore, the chart you see holds a little less than two days of history. Since we were asking for peaks that happened deeper in history than that, we have to turn to the recordings I made. Presumably, when load increases again, people will ask when the previous records were again. The only persistent reference for that data is my recording table (though, I guess, maybe there might be some other spot in Steam where we record a similar number...) so that's really the only way to answer the question.
I hope that helps; let me know if you have additional questions.