A second is a long time

_delirium · on July 10, 2011

I agree with the general point that you might be over-smoothing if you graph data with large window sizes, but you don't have to change the units on the graph as well, and probably shouldn't, since "Mbit per half second" is a confusingly nonstandard unit--- the units of measurement and the granularity of measurement don't have to be tied. An Mbit/s is a rate, and you can graph the rate in that unit for data points taken over a window of 1/2-second also.

Same as how in physics, lots of things are expressed in rates of "per second", but this does not necessarily imply that the measurement window was one second; it might be vastly larger or smaller than that. Or, with cars, you can measure the speed in mph or km/hr much more frequently than once per hour. =]

Jabbles · on July 10, 2011

I think he's made some unfortunate mistakes with his language and use of units:

"we could transfer 900 Mbit for a half of a second and another 100 Mbit for the other half of that second. How much data was transferred during that second? The answer is 1 Gbit per second."

"for" implies multiplication, i.e. 2Mbit/s for 3s transfers 6Mbit in total, 2Mbit/s * 3s = 6Mbit. The phrase "900Mbit for half a second" should probably read "900Mbit in half a second", giving 1.8Gbit/s (and thus supporting the point of the article).

"The answer is [not] 1Gbit per second", it is 1Gbit. It is a measure of data, not transfer rate.

He makes a good point though, averaging may mask important details.

Confusion · on July 11, 2011

It's ambiguous and either interpretation holds. '900 Mbit for half a second' is meaningless and requires correction. Either for -> in or Mbit -> Mbit/s.

tlrobinson · on July 11, 2011

The resolution of data and units the data is expressed in are orthogonal. You can still use X/second units but report 10 samples per second. Changing the units is confusing. I don't have an immediate frame of reference for how fast Mb/half-second is without doing a conversion in my head.

minimax · on July 10, 2011

For example, we could transfer 900 Mbit for a half of a second and another 100 Mbit for the other half of that second. How much data was transferred during that second? The answer is 1 Gbit per second.

The dimensional analysis doesn't work here. He asks for a quantity of data and answers with a transfer rate. Also it shouldn't be surprising that the instantaneous transfer rate at a given time is different than the average transfer rate over a period of time.

ssapkota · on July 10, 2011

It's a fabulous troubleshooting effort made by the admins - "we saw we would frequently burst the 1 Mbit per millisecond rate of the so called 1 Gbit/s interfaces"

carbonica · on July 10, 2011

> This effect is made even worse by most monitoring tools because most take samples every 5 minutes.

Who exactly is using monitoring tools that sample every 5 minutes? Goodness. Maybe I've been spoiled, but I couldn't imagine using such a blunt tool.

The_Fox · on July 10, 2011

Munin, the package that charts performance and usage metrics, does so at 5 min intervals by default. But its purpose is to show historical trends (it tracks every single metric for the past year), and its users are expected to understand that it doesn't help for the kind of thing this article is about.

xtacy · on July 11, 2011

I have been looking for rrd/cacti like graphing tools that work with millisecond sampling. Any ideas?

Joakal · on July 11, 2011

I monitor as often as possible to be alerted for sudden high traffic. To not add to the load, it checks every 30 seconds.

lmz · on July 11, 2011

Cacti uses 5 minute intervals by default.

moe · on July 10, 2011

What tool do you use?

JoachimSchipper · on July 11, 2011

rrdtool[1] can be configured to "merge" measurements in various ways: by default, the pixel at (say) "4min" is an average of the measurements at (e.g.) 3:30, 3:40, ..., 4:20 [2], but it can also be configured to be the maximum (or minimum, or median) of these measurements.

[1] the database and graph software which underlies most (Unix?) server monitoring tools, including e.g. Munin and Cacti.

[2] Or the measurements at 4:00, ..., 4:50, or those at 3:10, ..., 4:00; I forgot, and it's not important.

7952 · on July 11, 2011

What you need is standard deviation and other stats of all the samples.