Wednesday, June 21, 2017

Sorry for the low blogging throughput...

Dear Blogspot,



While working the other day, I was watching the throughput meter in a downloader/patcher as it moved about a gig of data from the internet to my hard drive.  The meter was showing me how many bytes per second I was downloading.  There was also a wonderful little widget for limiting the throughput.  I could choose options from 64 kilobytes per second all the way up to 4 megabytes and Unlimited.  It is this widget and how it works that I'd like to tell you about.

I'm sure the algorithm used to produce the above graph is more clever, but the most straight-forward network rate limiter basically works by:

1. Observing a maximum throughput (how many bytes have come in lately?)

2. Extrapolating future throughput over a set time period (at this rate, how many bytes will I download by the time 1 minute has passed?).

3. Pausing, by simply refusing to read, the incoming traffic until new extrapolations of the future throughput match the desired throughput.

4. Repeat

For example, suppose I cannot help but eat one peanut per second, for an expected maximum peanut throughput of 60 peanuts per minute, but I only want to eat 20 peanuts per minute.  The way to accomplish this would be to tape my mouth shut and bind my hands for 2 full seconds for every 3 seconds available, so that only once every 3 seconds am I actually able to eat a peanut.  Since my normal eating rate is one peanut per second, by preventing me from getting at the peanuts 2/3 of the time, I have successfully reduced my peanut consumption throughput to 20/minute.

If you were to graph my throughput-limited peanut-eating, it would look like this:


Peanuts are a good analogy because, like network downloads, the receiver has little to no control over the size of each incoming package of data.  This limits the receiver (or peanut eater) to either consuming or not-consuming.  Partial-consuming is not really an option.  This leads to very jagged graphs, as you see above, because consumption is always followed by deprivation in order to control the overall throughput.  And even if one could somehow eat 1/3 of a peanut per second (try it, it's not easy), that would definitely not work for download data, since there is no way to ask a transmitter to only send a partial byte per packet.

So, what has surprised me in the past is how alarming a graph like the above can be to folks.  Why can't an accurate graph of limited throughput be smooth?  When I want to limit my rate of walking, I don't walk normally and then pause every second before continuing to walk for another second.  I just walk slower, in one fluid motion.  Why can't the graph be like that?

Well, the graph can certainly LOOK like that, if one, for example, graphs the median throughput over a period instead of the actual bytes or peanuts consumed during each atomic unit of time.  But that wouldn't quite be the truth about what is going on.  It would be covering up the mechanics of reality with statistical lies.  It would be pretending that something happened during an atomic unit of time that really didn't happen at ALL.  I once wasted an hour trying to explain this to some folks, and that didn't go well.  They just wanted the data to be slower, and not pause all jaggedy like that.

The size of each package of incoming peanut or data, and the limited ability of the receiver in how they can react to incoming peanuts and data, dictates both the algorithm one must use to limit data, and the way an accurate graph of data throughput over time would look.

All that said, my advice is this: screw accuracy; don't try to explain or show any of this to anyone.  Just use a smooth-averaging algorithm to hide the jagged edges of reality when making your throughput graphs.  It's much easier to do this than to try to explain why those toothy fangs are more accurate.  Besides, the graph is meant to convey a general rate of data, not provide a window into actual data movement along your i/o bus.


No comments:

Post a Comment