Wednesday, February 27, 2008

What Is "Load Average" In Top's Output Anyway?


When I started using Unix in 1989, I learned about the "top" command to see how hard the computer was working. There are many numbers in top's output. Some (such as time, uptime, number of users, CPU states, etc.) are easy to understand. However, I never really understood what "load average" means.

Of course, if the computer is doing nothing, the load average would be low, and if the computer is busy, the load average will be large. However, the number can go much higher than 1, so it's not the fraction of the computer's capacity that is being used. I was never curious enough to find out exactly what that number means, and just settled for general feeling of typical numbers that would be "safe" for each computer.

Now, thanks to the magic of the Internet, the exact meaning can easily be found here and here. (Yes, it's easily found, but to really understand it, you need to read till the end.)

To summarize:

So, what have we learned? Those three innocuous looking numbers in the LA triplet have a surprising amount of depth behind them.

The triplet is intended to provide you with some kind of information about how much work has been done on the system in the recent past (1 minute), the past (5 minutes) and the distant past (15 minutes).

and

Those three little numbers tucked away innocently in certain UNIX commands are not so trivial after all. The first point is that load in this context refers to run-queue length (i.e., the sum of the number of processes waiting in the run-queue plus the number currently executing). Therefore, the number is absolute (not relative) and thus it can be unbounded; unlike utilization (AKA ``load'' in queueing theory parlence).

No comments: