What should the CPU usage be of a fully-loaded CPU that has been throttled?

Thom Holwerda 2021-07-05 Hardware 42 Comments

For simplicity, let’s say you have a single-CPU system that supports “dynamic frequency scaling”, a feature that allows software to instruct the CPU to run at a lower speed, commonly known as “CPU throttling”. Assume for this scenario that the CPU has been throttled to half-speed for whatever reason, could be thermal, could be energy efficiency, could be due to workload. Finally, let’s say that there’s a program that is CPU-intensive, calculating the Maldebrot set or something.
The question is: What percentage CPU usage should performance monitoring tools report?

Should it report 100%, or 50%? This is like asking what side of the bed is the front, and which side is the back – you can make valid arguments either way, and nobody is wrong or right.

About The Author

Thom Holwerda

Follow me on Mastodon @thomholwerda@exquisite.social

42 Comments

2021-07-05 9:11 pm
Anonymous
Another question is how much CPU processing power is required to achieve a task? The question naturally follows why is Windows and all the bloated applications so slow when created by lazy progammers with a bazillion CPU cycles at hand? Why not have a graph showing effective performance when compared to a P200 or, say, a modern two core 64 bit processor as a baseline? Why not other graphs asking different questions such as the liklihood of your configuration being hacked compared to a “gold standard” configuration disconnected from the internet?
As some people suggested in the comments the answers depend on what question you ask.
The questions asked may depend on what they want you to know.
2021-07-06 12:48 am
sj87
The real question is: if I need to know all of this information mentioned in the article, then why am I trying to squeeze it into a single number? Obviously what I need is not a useless lecture into why everyone else is wrong, but rather a different tool to fit my purpose.
People who use top and similar simplistic system monitoring tools probably want to see “100 %” in the CPU usage when the app is consuming all cycles available on the system. More complicated scenarios require more complicated system analysis tools.
edit: Oh, only now I noticed that it’s a Microsoft blog post. I guess it explains everything then.

2021-07-06 10:19 am
Earl C Pottinger
I have to 200% agree with you. Just from the first moments I thought you need at-least two numbers. ie What percent of the CPU is being used and what percentage of the CPU top performance was it running at. Just trying to put that and still other factors into a single number is a total waste.

2021-07-06 10:26 am
Kochise
I use https://www.alcpu.com/CoreTemp/ to display various things in the tray (CPU temp, CPU hz, …) and https://docs.microsoft.com/en-us/sysinternals/downloads/process-explorer to do the same (CPU load, MEM commit, MEM physical, …)
With just these two, I get the info : CPU hz gives the cap, CPU load gives the level
Maybe a third option, as you say, would be good : CPU ops (to simplify)
QED

2021-07-06 1:06 am
sukru
Raymond Chen touches many interesting topics in the Windows system, and this one is also a good brain exercise.
If for example, you are running Blue Iris for your cameras as a NVR, and you want to monitor CPU usage to see how much the functions (like motion detection) are taxing the system.
Let’s say if the software is using 25% of the theoretical CPU limit, but to save power it throttles down. Would you like to see 100% in the stats, and freak out? Or see 25% and not worry about power saving details.
(If you really worry about power saving, there should be another metric to monitor that separately).
So, in this particular aspect, I agree with the decision taken by their system team.

2021-07-06 4:12 am
The123king
If my system is lagging and stuttering, and i see the most resource intensive task isn’t using all the CPU resources available, i’ll assume there’s either a bottleneck in my system somewhere else, or my CPU is faulty and it needs RMAing

2021-07-07 11:50 am
sukru
The123king,
CPUs rarely need RMA’ing, but I get you. If the CPU governor is not able to sync up properly with demand, it will cause stuttering, but in practice, this is a solved problem.
There are two more reasons for lowering CPU speed: thermal throttling, and power throttling.
If the system is too hot, it will slow down until cooling keeps up. Many cheap laptops have this issue, where the tiny whiny fan cannot pull out enough heat. Placing it on a carpet or never cleaning vents does not help either.
And, it you are low on battery (think 1%), and using a low powered adapter, the CPU will throttle to avoid random power offs. Again, cheap laptops can have low powered adapters.

2021-07-07 1:06 pm
Alfman verbose=1
sukru,
If the system is too hot, it will slow down until cooling keeps up. Many cheap laptops have this issue, where the tiny whiny fan cannot pull out enough heat. Placing it on a carpet or never cleaning vents does not help either.
I can attest to this issue as well, but it’s not unique to “cheap laptops”. Thermal throttling is notorious with many high end laptops too particularly on multi-threading where all cores are loaded. I’d even say this has gotten worse as engineers have focused on lightweight, sleek, and quiet laptop designs at the expense of thermal considerations, resulting in computers that cannot perform to spec for long before throttling.
Perhaps the future will be better though if laptops are done with the physical compromising while CPUs continue to improve.

2021-07-07 5:29 pm
sukru
Alfman,
You are right, “cheapening out” is possible even for higher end laptops. My XPS is notorious for this, with a high end spec, and a very tiny fan.
I disabled turbo, and as you hinted if all cores no longer run at full speed, it actually stays cool, and does not even turn the fan at all for long periods. But then, it is like running an i7 almost as an i5.

2021-07-08 6:23 am
The123king
Sukru,
I was really talking metaphorically from an end-user standpoint. The average user will understand that percentages should add up to 100%. If they don’t then something is wrong with either the software or the hardware. If it says i’m only using 25% of the CPU but the computer had ground to a halt, obviously the hardware is faulty, as it shouldn’t grind until it’s at 100%

2021-07-08 1:05 pm
sukru
The123king,
Sure it might happen. A low powered laptop with a tiny / clogged fan will start whining at low CPU usage. But even though Task Manager could show 25% usage, the end user will see (1) is it throttled in the same UI, (2) will feel the heat on the chassis.
And why should not the user complain about real hardware issues at that scenario?

2021-07-06 3:28 am
jmorgannz
IMHO the answer is you’re trying to fit a square peg in a round hole asking such a question.
Whether you want percent utilisation to represent percent of current capacity, or percent of total possible capacity seems like a misnomer. Both are probably useful. But I would opt for the first one, as it hedges against the total relative performance of your hardware, as opposed to the latter which pins what was a relative measure to an absolute – your processors floating TDP.
I instead propose that a new measure should be added which resolves this issue – along side percent of total possible utilisation (relative), one should see current usage (total and per process) in absolute terms – kW or kJ
This should be possible.
2021-07-06 4:10 am
The123king
Percentage CPU load should be relative to the total CPU resources available at the time.

2021-07-06 4:56 am
Alfman verbose=1
The123king,
Percentage CPU load should be relative to the total CPU resources available at the time.
I agree, if the CPU is throttled to an arbitrary amount, say 20%, and at the same time the process monitors show 100%, I suspect that nearly all users will be mislead by it. Even if you know what’s going on usage measurements become significantly less useful if the measuring stick is constantly changing. It’s better to have constant measurement units.
This closely mirrors the relationship between tachometers and a speedometers in a car. For the vast majority of drivers it’s the speedometer that matters, since it’s a proportional measurement of output. The tachometer is more of a means to an end and there are probably many drivers who don’t understand it very well. Make a tachometer available to anyone who wants it for diagnostics, but having the output speed is typically more meaningful while driving.
2021-07-06 6:52 am
jmorgannz
I don’t see how this can work in a meaningful way.
Using this method, as a task approaches the load ceiling of the current state, the CPU should detect increased demand and transition to a less energy efficient but faster state, thereby lifting the CPU load ceiling, the result being a process using lots of CPU will go DOWN in relative CPU utilisation as it increases demand for processing. It will see-saw until it the CPU is at it’s full load capacity.
This whole mess gets even further complicated when you talk about transitioning between cores on an asymmetric setup like big.LITTLE.
Adds even more weight to the idea of absolute kW as an independent measure per task, but keeping load % the same.
You can then scale kW by the available total kW the CPU can burn in it’s current state, and also kW max available at highest state to get percents.

2021-07-06 9:30 am
Alfman verbose=1
jmorgannz,
I don’t see how this can work in a meaningful way.
Using this method, as a task approaches the load ceiling of the current state, the CPU should detect increased demand and transition to a less energy efficient but faster state, thereby lifting the CPU load ceiling, the result being a process using lots of CPU will go DOWN in relative CPU utilisation as it increases demand for processing. It will see-saw until it the CPU is at it’s full load capacity.
I guess The123king’s post was unclear and a bit ambiguous. I assumed “total” meant the maximum the CPU can output as configured at the time. Whereas you read it as meaning the total can output at a given cpu power state. I don’t know which he meant.
This whole mess gets even further complicated when you talk about transitioning between cores on an asymmetric setup like big.LITTLE.
Yes, that’s much more complicated. Most linux tools will add the % from each CPU together. So that 8 cores would yield up to 800%. But in big-little configuration the percentage from one CPU doesn’t represent the same amount of work as another. Say you’ve got 4 big cores and 4 little cores, you could make the case that the “big” and “little” CPUs should be summed separately, so (400%, 400%), or (0% 400%) would mean only the little cores.
On the other hand if you just want to know whether the system is under full load regardless, then maybe it makes sense to add the big and little cores together, ie 800%. But values like 600% would be completely ambiguous.because you don’t know which type of cores are utilized/underutilized. Using tools that show the output % for each core helps, but many tools don’t label the type of core it is, so I guess you just have to know.
Adds even more weight to the idea of absolute kW as an independent measure per task, but keeping load % the same.
You can then scale kW by the available total kW the CPU can burn in it’s current state, and also kW max available at highest state to get percents.
It may be interesting to account for power utilization by task, it could help developers optimize software more, something that our industry has gotten extremely lax with. The numbers would naturally be somewhat different from one system to the next, but it could open people’s eyes as to what software to lookout for. If the data showed that postfix uses 4 times as much energy as exim, that could be actionable information for both users and developers, although the comparisons might not always be apples to apples.
I’m not sure whether we have enough CPU telemetry to get a precise energy reading per task. We could try and estimate it but then we would probably need some complex calibration process. Also would you account for memory & disk energy usage as well? In principal this might be possible, I don’t know if any operating systems support this level of accounting though.

2021-07-07 3:27 am
jmorgannz
I don’t know how the heck you quote in this thing – but I am replying to your last paragraph.
I don’t think it’s practical to make precise statements about per-task energy usage. It would always be an indicator.
It should be possible to know a core’s total power consumption per timeslice the way power is accounted for in modern CPU’s – even if the CPU firmware just states maximum kW per energy state.
You then proportion that out to the tasks that ran in that timeslice percentage wise – the same way as the current percent calculation works, with a fake idle process eating the remainder (and I guess not being shown as using any energy?)
When you DO add that together across mutlipe heterogenous cores, you then get meaningful numbers as kW is absolute not relative like %

2021-07-07 10:08 am
Alfman verbose=1
jmorgannz,
I don’t know how the heck you quote in this thing – but I am replying to your last paragraph.
In wordpress you can use HTML blockquote tags.
It should be possible to know a core’s total power consumption per timeslice the way power is accounted for in modern CPU’s – even if the CPU firmware just states maximum kW per energy state.
You then proportion that out to the tasks that ran in that timeslice percentage wise – the same way as the current percent calculation works, with a fake idle process eating the remainder (and I guess not being shown as using any energy?)
In windows I recall a motherboard specific utility that reported the power of the entire CPU, but not per core. I’m not aware of a utility in linux that reports this information per CPU or per core. If you know how to do it, please let me know 🙂
Even if there is telemetry data for power consumption, the overhead of accounting for it accurately every single task switch could be an issue, but it really depends how it works. If the information is available via contention-free CPU registers, then it could be read very quickly. But if it has to be accessed through a shared PCI device or even bit banged through I2C, which a lot of motherboards use for sensors, it could cause performance to suffer.

2021-07-06 7:12 am
Anonymous
Raymond Chen is a waste of brain cycles for even posing this fortune cookie question.

2021-07-07 2:41 am
Drumhellar
Oh, well, maybe you should ask him for a refund.

2021-07-06 8:06 am
Brendan
The correct answer involves recognizing that dynamic frequency scaling is dynamic. In other words, the OS could happily increase the CPU speed if it felt like allowing the process to use more than 50% of the CPU’s potential, and “using 100% of 50% of a CPU’s potential” very much should be reported as “50% CPU load”. Of course an OS should be smart enough to figure out the dynamic frequency scaling as a compromise between energy efficiency and task priority (e.g. if high priority tasks need CPU time the CPU should be running as fast as it’s able, but if only low priority tasks need CPU time then power consumption is more important than performance and ..).
Further; I’d say “100% CPU” should be equivalent to the maximum amount of work a CPU could sustain indefinitely; and things like TurboBoost should allow a task to use 120% of CPU temporarily.
Sadly; for multi-core and hyper-threading this gets messy – you’d have to benchmark/calibrate (maybe with all CPUs going flat out with fans at max. speed for an hour) to determine what “100% CPU” actually is; then have detailed monitoring of things like TurboBoost and (“forced by over-temperature conditions and not voluntary at the operating system’s request”) thermal throttling to determine what the current CPU speed actually is; plus some kind of generic rules for figuring how hyper-threading should influence the reported CPU load (“both logical processors running = 50% of core per logical processor, one logical processor running and the other idle = 75% of core for the logical processor that’s running”? Would need more benchmarking there).
Of course when doing it right is hard, it’s no surprise that operating systems do it the easy/wrong way.
2021-07-06 9:16 am
bhtooefr
My thought here basically boils down to the following:
100% CPU == 100% load at nominal (or if less than nominal, maximum possible current) clock speed
So, if the CPU’s at 100% load at 50% nominal clock speed for energy saving, but it could get to 100% nominal right now, report 50% load.
Turbo boost and similar technologies should be over 100% at least on a per-core basis. All-core turbo should be over 100% on a CPU basis. (Note that boosts that allow a core to exceed the all-core turbo speed increase the per-core percentage, but not the total CPU percentage.)
SMT gets really weird in this model, though, due to how dynamic its effects on core load are.
2021-07-06 3:10 pm
Xanady Asem
CPU utilization is independent of CPU frequency. So the CPU utilization should be the snapshot of current processor resources being busy through whatever averaging sampling period.
Current Windows Task Manager does it right: % Utilization in the graph + CPU speed on the side to contextualize the graph.

2021-07-07 11:30 am
Alfman verbose=1
javiercero1,
CPU utilization is independent of CPU frequency. So the CPU utilization should be the snapshot of current processor resources being busy through whatever averaging sampling period.
It’s debatable, which is the whole point of the article. The vast majority of users will expect the reported value to reflect a percentage of max capacity rather than current speed. When someone sees 99%-100% CPU, users think the CPU (or core) has no more capacity, which is extremely misleading if the CPU is in fact only running at a reduced clock frequency.
Even for those of us who understand what’s going on, I don’t want my CPU tools reporting a 100% load when it’s operating at 50% speed. Of course the information should be available for those who want it, but the more important metric for the vast majority of users is the percentage of total available capacity.
in principal you might even have a clockless CPU that doesn’t tick unless there’s work to be done. Such a CPU would always be under 100% utilization by definition because every single tick is doing work no matter how much faster the CPU can go. You could be at 1% load and be showing “100% utilization”, but it wouldn’t be very useful.
Current Windows Task Manager does it right: % Utilization in the graph + CPU speed on the side to contextualize the graph.
If I’m not misunderstanding you, you are suggesting this is how it works:
plot_percentage = cpu_utilization
When I open the windows resource monitor (albeit on windows 7 since I don’t have 10 readily available) when I limit the clock speed using power options, confirmed by windows resource monitor, the plotted graph never reaches 100% even when I run a task that uses all available CPU.
A little bit of testing seems to show that windows is using something like this instead:
plot_percentage = cpu_utilization * current_freq / max_freqency
This is all inline with what I personally expected it to do, but if you have any information that contradicts this, then please link it because I’d like to know how it works.

2021-07-07 2:18 pm
Xanady Asem
In architecture CPU utilization is a metric to measure the % of processor resources being utilized based on processor state at the time of sampling (which can be simplified as the number of active processes vs average time processes are waiting on I/O).
Whatever speed the processor is operating at the time of sampling is set for that state. It doesn’t matter what speed your processor is capable of for the state of utilization at time of sampling, it only matters the actual speed at that time.
If the CPU reports 99% of utilization, that is in fact the max capacity for the CPU at that time. The frequency governor has limited the max frequency for a specific reason, so that is as much work as the CPU can get done.
CPU utilization is about a snapshot of the current capacity of the system, not about future or past.

2021-07-07 3:18 pm
Alfman verbose=1
javiercero1,
Whatever speed the processor is operating at the time of sampling is set for that state. It doesn’t matter what speed your processor is capable of for the state of utilization at time of sampling, it only matters the actual speed at that time.
For most people though it makes more sense for 100% to equal the core’s max performance rather than see 100% of some arbitrary throttling factor, which is pretty meaningless for most people.
Assume we have a CPU capable of 4ghz and you have a load that slowly rises from 0% to 100% at 4ghz. As the process goes from 0% load to 100% load, the system at 4ghz will display CPU usage from 0% to 100% (all of this is as expected).
Now let’s introduce a CPU energy saver mode that can scale back to 2ghz. You would see as the process goes from 0% to 50%, the CPU utilization would go from 0% to 100% at 2ghz. But this “100%” is not a meaningful value for most users who expect to see “50%”.
As the process continues from 50% to 100%, the CPU must transition from 2ghz energy saver mode to 4ghz full power mode, but at this point the CPU utilization will suddenly drop from 100% at 2ghz to 50% at 4ghz, which is the same amount of work. users would see a drop in CPU utilization even as the workload keeps increasing! Most users wouldn’t be able to make heads or tails of this. Moreover the CPU performance graph would be mostly useless as you’d have samples taken at 2ghz and 4ghz plotted along side each other with a scale that isn’t proportionate to the workload.
Just to make it clear: 50% at 2ghz is half the workload of 50% at 4ghz. Obviously both these 50% values are NOT equal, yet without a scaling factor there would be no change change in the cpu load graph, which isn’t what users expect.
CPU utilization is about a snapshot of the current capacity of the system, not about future or past.
That’s fine if that’s your opinion, but most normal users will expect to see the CPU load in terms of full capacity, which is why windows, linux (and I suspect macos) CPU monitoring tools scale the CPU load values. This way 0% – 100% values are comparable regardless of the frequency that the CPU is running at.

2021-07-07 6:50 pm
Xanady Asem
The issue is that you’re mixing up 3 different metrics/concept: usage, load, and frequency.
2021-07-07 8:34 pm
Alfman verbose=1
javiercero1,
The issue is that you’re mixing up 3 different metrics/concept: usage, load, and frequency.
Except that I was right: the CPU usage displayed by windows and other operating systems are adjusted for frequencies to make the percentages line up consistently regardless of frequency changes. If you have a problem with something specific then please provide a robust counterargument. I think that’s fair.
2021-07-07 8:45 pm
Xanady Asem
I don’t think you’re right. You’re equating system utilization to be directly proportional to frequency, which is not correct.
A 4Ghz processor can be at 100% utilization when it’s throttled down to 2Ghz. And also be at 100% utilization when it throttles up at 4Ghz.
The CPU usage displayed by windows is basilly using the NtQuerySystemInformation to create a graph from the performance counter structures. And I don’t believe it adjusts it for frequency.
2021-07-07 9:34 pm
Alfman verbose=1
javiercero1,
I don’t think you’re right. You’re equating system utilization to be directly proportional to frequency, which is not correct.
A 4Ghz processor can be at 100% utilization when it’s throttled down to 2Ghz. And also be at 100% utilization when it throttles up at 4Ghz.
Yes that is what I was talking about. 100% at 4ghz and 100% at 2ghz are doing completely different amounts of work and plotting them together on a graph would be misleading to a typical user without scaling them. The graph would contain non-continuous jumps as the frequency changed.
The CPU usage displayed by windows is basilly using the NtQuerySystemInformation to create a graph from the performance counter structures. And I don’t believe it adjusts it for frequency.
Well, if I had the source code, I could take a look and see where the code does what. Alas, I don’t have the source code. Still, window resource manager does appear to be scaled for frequency. Perhaps we can test it more thoroughly if you think that would help.
2021-07-07 10:12 pm
Xanady Asem
Again, I don’t think you understand what utilization represents. It is independent of frequency.
Utilization is not scaled for frequency, it’s based on averaging between fixed sampling periods of the performance counters.
The performance graph in windows is not scaled for frequency.
2021-07-08 12:35 am
Alfman verbose=1
javiercero1,
Again, I don’t think you understand what utilization represents. It is independent of frequency.
Utilization is not scaled for frequency, it’s based on averaging between fixed sampling periods of the performance counters.
The performance graph in windows is not scaled for frequency.
I do understand what you are saying, but that’s not what CPU graphs are showing and moreover it wouldn’t make much sense for the graphs to work the way you’re suggesting because it would be misleading and not useful for most purposes.
I don’t think what I said came across so here’s a visual example that should help clarify…
https://ibb.co/HGLBXqX
This CPU can run at 1ghz, 2ghz, 3ghz, 4ghz. 1ghz is the most efficient, 4ghz is the fastest.
For simplicity assume 1 unit of work requires 1% of CPU at 4ghz.
As the data show, 5 units of work can be run at 5% duty cycle at 4ghz, 7% duty cycle at 3ghz, 10% duty cycle at 2ghz, etc. I’ve also added the duty cycle at the most efficient frequency that can sustain the load (the purple line). For the sake of simplification let’s assume the CPU always chooses the most efficient frequency for the workload without delay or interference from other system processes.
Now, your opinion is that the CPU monitor should always show the duty cycle without regards to the current frequency, well this corresponds to the purple line, which always displays the duty cycle without regards to cpu frequency. My point is that the purple line it too erratic for normal users to make sense of. Even though the amount of work steadily increases from 0 to 100, the optimal duty cycle is quite chaotic.
Say windows resource monitor graph worked this way and a system administrator were using it to analyze a server. He would clearly see the areas where the graph goes to 98+%”, but looking at this he wouldn’t know whether it was a lightweight, moderate, heavy, or crushing workload since all those scenarios can produce the same “100%” reading at different frequencies.
This is clearly not what users expect a CPU/core monitor to show. The solution is to multiply the duty cycle by the CPU frequency ratio to get a reading that is consistent across frequencies. If we plot it this way, we’ll get a nice graph that is continuous even as CPU frequencies change. This way the CPU usage can be represented the way most users expect it to be (ie 100% at 1ghz is really 25% of total). From my observations, this appears to be what windows resource monitor is doing.
2021-07-08 2:37 am
Xanady Asem
No need for that..
The sampling is done in terms of fixed periods. There’s no need to adjust for frequency in the final composite, i.e. what it is reported in the graph, because it is the average for the performance counters for that fixed sampling period.
There are multiple clock domains within the core and across the cores. And the frequency may vary between a single sampling period.
In the end the number you report in the graph is just the average for that sample. And since the frequency may have varied within that sample, it makes no sense to adjust the average with respect to the max frequency.
In the old days, when there was one single clock domain for the entire CPU, there was a clear correlation between utilization and processor performance. Right now utilization and performance are slightly decoupled because the frequency is variable, but the sampling period is still fixed.
Which is why the assumption that a CPU @ 2Ghz with an utilization of 100% is equivalent to the same CPU @ 4GHz w a utilization of 50% is wrong.
2021-07-08 4:14 am
Alfman verbose=1
javiercero1,
No need for that..
The sampling is done in terms of fixed periods. There’s no need to adjust for frequency in the final composite, i.e. what it is reported in the graph, because it is the average for the performance counters for that fixed sampling period.
It doesn’t matter when the sampling is done. It could be done every second, every five seconds, every millisecond, whatever. The problem is having a 0-100% range from one sample not aligning with 0-100% range in another sample and then trying to plot them side by side. The fact is cpu duty cycles will become misaligned in the plot if we don’t account for cpu frequency changes. Ignoring this problem results in a distorted non-continuous graph. The example chart I’ve provided shows this very clearly so please pay close attention to it.
In the end the number you report in the graph is just the average for that sample. And since the frequency may have varied within that sample, it makes no sense to adjust the average with respect to the max frequency.
That’s not true, please take a closer look at what windows and other operating systems are doing. I’ve tested it already and it behaves as I’ve described it. If you disagree then please test it yourself and gather proof before telling me that frequencies are not used.
Right now utilization and performance are slightly decoupled because the frequency is variable, but the sampling period is still fixed. Which is why the assumption that a CPU @ 2Ghz with an utilization of 100% is equivalent to the same CPU @ 4GHz w a utilization of 50% is wrong.
Suggesting that it is wrong suggests there’s something your still not understanding. It’s crucial that you understand this to be able to understand my point and what CPU monitoring tools are doing. If you’ve got a socket daemon that requires a million clock cycles to handle a client request. At 4ghz the cpu can handle up to ~1000 requests per second. At 2ghz the cpu can handle ~500 requests per second. At 4ghz clock, the cpu can also handle ~500 requests per second at 50% duty cycle. In other words the amount of work done at 4ghz with a 50% duty cycle is roughly the same as the amount of work done at 2ghz with 100% duty cycle, which is why we line the scales up accordingly.
There could be some minor variation due to external factors like ram speed, interrupts and so on, but using simplified examples helps us stay focused on the big stuff for now.
2021-07-08 4:42 am
Alfman verbose=1
Correction:
If you’ve got a socket daemon that requires a million clock cycles to handle a client request. At 4ghz the cpu can handle up to ~1000 requests per second. At 2ghz the cpu can handle ~500 requests per second.
In my head I did 1ghz/1 million clock cycles = 1000 requests per second instead of 4ghz/1 million clock cycles = 4000 requests per second. Sorry about that, it’s late. I hope you can still follow what I meant,
2021-07-08 1:10 pm
Xanady Asem
I can explain it to you, I can not understand it for you.
You’re still not understanding what utilization/load means. You’re thinking those graphs are telling you “work done” which is not what it’s going on. The graph is telling you the percentage of processor resources that were busy, on average, in between fixed sampling periods.
Utilization and frequency are orthogonal. Perhaps you are thinking of frequency as another resource that should be added to the composite. The issue is that it can not be seen as a multiplicative factor. Again, 100% utilization at 2Ghz, doesn’t not mean that the same processor at 4Ghz would be doing 50% utilization.
If I understand what you mean, you are thinking that the graphs represent how much of the “ideal” processor state (i.e. at max cpu frequency) would the current processor state would map to. But that is not what those graphs are saying.
The sampling period is fixed. Ergo the 1-100% scale is with respect to that. The CPU frequency is variable during that sampling period, thus it makes no sense to scale the “scale” further.

2021-07-08 6:02 pm
Alfman verbose=1
javiercero1,
I can explain it to you, I can not understand it for you.
You’re still not understanding what utilization/load means. You’re thinking those graphs are telling you “work done” which is not what it’s going on. The graph is telling you the percentage of processor resources that were busy, on average, in between fixed sampling periods.
You gotta quite doing this javiercero1, my disagreeing with you is not for a lack of understanding, that’s insulting. It’s the lack of data, there’s a big difference. It would be more helpful for you to respond with data when I ask you for it.
But I have good news and have found the source of our disagreement. The data points I measured before using windows resource monitor did not exhibit any discontinuities because the default power mode on the windows computer was not scaling the frequency (it’s extremely aggressive to go into “turbo” speed), which is why I observed linear scaling.
However when I set the policy to “powersave”, the dynamic frequencies become much more prevalent and the resulting distortions to the CPU usage graph are much more noticeable. (ie the usage graph can drop even as the load is increasing, exactly as I was explaining above).
I maintain, as the author does, that it is confusing and misleading to show users percentages that are not adjusted for frequency, but like me most people probably never noticed this on windows because the default power policy is quick to enter full power mode where the duty cycle and load are directly proportional to one another.

2021-07-08 9:55 pm
Xanady Asem
Just as it is insulting when I keep explaining you basic concepts from microarchietcture, and you keep referring them to my “opinion.”
We go into this rabbit holes where it’s clear that you not knowing WTF you’re talking about and keep insisting on being correct regardless. Rather than expanding your understanding.
The graphs are doing exactly what they are supposed to do; present USAGE of resources. I keep telling you that normalizing them to frequency makes no sense, since you already have the number of cycles taken into account in the average.
You and the author of the article should understand that already. JFC
2021-07-09 12:50 am
Alfman verbose=1
javiercero1,
Just as it is insulting when I keep explaining you basic concepts from microarchietcture, and you keep referring them to my “opinion.”
You really want to bring that up? We all have different opinions, and frankly yours are no better than mine. Here we are all peers. I think you’re a smart guy, however IMHO you rely too heavily on argument from authority. It would be much more convincing if you cited real data and real world benchmarks to make your points. Data is much harder to dismiss than an opinionated argument. Also don’t fall to the temptation of ad hominem attacks. That’s my advice, for whatever it’s worth. Back to the topic at hand…
We go into this rabbit holes where it’s clear that you not knowing WTF you’re talking about and keep insisting on being correct regardless. Rather than expanding your understanding.
On the contrary I understood everything you wrote. The problem is you hadn’t provided any data or evidence. In fact it wasn’t until I had the idea of switching my CPU’s power mode that I actually saw clear evidence of the graphing discontinuities that prove windows is not scaling the graph as I thought it would.
The graphs are doing exactly what they are supposed to do; present USAGE of resources. I keep telling you that normalizing them to frequency makes no sense, since you already have the number of cycles taken into account in the average.
Again, I understand that’s your opinion, but I share the opinion of the author: I think most people actually want to be looking at cpu load. The CPU’s frequency & duty cycle are under the hood details in much the same way that a car’s gear ratio & RPM are under the hood details.
Very few people are expecting the graph to drop at some points where load goes up and rise at some points where the load goes down. They are extremely likely to make erroneous concludes from that. That said, this problem is largely masked thanks to the default power profile making the results as linear as they are.
2021-07-09 3:18 pm
Xanady Asem
No, we’re not peers. Not even remotely.
For the last time: It makes no sense to normalize processor load/utilization with respect to frequency because it is a dynamic variable. Just as we don’t normalize for number of processes either. Frequency is not linearly correlated with load/utilization.
The processor load/utilization is a simple composite average of a bunch of performance counters for a set sampling period. The number of cycles in the period is one of the performance counters that accounts for the frequency effects.
Just because a processor is running at 4Ghz it does not mean it can be more utilized/get more work done than when it is at 2Ghz.
Modern out-of-order CPUs with dynamic power-frequency domains can be very counter intuitive to people with no formal background in microarchitecture. And those graphs are actually telling the correct story.
2021-07-09 5:06 pm
Alfman verbose=1
javiercero1,
No, we’re not peers. Not even remotely.
You may not like it, but we are peers here.
For the last time: It makes no sense to normalize processor load/utilization with respect to frequency because it is a dynamic variable.
There are many solutions, the cpu provides the necessary telemetry. It’s just data points and a bit of calculus. I believe the only reason you are making it so complicated is because you’ve mentally committed to digging your heals in.
The processor load/utilization is a simple composite average of a bunch of performance counters for a set sampling period. The number of cycles in the period is one of the performance counters that accounts for the frequency effects.
I’m well aware of the math, I already gave you the mathematical relationship earlier. You’re the one making it sound like it’s impossible to calculate and/or graph the amount of work a cpu is doing, which is silly.
Just because a processor is running at 4Ghz it does not mean it can be more utilized/get more work done than when it is at 2Ghz.
Obviously there may be other bottlenecks in the system that inhibit the cpu from reaching max performance, but in terms of the CPU’s capabilities yes the 4ghz can do twice the work as 2ghz. This is not complicated.
Modern out-of-order CPUs with dynamic power-frequency domains can be very counter intuitive to people with no formal background in microarchitecture. And those graphs are actually telling the correct story.
I never claimed the cpu duty cycle graphs are incorrect, only that cpu duty cycles are non-continuous over frequency changes, which is a fact. This makes them counter-intuitive to normal users who expect them to be proportional to load. It’s perfectly valid to hold the opinion that “work getting done” is the more useful metric.
I know that you know I have a valid point, but the real question is whether you’ll ever admit it. You’ve turned this into an argument over ego. I could say “yeah javiercero1, it makes no sense to show users how much work their CPU is doing”, but it would be dishonest of me. So how do you want to end this? Will a simple “let’s agree to disagree” do?

2021-07-06 11:37 pm
Bill Shooter of Bul Platinum Prime
Basically, you should have all of this information available at your fingertips to understand what is happening.
%total of theoretical maximum power with no throttling per core and per system
%total of available cpu processing per core and per system
%efficiency usage of available resource *per core and per system
If you can’t introduce additional variables, well then you have a unsolvable problem, and we should work on making better tools that show how the system is performing
*efficiency is kind of a pipe dream, but maybe it would be possible in limited capacity, like the branch prediction has been wrong, lots of cache misses, Lots of bottle necking on cache lines, etc. or really those might make sense as a break down of this crazy over all number.