It is not easy for a power-constrained system to save power and, at the same time, always be responsive. In fact, the most efficient way to save power is to keep components off as much as possible. For CPUs, this means keeping frequency as low as possible, and idle states as deep as possible. This power-saving policy leads to high latencies if CPU workloads happen to increase suddenly (then latency gets back under control as the system performance gets raised to a level compatible with the new load).
With audio workloads, this issue can cause perceivable glitches and silence gaps. We have reproduced these problems deterministically with the SynthMark audio emulator, on a Qualcomm 845c Dragonboard, with a 5.10 kernel and the Android Open Source Project (AOSP) master.
More importantly, we have also devised a solution to this problem. It proved remarkably effective with SynthMark workloads. Details follow.
Spotting the problem
by Biagio Ferri and Davide Zini
SynthMark is a benchmarking emulator that generates audio tracks and collects relevant parameters, including audio latency. We focused on a specific test: switch mode, a case where the workload switches repeatedly from low to high. This is the most stressful and unfriendly pattern for audio latency. Using the board’s default CPUfreq Governor, schedutil, we got a latency around 12ms. On the opposite end, with the performance CPUfreq Governor (all CPUs at maximum frequency) and deepest idleState disabled, latency drops to 2ms (until thermal mitigation does not come into play). The high latency in default mode is evidently caused by the performance level being too low.
Adding utilClamp controller
by Biagio Ferri
Starting from SynthMark, after several consultations on possible hypotheses, it was decided to deal with the problem by implementing a jump-to-max and slowly-decrease policy, to choose the right value of frequency for handling a load increase as quickly as possible. This mechanism was implemented by modifying the utilClamp parameter, a parameter that allows a single process not to fall below a certain frequency threshold. In more detail, when an underrun for audio buffers is detected, utilClamp is pushed immediately to its maximum possible value. Then, as a load decrease is detected, utilClamp is decreased linearly in small steps, until the CPU frequency complies with the current load.
With this solution in place, latency falls from 12ms to 6ms, which essentially eliminates glitches and silence gaps. And the system remains in the default power-saving mode.