How Process Priority Inversion Can Burn CPU via “Waiting” Processes
For the past few weeks, we have been wrestling with an interesting bug in Oracle 11g at Netflix.
We are seeing high CPU attributed with a high number of wait events for the following:
- cursor: mutex S
- latch: shared pool
I was perplexed as to why waits would result in high CPU so I hopped on to Google. I didn’t find an answer for 11g, but I did find something reported in 10.2 and reportedly fixed in 11g.
In Oracle 10.2, on Operating Systems (e.g. AIX, HPUX, etc..) that support process priority decay (e.g. fair round robin, default on AIX 5), priority inversion can cause “blocked threads” to burn CPU. Ignore the fact that this mentions “cursor pin s” waits instead of “cursor mutex s” waits, the pattern is the same.
http://blog.tanelpoder.com/2010/04/21/cursor-pin-s-waits-sporadic-cpu-spikes-and-systematic-troubleshooting/
In Oracle 10.2, latches cause waits, while mutexes cause CPU spins.