We decided to try out using a Kubernetes cronjob to measure the SPEED INDEX score of a system. We thought this would be a great idea. We can have a job that polls a server and measures the performance of the HomePage loading.
Why Speed Index and not something like BrowserTimings?
Speed index measures the time it takes for the visible (above-the-fold) parts of a webpage to appear to the users. It became part of the WebPageTest in April 2012. The Lighthouse-powered Audits panel in the Chrome DevTools also measures it.
Speed index can be used to compare performance against competitors or previous versions of the same website. It best utilised alongside other metrics like load time and start-render to get a better understanding of a site’s quality.
Thats pretty awesome. You get a good metric that shows you the user experience. BrowserTimings usually do not factor in when the user can interact with the page. Even if a web page takes 10 seconds to load, if it is usable within 4 seconds, we good as gold.
SpeedIndex uses your browser and it requires CPU/Memory, both of which can be throttled by kubernetes.
We noticed that when we used a normal machine the speed index was good – around 5 seconds. We noticed in kubernetes it would be around 12 seconds!
We had set some throttling limits, but never though it would get throttled.
resources: limits: cpu: 1500m memory: 1500M requests: cpu: "1" memory: 1G
However, it was getting throttled!
We decided to remove the resource limits
Now, there is no throttling and the speed index measured similar to our home device.
The above graph shows that throttling disabled improved the metric.
So for us, it is back to the drawing board and pick a measurement as close to the client as possible.
Avoid using kubernetes to be a measuring device that measures metrics related to CPU/Memory
There seems to be a bug in Kubernetes with some linux kernels regarding CFS/Throttling
What still puzzles me, is I have a physical node that is not using power. I have a process using 0.4 cpu with a limit of 2.5 cpus and it gets throttled!
This is on Azure Kubernetes 1.16. However I can reproduce the issue on Google GKE as well. So it is a kubernetes algorithm of some sort that limits more than you would expect!