Using Kubernetes to measure homepage speed index

Hi,

We decided to try out using a Kubernetes cronjob to measure the SPEED INDEX score of a system. We thought this would be a great idea. We can have a job that polls a server and measures the performance of the HomePage loading.

We use Cypress and a lighthouse plugin for synthetics.
https://github.com/mfrachet/cypress-audit

Why Speed Index and not something like BrowserTimings?

Speed index measures the time it takes for the visible (above-the-fold) parts of a webpage to appear to the users. It became part of the WebPageTest in April 2012. The Lighthouse-powered Audits panel in the Chrome DevTools also measures it.

Speed index can be used to compare performance against competitors or previous versions of the same website. It best utilised alongside other metrics like load time and start-render to get a better understanding of a site’s quality.

Thats pretty awesome. You get a good metric that shows you the user experience. BrowserTimings usually do not factor in when the user can interact with the page. Even if a web page takes 10 seconds to load, if it is usable within 4 seconds, we good as gold.

The Problem

SpeedIndex uses your browser and it requires CPU/Memory, both of which can be throttled by kubernetes.

We noticed that when we used a normal machine the speed index was good – around 5 seconds. We noticed in kubernetes it would be around 12 seconds!

We had set some throttling limits, but never though it would get throttled.

resources:
              limits:
                cpu: 1500m
                memory: 1500M
              requests:
                cpu: "1"
                memory: 1G

However, it was getting throttled!

We decided to remove the resource limits

resources:
 {}

Now, there is no throttling and the speed index measured similar to our home device.

The above graph shows that throttling disabled improved the metric.

Conclusion

Even those Speed Index is a great way to measure performance, we cannot use it on Kubernetes as a measuring device, because even with throttling off, you just never know if another container/pod is stealing resources. The best way forward when measuring metrics (NOT GATHERING), is to measure as close as possible to the source e.g. Client Side Javascript etc.

So for us, it is back to the drawing board and pick a measurement as close to the client as possible.

Possible solutions:

https://github.com/WPO-Foundation/RUM-SpeedIndex/blob/master/src/rum-speedindex.js

Avoid using kubernetes to be a measuring device that measures metrics related to CPU/Memory

Questions

There seems to be a bug in Kubernetes with some linux kernels regarding CFS/Throttling

https://github.com/kubernetes/kubernetes/issues/67577

What still puzzles me, is I have a physical node that is not using power. I have a process using 0.4 cpu with a limit of 2.5 cpus and it gets throttled!

This is on Azure Kubernetes 1.16. However I can reproduce the issue on Google GKE as well. So it is a kubernetes algorithm of some sort that limits more than you would expect!

Uncategorized

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s