Mobile web performance: the importance of the device

Let’s explore our web performance data from an angle we haven’t explored before: mobile device type.

By Gilles Dubuc, Senior Software Engineer, Wikimedia Performance Team

Most mobile devices expose their make and model in the User Agent string, which allows to look at data for a particular type of device. As per our data retention guidelines, we only keep user agent information for 90 days, but that’s already plenty of data to draw conclusions.

I looked at the top 10 mobile devices accessing our mobile sites, per country, for the past week. One country in particular, India, had an interesting set of top 10 devices that included two models from different hardware generations. The Samsung SM-J200G, commercially known as the Samsung Galaxy J2, which was the 5th most common mobile device accessing our mobile sites. And the Samsung SM-G610F, also known as the Samsung Galaxy J7 Prime, which was the 2nd most common. The hardware of the more recent handset is considerably more powerful, with 3 times the RAM, 23% faster CPU clock and twice the amount of CPU cores than the older model.

Being in the top 10 for that country, both devices get a lot of traffic in India, which means a lot of performance Real User Monitoring data collected from real clients to work with.

With the J7 Prime retail price in India currently being double the J2 retail price, one might wonder if users who use the cheaper phone also use a cheaper, slower, internet provider.

Thanks to the Network Information API, which we recently added to the performance data we collect, we are able to tell.

Looking at Chrome Mobile only, for the sake of having a consistent definition of the effectiveType buckets, we get:

effectiveType	J2	J7 Prime
slow-2g	0.5%	1.1%
2g	0.8%	0.7%
3g	27%	28%
4g	71.5%	70.2%

These breakdowns are extremely similar, which strongly suggests that users of these two phone models in India actually experience the same internet connectivity quality. This is very interesting, because it gives us the ability to compare the performance of these two devices from different hardware generations, in the real world, with connectivity quality as a whole that looks almost identical. And similar latency, since they’re connecting to our data centers from the same country.

What does firstPaint look like for these users, then?

Device	Sample size	Median	p90	p95	p99
J2	1226	1842	4769	7704	15957
J7 Prime	1798	1082	2811	5076	12136
difference		-41.3%	-41.1%	-34.2%	-24%

And what about loadEventEnd?

Device	Sample size	Median	p90	p95	p99
J2	1226	3078	9813	14072	29240
J7 Prime	1798	1821	5635	9847	28949
difference		-40.9%	-42.6%	-30.1%	-1.1%

Across the board, the difference is huge, even for metrics like loadEventEnd when one might think that download speed might be an equalizer, particularly since we serve some heavy pages when articles are long. OS version might play a part in addition to hardware, but in practice we see that older Android devices tend to stick to the OS version they were shipped with, which means that those two factors are tied together. For example, worldwide for the past week, 100% of J2 phones run the Android version they were shipped with (5.1).

These results show that device generation has a huge impact on the real performance experienced by users. Across the globe, users are upgrading their devices over time. This phenomenon means that the performance metrics we measure directly on sampled users with RUM should improve over time, by virtue of people getting more powerful devices on average. This is an important factor to keep in mind when measuring the effect of our own performance optimizations. And when the median of the RUM metrics stay stable over a long period of time, it might be that our performance is actually worsening, and that degradation is being masked by device and network improvements across the board.

Given the eye-opening results of this small study, getting a better grasp on the pace of improvement of the environment (device generations, network) looks like a necessity to understand and validate our impact on the evolution of RUM metrics.

About this post

This post was originally published on the Wikimedia Performance Team Phame blog.

Featured image credit: Cellphone (Unsplash), Rodion Kutsaev, CC0 1.0