In the modern digital world, systems and services have become an intertwined and sensitive network where small changes or hick-ups in one corner can have dramatic effects on the system as a whole. Critical parts of the system need to be monitored for inconsistencies, and the performance and capacity need to evolve constantly to support the continuous growth.
Mobile cellular networks are one critical component of the modern society. Mobile operators follow proactively the performance of their precious investment with various tools and services and react when things break. As mobile networks are built with simulations and planning tools but the end users are real, there is often a wide gap between the plans and the reality. Leading companies have invested in crowdsourcing to gather data from their real users, to support their processes and enhance their networks.
In the context of mobile network measurements and in particular end-point measurements, we see two main types of measurements being used in crowdsourcing: active and passive. Active measurements are typically based on sending a synthetic payload to load the network and measuring the top speed of that transfer or sending latency measurement probes to calculate the latency. In passive measurements, the end point simply monitors the wireless interface and takes speed samples from the traffic flow.
Active measurements can be triggered in a multitude of ways, e.g.,
• By consumer wishing to test his current network top speed,
• Remotely by a service provider to debug a potential network speed issue, or
• Periodically to gather intelligence on the network top speeds.
Regardless of the reason to trigger an active speed measurement, they still consume data, some more some less. The point is to see at that specific point in time, what would be the maximum speed the network could allow. The payload can be a synthetic data stream, the download of a certain set of web pages, or e.g. streaming some video with varying bit rates. The type of the payload affects the end result directly, e.g., if the payload is not able to saturate the link, we do not get to know the top speed of the network at all. In some tests, it might be enough to simply measure if the payload arrives within a certain delay bound, e.g., a web page loads all components in under 10 seconds.
With synthetic speed tests, there are many challenges if one wants to have reliable results. The first is to decide how to saturate the link, using one or multiple data streams at the same time? Note that most consumer apps that do high speed transfers (buffer a video, update an app, download fresh content, etc.) use only one data stream.
The next question is, how will you know that the link has been saturated and your results actually tell the network maximum speed? Typically speed test apps download and upload data for 10 seconds and expect that this time is enough to measure the link top performance. The downside is that the amount of data that is transmitted in a 10 second test can be huge, e.g., over a 4G network, it could be anything from 1MB (a user at the cell edge) to over 600MB (a user on an LTE-A network). This has a huge impact on the device power consumption and can use a huge fraction of the data plan if one is being used. Naturally running these types of tests periodically in the background does not make sense, as you never know beforehand how much data will be used.
To counter the issue of potentially huge data consumption, many vendors use e.g. 5MB or 10MB files that are downloaded periodically in the background. The problem here is that with small data volumes, the high speed recorded might not tell of the network capability. On a slow network it should saturate the link, but on a fast network, it would not – and how do you know that from the speed metric?
Moreover, the potential top speed of a mobile network can be anything from kilobits per second to even a gigabit per second – a difference of 1000X. How do you dimension your measurement servers and their network connection in such a way that they do not become the bottleneck themselves? If you get too much measurement traffic, you end up being limited by your measurement system, not by the network service being measured.
Active methodologies are typically used in drive-by testing, where the amount of data consumed or battery consumption are somewhat irrelevant. In consumer-oriented crowdsourcing, the active methodologies suffer from the potentially huge data and energy consumption, which limits how much often they can be used. This results in very limited amounts of measurement data making the data less valuable. Mostly active measurements can be used for operator comparison benchmarks on a national level (going to more finer granularity is typically not possible with the small amount of data) or some reactive debugging following customer complaints. Active measurements can not be used to proactively monitor network quality on a finer level and seek to enhance the quality before customer complaints emerge.
Furthermore, making an active measurement now at a given location does not tell what the performance was an hour ago or yesterday, or what it will be tomorrow. There is potentially some relationship, but it is not a reliable metric; you would need a lot of active measurements of the same location to increase your confidence.
Passive methodologies can provide a considerable amount of measurement data as the purpose is to monitor ongoing consumer data transfers. The downside is that the monitored speeds do not include information about the reasons why a certain peak speed was recorded: was it the network maximum, was it the server maximum, or something else? So having a lot of speed records does not help much in the end.
Latency measurements are also actively used to get information about how quickly the network can forward IP packets. Yet, as latency measurements are typically done outside actual data transfers, they record some sort of network ideal performance, but do not tell of the real latency experienced by the consumers’ data flows when they saturate the network entirely.
Thus, in the mobile network context, the current purely active or passive measurement methodologies have some inherent downsides that make them rather limited in wider use.
Active and passive measurements have been the norm, and mobile operators are familiar with them. But are these the holy grail of end-point driven, crowd-sourced, network analytics, or could it be done differently?
Hybrid measurements
The IETF RFC 7799 states:
“Hybrid Methods are Methods of Measurement that use a combination of Active Methods and Passive Methods, to assess Active Metrics, Passive Metrics, or new metrics derived from the a priori knowledge and observations of the stream of interest. ”
Thus, in essence, we can think about combining active and passive methodologies to form something new. The Netradar hybrid is such an invention.
The Netradar hybrid measurements combine passive and active techniques in a novel way. We use passive network monitoring to calculate the momentary bit rates. We augment these speed samples with an optimized stream of latency measurements probes. We do not create a synthetic payload, only measure latency.
The Netradar hybrid measurements combine passive and active techniques in a novel way. We use passive network monitoring to calculate the momentary bit rates. We augment these speed samples with an optimized stream of latency measurements probes. We do not create a synthetic payload, only measure latency.
As all crowd-sourced solutions measure speed and latency separately, we merge them together and run them at the same time. If the network is working well, the measured latency is stable or has a small jitter (horizontals and vertical handovers do create high latencies and jitter, but those are easy to spot and filter out). But once the network starts to receive too much load (downlink or uplink), it will have to start buffering the IP packets. When this happens, the latency starts to grow and jitter increases. The Netradar proprietary patent-pending algorithms detect this abnormal behavior and flag a passive measurement as having abnormal latency.
This methodology has the benefit that we can acquire a huge amount of speed and contextual data on consumer’s real network speeds, and we know when the network had issues providing reliable and stable connectivity. Yet, we do not inject a huge synthetic measurement payload, only small ping packets. Our overhead on the end user device is roughly 0.1% of increased data consumption per month.
To illustrate the power of hybrid measurements, let’s look at two maps. The data is from only one user to show how much data we can gather.
First, we have speeds recorded in a suburb of Helsinki during February 2021. Note how many of the speeds are in the red, so less than 5Mbps. This is the data you would get from passive measurements. Based on this data, you would expect that this particular Elisa 4G network is pretty bad. Not quite so.
Figure 1: All speed samples from passive data collection without any notion of why e.g. speeds are often under 5 Mbps (red is 5Mbps or lower peak speed
When we filter out all the speeds where the network latencies were stable (by our definition and algorithms, considering a 4G cellular network), we are left with the locations that had irregular latencies while the consumers had data flowing to their favourite apps.
Figure 2: Data speeds that were limited by the mobile network. Some are green, indicating a high speed and little impact to the user (if any), some are much lower impacting the user.
If I were in the network quality department of this Finnish mobile operator, I would concentrate on the red dots in this latter map, look at the contextual data related to all dots, and what my network internal analytics have to say. Are these single incidents like a user at the cell edge, is there a load balancing or coverage issue where traffic is not balanced adequately, a misaligned antenna pointing in a bad direction, hardware issue, or something else?
To illustrate further the power of our technology, consider the notion of latency. Network latency measurements are traditionally run outside data transfers. Netradar runs them inside the data transfers. While analyzing the latency probes and samples, we can calculate the minimum, average and maximum. And since we know if the jitter and latency behaved abnormally, we can also calculate the latency in a loaded and congested network, so the latencies consumers and their apps experience in real life if the network runs out of capacity.
The figure below shows an example with real data. The figure shows the latency in a given area in February 2021. We have calculated altogether nine different latencies. The typical metric is the average latency (1). Yet, as the load in the network affects also the latency, we can distinguish between latency in an uncongested (lightly loaded) network (2) and the congested (highly loaded) network (3). These tell how the network really handles end user traffic and what is the latency when there is no shortage of capacity and when capacity has run out and end user traffic is constrained. In these cases, we can also calculate the minimum latencies and maximum latencies, adding a further 3+3 metrics (best and worst cases).
Figure 3
Now these latencies are much more valuable input to network planning and performance management. They tell about the real latencies happening, not the optimal case which looks good on reports to the management.
To summarize, the power of the Netradar hybrid technology is the ability to analyze latency and latency variations during data transfers and draw conclusions on the performance of the mobile, or WIFI, network. The benefits over existing legacy methodologies are huge:
1. Extremely small data and energy consumption as we do not inject synthetic payloads, only monitor the consumer’s data connection and measure latency.
2. Detailed view of consumers’ daily mobile experience as our methodology is not based on random background speed tests but rather continuously monitoring the quality of the mobile network.
3. Data for proactively looking for network misconfigurations and problems before complaints occur.
4. No need for speed test servers so deployment is rather simple, we only need some very small ping servers and some place to push the collected metrics (your own database, Google Cloud, Amazon AWS or Microsoft Azure are all supported already)
5. As the data collection is so extensive, even a small deployment, say, 5% of your customer base, will generate a huge amount of data.