Jukka Manner | Netradar

December 8, 2022March 9, 2023

Use Cases from Our Customers

Netradar is an extremely versatile platform and offers data and insights to solve numerous problems and do proactive network enhancements. In this post I open some of the use cases our customers are solving with the help of the Netradar mobile analytics solution.

The cases discussed in this post include:

Replacing legacy drive and walk testing

Indoor coverage

Overall network coverage and capacity

Cell performance and faults

Latencies

Replace Drive Testing and legacy crowdsourcing

Conducting drive and walk testing has been the common way to measure mobile networks since the 90’s. These still seem to have their place, but more and more mobile operators are looking at solutions that would offer continuous monitoring of their service and wider coverage both in terms of geography and time – after all, a drive or walk test only analyses a limited path at a specific time.

Many mobile operators have also been somewhat disappointed with legacy crowd-sourced data because it does not help in developing the radio access network (RAN). The RAN is the most expensive part of the network and doing changes needs statistically valid amount of data and detailed KPIs. Legacy solutions based on speed tests can never offer enough data to fulfil the criteria. If one would like to solve the lack of data with very active testing of speeds using synthetic loads, the problem turns into network load and energy. A single speed test can create anything from a few MBs to a GB of data transfer into the mobile network and if one runs many tests per day, the combined load is quickly measured in terabytes or even petabytes.

Netradar’s unique hybrid measurement technology only consumes around 3MB per device per month. With this little data, the amount equivalent to loading a single web page, we provide a continuous 24/7 analysis of the network quality. And since Netradar does not use speed test servers, the deployment of the solution is extremely lightweight and easy. All the system components can be virtualized and run on an existing cloud platform offering easy scalability and cost-efficiency.

In summary, Netradar has very extensive features and our customers use our system in numerous ways. The beauty is that with a single solution, our customers can solve a number of daily issues and get data for enhancing their network. When Netradar is deployed with our customer, it also collects data from other mobile providers in the same market, and automatically generates the same analysis regardless of the network provider. This makes it possible to compare one’s mobile service to others and go beyond the competition.

As a final note, a small secret, we are working on real-time delivery of network analytics data. Already now our customers can configure how quickly data appears in their database, e.g., a few times per day or once per hour. In the near future, we get data of critical network issues in a matter of minutes or even down to milliseconds, providing a truly real-time network monitoring solution along all the already described features. But more of that in a future blog post.

Indoor Coverage

Wireless signals propagate well in open space but when there are various kinds of obstructions on their path, the result can be anything between a perfect connection and fully missing service. Cellular networks are initially deployed using outdoor towers and over time problematic indoor locations receive their own base stations.

Indoor coverage is typically measured by going directly into suspected building and measuring manually radio parameters and speeds. This is naturally rather unsophisticated as it requires a lot of manual work. Legacy crowd-sourced data does have a location (latitude, longitude) with some accuracy but fails to really identify if a measurement was done indoors or outdoors.

Netradar includes a novel methodology to tell, in addition to the GPS coordinates, if some measured KPIs were collected indoors or outdoors. Moreover, we can tell how deep indoors certain measurement was done, or was the user e.g., technically indoors but next to windows and had a good visibility to the outside world. This helps tremendously in automating the measurements of indoor coverage without going on site.

Overall Network Coverage and Capacity

Cellular networks are designed using planning tools. These need the geography and buildings as input and build models how a base station at a given location and height could perform. The plan that comes out is a good approximation but only a real deployment tells how the mobile subscribers experience the service, indoors and outdoors.

Netradar’s unique technology gathers a huge amount of data with even a small deployment across a mobile provider’s customer base e.g. 10% of users having the Netradar solution. The data collected builds up very quickly and just in a matter of days one can see the true coverage and capacity of the network, and over the following weeks the picture becomes even more clear and exact. Network evolution is analyzed as the service changes and new base stations and radio technologies (like 5G) are deployed. We show how subscribers receive 2G, 3G, 4G and 5G NSA/SA signals and what speeds they are getting at specific locations. We also analyze load balancing and show how distinct locations are served with a mix of cells and frequencies. Moreover, an interesting metric is also the availability of 4G: 4G is still the dominant radio technology but even that is not available in all places, so knowing where there is a lot of data usage but people only getting 3G helps in enhancing the user experience.

A critical but hidden network problem is total lack of coverage. If people try to use data and there is no signal whatsoever, the mobile operator will not know this. Customers might complain but cannot tell where they were at that time. These black spots are not seen in any network statistics either.

Netradar has developed a unique technology to catch moments and locations where the user tried to use data services but could not get any connectivity from his provider. This analysis shows clear locations where no service was available and is therefore extremely critical for today’s mobile users.

Cell performance and faults

All modern cellular technology vendors, such as Ericsson and Nokia, offer a huge amount of data about a base station site and its performance. Our customers tell us often that these metrics they get have two major problems. First, the metrics give averages of a cell’s performance, load and radio values. These values are often rather homogeneous and do not reveal potential issues. This problem takes us to the second issue with overall cell metrics: Cell internal problems. A cell (sector) can serve very complex locations and have planned coverage from a hundred meter to kilometers. Overall KPIs cannot reveal how distinct locations within that sector receive the signal and bit rate.

Netradar shows the real effective coverage and capacity of a cell, and any issues in service even at a 1 m accuracy. This makes it possible to analyze cells and tie their true performance to geographic locations. Our cell performance analysis reveals issues e.g. in cell interference, tower height, direction of antennas and tilt of antennas.

Netradar cannot know what a cellular provider’s initial plan was which came from a planning tool. Thus, our analytics needs to be combined with the data coming from the planning tool to see what the plan was and what the real result is in the field. This comparison can be even automated if the planning tool can export the direction of each cell sector and their planned range. We can also automate finding locations with cross-feed issues minimizing the need to visit sites to check if cables are properly connected.

Moreover, as radio engineers do changes at cell sites, and surely try their best to optimize the service and do proactive network enhancements, you never know what the end results are. With Netradar our customers can analyze the performance before and after and see the effect of a change in a cell site.

Latencies

With new 5G network, latency has become a critical topic. The industry has advocated 1 ms latency but this should be seen as the potential for a service that is run inside the base station itself. All network links in between the user and the service increase this latency, and so does the cloud platform. If subscribers are not getting the performance they expect and the user experience suffers, where is the problem? What is the latency of the radio network itself, the core network, and the peering networks where data might be residing?

We see increasing awareness in our customers on the meaning and impact of the real latency. Our analysis shows that a loaded network can create hundreds of milliseconds of latency in 4G, and we have seen latencies going up to 1-2 seconds in 5G. We also hear that the vendor of a base station hardware and the configuration used have a significant impact on the latency. As our customers know their network and the vendors, exact hardware, and configuration of each base station, we can help them in optimizing each base station to provide the best possible latency.

August 26, 2022August 30, 2022

Netradar AI 2.0

Netradar enables our customers to collect huge amounts of detailed information about their networks and how their own end users experience the wireless service. This has proven to be extremely valuable for our customers.

Some of our customers use the raw data performing their own analysis on it and merging the raw data potentially with other data sources. Others use our visual dashboard to see various performance metrics and maps about their network comparing that to other providers. And some do both.

The fundamental question with a huge amount of detailed data is the information it gives to us. Data is useless unless it reveals some critical insights and helps in the daily operations of a company.

For a mobile network provider, the basic question is, how is my service performing in the eyes of the subscribers? Are they getting (even close to) what they pay for? Locations with great performance are important to know but even more important are places and situations where the subscribers are having quality issues. The Netradar data crunching engine has now been upgraded with numerous AI-powered features and clearly point out locations with suboptimal performance, based on different metrics.

I’ll go over some of the new features that will be deployed during Q3/2022. Our capabilities increase again later this year as we continuously develop new cool ideas.

Missing 4G and 5G

Today 4G/LTE is the dominant mobile technology and will be for many years to come. Yet, 5G is being deployed and eventually will take over. Coverage is built with simulation tools that try to consider signal propagation, geography, and buildings. The fundamental question is, are the mobile subscribers being really served by 4G/5G or are they dropping to 3G or even 2G? How many people are affected and how badly? How well do the planning tools map the real world? The Netradar Coverage AI shows where problems occur, how many people are affected and helps in configuring the network to fill the gaps. Some of these are intentional, a side effect of incrementally deploying a new technology, while others are unintentional and not seen from base station metrics alone.

Low signal quality

While e.g., 4G signal can be available in an area the question is how good is it or how strong? Having a signal available is a prerequisite to offering a wireless service but then the signal characteristics dictate the best possible performance that could be achieved in each location. The Netradar Signal AI highlights these places and prioritizes the locations based on the number of affected subscribers. The dashboard distinguishes between RSRP, RSRQ and RSSNR thus enabling the study of signal power, quality and interference.

Low Top Download Speed

Network top download speed seems to be the most familiar metric to many wireless professionals. It does have its place, among other KPIs. Netradar Speed AI can analyze both top speeds and capacities. We can show specific locations where the top speeds remain low even on a 4G or 5G network and therefore negatively impact the subscribers and their apps.

Low Capacity

While top speed is a familiar concept in the end it does not tell the real capacity that is available to the subscribers. A location might give, at times, a very high-top speed but over the course of a day or week, this location might have serious capacity issues where the subscribers are left with very low speeds. This in turn affects the experience that subscribers get from their wireless provider. The Netradar Capacity AI uses our patented hybrid measurement technology to analyze the speeds subscribers are getting and informs when speeds go below acceptable levels. We can show the locations with the highest impact to subscribers and thereby then best potential to increase user satisfaction.

High latency

Current 4G networks can already offer low latencies but in 5G latency is increasingly important. Mobile networks are built gradually, having older and newer hardware, different types of connectivity from the base stations to fixed networks with varying signal propagation environments and even capacity constraints. All these will affect the real latency as experienced by the end users. The Netradar Latency AI pinpoints these locations sorting them based on impact to subscribers. With different thresholds, we can paint a very detailed picture of the latency that the subscribers are experiencing in the best and worst cases. This helps tremendously when developing new kind of services that require low latency.

Indoor vs Outdoor

A further unique feature of Netradar solution is our ability to distinguish between outdoor and indoor usage. We can tell if some issue is happening outdoors or if it is tied to indoor coverage. E.g., a shopping mall or office building might have great coverage and performance outdoors but when people enter the building their service drops to an unacceptable level. Indoor base stations might cover the building, but mobile devices might also simply hang on the macro base station and have a low service quality. All the above individual AI features are implemented for both indoor and outdoor use cases. Thus, we can show e.g., bad capacity in outdoor areas or missing 4G in some buildings.

Final words

Our AI is focusing on the different KPIs and how many people are affected by the poor performance of the wireless network. This allows focusing the work first on locations with the highest impact to subscribers. Yet some places might not have a great number of subscribers but are otherwise important based on non-technical factors. These locations can be analyzed with our performance and coverage tools with ease.

Further down the path we have interesting features coming up e.g., fully missing coverage of any radio technology (aka. black holes) or low uplink speeds and capacity. We also have capabilities to identify explicit interference to satellite signals, which results in bad location accuracy or even fully missing location-based services. Stay tuned!

May 4, 2022May 19, 2022

The Sustainable Way to Measure Network Quality

Global warming has been taken extremely seriously around the world and various countries and industry sectors are working to lower their impact on our environment. The ICT sector brings constantly new services and solutions that help other sectors to realize their goals on resource efficiency and sustainability.

Yet, we must also look at how the ICT sector itself, through these various services, uses resources and affects our goals towards a sustainable society. ICT-based services need hardware to process and deliver the data and software to create the services. The hardware uses electricity, and manufacturing of equipment uses materials and energy; and naturally shipping hardware around the world is not free either.

In this article, we discuss the sustainability of the Netradar solution, and why there is no better solution in terms of sustainability.

No Hardware

Netradar is purely a software-based solution and based on three logical components. First, we need an agent measuring the network quality. We currently focus on Android phones and offer an SDK or a separate app to our customers. We can support other end points too when the need arises.

Secondly, our methodology is based on a deep understanding of latency and its behavior and therefore we need latency measurement servers. Our servers are virtual and highly optimized. They can and should be deployed in virtual environments making use of existing data center platforms. There is no real benefit of deploying our latency measurements on barebone servers.

Thirdly, as in data collection and processing systems, there is a need for a backend database system. Netradar can be deployed in e.g. Google’s Cloud, Amazon AWS or on a private cloud. We can even push our data into an existing data lake from where it can be further pushed to support current analytics processes. Thus there is no need to install expensive new hardware to take full advantage of the Netradar analytics.

Many legacy solutions use various speed tests to try to measure network performance. The challenge with all speed tests is that the test servers need to be built in such a way that they never become a bottleneck in terms of performance results. Any solution must make sure that the speed test always reflects the maximum performance of the network and the servers should not have any impact on this. In a world of high-speed networks, like fiber and 5G, setting up this speed test infrastructure is very costly and difficult. The fundamental challenge is that you never know beforehand how fast the network is, and therefore need to over allocate capacity so that the performance result is trustworthy. Think of 5G for instance, a user could get a top speed of anything between 10Mbps and 1 Gbps, which makes a 100-fold difference.

This inherent problem of speed measurements forces the infrastructure to be built with excess capacity just in case, and typically barebone servers are needed to make the performance results at least somewhat trustworthy. Netradar does not have this serious shortcoming in terms of resource efficiency and cost of running the infrastructure.

Minimal Energy Usage

An important business decision when investing in any new product or service is the CAPEX vs OPEX analysis. A system might be reasonable to buy but to operate it increases the lifetime costs. In ICT, the OPEX is mostly about energy and with the increasing energy prices, the OPEX part of the calculation becomes increasingly large.

Netradar is a highly optimized and sustainable solution. Let’s look at the three components of the system from the energy usage perspective:

End point measurement agent

Netradar is collecting data primarily from customers’ Android phones. IOS could be technically supported but since the APIs are very limited, any Apple product would not help at all in network performance analysis (you can contact me for further details why this is so). As we integrate with various apps and run on the customer’s device (everything anonymized as GDPR has been our guideline for years), we simply must be extremely careful in how much we affect the consumer’s device and its energy consumption. Netradar analyses customer’s own data traffic. When the device is being used, its display is on and radio is transmitting data, and only then Netradar is active. The added energy consumption of Netradar at this stage is extremely small, well below 1%. When the consumer is not using the device, Netradar sleeps and does not consume energy.

Another important point is the amount of data transferred to measure network quality. Netradar uses small latency measurements to perform the analysis. Data is gathered on the end point and pushed from time to time to the backend. The combined data usage of both the latency measurements and the data upload is on average just 3MB/month. This is less than e.g. one view of a web page. With speed tests, we can estimate that the data usage is roughly the same as the indicated bit rate in bytes. For example, if I had a 200Mbps download and a 30Mbps upload, my device consumed 200+30MB of data, and if I had a 1 Gbps download result in 5G network, I used 1 GB of data. This data usage not only affects the customer’s device in terms of energy consumption but also the mobile network will need to transfer that artificial test data. All this consumes a lot of energy, and naturally consumes capacity from the other users of the network.

Latency measurements

The data usage of the latency measurements is extremely small. We have calculated that an average user consumes about 1.5MB/month of data to measure latency. This means that the impact is very small on the end device, on the wireless network and on the server infrastructure. Moreover, as our server software is highly optimized multi-threaded Linux binary written in C++, it runs very well even in a simple virtual server, and naturally can be run as a container. Thus, the resources needed to run our measurement infrastructure are minimal, almost non-existing.

Back-end data storage

Our system is flexible and can accommodate various different deployment architectures and data storage environments. The simplest deployment uses the cloud services of Google or Amazon, and benefits from their hard work on green data centers and distributed computing. We can also deploy the backend on a private cloud or even push the raw data to an existing data lake environment. The computing and storage requirements scale with the amount of data collected. Old data can be aggregated into trends and removed, and newer data can be used in deeper studies. Our customers are free to define data retention periods and thereby the effect on the backend resources needed.

In summary, the Netradar system is the world’s most sustainable mobile measurement system offering 24/7 analysis of end users’ connectivity. The amount of data collected by Netradar is huge while at the same time the whole system consumes very little energy. No devices need to be deployed anywhere as the system is fully virtual and software based. If sustainability is in your agenda, but you still want to understand network quality down to the smallest detail, there is only one solution worth considering.

Another important impact of Netradar is what it can offer to the overall sustainability goals for an operator. Firstly, there is no more need to run drive-through tests around a country. Secondly, having a wide and deep understanding of your network will be critical in optimizing the hardware deployment. Every base station and network node consumes energy and the less hardware there is, the lower the energy consumption. Yet, the amount of hardware deployed must match the needs of the customers and their expectations. With the Netradar solution wireless operators can invest where it matters the most.

April 8, 2022May 19, 2022

Understanding Latency in Modern Mobile Networks

Network latency as a KPI has gained more focus in recent years with the introduction of 5G networks. It is commonly understood as the time it takes for an IP packet to reach the destination and it is measured in milliseconds (ms). The lower the latency is, the faster data gets to the end point to create a given digital service. The bandwidth (bit rate, capacity) available in the network further affects the amount of IP packets that can be transmitted at the same time.

With 4G, latency dropped significantly compared to good old 3G, the first real mobile data service. 5G non-standalone and 5G standalone networks promise to enhance latency, along with high bit rates, even further down to 1 ms. But what is the network latency consumers really experience today in commercial mobile networks? How should you measure and understand latency? Can you interpret latency metrics in a novel way and understand better the performance of your network?

Where does latency come from?

While transfer bit rate reflects the lowest performance between the data source and the destination, e.g. the bottleneck speed between a server and mobile user, latency builds up in the network. The longer the path, the higher the latency typically is. Naturally physical and link layer technologies matter, all nodes on the path add some amount of latency and availability of network capacity finally dictates how quickly IP packets get through a router, switch or cellular base station.

In modern cellular networks, latency is primarily a sum of three components:

Distance to the vantage point,
Performance of the end points themselves, and
Capacity of the wireless link.

The vantage point affects the absolute latency. If we measure the latency from the mobile device to the base station hardware, in favorable conditions we can get to as low as 1 ms. Yet, when we measure further away from the mobile device, the latency grows due to the physical distance and due to all the nodes in the path that need to process the IP packets. Still, it is worth noting that the distance dictates the lower bound of the latency, but does not directly affect the fluctuation of the latency (often referred to as jitter) nor the upper bound.

Core and access networks are primarily built with optical links and they are extremely fast. The resulting latency comes in most cases from the physical distance to a vantage point. Still, if a part of the network becomes congested, it will increase the latency and potentially create packet loss.

The end points themselves can also affect the latency. The latency is affected by how the end points, e.g. server and mobile device, handle the data. Today when discussing crowd-sourced measurement, smartphones are rather powerful and should not add any significant amount of latency due to the data processing. Servers can become a bottleneck if they become overloaded and thereby can introduce very significant latency into the measurements.

In a cellular network, the provisioning of radio connectivity is the most complex and difficult part. As the end users are mobile, they can be virtually in any location within the reach of the antenna signal. This forces the radio link to behave differently depending on the circumstances. There can also be any number of end users that need to be served with data, they move around creating handovers, and the type of apps and data transfer needs differ. In many situations, modern cellular technologies can handle this complexity and serve the end users well. But at times, the sheer load of these users and their apps can create congestion on the radio and force the base station to buffer incoming data before it can be transmitted on the downlink. The uplink can behave similarly, just that the data buffering is done on the mobile devices before they start to transmit their data over the radio link.

As modern cellular networks are built to offer reliable transfer of IP packets, it means that there has to be adequate buffer space to hold user data before it can be transmitted on the radio link. If capacity becomes an issue on a given base station, it will buffer user’s data and thereby increase the latency. This increased latency can be even one second in 5G networks, a thousand fold higher than the advocated 1ms latency.

How to measure the latency that matters?

With 5G, the mobile community has been advocating 1ms latency for the services. What is typically left unsaid is that such low latency refers to the connection between the mobile device and the base station running some form of edge service. So in essence this is the radio link latency. With WIFI, we get the same latency from the radio link.

Many legacy network measurement platforms separate data transfers and latency from each other. They measure latency on an empty radio link and then test data transfer speeds. This mode of operation seeks to show an optimistic latency, a theoretical lower bound that a customer could experience if he did not have any data transfer ongoing. Seldom people are using apps on their smartphones without any data transfer. Moreover, these platforms optimize the physical location of the measurement vantage point to be as close to the consumer as possible, to further lower the measured latency.

With Netradar, our customers do not seek to simply measure this best case latency. They want to understand the real latency as experienced by their end-users throughout their daily network usage. Netradar calculates over ten different latency-related metrics and can store individual latency samples – downlink and uplink are studied separately. All this is coupled with extensive contextual information of the radio network to enable extremely accurate and detailed analysis of the quality of the cellular network.

Moreover, as highlighted earlier, the capacity and congestion of base stations increase the latency. Netradar’s proprietary algorithms (a form of AI) use the momentary bit rate of the app traffic coupled with latency and contextual information to indicate network capacity issues. The system is highly optimized and for a full month of detailed latency and capacity analysis, we use merely 2-3MB of data per user.

Real-world Examples

Netradar develops the core technology and AI to analyze the quality of cellular and WIFI networks. Our customers understand the difference between buying some legacy third-party crowd-sourced data and collecting detailed private data from one’s own network. When the real performance and development of the cellular network is critical, the data has to be reliable and extensive.

To support our technology development, we do our own data collection around the world. We use a network of distributed measurement points around the world and use a load balancer to measure latency to various vantage points.

Let’s take Finland and Germany as examples. I selected data that is measured to the same vantage point located in Frankfurt and filtered the data to only consider 4G or 5G connections across all mobile operators. The analysis shows that

The lowest latency in Finland was around 26ms while in Germany it was 12ms. The difference is natural because of the physical distance from Finland to Germany.
The average latency experienced by consumers in both Finland and Germany was 89 ms and 81 ms respectively. Considering the distance to the vantage point, Finns experience a slightly lower latency compared to the German consumers.
The highest latencies in both countries go way beyond one second.
Looking at average latencies that are over 100 ms for Germans and over 114ms for Finns (100ms+14ms for the physical distance), Finns encounter them 15% of the time and Germans 17% of the time.
When comparing 4G and 5G NSA, we see surprising numbers. The lowest latencies are slightly higher for 5G than for 4G while it should be the opposite. There is no significant difference in average and worst case latencies. As the data is analyzed across all national operators, there are differences and one low performing provider will affect the national results negatively. For example in the German data, we see one provider having systematically higher latencies in both 4G and 5G compared to the competition.

In summary, measuring the real latency of a cellular network and how the end users and their smartphones experience it has not yet been truly understood by the mobile industry. Moreover, Netradar uses the deep understanding of latency and its behavior to analyze network capacity shortages. Hopefully this article will trigger some new thinking in how cellular network performance should be measured and understood. You can always email me jukka.manner@netradar.com if you have any thoughts on the topic, and join our forthcoming webinar to learn more. Register here: www.netradar.com/webinar

October 1, 2021June 15, 2022

Analyzing 5G coverage and service quality

The current 5G service is built on the existing 4G core and 4G signaling, the so-called Non-Standalone Mode (NSA). The signaling and data transmission are asymmetric in that they are run on different frequencies. The 5G data bearer uses in most cases the 3.5GHz frequency band while the 4G signaling channel uses lower frequencies.

In theory everything works well and the 5G subscribers always get a 5G service when they are in the planned coverage area. In most cases this probably is so, but there are many situations where the planned and effective 5G service differ. A customer who has a high-end 5G phone with an expensive 5G data plan will not be happy.

The network operations center has a huge amount of information on the performance of the network, from very low level details from the cell sites to high level KPIs. Yet, they do not see what the customer experiences, where and when.

In the following, we show some examples how Netradar data can be used to build the best possible 5G service.

Signal coverage of 5G

Your network planning tools give one estimate of where the 5G bearer should be strong enough to carry data. Yet, as the most common frequency in 5G is 3.5GHz, it will propagate differently, e.g. it has a shorter effective range than earlier used frequencies and does not penetrate buildings as well.

In particular the indoor coverage may be very challenging to measure and locate the places that need better planning.

• Coverage of 4G signaling in 5G NSA

In a 5G NSA service, the 4G network offers the control signaling to the devices. The availability of 5G is advertised inside the 4G channel parameters. In an optimal case, the 5G is advertised in the same area the new radio is really available. Often the same base station offers the 4G signaling and the 5G bearer.

With the Netradar analytics, we can see where the 5G service is being advertised by the 4G signal.

• Difference between signaling and data bearer

As the frequencies are different and the propagation of the 4G and 5G signals are estimated based on models, we have situations where the network advertises a service that is not available in real life.

The Netradar analytics sees where 5G is being advertised by the 4G network and where people with 5G devices and subscriptions really do get onboard 5G. Digging further into the data, we can help understand why the customer does not get the new service and is instead left on 4G.

• Handovers between 4G and 5G

When both 4G and 5G are available, the network controls the service the customer gets. The decision on which radio service to choose is taken by the network. In most cases this works great. But with Netradar analytics, we can see situations where the customer’s device jumps back and forth between these two radio technologies. This increases delays in the data transmission and even full service loss before the new bearer is configured to carry the customer’s data.

Performance

The new 5G service offers a higher peak bit rate and lower latency compared to 4G – at least in most cases. Yet, as the 5G service is run along 4G and the data bearer is running on a different frequency, there can be serious performance issues that affect the customer.

In general, 5G offers a lower latency. The industry talks about 1ms latency but this usually means the delay of the radio link, not an end-to-end delay to the content in the cloud. The 1ms delay could be possible if content is hosted in the base station itself, so-called edge computing.

In reality, we see a similar end-to-end latency with current 4G and 5G networks. 5G latencies tend to be a bit smaller but since the path to the content can be long, the benefits of a lower latency radio link diminish.

Yet, what we do see in our data, is that 5G users often have very high latency peaks in their data transfers. With 4G, the worst case latencies can be several hundred milliseconds in a congested network but with 5G we have seen latencies as high as 1.5 seconds in some networks. These indicate that something in those 5G networks is not working right.

Similar surprises we see also with 5G download speeds. The industry advocates 1 Gbit/s peak speeds, and sometimes consumers can get very high speeds. But we also see very low speeds in 5G, even lower than 4G offers in the same location. In these cases, the device was using a very bad 5G bearer even though a better 4G signal would have been available. We also see network congestion events in 5G that lead to download speeds of less than 10 Mbit/s – not a tremendous service for a premium subscription.

The above are only some examples of analytics and views we can offer to a 5G provider. In a future post, we will look in more detail in these cases with real data from mobile networks. Stay tuned!

May 20, 2021June 15, 2022

How to measure real latency, as experienced by your customers?

Latency has become a hot topic with the introduction of 5G. When advocating 5G services, the industry talks about gigabit download speeds and about 1ms latency in 5G (but forgets to mention that this is the radio link latency, not to the content, but more about that later).

A low latency is truly beneficial for some daily mobile applications, like VoIP calls and real-time multiplayer gaming. If the visions of self-driving cars running over 5G becomes reality, a low latency surely is important there, too.

So, what is the latency that your customers are truly experiencing in their daily usage of the mobile network?

It is fundamental to note that “latency” can be used to describe different metrics. It can be either the one-way latency between two points, or the two-way latency or Round Trip Time (RTT).

The one-way latency is challenging to calculate accurately as it requires the clocks to be synchronized between the measuring nodes. One could also simply measure the RTT and divide it by two to get a rough estimate of the one-way latency. Yet, in asymmetric networks, which mobile networks are, this would be inaccurate. The RTT (or what people often refer to as “ping”) is much easier to measure, as the sending and receiving timestamps are taken from the same physical clock.

Components of the latency that affect the consumers’ apps is the sum of many factors:

1. Radio link

2. Access network

3. Core network

4. Content server

The radio link (1) and access network (2) are the key components that eventually define most of the latency. The radio link, in particular, is the part that often becomes congested, increasing the latency of the mobile applications data flow.

Core networks (3) do not easily get congested, unless some hardware or cables do break. The RTT introduced by a fiber core is roughly 1ms/1000km but the hardware used increases this further, to a range of 1.5-3ms/1000km.

Content servers (4), residing in a data center, can be a bottleneck if the service is run on inadequate hardware not up to the task of serving the current user base. The hardware can introduce latency in the server nodes themselves, the internal data center network or the external internet connection.

How should you measure latency?

If we focus on crowdsourced measurement methodologies, we notice that most players have a look-up system that seeks to allocate the nearest measurement node for the test. The more nodes you have, the better the chances are that the end point is physically close to the measuring device and the lower reported latency.

Often with latency, one can measure the jitter, a notion of how stable the latency is and a way to show how much fluctuation there is. The closer the content is to the end user, the lower the jitter is, too.

Yet, as most players measure latency before a synthetic speed measurement, the reported latency and jitter present optimal results in an unloaded network.

Our daily apps do not operate in empty environments. Our apps send and receive data all the time, some more, and some less, and depending on the other users in the network, the capacity available changes. As the available capacity of the radio network drops, the user data starts to get buffered before transmission and latency grows.

Netradar measures latency inside the ongoing users’ data transfers. We use sophisticated AI (artificial intelligence) algorithms to decide when and how to measure the latency.

Our analytics can therefore present the real latencies experienced by the end users, while using their daily mobile apps.

Furthermore, at the heart of Netradar is our algorithms that can identify network congestion and show user data transfers that were limited by the mobile network. By combining our latency measurements with the congestion detection algorithms, and the detailed contextual data, Netradar can provide a very detailed picture of the performance of a mobile network, from a country to city level, down to individual routing areas, base stations and even antenna sectors.

What does Netradar analytics show ?

Our analytics, in relation to latency, show the familiar metrics that most crowdsourced systems or dedicated measurement hardware can show and a lot more. As we know when the network is running perfectly, and when there is a shortage of capacity, we can analyze the configurations of different parts of the network and help optimize the performance and the behavior.

The Netradar analytics show, in terms of latency:

1. Average latency as consumers and their apps experience the mobile network in their daily journey, regardless of the network capacity;

2. Minimum latency, the optimal case, when everything works perfectly (similar to the typical latencies reported by speed tests);

3. Maximum latency, the worst case, when the network is seriously overloaded;

4. Latency when there is ample capacity to server the users;

5. Latency in a congested network;

6. Latencies caused by handovers between base stations or radio technologies;

7. All of the above can be applied to radio technologies: 3G, 4G and 5G (SA and NSA), as well as

8. To any number of reference points that enable a very extensive view of data connectivity, even international network peering.

In Netradar, latency is not a simple single value, rather a multi-dimensional metric that can be used to study in detail your own network, and the ones of your competitors.

To deploy Netradar

Netradar is a solution for collecting private network performance analytics. As such, our customers have the ability to deploy measurement points anywhere on the planet. We can rotate the location to test latency between different national hot-spots, data centers or even access network hubs. We can also measure around the world, to see latency to different countries, or to major cloud providers like Amazon, Google and Microsoft.

Our latency measurement servers are designed to be extremely lightweight, yet powerful. A single server running in a virtual machine can serve tens of thousands of customers. This is yet another benefit from our architecture and methodology: you do not need to deploy heavy servers for testing top speeds, only lightweight “ping” servers.

An example from Germany

To illustrate what Netradar shows, we pulled data from the past couple of months from the three major players in Germany. Here are some examples of what we can notice immediately:

• One of the providers has a lower minimum latency than the other two providers: a very significant difference- if I were a gamer, the choice of provider would be clear to me.

• One provider has much more latencies from congested data transfers that indicates a much higher load in the network and not enough capacity to serve the customers. I would potentially avoid this operator.

• One provider has a higher average latency, which indicates potential configuration sub optimality or even network issues.

• One provider has a very high average of the minimum latencies, calculated per data transfers – not a perfect partner for running delay sensitive applications.

• The difference between optimal latency and latency in a congested network can become as high as 200ms for one provider, which will impact substantially mobile apps.

• Looking at individual regions, Brandenburg has very high latencies from two network providers The same can be noticed in Saxony-Anhalt where two providers have high latencies without significant network load. The worst regional latency is found in Mecklenburg-Vorpommern where one provider offers up to 100% higher latencies than the two other providers. Avery significant impact on customer experience.

• Looking at 5G, there is no difference in latency to 4G. This is due to the Non-Standalone Access (NSA) mode of deploying 5G, where the access network is the same for both 4G and 5G. In some cases, the 5G latency is even higher than 4G as people make use of the full bit rates and data transfer capabilities of the technology and load the network with traffic. Hopefully Standalone Access (SA) will change this, for both unloaded and congested networks.

Conclusions

In summary, with the wide range of apps and services, and the emerging 5G networks, latency must be seen as an important metric besides bit rate. Measuring the real latency experienced by customers is critical as it sheds light on how the network is configured and how it performs with the daily apps and the data transfers. Finding these misconfigurations and network segments with a limited capacity will make a difference between an average network service and a great one.

Only Netradar can provide a full picture of the network performance as experienced by the end users. Book an introduction session and solution’s demo by contacting tomi.paatsila@netradar.com

April 6, 2021June 15, 2022

The Netradar Hybrid Measurement Technology

In the modern digital world, systems and services have become an intertwined and sensitive network where small changes or hick-ups in one corner can have dramatic effects on the system as a whole. Critical parts of the system need to be monitored for inconsistencies, and the performance and capacity need to evolve constantly to support the continuous growth.

Mobile cellular networks are one critical component of the modern society. Mobile operators follow proactively the performance of their precious investment with various tools and services and react when things break. As mobile networks are built with simulations and planning tools but the end users are real, there is often a wide gap between the plans and the reality. Leading companies have invested in crowdsourcing to gather data from their real users, to support their processes and enhance their networks.

In the context of mobile network measurements and in particular end-point measurements, we see two main types of measurements being used in crowdsourcing: active and passive. Active measurements are typically based on sending a synthetic payload to load the network and measuring the top speed of that transfer or sending latency measurement probes to calculate the latency. In passive measurements, the end point simply monitors the wireless interface and takes speed samples from the traffic flow.

Active measurements can be triggered in a multitude of ways, e.g.,

• By consumer wishing to test his current network top speed,

• Remotely by a service provider to debug a potential network speed issue, or

• Periodically to gather intelligence on the network top speeds.

Regardless of the reason to trigger an active speed measurement, they still consume data, some more some less. The point is to see at that specific point in time, what would be the maximum speed the network could allow. The payload can be a synthetic data stream, the download of a certain set of web pages, or e.g. streaming some video with varying bit rates. The type of the payload affects the end result directly, e.g., if the payload is not able to saturate the link, we do not get to know the top speed of the network at all. In some tests, it might be enough to simply measure if the payload arrives within a certain delay bound, e.g., a web page loads all components in under 10 seconds.

With synthetic speed tests, there are many challenges if one wants to have reliable results. The first is to decide how to saturate the link, using one or multiple data streams at the same time? Note that most consumer apps that do high speed transfers (buffer a video, update an app, download fresh content, etc.) use only one data stream.

The next question is, how will you know that the link has been saturated and your results actually tell the network maximum speed? Typically speed test apps download and upload data for 10 seconds and expect that this time is enough to measure the link top performance. The downside is that the amount of data that is transmitted in a 10 second test can be huge, e.g., over a 4G network, it could be anything from 1MB (a user at the cell edge) to over 600MB (a user on an LTE-A network). This has a huge impact on the device power consumption and can use a huge fraction of the data plan if one is being used. Naturally running these types of tests periodically in the background does not make sense, as you never know beforehand how much data will be used.

To counter the issue of potentially huge data consumption, many vendors use e.g. 5MB or 10MB files that are downloaded periodically in the background. The problem here is that with small data volumes, the high speed recorded might not tell of the network capability. On a slow network it should saturate the link, but on a fast network, it would not – and how do you know that from the speed metric?

Moreover, the potential top speed of a mobile network can be anything from kilobits per second to even a gigabit per second – a difference of 1000X. How do you dimension your measurement servers and their network connection in such a way that they do not become the bottleneck themselves? If you get too much measurement traffic, you end up being limited by your measurement system, not by the network service being measured.

Active methodologies are typically used in drive-by testing, where the amount of data consumed or battery consumption are somewhat irrelevant. In consumer-oriented crowdsourcing, the active methodologies suffer from the potentially huge data and energy consumption, which limits how much often they can be used. This results in very limited amounts of measurement data making the data less valuable. Mostly active measurements can be used for operator comparison benchmarks on a national level (going to more finer granularity is typically not possible with the small amount of data) or some reactive debugging following customer complaints. Active measurements can not be used to proactively monitor network quality on a finer level and seek to enhance the quality before customer complaints emerge.

Furthermore, making an active measurement now at a given location does not tell what the performance was an hour ago or yesterday, or what it will be tomorrow. There is potentially some relationship, but it is not a reliable metric; you would need a lot of active measurements of the same location to increase your confidence.

Passive methodologies can provide a considerable amount of measurement data as the purpose is to monitor ongoing consumer data transfers. The downside is that the monitored speeds do not include information about the reasons why a certain peak speed was recorded: was it the network maximum, was it the server maximum, or something else? So having a lot of speed records does not help much in the end.

Latency measurements are also actively used to get information about how quickly the network can forward IP packets. Yet, as latency measurements are typically done outside actual data transfers, they record some sort of network ideal performance, but do not tell of the real latency experienced by the consumers’ data flows when they saturate the network entirely.

Thus, in the mobile network context, the current purely active or passive measurement methodologies have some inherent downsides that make them rather limited in wider use.

Active and passive measurements have been the norm, and mobile operators are familiar with them. But are these the holy grail of end-point driven, crowd-sourced, network analytics, or could it be done differently?

Hybrid measurements

The IETF RFC 7799 states:

“Hybrid Methods are Methods of Measurement that use a combination of Active Methods and Passive Methods, to assess Active Metrics, Passive Metrics, or new metrics derived from the a priori knowledge and observations of the stream of interest. ”

Thus, in essence, we can think about combining active and passive methodologies to form something new. The Netradar hybrid is such an invention.

The Netradar hybrid measurements combine passive and active techniques in a novel way. We use passive network monitoring to calculate the momentary bit rates. We augment these speed samples with an optimized stream of latency measurements probes. We do not create a synthetic payload, only measure latency.

As all crowd-sourced solutions measure speed and latency separately, we merge them together and run them at the same time. If the network is working well, the measured latency is stable or has a small jitter (horizontals and vertical handovers do create high latencies and jitter, but those are easy to spot and filter out). But once the network starts to receive too much load (downlink or uplink), it will have to start buffering the IP packets. When this happens, the latency starts to grow and jitter increases. The Netradar proprietary patent-pending algorithms detect this abnormal behavior and flag a passive measurement as having abnormal latency.

This methodology has the benefit that we can acquire a huge amount of speed and contextual data on consumer’s real network speeds, and we know when the network had issues providing reliable and stable connectivity. Yet, we do not inject a huge synthetic measurement payload, only small ping packets. Our overhead on the end user device is roughly 0.1% of increased data consumption per month.

To illustrate the power of hybrid measurements, let’s look at two maps. The data is from only one user to show how much data we can gather.

First, we have speeds recorded in a suburb of Helsinki during February 2021. Note how many of the speeds are in the red, so less than 5Mbps. This is the data you would get from passive measurements. Based on this data, you would expect that this particular Elisa 4G network is pretty bad. Not quite so.

Figure 1: All speed samples from passive data collection without any notion of why e.g. speeds are often under 5 Mbps (red is 5Mbps or lower peak speed

When we filter out all the speeds where the network latencies were stable (by our definition and algorithms, considering a 4G cellular network), we are left with the locations that had irregular latencies while the consumers had data flowing to their favourite apps.

Figure 2: Data speeds that were limited by the mobile network. Some are green, indicating a high speed and little impact to the user (if any), some are much lower impacting the user.

If I were in the network quality department of this Finnish mobile operator, I would concentrate on the red dots in this latter map, look at the contextual data related to all dots, and what my network internal analytics have to say. Are these single incidents like a user at the cell edge, is there a load balancing or coverage issue where traffic is not balanced adequately, a misaligned antenna pointing in a bad direction, hardware issue, or something else?

To illustrate further the power of our technology, consider the notion of latency. Network latency measurements are traditionally run outside data transfers. Netradar runs them inside the data transfers. While analyzing the latency probes and samples, we can calculate the minimum, average and maximum. And since we know if the jitter and latency behaved abnormally, we can also calculate the latency in a loaded and congested network, so the latencies consumers and their apps experience in real life if the network runs out of capacity.

The figure below shows an example with real data. The figure shows the latency in a given area in February 2021. We have calculated altogether nine different latencies. The typical metric is the average latency (1). Yet, as the load in the network affects also the latency, we can distinguish between latency in an uncongested (lightly loaded) network (2) and the congested (highly loaded) network (3). These tell how the network really handles end user traffic and what is the latency when there is no shortage of capacity and when capacity has run out and end user traffic is constrained. In these cases, we can also calculate the minimum latencies and maximum latencies, adding a further 3+3 metrics (best and worst cases).

Figure 3

Now these latencies are much more valuable input to network planning and performance management. They tell about the real latencies happening, not the optimal case which looks good on reports to the management.

To summarize, the power of the Netradar hybrid technology is the ability to analyze latency and latency variations during data transfers and draw conclusions on the performance of the mobile, or WIFI, network. The benefits over existing legacy methodologies are huge:

1. Extremely small data and energy consumption as we do not inject synthetic payloads, only monitor the consumer’s data connection and measure latency.

2. Detailed view of consumers’ daily mobile experience as our methodology is not based on random background speed tests but rather continuously monitoring the quality of the mobile network.

3. Data for proactively looking for network misconfigurations and problems before complaints occur.

4. No need for speed test servers so deployment is rather simple, we only need some very small ping servers and some place to push the collected metrics (your own database, Google Cloud, Amazon AWS or Microsoft Azure are all supported already)

5. As the data collection is so extensive, even a small deployment, say, 5% of your customer base, will generate a huge amount of data.