Measuring Internet Performance
The goal of this article is to describe the different ways to measure Internet performance, including the popular speed tests, and their limitations. We will also cover how these measurements are typically performed.
While we will avoid using a lot of jargon, we may sometimes be forced to use some terms that, for simplicity, are defined in the article “Vocabulary of Internet performance”.
Measures of Internet Performance
In the article “The topology of the Internet and its impact on performance” we discussed how the structure of the Internet may impact a user’s experience. In this article, we will discuss Internet performance from source to destination without delving into the details of the topology that underlies the path or paths taken.
There are several measures of Internet performance that were defined in the article “ Vocabulary of Internet performance”. They can be grouped as follows to combine concepts that are frequently used synonymously, but are sometimes subtly different:
Methodology
While there are limits to it, we often use the road traffic analogy to discuss performance for the sake of simplicity. Each performance metric can be measured through a technique called sampling, which is similar to how traffic is measured on roads. Not every car is tracked, a small sample of cars being tracked can give a pretty good measure of how traffic is moving. Instead of cars on roads, “packets” are used in Internet performance measurements. However, unlike the road analogy, the sample packets are actually generated and tagged so that they can be tracked.
The Route
Since we have established the methodology, the next step is to establish the source and the destination
of the sample traffic.
As we discussed in our previous article “The topology of the Internet and its impact on performance”, there is no guarantee that individual packets will follow the same route from source to destination. This adds to a layer of complexity in thinking about internet performance. However, what matters is that all the packets sent reach their destination.
Internet measurements typically start from a source, like a browser on a PC. The measurements are then made to a destination chosen by the company providing the measurement. The choice of the destination therefore has an impact on the measurement. Ideally, multiple destinations are chosen to give a holistic view.
To make the measurement, sample packets are generated and sent from the source to the destination and back. Statistical techniques are used to compute the estimate of performance.
The Measurements
Bandwidth/Throughput/Speed
These measure how fast the raw amount of information can be transferred from source to destination. Upload speed refers to the time it takes to send information (say sending a large email attachment) and download speed refers to the time it takes to download information (say a movie). Both these are typically measured in Megabits per second (Mbps) or Gigabits per second (Gbps).
Internet service providers typically publish these numbers to describe the capacity of the connection to the home. However it is very rare, if ever, to actually get the published speed because of conditions of the network. It is expected, however, that a user will see performance “close” to those published numbers under ideal conditions.
Caution: Before running tests and potentially getting upset at the internet service provider, it is important to ensure that the tests are being run in a “clean” environment. Ideally, the machine where the test is being run is connected directly (not via WiFi) to the router closest to where the connection is delivered and run after removing additional delays (such as other applications running at the same time on the network or on the machine itself).
The tests to measure Bandwidth/Throughput/Speed are popular and are referred to as “speed tests”. They are often provided by the internet service provider or can be hosted by third parties such as Ookla.
These tests identify a server, typically not far from the source, and send a large sample of data to it and then back from it. The data is large enough to extend beyond the capacity of the connection in order to measure the time it takes to move the information. It is similar to trying to see how many cars can move on a highway by sending a lot of cars on it until all lanes are full. C learly, not something that should be done too often because it saturates your network.
Most applications (other than large file uploads and downloads) do not push the boundaries of this measurement and therefore, while useful, are not the only test of internet performance that is relevant. Most video applications, adjust the rate of transmission of packets to a comfortable speed to give the best user experience. As long as there are no other issues, most modern connections have more than sufficient bandwidth (in spite of the attempt by some internet service providers to convince subscribers otherwise).
Latency/Delay/Lag/Ping Time
This measures how long it takes for information to go from source to the destination and back. Obviously faster is better (fastest, theoretically, being the speed of light). For applications such as gaming, this is of particular significance. The longer the route the packet takes, the longer the latency or delay. Hence gamers will want to pick servers in the cloud that are closest to their location and connect directly to the first router within the home.
Typically this involves sending a few sample packets and seeing how much time it takes for all of them to return. Given each packet can take a different amount of time, the measure is the average time across the sample sent. Ping is a commonly used tool/technique to measure this.
Jitter/Delay variation
While latency/delay is the average time packets take to traverse a path from source to destination, the underlying quality of the path is not visible. In other words, two paths may appear to have the same latency but if one of the paths did not have underlying issues it should have actually been faster. There are numerous underlying issues that can cause delay, including congestion and other routing issues.
One of the measures of the quality is delay variation or jitter. There are numerous underlying issues that can cause delay, including congestion and other routing issues. If things are clean, all packets sent should arrive at the same pace they were sent. The arrival pattern should look like the transmission pattern. In reality there are subtle delays along the path and the packets can end up spaced very unevenly, indicating a problem. The larger the variation the bigger the quality issue.
You may imagine a highway where all the cars are moving smoothly at the same pace versus the case where there are a lot of changes in the spacing between cars (i.e., a lot of accelerating and braking). Since the measurement is done at the time the cars (packets) arrive, you can see how the delay variation may be an indication of an underlying problem.
Since the test generates an evenly spaced set of sample packets, it is expected that they arrive evenly spaced. Delay variation is computed by averaging the variations in the arrival delays.
Packet Loss
This measurement is another measure of path quality.
When packets don’t arrive at the destination, due to corruption or congestion along the path, or arrive too late (and are ignored), it is the responsibility of the application to notice and request a resend. Such a resend (called a retransmission) obviously causes delays. Therefore, it is likely that a shorter path with a lot of packet loss will not perform as well as a longer path without loss.
This test generates a sample set of packets and measures to see if there was any loss in either direction. Sparse and infrequent packet loss can be normal and have numerous underlying valid reasons, in which case it should be ignored. Experienced engineers can detect what is abnormal and seek to rectify the underlying issue.
Conclusion
A plug for monitor-io
Monitor-io was designed to take into consideration the above conclusions. The monitor-io device is a standalone machine that does the measurements and nothing else. It is designed to be plugged into the router and connects via a wired connection. The destinations for tests are selectable and can be chosen by the user based on their location. Since the destinations are co-located with popular sites on the internet, the routes are likely to represent the typical path user data will travel, and therefore be a good approximation of the user experience.
The goal of this article is to describe the different ways to measure Internet performance, including the popular speed tests, and their limitations. We will also cover how these measurements are typically performed.
While we will avoid using a lot of jargon, we may sometimes be forced to use some terms that, for simplicity, are defined in the article “Vocabulary of Internet performance”.
Measures of Internet Performance
In the article “The topology of the Internet and its impact on performance” we discussed how the structure of the Internet may impact a user’s experience. In this article, we will discuss Internet performance from source to destination without delving into the details of the topology that underlies the path or paths taken.
There are several measures of Internet performance that were defined in the article “ Vocabulary of Internet performance”. They can be grouped as follows to combine concepts that are frequently used synonymously, but are sometimes subtly different:
- Bandwidth/Throughput/Speed
- Latency/Delay/Lag/Ping Time
- Jitter/Delay variation
- Packet Loss
Methodology
While there are limits to it, we often use the road traffic analogy to discuss performance for the sake of simplicity. Each performance metric can be measured through a technique called sampling, which is similar to how traffic is measured on roads. Not every car is tracked, a small sample of cars being tracked can give a pretty good measure of how traffic is moving. Instead of cars on roads, “packets” are used in Internet performance measurements. However, unlike the road analogy, the sample packets are actually generated and tagged so that they can be tracked.
The Route
Since we have established the methodology, the next step is to establish the source and the destination
of the sample traffic.
As we discussed in our previous article “The topology of the Internet and its impact on performance”, there is no guarantee that individual packets will follow the same route from source to destination. This adds to a layer of complexity in thinking about internet performance. However, what matters is that all the packets sent reach their destination.
Internet measurements typically start from a source, like a browser on a PC. The measurements are then made to a destination chosen by the company providing the measurement. The choice of the destination therefore has an impact on the measurement. Ideally, multiple destinations are chosen to give a holistic view.
To make the measurement, sample packets are generated and sent from the source to the destination and back. Statistical techniques are used to compute the estimate of performance.
The Measurements
Bandwidth/Throughput/Speed
These measure how fast the raw amount of information can be transferred from source to destination. Upload speed refers to the time it takes to send information (say sending a large email attachment) and download speed refers to the time it takes to download information (say a movie). Both these are typically measured in Megabits per second (Mbps) or Gigabits per second (Gbps).
Internet service providers typically publish these numbers to describe the capacity of the connection to the home. However it is very rare, if ever, to actually get the published speed because of conditions of the network. It is expected, however, that a user will see performance “close” to those published numbers under ideal conditions.
Caution: Before running tests and potentially getting upset at the internet service provider, it is important to ensure that the tests are being run in a “clean” environment. Ideally, the machine where the test is being run is connected directly (not via WiFi) to the router closest to where the connection is delivered and run after removing additional delays (such as other applications running at the same time on the network or on the machine itself).
The tests to measure Bandwidth/Throughput/Speed are popular and are referred to as “speed tests”. They are often provided by the internet service provider or can be hosted by third parties such as Ookla.
These tests identify a server, typically not far from the source, and send a large sample of data to it and then back from it. The data is large enough to extend beyond the capacity of the connection in order to measure the time it takes to move the information. It is similar to trying to see how many cars can move on a highway by sending a lot of cars on it until all lanes are full. C learly, not something that should be done too often because it saturates your network.
Most applications (other than large file uploads and downloads) do not push the boundaries of this measurement and therefore, while useful, are not the only test of internet performance that is relevant. Most video applications, adjust the rate of transmission of packets to a comfortable speed to give the best user experience. As long as there are no other issues, most modern connections have more than sufficient bandwidth (in spite of the attempt by some internet service providers to convince subscribers otherwise).
Latency/Delay/Lag/Ping Time
This measures how long it takes for information to go from source to the destination and back. Obviously faster is better (fastest, theoretically, being the speed of light). For applications such as gaming, this is of particular significance. The longer the route the packet takes, the longer the latency or delay. Hence gamers will want to pick servers in the cloud that are closest to their location and connect directly to the first router within the home.
Typically this involves sending a few sample packets and seeing how much time it takes for all of them to return. Given each packet can take a different amount of time, the measure is the average time across the sample sent. Ping is a commonly used tool/technique to measure this.
Jitter/Delay variation
While latency/delay is the average time packets take to traverse a path from source to destination, the underlying quality of the path is not visible. In other words, two paths may appear to have the same latency but if one of the paths did not have underlying issues it should have actually been faster. There are numerous underlying issues that can cause delay, including congestion and other routing issues.
One of the measures of the quality is delay variation or jitter. There are numerous underlying issues that can cause delay, including congestion and other routing issues. If things are clean, all packets sent should arrive at the same pace they were sent. The arrival pattern should look like the transmission pattern. In reality there are subtle delays along the path and the packets can end up spaced very unevenly, indicating a problem. The larger the variation the bigger the quality issue.
You may imagine a highway where all the cars are moving smoothly at the same pace versus the case where there are a lot of changes in the spacing between cars (i.e., a lot of accelerating and braking). Since the measurement is done at the time the cars (packets) arrive, you can see how the delay variation may be an indication of an underlying problem.
Since the test generates an evenly spaced set of sample packets, it is expected that they arrive evenly spaced. Delay variation is computed by averaging the variations in the arrival delays.
Packet Loss
This measurement is another measure of path quality.
When packets don’t arrive at the destination, due to corruption or congestion along the path, or arrive too late (and are ignored), it is the responsibility of the application to notice and request a resend. Such a resend (called a retransmission) obviously causes delays. Therefore, it is likely that a shorter path with a lot of packet loss will not perform as well as a longer path without loss.
This test generates a sample set of packets and measures to see if there was any loss in either direction. Sparse and infrequent packet loss can be normal and have numerous underlying valid reasons, in which case it should be ignored. Experienced engineers can detect what is abnormal and seek to rectify the underlying issue.
Conclusion
- There are four groups of tests which can provide a fairly realistic view of internet performance.
- Speed tests cannot and should not be performed often. These tests, while useful, are limited in what they do as they are a “here and now” measurement. There is no guarantee that a few
minutes later, the measurements will be similar. Because the tests choose the servers closest to them, they are also providing an ideal speed measure which may not map to the actual user experience. - The remaining tests can be performed frequently as they barely use any of the available bandwidth. If performed continuously, they give a better view of the quality of the underlying network as well as trends in performance. For example, one may see that every evening when everyone returns from work there is an increase in the delay numbers - possibly because people are home and streaming video. Over a long period of time a regular pattern can be detected.
- The location and resources used to perform these measurements is important. Ideally these measurements are performed on a standalone machine with nothing else running on it and located closest to the first router in the house via a direct (non WiFi) connection.
- The source and destinations for these tests impacts the results. A diverse set of destinations may give a more holistic view of the performance a user can expect.
A plug for monitor-io
Monitor-io was designed to take into consideration the above conclusions. The monitor-io device is a standalone machine that does the measurements and nothing else. It is designed to be plugged into the router and connects via a wired connection. The destinations for tests are selectable and can be chosen by the user based on their location. Since the destinations are co-located with popular sites on the internet, the routes are likely to represent the typical path user data will travel, and therefore be a good approximation of the user experience.