Join us for Understanding Latency 3.0 on December 9-11.
Blog

Proving Quality of Outcome captures complex network requirements: A Call of Duty case study

September 10, 2024
Bjørn Ivar Teigen Monclair
Head of Research

Did you know that a mere 0.2 seconds of extra delay noticeably degrades user experience in fast-paced online games? Over the last few years, we’ve been busy testing how games and video conferencing apps respond under poor network conditions to better understand their network requirements. Our goal was to pinpoint the conditions under which users notice performance issues and determine the thresholds at which these apps stop functioning completely. The insights gained from this research enable us to produce Quality of Outcome (QoO) scores for these applications, providing a more accurate measure of network performance.

The testbed

We built a testbed and some custom tools for the job. The testbed involves a nfqueue-based queueing mechanism that allows greater flexibility over network conditions compared to standard Linux tools. With tools such as Netem, adding “white noise” (gaussian) latency, or latency sampled from a uniform distribution, is straightforward. But we needed to add latency to only a few packets every now and then because that more accurately simulates a congested network. We also needed to make sure the application packets were not re-ordered.

Our tool adds latency to the per-packet processing function at the head of a FIFO queue. This means that a queue will form if more packets arrive while the head-of-line packet is waiting to be processed. Therefore, the latency observed by the application traffic can be higher than the latency we add to any single packet. An example: The network emulator is like a grocery store checkout line where the cashier goes on the occasional smoke-break. We can configure how often, and for how long, the cashier is out. When the cashier is out, a queue is likely to form, and some people might be waiting in line even longer than the full duration of the smoke break.

Call of Duty

Call of Duty on PlayStation 4 is our game of choice. We play a few games while tweaking the network performance and write down whether the player notices any lag, or has other issues with the game. The data shows that small and rare latency spikes are enough for the player to notice and get annoyed.

The following plot shows the thresholds where Call of Duty stops working (blue line) and degrades from perfect to noticeably laggy (red line). The x-axis represents the magnitude of delays, and the y-axis shows the percentiles. For example, at the 99th percentile, 1 in every 100 packets is artificially delayed by our network emulator. This simulates a network with sudden and randomly timed latency spikes, a problem that is common in wireless networks, notably WiFi and 5G, and networks with bufferbloat.

As we can see, Call of Duty is very sensitive to latency spikes. If 1 out of every 100 packets receives an additional 128ms of latency, the player will likely notice that the gaming experience is not good.

In our full report, Performance Measurement of Web Applications, we have also published results for real-time video streaming and video conferencing applications.

Network requirement APIs

With the new Camara APIs for connectivity insights, it becomes feasible for the network to communicate how likely it is that the network can meet specific requirements. We believe the results presented here show how Quality of Outcome can serve as a guide for specifying such network requirements for specific applications.

Conclusion

Given that even brief network disruptions—impacting as few as one in every 100 packets—can significantly deteriorate user experience, it’s clear that traditional metrics like average latency and jitter fall short. They simply do not capture the full spectrum of what users endure. To address this critical gap, we are pioneering Quality of Outcome (QoO), a comprehensive network quality metric that accounts for these common yet often overlooked events. By focusing on real-world performance issues, QoO aims to ensure that every aspect of network performance is measured and improved, providing a more accurate reflection of what users truly experience.

Did you know that a mere 0.2 seconds of extra delay noticeably degrades user experience in fast-paced online games? Over the last few years, we’ve been busy testing how games and video conferencing apps respond under poor network conditions to better understand their network requirements. Our goal was to pinpoint the conditions under which users notice performance issues and determine the thresholds at which these apps stop functioning completely. The insights gained from this research enable us to produce Quality of Outcome (QoO) scores for these applications, providing a more accurate measure of network performance.

The testbed

We built a testbed and some custom tools for the job. The testbed involves a nfqueue-based queueing mechanism that allows greater flexibility over network conditions compared to standard Linux tools. With tools such as Netem, adding “white noise” (gaussian) latency, or latency sampled from a uniform distribution, is straightforward. But we needed to add latency to only a few packets every now and then because that more accurately simulates a congested network. We also needed to make sure the application packets were not re-ordered.

Our tool adds latency to the per-packet processing function at the head of a FIFO queue. This means that a queue will form if more packets arrive while the head-of-line packet is waiting to be processed. Therefore, the latency observed by the application traffic can be higher than the latency we add to any single packet. An example: The network emulator is like a grocery store checkout line where the cashier goes on the occasional smoke-break. We can configure how often, and for how long, the cashier is out. When the cashier is out, a queue is likely to form, and some people might be waiting in line even longer than the full duration of the smoke break.

Call of Duty

Call of Duty on PlayStation 4 is our game of choice. We play a few games while tweaking the network performance and write down whether the player notices any lag, or has other issues with the game. The data shows that small and rare latency spikes are enough for the player to notice and get annoyed.

The following plot shows the thresholds where Call of Duty stops working (blue line) and degrades from perfect to noticeably laggy (red line). The x-axis represents the magnitude of delays, and the y-axis shows the percentiles. For example, at the 99th percentile, 1 in every 100 packets is artificially delayed by our network emulator. This simulates a network with sudden and randomly timed latency spikes, a problem that is common in wireless networks, notably WiFi and 5G, and networks with bufferbloat.

As we can see, Call of Duty is very sensitive to latency spikes. If 1 out of every 100 packets receives an additional 128ms of latency, the player will likely notice that the gaming experience is not good.

In our full report, Performance Measurement of Web Applications, we have also published results for real-time video streaming and video conferencing applications.

Network requirement APIs

With the new Camara APIs for connectivity insights, it becomes feasible for the network to communicate how likely it is that the network can meet specific requirements. We believe the results presented here show how Quality of Outcome can serve as a guide for specifying such network requirements for specific applications.

Conclusion

Given that even brief network disruptions—impacting as few as one in every 100 packets—can significantly deteriorate user experience, it’s clear that traditional metrics like average latency and jitter fall short. They simply do not capture the full spectrum of what users endure. To address this critical gap, we are pioneering Quality of Outcome (QoO), a comprehensive network quality metric that accounts for these common yet often overlooked events. By focusing on real-world performance issues, QoO aims to ensure that every aspect of network performance is measured and improved, providing a more accurate reflection of what users truly experience.


© 2024 Domos. All rights reserved.