Communications Network Research Institute
VoIP over Wireless
Mirosław (Mirek) NarbuttAs VoIP goes wireless, this will present a challenge not only to fixed line operators, but to mobile operators.
Richard Webb, Infonetics
Back to last page | Top of this page | Home |
QoS requirements for VoIP
As VoIP spreads from the wireline to the wireless world, performance issues arise because the characteristics of wireline and wireless networks differ. Delay, jitter and packet loss, the key factors that impact packet voice quality in the fixed Internet, are further magnified in a WLAN environment. In order to achieve PSTN-like voice quality, impact of those transmission impairments should be minimized. The stringent quality of service (QoS) requirements for VoIP are:
- low (<3%) or zero packet loss in order to achieve good voice quality
- low delay (<150ms preferred, <400ms limit) for interactive communication
- low or zero delay variation for continuous playout

(Click on image for larger view)
Back to last page | Top of this page | Home |
Combating transmission impairments
Real-time voice transmission over packet networks imposes stringent requirements on one-way end-to-end delays and packet loss. The responsibility of meeting these requirements is shared between end-points and the underlying network.
Actions at the end-points
Much of the responsibility for packet loss, delay and jitter is delegated to the end-points. As long as transmission impairments remain below a certain level, actions at end terminals can greatly mitigate their effects. For example:
- Encoding and packetization schemes can be optimized to give smaller algorithmic delay and better bandwidth utilization;
- Degradation in speech quality due to packet loss can be minimized by means of packet loss concealment (PLC);
- De-jitter buffering can be implemented at the end terminals to compensate for delay variation;
- Echo cancellation techniques can improve speech intelligibility.

Actions at the underlying network
On the network side there is considerable development activity in designing new architectures and protocols. Integrated Services (Int-Serv) mechanisms can provide QoS guarantees by adding circuit-like functionality (with the use of RSVP protocol). Differentiated Services (Diff-Serv) mechanisms enable service differentiation and prioritization of various traffic classes. For wireless LAN networks, the new 802.11e MAC enhancement protocol allows one to distinguish between different traffic categories, contending for the same bandwidth. The Wi-Fi Multimedia (WMM) portion of the standard defines four such traffic categories: voice, video, best effort, and background traffic. These different traffic categories are mapped to four output queues called access categories (ACs). A WMM enabled end-user device or an access point (AP) first classifies the traffic flow into one AC and directs it into the appropriate forwarding queue (see Fig. 3).

Each AC is characterized by its own set of channel access parameters:
- Arbitration Inter-frame Space (AIFS) interval specifies the time interval between medium-idle and the start of media negotiations.
- Contention Window (CW) Size specifies dynamic backoff intervals (from CWmin to CWmax).
- Transmit Opportunity (TXOP) Limit interval specifies the duration that an end-user device can transmit for a given access category.
Different levels of service for each AC are provided by differentiating the channel access parameters. For example, voice as a higher priority traffic than other traffic types will receive a shorter AIFS value and/or a smaller CW range to have a high probability of accessing the wireless medium before other (low priority) traffic types.
Cross-layer tuning
Quality enhancement mechanisms at various communications layers are often complex and difficult to configure. As an example, Figure 4 shows how proper tuning of de-jitter buffering at the end-terminal can improve the quality of VoIP call.

(Click on image for larger view)
Tuning one parameter can often lead to a local performance improvement, but at the same time can have a disastrous effect on the overall call quality. For example, increasing the number of possible retransmissions decreases packets loss which can improve speech quality but also can introduce variable retransmission delays. Variable delays need to be smoothed out before decoding so de-jitter buffering is required, which introduces additional delays. As a result decreasing packet loss improves speech quality (i.e. listening-only quality), but at the same time decreases the conversation's interactivity and thus the overall call quality. If a part of the VoIP transmission path is being tuned, the impact of local tuning actions on both interactivity and speech quality has to be taken into account. Predicting the impact of local tuning actions on speech transmission quality is indispensable for proper QoS provisioning.

(Click on image for larger view)
Back to last page | Top of this page | Home |
Quality contours for predicting speech transmission quality
We proposed a method for predicting speech transmission quality from time varying transmission impairments. This method is based upon a reduction of the ITU-T E-model concept to transport level time-varying metrics (packets loss and packets delay). Using the ITU-T E-model formula and the ITU-T categories of user satisfaction we derived contours of quality as a function of mouth-to-ear delay and the packet-loss ratio (with assumed encoding scheme). Figure 6 shows the quality contours for the G.711 encoding scheme (assuming random loss of packets) derived from the ITU-T E-model formulas.
(Click on image for larger view)
Quality contours are a crucial part of predicting speech transmission quality affected by adaptive playout. The procedure is as follows:
- The trace file is recorded at the VoIP terminal, after the de-jitter buffer that contains one-way mouth-to-ear delays, sequence numbers and marker bits for each arriving VoIP packet.
- A specific playout buffer algorithm takes the trace file as an input and calculates playout deadlines for each incoming packet; packets that arrive after calculated playout deadlines are dropped and considered lost (late packets).
- Average mouth-to-ear playout delay and average packet loss (including late packets and packets lost in the network) are calculated for a defined time interval (for example 10 sec. or for duration of a talk spurt).
- Quality contours are chosen for a specific encoding scheme and echo cancellation level.
- Calculated average playout delays and average packet losses are mapped on a loss-delay plane that already has quality contours on it.
- Overall user satisfaction in a form of pie chart is derived from the distribution of loss-delay points on the quality contours.

(Click image for larger view)
With quality contours, the impact of delay and packet loss on conversational speech quality can be studied in two ways: either as the combined effect of loss and delay on overall quality, or as individual contributions of packet loss to speech degradation and playout delay to interactivity degradation. This is especially useful in the process of parameter tuning when a trade-off exists between packet delays and loss and efforts are focused on finding the operating point where conversational quality is maximized. An additional advantage of this method is showing percentages of user satisfaction instead of giving one quality score. The proposed method of predicting user satisfaction from time varying transmission impairments has already shown to be particularly effective in evaluating various playout buffer algorithms and in assessing VoIP performance in WLAN systems.
Related publications:
Draft Appendix I to ITU-T Recommendation G.109 (COM12-C32-E) The E-model based quality contours for predicting speech transmission quality and user satisfaction from time varying transmission impairments Mirosław Narbutt, Joachim Pomy, SG12 ITU-T Meeting, Geneva, January 2007
ITU-T Recommendation G.109 Appendix I (01/2007) The E-model based quality contours for predicting speech transmission quality and user satisfaction from time-varying transmission impairments
Back to last page | Top of this page | Home |
Experiments
Experimental VoWLAN Testbed
The VoWLAN testbed consists of 15 desktop PCs acting as wireless VoIP terminals, one desktop PC acting as a background traffic generator, and one desktop PC acting as an access point (AP). All machines in the testbed use the 802.11b PCMCIA wireless cards based on Atheros chip sets controlled by MadWiFi wireless drivers and Linux OS (kernel 2.6.9). All of the nodes are also equipped with a 100Mbps wired Ethernet. The PC that acts as an access point routes traffic between the wired network and the wireless clients, and vice versa (each PC has two interfaces: one on the wireless and one on wired subnet). During experiments each VoIP terminal runs one VoIP session and all sessions are bi-directional. In this way each terminal acts as the source of an uplink flow and the sink of a downlink for a VoIP session. The wired interface is used to generate background traffic which is routed via the AP to the wireless interface of the same PC. All generated traffic involves a wired and a wireless interface so that no traffic was generated between wireless interfaces. This testbed is illustrated in Figure 8.

VoIP traffic is generated using RTP tools, whereas background traffic in the form of Poisson distributed UDP packet flow is generated using MGEN traffic generator. During experiments all experimental data (packet arrival times, timestamps, sequence numbers, and marker bits) are collected at the receiving terminal to be processed later (off-line).
Choosing de-jitter buffering scheme
Traditionally, the choice of a de-jitter buffer algorithm was based purely on the trade-off between buffering delay and the resulting late-packet loss. Given that the purpose of playout buffering is to improve conversational speech quality, a more informed choice of algorithm can be made by considering its effect on user satisfaction. Figure 9 compares the performance of six various playout buffer algorithms in terms of percentages of user satisfaction.
(Click on image for larger view.)
Related publications:
Assessing the Quality of VoIP Transmission Affected by Playout Buffer Scheme, Mirosław Narbutt, Mark Davis, Proc. of the ETSI/IEE Measurement of Speech and Audio Quality in Networks Conference 2005 (MESAQIN 2005), Prague, June 2005.
Adaptive VoIP Playout scheduling: Assessing User Satisfaction, Mirosław Narbutt, Andrew Kelly, Liam Murphy, Philip Perry, IEEE Internet Computing Magazine, vol. 09, no. 4,July/August 2005.
Evaluating relationship between amount of free bandwidth and VoIP calls quality.
The main objective of the experiments was to evaluate wireless LAN utilization as the number of VoIP calls increases and how it influences VoIP call quality.
Figure 10 shows how the overall capacity of the wireless medium was shared between three basic MAC bandwidth components (free, access, and load) when 11 VoIP simultaneous calls and 2Mbps background traffic were carried in the network, how it influenced transmission impairments (delay, loss, and jitter) and thus call quality and overall user satisfaction.

11 VoIP calls + background traffic.
(Click image for larger view)
With an available free bandwidth of 2.51% out of 11Mbps, playout delays are below 25ms (i.e. mouth-to-ear) and packet loss below 3%. In this case an average user would be satisfied 99% of the time.
Figure 11 shows the three basic MAC bandwidth components when 14 VoIP simultaneous calls and 2Mbps background traffic were carried in the network, transmission impairments (delay, loss, and jitter), call quality and overall user satisfaction.

14 VoIP calls + background traffic.
(Click image for larger view)
With no free bandwidth available, playout delays occasionally increased to 400ms and packet loss increased up to 20%.
The figures below show how the call quality decreases as the number of VoIP calls increases, how the amount of free bandwidth decreases as the number of VoIP calls increases and how the call quality depends on the availability of free bandwidth.

(Click image for larger view)
We found a close relationship between wireless bandwidth utilization and call quality. When the amount of free bandwidth dropped below 1% call quality became unacceptable for all ongoing calls. We claim that the amount of free bandwidth is a good indicator for predicting VoIP call quality. This kind of information on MAC bandwidth components may be required for potential QoS provisioning and call admission schemes.
Related publications:
The Experimental investigation on VoIP performance and the resource utilization in 802.11b WLANs, Mirosław Narbutt, Mark Davis, IEE Conference Publication: IEEE Conference on Local Computer Networks (LCN'06), Tampa, November 2006
Effect of Free Bandwidth on VoIP Performance in 802.11b WLANs, Mirosław Narbutt, Mark Davis, IEE Irish Signals and Systems Conference 2006 (ISSC 2006), Dublin, June 2006
Gauging VoIP Call Quality from 802.11b Resource Usage, Mirosław Narbutt, Mark Davis, IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM06), Buffalo-NY, June 2006
Voice traffic prioritization in a WLAN
With a hardware that supports a subset of the 802.11e functionality it is possible to prioritize VoIP packets over other traffic types. In our experiments we prioritized voice traffic over background traffic by increasing the AIFSN or CWmin parameters of the background AC (i.e. AIFSN [AC_BK] or CWmin [AC_BK]. At the same time, all the parameters related to voice AC were kept fixed. Figures 13 show time-varying network delays, calculated playout delays, loss/delay distributions on the quality contours, and quality pie charts for four CWmin values: 3, 5, 7, and 9 respectively.
(Click on image for larger view.)
In a similar way we experimentally investigated the impact of the AIFSN parameter on a mixed voice/data wireless transmission. Figure 14 shows how the average voice transmission quality increases as AIFSN [AC_BK] increases (for 2Mbps background traffic) and how it impacts of the effective background data throughout (i.e. the goodput).

As can be seen, voice transmission can be effectively protected from the background traffic influence by using the 802.11e EDCA mechanism. Increasing AIFSN [AC_BK] essentially promotes the AC_VO queue at the expense of the AC_BK queue in terms of probability access. The bigger the difference in AIFSN values, the easier it is for the AC_VO queue to win transmission opportunities which reduces delay and jitter and hence improves QoS.
Back to last page | Top of this page | Home |