Communications Network Research Institute

VoIP over Wireless

Mirosław (Mirek) Narbutt

As VoIP goes wireless, this will present a challenge not only to fixed line operators, but to mobile operators.

Richard Webb, Infonetics

QoS requirements for VoIP

As VoIP spreads from the wireline to the wireless world, performance issues arise because the characteristics of wireline and wireless networks differ. Delay, jitter and packet loss, the key factors that impact packet voice quality in the fixed Internet, are further magnified in a WLAN environment. In order to achieve PSTN-like voice quality, impact of those transmission impairments should be minimized. The stringent quality of service (QoS) requirements for VoIP are:

Fig. 1
Fig. 1: Quality of service requirements for various multimedia applications (source: ITU-T Rec. G.1010
(Click on image for larger view)

Combating transmission impairments

Real-time voice transmission over packet networks imposes stringent requirements on one-way end-to-end delays and packet loss. The responsibility of meeting these requirements is shared between end-points and the underlying network.

Actions at the end-points

Much of the responsibility for packet loss, delay and jitter is delegated to the end-points. As long as transmission impairments remain below a certain level, actions at end terminals can greatly mitigate their effects. For example:

Fig. 2
Fig. 2: Network states, resulting transmission impairments and application related mechanisms used to compensate for the effects in the network (ETSI TR/STQ-037)

Actions at the underlying network

On the network side there is considerable development activity in designing new architectures and protocols. Integrated Services (Int-Serv) mechanisms can provide QoS guarantees by adding circuit-like functionality (with the use of RSVP protocol). Differentiated Services (Diff-Serv) mechanisms enable service differentiation and prioritization of various traffic classes. For wireless LAN networks, the new 802.11e MAC enhancement protocol allows one to distinguish between different traffic categories, contending for the same bandwidth. The Wi-Fi Multimedia (WMM) portion of the standard defines four such traffic categories: voice, video, best effort, and background traffic. These different traffic categories are mapped to four output queues called access categories (ACs). A WMM enabled end-user device or an access point (AP) first classifies the traffic flow into one AC and directs it into the appropriate forwarding queue (see Fig. 3).

Fig. 3
Fig. 3: Traffic prioritization for WMM access categories

Each AC is characterized by its own set of channel access parameters:

Different levels of service for each AC are provided by differentiating the channel access parameters. For example, voice as a higher priority traffic than other traffic types will receive a shorter AIFS value and/or a smaller CW range to have a high probability of accessing the wireless medium before other (low priority) traffic types.

Cross-layer tuning

Quality enhancement mechanisms at various communications layers are often complex and difficult to configure. As an example, Figure 4 shows how proper tuning of de-jitter buffering at the end-terminal can improve the quality of VoIP call.

Fig. 4
Fig. 4: Proper tuning of adaptive de-jitter buffering improves speech quality.
(Click on image for larger view)

Tuning one parameter can often lead to a local performance improvement, but at the same time can have a disastrous effect on the overall call quality. For example, increasing the number of possible retransmissions decreases packets loss which can improve speech quality but also can introduce variable retransmission delays. Variable delays need to be smoothed out before decoding so de-jitter buffering is required, which introduces additional delays. As a result decreasing packet loss improves speech quality (i.e. listening-only quality), but at the same time decreases the conversation's interactivity and thus the overall call quality. If a part of the VoIP transmission path is being tuned, the impact of local tuning actions on both interactivity and speech quality has to be taken into account. Predicting the impact of local tuning actions on speech transmission quality is indispensable for proper QoS provisioning.

Fig. 5
Fig. 5: Assessing the impact of local tuning actions on
(Click on image for larger view)

Quality contours for predicting speech transmission quality

We proposed a method for predicting speech transmission quality from time varying transmission impairments. This method is based upon a reduction of the ITU-T E-model concept to transport level time-varying metrics (packets loss and packets delay). Using the ITU-T E-model formula and the ITU-T categories of user satisfaction we derived contours of quality as a function of mouth-to-ear delay and the packet-loss ratio (with assumed encoding scheme). Figure 6 shows the quality contours for the G.711 encoding scheme (assuming random loss of packets) derived from the ITU-T E-model formulas.

Fig. 6
Fig. 6: Quality contours for assessing conversational speech quality.
(Click on image for larger view)

Quality contours are a crucial part of predicting speech transmission quality affected by adaptive playout. The procedure is as follows:

Fig. 7
Fig. 7: Predicting user satisfaction from time-varying transmission impairments.
(Click image for larger view)

With quality contours, the impact of delay and packet loss on conversational speech quality can be studied in two ways: either as the combined effect of loss and delay on overall quality, or as individual contributions of packet loss to speech degradation and playout delay to interactivity degradation. This is especially useful in the process of parameter tuning when a trade-off exists between packet delays and loss and efforts are focused on finding the operating point where conversational quality is maximized. An additional advantage of this method is showing percentages of user satisfaction instead of giving one quality score. The proposed method of predicting user satisfaction from time varying transmission impairments has already shown to be particularly effective in evaluating various playout buffer algorithms and in assessing VoIP performance in WLAN systems.

Related publications:

Draft Appendix I to ITU-T Recommendation G.109 (COM12-C32-E) The E-model based quality contours for predicting speech transmission quality and user satisfaction from time varying transmission impairments Mirosław Narbutt, Joachim Pomy, SG12 ITU-T Meeting, Geneva, January 2007

ITU-T Recommendation G.109 Appendix I (01/2007) The E-model based quality contours for predicting speech transmission quality and user satisfaction from time-varying transmission impairments


Experimental VoWLAN Testbed

The VoWLAN testbed consists of 15 desktop PCs acting as wireless VoIP terminals, one desktop PC acting as a background traffic generator, and one desktop PC acting as an access point (AP). All machines in the testbed use the 802.11b PCMCIA wireless cards based on Atheros chip sets controlled by MadWiFi wireless drivers and Linux OS (kernel 2.6.9). All of the nodes are also equipped with a 100Mbps wired Ethernet. The PC that acts as an access point routes traffic between the wired network and the wireless clients, and vice versa (each PC has two interfaces: one on the wireless and one on wired subnet). During experiments each VoIP terminal runs one VoIP session and all sessions are bi-directional. In this way each terminal acts as the source of an uplink flow and the sink of a downlink for a VoIP session. The wired interface is used to generate background traffic which is routed via the AP to the wireless interface of the same PC. All generated traffic involves a wired and a wireless interface so that no traffic was generated between wireless interfaces. This testbed is illustrated in Figure 8.

Fig. 8
Fig. 8: Experimental testbed

VoIP traffic is generated using RTP tools, whereas background traffic in the form of Poisson distributed UDP packet flow is generated using MGEN traffic generator. During experiments all experimental data (packet arrival times, timestamps, sequence numbers, and marker bits) are collected at the receiving terminal to be processed later (off-line).

Choosing de-jitter buffering scheme

Traditionally, the choice of a de-jitter buffer algorithm was based purely on the trade-off between buffering delay and the resulting late-packet loss. Given that the purpose of playout buffering is to improve conversational speech quality, a more informed choice of algorithm can be made by considering its effect on user satisfaction. Figure 9 compares the performance of six various playout buffer algorithms in terms of percentages of user satisfaction.

Fig. 9
Figure 9. Performance of various playout algorithms in terms of user satisfaction: (a) Ramjee's alg. with α=0.99, (b) Ramjee's alg. with α=0.9, (c) "Concord" alg., (d) Moon's alg., (e) Bolot's alg., (f) the "dynamic α" alg.
(Click on image for larger view.)
Related publications:

Assessing the Quality of VoIP Transmission Affected by Playout Buffer Scheme, Mirosław Narbutt, Mark Davis, Proc. of the ETSI/IEE Measurement of Speech and Audio Quality in Networks Conference 2005 (MESAQIN 2005), Prague, June 2005.

Adaptive VoIP Playout scheduling: Assessing User Satisfaction, Mirosław Narbutt, Andrew Kelly, Liam Murphy, Philip Perry, IEEE Internet Computing Magazine, vol. 09, no. 4,July/August 2005.

Evaluating relationship between amount of free bandwidth and VoIP calls quality.

The main objective of the experiments was to evaluate wireless LAN utilization as the number of VoIP calls increases and how it influences VoIP call quality.

Figure 10 shows how the overall capacity of the wireless medium was shared between three basic MAC bandwidth components (free, access, and load) when 11 VoIP simultaneous calls and 2Mbps background traffic were carried in the network, how it influenced transmission impairments (delay, loss, and jitter) and thus call quality and overall user satisfaction.

Fig. 10: Influence of free bandwidth on VoIP quality
11 VoIP calls + background traffic.
(Click image for larger view)

With an available free bandwidth of 2.51% out of 11Mbps, playout delays are below 25ms (i.e. mouth-to-ear) and packet loss below 3%. In this case an average user would be satisfied 99% of the time.

Figure 11 shows the three basic MAC bandwidth components when 14 VoIP simultaneous calls and 2Mbps background traffic were carried in the network, transmission impairments (delay, loss, and jitter), call quality and overall user satisfaction.

Fig 11
Fig. 11: Influence of free bandwidth on VoIP quality
14 VoIP calls + background traffic.
(Click image for larger view)

With no free bandwidth available, playout delays occasionally increased to 400ms and packet loss increased up to 20%.

The figures below show how the call quality decreases as the number of VoIP calls increases, how the amount of free bandwidth decreases as the number of VoIP calls increases and how the call quality depends on the availability of free bandwidth.

Fig. 12
Fig. 12: Influence of free bandwidth on VoIP.
(Click image for larger view)

We found a close relationship between wireless bandwidth utilization and call quality. When the amount of free bandwidth dropped below 1% call quality became unacceptable for all ongoing calls. We claim that the amount of free bandwidth is a good indicator for predicting VoIP call quality. This kind of information on MAC bandwidth components may be required for potential QoS provisioning and call admission schemes.

Related publications:

The Experimental investigation on VoIP performance and the resource utilization in 802.11b WLANs, Mirosław Narbutt, Mark Davis, IEE Conference Publication: IEEE Conference on Local Computer Networks (LCN'06), Tampa, November 2006

Effect of Free Bandwidth on VoIP Performance in 802.11b WLANs, Mirosław Narbutt, Mark Davis, IEE Irish Signals and Systems Conference 2006 (ISSC 2006), Dublin, June 2006

Gauging VoIP Call Quality from 802.11b Resource Usage, Mirosław Narbutt, Mark Davis, IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM06), Buffalo-NY, June 2006

Voice traffic prioritization in a WLAN

With a hardware that supports a subset of the 802.11e functionality it is possible to prioritize VoIP packets over other traffic types. In our experiments we prioritized voice traffic over background traffic by increasing the AIFSN or CWmin parameters of the background AC (i.e. AIFSN [AC_BK] or CWmin [AC_BK]. At the same time, all the parameters related to voice AC were kept fixed. Figures 13 show time-varying network delays, calculated playout delays, loss/delay distributions on the quality contours, and quality pie charts for four CWmin values: 3, 5, 7, and 9 respectively.

Fig. 13
Figure 13. Packets delays, playout delays, loss/delay distribution and user satisfaction when a) CWmin [AC_BK] = 3, b) CWmin [AC_BK] = 5, c) CWmin [AC_BK] = 7, d) CWmin [AC_BK] = 9.
(Click on image for larger view.)

In a similar way we experimentally investigated the impact of the AIFSN parameter on a mixed voice/data wireless transmission. Figure 14 shows how the average voice transmission quality increases as AIFSN [AC_BK] increases (for 2Mbps background traffic) and how it impacts of the effective background data throughout (i.e. the goodput).

Fig. 14
Figure 14. Average voice transmission quality for 15VoIP stations as a function of AIFSN [AC_BK] with 2Mbps background traffic and effective throughput (goodput) of 2Mbps background traffic vs AIFSN [AC_BK]. (Click on image for larger view.)

As can be seen, voice transmission can be effectively protected from the background traffic influence by using the 802.11e EDCA mechanism. Increasing AIFSN [AC_BK] essentially promotes the AC_VO queue at the expense of the AC_BK queue in terms of probability access. The bigger the difference in AIFSN values, the easier it is for the AC_VO queue to win transmission opportunities which reduces delay and jitter and hence improves QoS.