Sunteți pe pagina 1din 15

A Rohde & Schwarz Company

Reference Speech Signals for SQuad


Measurements
Manual
June 2012

SwissQual License AG
Allmendweg 8 CH-4528 Zuchwil Switzerland
t +41 32 686 65 65 f +41 32 686 65 66 e info@swissqual.com
www.swissqual.com
Part Number: 16-070-200349/3 Rev 1.3

SwissQual has made every effort to ensure that eventual instructions contained in the document are adequate and free
of errors and omissions. SwissQual will, if necessary, explain issues which may not be covered by the documents.
SwissQuals liability for any errors in the documents is limited to the correction of errors and the aforementioned advisory
services.
Copyright 2000 - 2012 SwissQual AG. All rights reserved.
No part of this publication may be copied, distributed, transmitted, transcribed, stored in a retrieval system, or translated
into any human or computer language without the prior written permission of SwissQual AG.
Confidential materials.
All information in this document is regarded as commercial valuable, protected and privileged intellectual property, and is
provided under the terms of existing Non-Disclosure Agreements or as commercial-in-confidence material.
When you refer to a SwissQual technology or product, you must acknowledge the respective text or logo trademark
somewhere in your text.
SwissQual, Seven.Five, SQuad, QualiPoc, NetQual, VQuad, Diversity as well as the following logos are
registered trademarks of SwissQual AG.

Diversity Explorer, Diversity Ranger, Diversity Unattended, NiNA+, NiNA, NQAgent, NQComm, NQDI,
NQTM, NQView, NQWeb, QPControl, QPView, QualiPoc Freerider, QualiPoc iQ, QualiPoc Mobile,
QualiPoc Static, QualiWatch-M, QualiWatch-S, SystemInspector, TestManager, VMon, VQuad-HD are
trademarks of SwissQual AG.
SwissQual acknowledges the following trademarks for company names and products:
Adobe, Adobe Acrobat, and Adobe Postscript are trademarks of Adobe Systems Incorporated.
Apple is a trademark of Apple Computer, Inc.
DIMENSION, LATITUDE, and OPTIPLEX are registered trademarks of Dell Inc.
ELEKTROBIT is a registered trademark of Elektrobit Group Plc.
Google is a registered trademark of Google Inc.
Intel, Intel Itanium, Intel Pentium, and Intel Xeon are trademarks or registered trademarks of Intel Corporation.
INTERNET EXPLORER, SMARTPHONE, TABLET are registered trademarks of Microsoft Corporation.
Java is a U.S. trademark of Sun Microsystems, Inc.
Linux is a registered trademark of Linus Torvalds.
Microsoft, Microsoft Windows, Microsoft Windows NT, and Windows Vista are either registered trademarks or
trademarks of Microsoft Corporation in the United States and/or other countries U.S.
NOKIA is a registered trademark of Nokia Corporation.
Oracle is a registered US trademark of Oracle Corporation, Redwood City, California.
SAMSUNG is a registered trademark of Samsung Corporation.
SIERRA WIRELESS is a registered trademark of Sierra Wireless, Inc.
TRIMBLE is a registered trademark of Trimble Navigation Limited.
U-BLOX is a registered trademark of u-blox Holding AG.
UNIX is a registered trademark of The Open Group.

Reference Speech Signals for SQuad Measurements Manual


2000 - 2012 SwissQual AG

Contents
1

Pre-Filtering of Reference Speech Material ...................................................................................... 1


Narrowband (Telephony) Applications .................................................................................................. 1
Wideband (Telephony) Applications ...................................................................................................... 1

SQuad-LQ Speech Quality Measurements .................................................................................... 3


Basics .................................................................................................................................................... 3
Squad-LQ Speech Design Of Samples .............................................................................................. 3
SwissQual Speech Material Narrowband ........................................................................................... 3
SwissQual Speech Material Wideband .............................................................................................. 4

SQuad-NS Noise Suppression Measurement ................................................................................ 6


Basics .................................................................................................................................................... 6
SQuad-NS Speech Material .................................................................................................................. 6
SwissQual Speech Material ................................................................................................................... 6

SQuad-AEC (Passive) Passive Echo Disturbance Measurement ................................................. 8


Basics .................................................................................................................................................... 8
SQuad-AEC (Passive) Speech Material ................................................................................................ 8
SwissQual Speech Material ................................................................................................................... 8

SQuad-AEC (Active) Active Echo Disturbance Measurement ...................................................... 9


Basics .................................................................................................................................................... 9
SQuad-AEC (Active) Speech Material................................................................................................... 9
SwissQual Speech Material ................................................................................................................... 9

SQuad-RTT Round Trip Time Measurement ................................................................................ 11


Basics .................................................................................................................................................. 11
Speech-Like Sequences ...................................................................................................................... 11
SwissQual Speech Material ................................................................................................................. 11

Tables
Table 2-1 Description of the settings for an SQuad-LQ measurement ............................................................. 3
Table 2-2 Description of the IRS pre-filtered reference speech samples for narrowband samples .................. 4
Table 2-3 Description of the non-IRS pre-filtered reference speech samples for wideband scenarios ............ 4
Table 2-4 Description of the WB-IRS pre-filtered reference speech samples for wideband scenarios ............ 5
Table 3-1 Description of the settings for an SQuad-NS measurement ............................................................. 6
Table 3-2 Description of the prefix for a reference speech sample ................................................................... 7
Table 4-1 Description of the settings for an SQuad-AEC (Passive) measurement........................................... 8
Table 5-1 Description of the settings for an SQuad-AEC (Active) measurement ............................................. 9
Table 5-2 Description of the settings for a double-talk SQuad-AEC (Active) measurement............................. 9
Table 5-3 Description of the double-talk reference speech samples .............................................................. 10
ii

Contents |
CONFIDENTIAL MATERIALS

Reference Speech Signals for SQuad Measurements Manual


2000 - 2012 SwissQual AG

Table 6-1 Description of the characteristics for a SQuad-RTT measurement ................................................ 11

iii

Contents |
CONFIDENTIAL MATERIALS

Reference Speech Signals for SQuad Measurements Manual


2000 - 2012 SwissQual AG

Pre-Filtering of Reference Speech Material

The Quality of Service (QoS) measurements that you perform with SwissQual equipment are designed to
provide the same level of quality that a subscriber experiences. For best results, SwissQual recommends
that you use human speech references for most of your measurements and speech-like test signals for
special measurements. Only human speech samples allow for an in-band transmission of the reference
signal and ensure that the transmission component reacts correctly.
For best results, use the following guidelines when you create a speech sample:

Record in a low noise environment with high quality equipment

Avoid long reverberation times in the complete frequency range of the speaker's environment

Include male and female voices as well as human utterances that are typical for telephone
conversations

Ensure that the text is well-balanced from a phonological point of view

SwissQual equipment is designed to be connected to the electrical interface of the sending-side.


Accordingly, the acoustical behaviour of a sending device has to be modelled by the measurement
equipment in use.

Narrowband (Telephony) Applications


Conventional shaped handsets tend to show a weak high-pass characteristic, or pre-emphasis, in the
sending direction, which means that the terminal filters the real spoken voice at the microphone before the
signal is transmitted. If the sending interface of the measurement equipment is the network termination point
(two-wire analog or ISDN), the filtering that is normally done by the handset must be modelled by the
measurement equipment. To create this model, the ITU-T recommends an IRS (Intermediate Reference
System) characteristic within Recommendations P.48 and P.830.
The IRS (send) filter is defined for traditional narrowband applications (up to 3.4kHz) and for traditional
wideband applications (up to 7kHz).
In narrowband scenarios (traditional telephony band), the usual behaviour of a handset is similar to the IRS
(send). To realize a normative input signal, SwissQual recommends that you use the IRS pre-filtered signals
as the input signal to the headset connector. Along with a built-in filter in a Diversity MCM (Mobile Connect
Module), this connector can be considered as flat. When you use an IRS (send) pre-filtered input signal,
exactly one IRS handset is emulated.
For this reason, SwissQual provides pre-filtered reference files. These files can be sent directly from the
electrical interface of the connection to emulate a microphone.
These pre-filtered speech signals should also be used as a high quality reference for the SQuad-LQ
measurements in narrowband measurements. The differences between the optimal speech signal in a
telephone connection (IRS-pre-filtered, but completely undistorted) and the transmitted one can be taken into
account. An MCM (Radio-Interface-Manager) provides an interface that is similar to a 4-wire network
termination point and can also be used with IRS pre-filtered speech signals.

Wideband (Telephony) Applications


The ITU-T also defines an IRS (send) filter for traditional wideband scenarios. However, typical test cases for
wideband telephony services tend to prefer unfiltered flat signals.
To serve wideband and super-wideband (up to 14kHz) test cases, SwissQual provides all reference speech
material for the speech-Wideband test type. This material has a sampling frequency of 32kHz and an
effective audio bandwidth of 50 to 14000 Hz. This signal can be used to directly feed the electrical headset
connector from a Diversity MCM.
For special wideband applications, the usage of IRS(send) filtered speech material might be required. For
such scenarios, SwissQual provides wideband IRS (send) signals.
Chapter 1 | Pre-Filtering of Reference Speech Material
CONFIDENTIAL MATERIALS

Reference Speech Signals for SQuad Measurements Manual


2000 - 2012 SwissQual AG

Disclaimer:
You can only use SwissQual speech material with SwissQual products such as Diversity and QualiPoc. Use
of this material with non-SwissQual products as well as further distribution or deployment is not permitted.

Chapter 1 | Pre-Filtering of Reference Speech Material


CONFIDENTIAL MATERIALS

Reference Speech Signals for SQuad Measurements Manual


2000 - 2012 SwissQual AG

SQuad-LQ Speech Quality Measurements

Basics
The measurement of listening quality is based on a comparison between a high quality un-degraded speech
sample, which is used as the input signal and the transmitted and probably distorted signal that is recorded
at the output of the connection. A psychoacoustic model is then applied to both signals after which all
perceptible differences are measured. The result of these measurements forms the overall listening quality
score. Since, the linear distortions (frequency response) also influence the score, the selection of the input
signal can also depend on the sending interface that is used.

Squad-LQ Speech Design Of Samples


The SQuad-LQ algorithm calculates the listening quality for any arbitrary speech signal, where the
characteristics of the speech material that you use can influence the end result.
To obtain representative and reproducible measurements, the speech sample should reflect typical human
utterances in a telephone conversation.
For auditory tests that are in accordance with ITU-T P.800, short sentence pairs are used. Both sentences
were spoken from one speaker. The average derived by scoring of some of these sentence pairs forms the
mean opinion score.
For network measuring purposes, the transmission of separate files over a longer period is not an acceptable
solution. For this reason, SwissQual recommends speech clips that contain at least two sentences from a
male and a female native speaker.
Note: The sentences are selected to avoid QoS dependencies on the text.
Table 2-1 Description of the settings for an SQuad-LQ measurement

Setting

Description

Length

6.0 s

Speech Activity

approximately 70 %

Structure

Two sentences, pause between sentences > 0.5s

Speaker

Male and female native speakers

Sampling
frequency

16 kHz (for narrowband telephony)


32 kHz (for wideband telephony)

File Format

WAVE, 16bit, INTEL

Level

-26.0 dB OVL

Pre-Filtering

ITU-T Rec. P.830, mod. IRS(send)

If you want to use your own speech material, SwissQual strongly recommends a minimum sample length of
5 seconds of which at least 50% contains speech activity.

SwissQual Speech Material Narrowband


For illustration purposes, SwissQual provides speech material in different languages in accordance with ITUT P.800 and ITU-T P.862.3 recommendations. For consistent results, SwissQual recommends the IRS prefiltered speech samples in Table 2-2 for narrowband telephony scenarios.

Chapter 2 | SQuad-LQ Speech Quality Measurements


CONFIDENTIAL MATERIALS

Reference Speech Signals for SQuad Measurements Manual


2000 - 2012 SwissQual AG

Table 2-2 Description of the IRS pre-filtered reference speech samples for narrowband samples

Reference
Sample

Description

am_fm_IRS.wav

American English, male+female

ar_fm_IRS.wav

Arabian, male+female

ch_fm_IRS.wav

German, Swiss pronunciation, male+female

cn_fm_IRS.wav

Chinese Mandarin, male+female

en2_fm_IRS.wav

British English, male+female


Note: The en2_fm_IRS.wav file replaces the en_fm_IRS.wav, which is also
included for existing deployments. For new deployments, use the
1
en2_fm_IRS.wav reference sample.

fr_fm_IRS.wav

French, male+female
Note: This sample systematically yields slightly lower MOS values than the other
language reference samples in comparable situations. This discrepancy might be
the result of the generation and recording process of the French source material.

ge_fm_IRS.wav

German, male+female

gr_fm_IRS.wav

Greek, male+female

hu_fm_IRS.wav

Hungarian, male+female

it_fm_IRS.wav

Italian, male+female

jp_fm_IRS.wav

Japanese, male+female

pl_fm_IRS.wav

Polish, male+female

pt_fm_IRS.wav

Portuguese, male+female

ru_fm_IRS.wav

Russian, male+female

sp_fm_IRS.wav

Spanish, male+female

tk_fm_IRS.wav

Turkish, male+female

On request, SwissQual can provide all speech material for narrowband as un-filtered (flat) source material. In
addition to the 6 s samples, a 11 s sample in American English (AM_CallQual_IRS.wav ) is also provided,
which you can use for Call Quality measurements.

SwissQual Speech Material Wideband


For illustration purposes, SwissQual provides speech material in different languages in accordance with ITUT P.800 and ITU-T P.862.3 recommendations. SwissQual recommends the NON-IRS pre-filtered speech
samples in Table 2-3 for wideband telephony scenarios.
Table 2-3 Description of the non-IRS pre-filtered reference speech samples for wideband scenarios

Reference Sample

Description

am_fm_wide.wav

American English, male+female

du_fm_wide.wav

Dutch, male+female

SwissQual would like thank Psytechnics Ltd, UK, for their kindly permission to use their British English source material
to generate the new speech sample.

Chapter 2 | SQuad-LQ Speech Quality Measurements


CONFIDENTIAL MATERIALS

Reference Speech Signals for SQuad Measurements Manual


2000 - 2012 SwissQual AG

Reference Sample

Description

ch_fm_wide.wav

German, Swiss pronunciation, male+female

en_fm_wide.wav

British English, male+female

ge_fm_wide.wav

German, male+female

it_fm_wide.wav

Italian, male+female

For special purposes, all speech material in wideband is also available with WB-IRS (send) pre-filtering.
Note: The use of wideband material with IRS pre-filtering can lead to a recognizable limitation in audio
bandwidth. In speech wideband test cases, this limitation is scored as a degradation.
Table 2-4 Description of the WB-IRS pre-filtered reference speech samples for wideband scenarios

Reference Sample

Description

am_fm_IRS_wide.wav

American English, male+female

du_fm_IRS_wide.wav

Dutch, male+female

ch_fm_IRS_wide.wav

German, Swiss pronunciation, male+female

en_fm_IRS_wide.wav

British English, male+female

ge_fm_IRS_wide.wav

German, male+female

it_fm_IRS_wide.wav

Italian, male+female

SwissQual would like thank Psytechnics Ltd, UK, for their kindly permission for use their British English source material
to generate the new speech sample.
Chapter 2 | SQuad-LQ Speech Quality Measurements
CONFIDENTIAL MATERIALS

Reference Speech Signals for SQuad Measurements Manual


2000 - 2012 SwissQual AG

SQuad-NS Noise Suppression Measurement

Basics
The main parameter of the SQuad Noise Suppression (NS) measurement assesses the improvement or
degradation of the noisy speech sample during transmission by comparing the input of the noisy speech
sample to the output sample. Other parameters assess the noise suppression and level deviations. The
SQuad-NS measurement requires the following reference signals:

Noise-free reference speech signal

Same noise-free signal but mixed with an additive noise signal

SQuad-NS Speech Material


You can use an arbitrary speech signal in combination with a noise to run SQuad-NS. However, for a proper
SQuad-NS measurement, you need to include the noise free (clean) reference sample and the sample with
the background noise (noisy sample) for a proper measurement. Furthermore, the speech signal must be in
conformance with the signals that are described for SQuad-LQ. The initial pause of the signal must be a
minimum of 2.0 s. To account for the so-called Lombard-effect, the speech level must be at 3 dB above the
recommended value for SQuad-LQ.
Table 3-1 Description of the settings for an SQuad-NS measurement

Setting

Description

Length

8.0 s

Speech Activity

approx. 50 %

Structure

Two sentences initial pause > 2.0 s pause between the sentences > 0.5 s

Speaker

Male and/or female native speakers

Sampling frequency

16 kHz

File Format

WAVE, 16bit, INTEL

Speechlevel

-23.0 dB OVL

Noiselevel

-23.0 -50.0 dB OVL (recommended)

Pre-Filtering

ITU-T Rec. P.830, mod. IRS(send)

SwissQual Speech Material


SwissQual provides IRS pre-filtered speech material in different languages. This material is based on the
reference samples that are recommended for SQuad-LQ. The reference speech samples are available with
the following types of background noise:

In-car (stationary)

Street-noise (non-stationary)

Each of these noise types is mixed to each of the speech samples in four different level steps:

-26 dB OVL noise level (SNR = 3 dB)

-32 dB OVL noise level (SNR = 9 dB)

-38 dB OVL noise level (SNR = 15 dB)

-44 dB OVL noise level (SNR = 21 dB)

Chapter 3 | SQuad-NS Noise Suppression Measurement


CONFIDENTIAL MATERIALS

Reference Speech Signals for SQuad Measurements Manual


2000 - 2012 SwissQual AG

The filenames of the reference speech samples provide a description of the file content. For example, the
Am_fm_IRS_16k_car_32.wav file is the American English speech sample that has been mixed with car
noise at 32 dB. The Am_fm_IRS_16k_car_32_clean.wav file is the corresponding speech sample without
the background noise.
For measurements in English, the following reference samples are recommended:

Am_fm_IRS_16k_car_26.wav / Am_fm_IRS_16k_car_26_clean.wav

Am_fm_IRS_16k_car_44.wav / Am_fm_IRS_16k_car_44_clean.wav

The same speech signals are also available interlaced with street noise:

Am_fm_IRS_16k_str_26.wav / Am_fm_IRS_16k_str_26_clean.wav

Am_fm_IRS_16k_str_44.wav / Am_fm_IRS_16k_str_44_clean.wav

The following table contains the other languages that SwissQual provides similar reference samples for.
Table 3-2 Description of the prefix for a reference speech sample

Reference Sample

Language

En_*.wav

English

Ge_*.wav

German

Gr_*.wav

Greek

It_*.wav

Italian

Jp_*.wav

Japanese

Ru_*.wav

Russian

Sp_*.wav

Spanish

For best results, use the SwissQual reference sample files.


Note: You can use custom reference material if the material fulfills the defined requirements.

Chapter 3 | SQuad-NS Noise Suppression Measurement


CONFIDENTIAL MATERIALS

Reference Speech Signals for SQuad Measurements Manual


2000 - 2012 SwissQual AG

SQuad-AEC (Passive) Passive Echo Disturbance


Measurement

Basics
The SQuad-AEC (passive) algorithm searches for reflections (echoes) of a sent speech signal and if present,
calculates the delay of the reflection with respect to the sent signal and, if also present, the echo return loss
of the reflected signal. Side-tones, that is, reflections with a delay of less than 20 ms, are ignored by both of
the algorithms, but are still signalized.

SQuad-AEC (Passive) Speech Material


You can use an arbitrary speech signal to measure the passive echo disturbance with the SQuad-AEC
(passive) algorithm.
Table 4-1 Description of the settings for an SQuad-AEC (Passive) measurement

Setting

Description

Length

> 12.0 s

Speech Activity

> 90 %

Structure

Continuous speech

Speaker

Male and/or female native speakers

Sampling frequency

16 kHz

File Format

WAVE, 16bit, INTEL

Speech level

-23.0 -29.0 dB OVL

Pre-Filtering

ITU-T Rec. P.830, mod. IRS(send)

SwissQual Speech Material


squad_aec.wav

Chapter 4 | SQuad-AEC (Passive) Passive Echo Disturbance Measurement


CONFIDENTIAL MATERIALS

Reference Speech Signals for SQuad Measurements Manual


2000 - 2012 SwissQual AG

SQuad-AEC (Active) Active Echo Disturbance


Measurement

Basics
The SQuad-AEC (active) algorithm searches for reflections (echoes) of a sent speech signal that is
generated actively by the far-end side and calculates the echo delay as well as the echo return loss of the
residual echo. Side-tones (reflections with a delay of less than 20 ms) are ignored by both of the algorithms,
but are still signalized.

SQuad-AEC (Active) Speech Material


Due to the complexity of the measurement, the file-length of the active measurement is shorter than the
passive measurement. Basically, you can use an arbitrary speech signal with an exact length of 6 s to
measure the echo disturbance with SQuad-AEC (active).
Table 5-1 Description of the settings for an SQuad-AEC (Active) measurement

Setting

Description

Length

6.0 s

Speech Activity

> 90 %

Structure

Continuous speech

Speaker

Male and/or female native speakers

Sampling frequency

16 kHz

File Format

WAVE, 16bit, INTEL

Speech level

-26.0 dB OVL

Pre-Filtering

ITU-T Rec. P.830, mod. IRS(send)

SwissQual also provides 6-second speech clips to generate double talk at the far end side. The speech
activity is lower than the default speech clips and focused on speech bursts. Although you can use the
default clips (length = 6sec), some echo mis-spotting can occur.
Table 5-2 Description of the settings for a double-talk SQuad-AEC (Active) measurement

Setting

Description

Length

6.0 s

Speech Activity

10 50%

Structure

Isolated utterances

Speaker

Male or female native speakers

Sampling frequency

8 kHz (!)

File Format

WAVE, 16bit, INTEL

Speech level

-26.0 dB OVL

Pre-Filtering

ITU-T Rec. P.830, mod. IRS(send)

SwissQual Speech Material


SQuadAECact.wav
Chapter 5 | SQuad-AEC (Active) Active Echo Disturbance Measurement
CONFIDENTIAL MATERIALS

Reference Speech Signals for SQuad Measurements Manual


2000 - 2012 SwissQual AG

SwissQual strongly recommends that you use the default reference sample files as they are optimally
adjusted to avoid interactions between the files.
Table 5-3 Description of the double-talk reference speech samples

Reference Sample

Language

dt_10_8kHz.wav

(10% speech activity, female Croatian)

dt_25_8kHz.wav

(25% speech activity, female Croatian)

dt_50_8kHz.wav

(50% speech activity, female Croatian)

Chapter 5 | SQuad-AEC (Active) Active Echo Disturbance Measurement


CONFIDENTIAL MATERIALS

10

Reference Speech Signals for SQuad Measurements Manual


2000 - 2012 SwissQual AG

SQuad-RTT Round Trip Time Measurement

Basics
The measurement of the round trip time is based on an in-band transmission of short voice-like sequences.
During the measurement, one sequence is sent repeatedly from the A-side to B-side and after the signal is
received, a different sequence is sent back from the B-side to A-side. SwissQual strongly recommends that
you use the default reference speech samples RTTvoice_A.wav and RTTvoice_B.wav.

Speech-Like Sequences
For the in-band RTT measurement, two different sequences are necessary, where each sequence must fulfil
the technical characteristics in Table 6-1.
Table 6-1 Description of the characteristics for a SQuad-RTT measurement

Setting

Description

Length

0.5 s 0.6 s

Speech Activity

> 80 %

Sampling frequency

16 kHz

File Format

WAVE, 16bit, INTEL

Level

-27.0 -23.0 dB OVL

Pre-Filtering

ITU-T Rec. P.830, mod. IRS(send)

SwissQual Speech Material

RTTvoice_A.wav

RTTvoice_B.wav

Chapter 6 | SQuad-RTT Round Trip Time Measurement


CONFIDENTIAL MATERIALS

11

S-ar putea să vă placă și