Documente Academic
Documente Profesional
Documente Cultură
Masterarbeit
im Studiengang "Angewandte Informatik"
John-Patrick Wowra
Bachelor- und Masterarbeiten des Zentrums fr Informatik an der Georg-August-Universitt Gttingen 17. September 2007
Georg-August-Universitt Gttingen Zentrum fr Informatik Lotzestrae 16-18 37083 Gttingen Germany Tel. Fax Email +49 (5 51) 39-1 44 14 +49 (5 51) 39-1 44 15 ofce@informatik.uni-goettingen.de
WWW www.informatik.uni-goettingen.de
Ich erklre hiermit, dass ich die vorliegende Arbeit selbstndig verfasst und keine anderen als die angegebenen Quellen und Hilfsmittel verwendet habe. Gttingen, den 17. September 2007
Masterarbeit
Acknowledgement I would like to acknowledge my advisor Prof. Dr. Xiaoming Fu for excellent guidance, motivation and encouragement, my parents and Kate ina for their support and Christian r Dickmann for his patience and helpfulness.
Abstract
The popularity of Internet Telephony has been rising continuously in recent years. With a rising number of users inevitably the number of malicious users rises as well. Hence security is a major concern for Internet Telephony. Commonly RTP is used with Internet Telephony for transmission and reception of audio and video data. Traditionally, RTP runs over UDP, and RTP trafc is in most cases transmitted without any protection. Datagram TLS is a modied version of TLS that functions properly over datagram transport. This thesis studies an RTP extension based on DTLS, and includes conduction of a prototype implementation and further analysis of the design towards securing RTP and thus Internet Telephony.
Contents
1 Introduction 8
Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thesis Organisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Voice over IP . . . . . . . . . . Real Time Transport Protocol SSL/TLS and DTLS . . . . . . Session Initiation Protocol SIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8 10 11
12
Background
12 14 17 22
28
Related Work
3.1
3.2
4
Security in VoIP . . . . . . . . . . . . . . . . . 3.1.1 Internet Protocol Security, IPsec . . . . 3.1.2 Comparison between IPsec and DTLS Secure Real Time TransportProtocol . . . . . Introduction . . . . . . . . . . 4.1.1 Condentiality in VoIP 4.1.2 Availability in VoIP . . Threats and Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28 29 32 33
35
4.1
4.2
5
35 36 37 37
39
5.1
Introduction to RTP over DTLS . 5.1.1 SRTP Compatibility Mode 5.1.2 Packet size Comparison . 5.1.3 Security Considerations .
39 40 41 41
42
Implementation Design
6.1 6.2
42 43 43
Contents
6.3 6.4
6.2.2 RTP . . . . . . . . . 6.2.3 SIP Softphone . . . RTP over DTLS . . . . . . Choice of Libraries . . . . 6.4.1 OpenSSL . . . . . . 6.4.2 CCRTP . . . . . . . 6.4.3 Twinkle Softphone
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
43 44 44 46 46 47 48
49
Design Details
7.1
Design Components: RTP - ccRTP, DTLS - OpenSSL and SIP - Twinkle 7.1.1 OpenSSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.2 Socket Initialisation . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.3 Session Initialisation with ccRTP . . . . . . . . . . . . . . . . . . 7.1.4 Sending Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.5 Receiving Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.6 Closing Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.7 Types of Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . SIP Session Initiation with Twinkle . . . . . . . . . . . . . . . . . . . . . Implementation Process . . . . . . . . . . . . . . . . . . . . . . . . . . . Class Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . Testing Methodology . . . . . . . Testbed Setup . . . . . . . . . . . Measurement Methods and Tools Results . . . . . . . . . . . . . . . Standard RTP Packet Delay . . . RTP over DTLS Packet Delay . . CPU Usage . . . . . . . . . . . . . Test Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49 49 50 50 51 51 52 52 52 53 55 56
57
Testing
57 58 58 59 61 63 64 64
66
9.1 9.2
66 67
69
Bibliography
List of Figures
2.1 2.2 2.3 2.4 2.5 2.6 3.1 3.2 3.3 5.1 7.1 7.2 7.3 7.4 8.1 8.2 8.3 Strukture of an RTP packet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Schematic representation of the SSL handshake protocol with two way authentication with certicates [1]. . . . . . . . . . . . . . . . . . . . . . . . . . . DTLS in the TCP/IP stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DTLS packet struckture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DTLS state machine [2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Initialisation of a SIP session . . . . . . . . . . . . . . . . . . . . . . . . . . . . IPsec in the TCP/IP stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Structure of an IPsec packet with AH . . . . . . . . . . . . . . . . . . . . . . . Structure of an IPsec packet with ESP . . . . . . . . . . . . . . . . . . . . . . Struckture of an RTP packet sent over DTLS . . . . . . . . . . . . . . . . . . . Implementation status after phase 1 Implementation status after phase 2 Implementation status after phase 3 RTP over DTLS class structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 18 19 20 21 25 30 31 31 40 53 54 54 55 59 61 63
Testbed for RTP over DTLS tests . . . . . . . . . . . . . . . . . . . . . . . . . Delay for normal RTP packets . . . . . . . . . . . . . . . . . . . . . . . . . . . Delay for RTP over DTLS packets . . . . . . . . . . . . . . . . . . . . . . . . .
1 Introduction
1.1 Motivation
Today enterprises have to maintain two networks in order to use the services of Internet and Telephone. But traditional landline phones as we all know them are bit by bit replaced with new Internet Phones for their advantages. Internet Telephony is the routing of voice information over the Internet (or other IP based networks). The telephone calls are handled by protocols which are commonly referred to as Voice over Internet Protocol (VoIP). VoIP technology provides a wide range of services to users. As an additional feature VoIP offers for example video calls. VoIP calls are also cheaper than traditional phone calls; calls between two VoIP participants are even free. Enterprises with branches in different cities that are connected by a VPN might use VoIP technology for internal communication between the branches and can thereby reduce costs signicantly. Beside the reduction of costs for calls, the infrastructure has become more exible because VoIP technology provides open platforms in contrast to traditional Telephony. In traditional Telephony networks standards were only known to a small circle of developers at the network provider. Nowadays with VoIP the protocols, software and tools can be improved and adjusted to the needs of the users not only by their developers. The total number of VoIP users has been rising continuously over the past years. With a rising number of users inevitably the number of malicious users rises as well. Hence
1 Introduction
security is a major concern for Internet Telephony. Since VoIP is based on IP [3], it is vulnerable to all of the attacks that can plague traditional IP networks, like packet snooping, unauthorised access, spoong and especially denial of service attacks. Usually a conversation over a traditional phone is established over the communications providers network. All companies involved in the connection are known and have to be trusted. With VoIP data is transmitted through a lot of networks where not all providers are known. Anyone with access to a machine along the path of communication could access the transmitted data. Therefore VoIP calls are more vulnerable to eavesdropping than landline telephones. However this is a known problem from other applications transmitting condential data over an insecure network such as the Internet. Cryptographic protocols can be used to secure data from being eavesdropped or altered. A well known and as reliable considered security protocol is Secure Sockets Layer/Transport Layer Security (SSL/TLS) [4]. SSL/TLS residues above Transport Layer and commonly uses the Transmission Control Protocol (TCP) [5] or alike. TCP is a reliable and connection oriented protocol with mechanisms for buffering and retransmission. Thereby it is assured that the received data is exactly the same as the data transmitted. This is however not the primary desired feature in VoIP. The problem hereby is the buffering and retransmission mechanisms. The data is sent with unreliable IP protocol. Hereby packets might not arrive in the order they were transmitted or they can get lost on the way. TCP reassembles the packets to the right order and waits for lost packets to be retransmitted to reassemble the data as it was sent. This is very useful for services like e-mail where the received data is desired to be the same as the data transmitted, but in a VoIP a stream of data is played continuously to the receiver and a delay caused by retransmission results in a pause of the media stream. A delay in VoIP is dened as the time the voice takes on its way from the mouth of the speaker to the ear of the listener. It is the sum of time needed to digitalise the voice to audio data, fragment the stream of audio data to data packets and transmit the data to
1 Introduction
the destination. Delays are commonly known in traditional telephony. For instance long distance phone calls used to have quite long delays until the spoken word is received on the other side. These delays make a uent conversation like in a face to face conversation impossible. Therefore in VoIP trafc the highest priority is not the exact transmission of the data, instead the data needs to be transmitted to the receiver as fast as possible to reduce the delays caused by the transmission over the Internet. Thus VoIP protocols such as the Real Time Transport Protocol (RTP) [6] rely on connectionless transmission using the User Datagram Protocol (UDP) [7]. UDP has no mechanism for retransmission of lost data packets. Hence in RTP lost, damaged or late packets are discarded and the media stream is played continuously. In case a packet gets lost the next received packet will be played immediately and as long as the amount of lost packets does not exceed a certain amount, the receiver does not even notice that packets are lost. With the goal of securing real time media such as VoIP, TLS was enhanced in order to work with UDP datagrams. This advancement of TLS is called Datagram Transport Layer Security (DTLS) [8] and it was standardised in spring 2006 by the Internet Engineering Task Force (IETF1 ). In the same time the IETF published an Internet Draft on RTP over DTLS [9]. The core of this thesis is the design, implementation and test of a prototype implementation of RTP over DTLS.
1.2 Goals
Goal of this thesis is taking part in the design of a unied media security framework for Internet Telephony, using RTP and DTLS. The focus herby lies in the interaction of RTP and DTLS components of the framework. A critical aspect in terms of efciency of the implementation framework is the packet loss. Packet loss in media streaming occurs, when data packets do not arrive within a time limit to be inserted into the data stream any more.
1 http://www.ietf.org/
10
1 Introduction
The critical aspect hereby is the delay, the sum of time it takes to transmit voice data from caller to callee. The recommended threshold for delays in Telephony is 150 milliseconds according to the International Telecommunication Union Standardisation Sector (ITU-T2 ). For Telephony a packet loss rate of up to 5% is still acceptable according to the ITU-T. Therefore the implementation of RTP over DTLS shall provide a packet loss rate lower than 5%. A prototype of a VoIP application using RTP over DTLS was implemented in order to determine whether RTP trafc can be effectively transmitted over DTLS without compensation of the quality of the call. This prototype was developed based on existing implementations of RTP and DTLS. Technical premises and detailed requirements to this implementation needed to be analysed to lead to an adequate approach. The prototype was tested for the critical aspects in order to determine the usability of the approach in order to lay path to further development of the unied media security framework.
11
2 Background
This chapter describes the basic concepts which are necessary to understand this thesis. An introduction to Internet Telephony is given along with a description of the main protocols used in this thesis. Due to space limitations the level of detail is kept moderate and interested reader are suggested to follow the references of this thesis.
12
2 Background
der to establish a connection, the caller needs to know where and whether the callee is available. Subscribers of a SIP provider have a so called SIP-Uniform Resource Identier (SIP-URI). These addresses are similar to e-mail addresses in the URI format (e.g. sip:username@example.com). Before any user can call another user or receive a call, the terminal device must register to the central server of the SIP provider and thereby inform their provider that they are online and ready to receive calls. The server has now information about the location of the logged user, thereby the user is reachable through the server to other SIP users. For connection initiation the caller sends an invite message to the server, which will be forwarded to the callee, whose terminal device will be ringing then. Upon acknowledgement of the call an accept message is send back to the caller along with the current IP address of the callee. The servers are not needed for the session anymore because the session is intiated. The media channel is now established directly between the participants with RTP. A detailed description of a VoIP call initiation using SIP is provided in the upcoming SIP section. It is generally (e.g. with SIP) possible to establish a connection directly between caller and callee without servers, but then the IP address of the callee must be known to the caller. This is somewhat impractical as we know from telephone numbers and from the Internet. Nobody remembers a website by its IP address but by its domain name. A name is a much better association to a person or company and much better rememberable. Furthermore, IP addresses are dependent on the users location (e.g. at work and at home). The transport of the audio data is achieved with the Real Time Transport Protocol (RTP) [6], which is presented in detail in an upcoming section. RTP divides the audio data stream into small packets which are then transmitted via IP usually directly from speaker to listener, where an audio stream is generated from the received data packets that is played to the receiver. In enterprises VoIP is used more and more to reduce infrastructure costs since only one network infrastructure is needed instead of two, one for IP and one for Telephony. For en-
13
2 Background
terprises and private users a great benet is the saving of telephone call costs. Calls from VoIP to VoIP are normally free. Enterprises therefore tend to use VoIP for internal communication and traditional Telephony for outbound calls. Connections to landline phones are possible through gateway services (which are provided e.g. by SIP providers) but these connections are usually charged. In order to be reachable through such a gateway by a traditional phone providers offer their customers additionally to their address a landline phone number. The users are similarly to e-mail reachable through the same address or telephone number worldwide regardless of the current residence as long as the user is connected to the Internet. As terminal device a large variety of devices can be used, that can connect to networks (IPphones, cellphones, PCs, PDAs, Analogue Phones with special adapters, ...). Another benet of VoIP is the exibility provided by the open standards. Thereby new services can easily be added to VoIP. With reduction of costs, increased reachability, exibility and additional services like video calls VoIP will play a signicant role in the future of Telephony.
14
2 Background
VoIP applications using RTP require at least two participants who communicate by transmitting and receiving multimedia (voice and/or video) data to each other. An association among a set of participants communicating with RTP is called an RTP session or conference. A participant may be involved in multiple RTP sessions at the same time. The data transport of RTP is augmented by the RTP Control Protocol (RTCP) [6] to allow monitoring of data delivery in a manner scalable to large multicast networks, and to provide minimal control and identication functionality. RTCP is based on the periodic transmission of control packets to all participants in the session. The primary function is to provide feedback on the quality of the data distribution. In its second function RTCP carries a persistent transport-layer identier for an RTP source called the canonical name, or CNAME. While other ideniers, as the later explained SSRC may change during a session, the CNAME remains the same. It is used to identify a participant during a session. By having each participant send its control packets to all other participants of a session, each can independently observe the number of participants. This number is used to calculate the rate at which the packets are sent. Hereby more users in a session result in less frequent transmission of RTCP packets by each participant. This is necessary because otherwise the RTCP data trafc could take bandwidth from the connection and cycles from the CPU that are needed by the RTP data trafc. To establish an RTP session a pair of ports is reserved one for audio data and the other one for control (RTCP) packets. The audio conferencing application (the so called VoIP-phone) is used by each RTP session participant and sends audio data in small chucks of approximately 20 ms duration. Each chunk of audio is preceded by an RTP header indicating what kind of audio encoding (e.g. PCM, ADPCM or LPC) is contained in each packet, so that senders can change the encoding during a conference. To cope with lost packets and delays the RTP header contains timing information and a sequence number that allow the receivers to reconstruct the timing produced by the source. Hence the audio stream can be played out continuously. Conferences of both, audio and video are realised by transmit-
15
2 Background
ting each in a separate RTP session. In case one of the participants of an RTP session has a lower bandwidth connection to the network than the other participants, an RTP-Proxy (or so called mixer) can be used to solve this issue. A mixer is placed in the low bandwidth area; the mixer resynchronises incoming audio packets from multiple sources to a single audio stream. Thereby the audio data can be further compressed by using a different codec to enable the user with the low bandwidth connection to receive packets from multiple sources. Mixers can be used as well to compose a single video stream as a composition of multiple sources to a group scene of the participating users. The source of the stream of RTP packets is identied by a numeric value in the header of RTP packets. This 32-bit numeric value is called Synchronisation Source (SSRC) identier. Therefore it is independent upon the network address. Since all packets from an SSRC form part of the same timing and number space, a receiver can group packets by the SSRC for playback. The outgoing RTP packets of the mixer are then identied by the mixers SSRC value. The structure of an RTP packet is shown in gure 2.1 on page 16. The RTP Payload consists of the media that is being transmitted. The RTP Header contains information related to the payload e.g. the source, encoding etc. The RTP packet is then wrapped in a UDP packet which is encapsulated in an IP packet to be transferred over an IP based network.
16
2 Background
17
2 Background
Figure 2.2: Schematic representation of the SSL handshake protocol with two way authentication with certicates [1].
18
2 Background
used, a 32 bit random number upon which the pre-master secret will be generated, the Session Identier (Session ID) and the cipher suite to use. Phase two and three are optional, in phase two the server identies himself with a certicate to the client. The client identies himself to the server in case a certicate is available. Additionally the client veries the server certicate which contains the public key of the server. If the certicate cannot be veried, the connection is closed. The handshake is nished in phase four with the generation of the Master Secret, a single use symmetric key that is used during the connection for en-/ and decryption of messages. From now all messages will be transmitted encrypted. With the rising popularity of VoIP and other multimedia services it became necessary to use TLS as well with the faster UDP protocol. TLS itself could not be used directly, because after a packet loss the following data packets cannot be decrypted anymore. Datagram Transport Layer Security (DTLS) [8], which was standardised in April 2006, is a datagram capable version of TLS; therefore it is extremely similar to TLS. The DTLS protocol allows client/server applications to communicate in a way that is designed to prevent eavesdropping, tampering, or message forgery. DTLS reuses almost all the protocol elements of TLS, with minor but important modications for it to work properly with unreliable transport protocols. Figure 2.3 on page 19 shows DTLS in the ve layer TCP/IP protocol stack. DTLS packets have a structure as in gure 2.4 on page 20. In contrast to TLS in the DTLS
19
2 Background
handshake protocol a stateless cookie exchange is used to prevent denial of service. Additionally message fragmentation and re-assembly was added. DTLS handshake messages may be lost, since transmission takes place over datagram transport; therefore DTLS needs a mechanism for retransmission during handshake. This is achieved by incorporating a timer at each end point. Each end-point keeps retransmitting its last message until a reply is received. Furthermore DTLS unlike TLS is vulnerable to two types of denial of Service attacks. The rst attack is a standard resource consumption attack. The second attack is an amplication attack, where the attacker sends a client_hello message apparently sourced by the victim. In order to avoid these attacks, DTLS uses the cookie exchange technique that has been used in protocols such as Photuris [17]. Before the handshake proper begins, the client must replay a cookie provided by the server in order to demonstrate that it is capable of receiving packets at its claimed IP address. The DTLS client_hello message contains a cookie eld, which is empty in case there is no cached cookie from a prior exchange. The message contains the DTLS version, a list of algorithms and compression methods that the client will accept. The server responds with three messages, the server_hello contains the servers choice of version and algorithms. The certicate contains the servers certicate chain. The server_hello_done is a message to inform the Client that the handshake is done. Because of the possibility that DTLS handshake messages get lost, DTLS implements retransmission using a single timer at each endpoint. Each endpoint keeps retransmitting its last message until a reply is received. A state machine implements the timer and resulting retransmissions. Figure 2.5 on page
20
2 Background
21
2 Background
21 shows this state machine. Once in the Read Message Fragment state, transitions are triggered by the arrival of data fragments or the expiration of the retransmission timer. If a data fragment is the expected next handshake message then the fragment is returned to the higher layers and the timer is revoked. Otherwise, the fragment is buffered or discarded as appropriate and the timer is allowed to continue ticking. When the retransmit timer expires, the implementation retransmits the last messages that it transmitted. DTLS is perfectly predetermined to be used with VoIP because the security of TLS is combined with fast delivery of UDP lling this gap with the existing protocols.
22
2 Background
phone is activated, it sends out a registration to the SIP server announcing availability to the communications network. User availability: User availability is a method of determining whether a user would be willing to answer a request to communicate. A user can have several locations registered, but might only accept incoming communications on one device. If that is not answered, it transfers to another device or an application, such as voice-mail. User capability: There are many methods and standards of multimedia communications, this method checks for the users capabilities, for example whether a camera for video calls is available or which encryption/decryption methods a user can support. Session setup: SIP establishes the session parameters for both ends of communications, the actual session establishment, when one user calls and another user answers. Session management: This method manages for example the transfer of a call from one device to another (e.g. from a laptop to a mobile-phone and vice versa) without causing a noticeable impact to the communication partner. Another example is the invitation to a third user to a VoIP session and thereby the establishment of a conference call (multiuser session). SIP is not a vertically integrated communications system. SIP is rather a component that can be used with other IETF standardisations, like RTP to build a complete multimedia architecture. An important feature of SIP is that it does not dene the type of session that is being established, only how it should be managed. This exibility means that SIP can be used for a huge number of applications and services. To date, the 3G Community6 has selected SIP as the session control mechanism for the next generation of cellular networks. Microsoft has chosen SIP for its real-time communications strategy and has deployed it in
6 http://www.3gpp.org/
23
2 Background
various products. There are four major components in the SIP architecture: SIP User Agents SIP Registrar Server SIP Proxy Servers SIP Redirect Servers These components deliver messages embedded with the Session Description Protocol (SDP) [18] dening their content and characteristics to complete a SIP session. The terminal devices of SIP are called the SIP User agents (UAs), which can be any kind of a device capability of transmitting voice or other media over a network (e.g. cell-phones, PCs, PDAs,...). These devices are used to create and manage a SIP session. Every User Agent needs a unique identier which is called SIP-URI. SIP addresses use like e-mail addresses the URI format: sip:user@example.com. Another address system are the URLs for Telephone Calls (tel-uri) which are described in [19] where a traditional phone number can be mapped to a SIP address. This is used by gateway servers that many SIP providers maintain in order to enable traditional phone users to call VoIP users. Basically a connection is established, when a User Agent Client (caller) sends an invitation message and the User Agent Server (callee) responds to it. This initiation can be achieved directly (peer-to-peer), in case the current IP address of the User Agent server is known. For the user it is more comfortable to initiate the session with the SIP provider using a SIP-URI. The SIP Registrar Servers are databases that contain the location of all User Agents within a domain. These servers retrieve and send participants messages and other information to the SIP Proxy Server. SIP Proxy Servers accept session requests made by a SIP User Agent and query the SIP Registrar Server to obtain the recipients User Agents addressing information. The SIP
24
2 Background
25
2 Background
Proxy Server then forwards the invitation to a session directly to the recipient User Agent if it is located in the same domain or to a Proxy Server if it is located in another domain. The SIP Redirect Servers allow SIP Proxy Servers to redirect SIP session invitations to external domains. The SIP Redirect Server, the SIP Registrar Server and The SIP Proxy Server may reside in the same hardware. Figure 2.6 on page 25 illustrates the establishment of a SIP session between two Internet Service Providers (ISPs). Before any session may be established both users must power their devices and register their availability and their IP addresses with the SIP Proxy Server in the ISPs network in case the connection is established with a SIP provider. User A initiates the call by notifying the Proxy Server in domain A.com a request to communicate with User B. 1. The SIP proxy Server in Domain A recognises that User B is outside its domain upon reception of the request from user A 2. SIP proxy Server A then queries a request for User Bs IP address to the SIP Redirect Server which location can be in Domain A or B. Note that the lookup at the Redirect Server is not SIP queried, it is for instance a DNS lookup. 3. The SIP Redirect Server returns User Bs Proxy Server address. 4. The SIP Proxy Server in Domain A forwards the session initiation request to the SIP Proxy Server in Domain B. 5. The SIP Proxy requests the current IP Address of User B from the Registrar Server in Domain B. 6. The Registrar Server returns User Bs SIP Address. 7. The SIP Proxy relays User As invitation to communicate with User B to User B. This request includes information about the media (audio and/or video). Hereby SDP is used.
26
2 Background
8. User B informs the SIP Proxy that User As invitation is accepted and that he is ready to receive the message. 9. The response from User B is forwarded to User A. Hereby the return path is provided since all servers left their address in a specic eld of the invitation. 10. The response from User B is forwarded to User A. 11. User A and B create a point-to-point RTP connection enabling them to interact.
27
3 Related Work
The following chapter presents related work such as alternative approaches to secure VoIP trafc. The IPsec protocol is presented and compared to the chosen DTLS protocol along with reasons for this choice.
28
3 Related Work
ing secure standard SRTP [21] and the new ZRTP [22] protocol are available on Analogue Telephone Adapters (ATAs) as well as various softphones. Although some devices support SRTP, and thus enabling encrypted VoIP calls, the problem herby is that in standard conguration the keying material is transmitted unencrypted in clear text over the net. Eavesdroppers are thereby able to access the keying material which makes the encryption (almost) useless. Furthermore users need to study the manual to nd out how to enable the secured key sharing [23]. It is possible to use IPsec to secure peer-to-peer VoIP by using opportunistic encryption, which will be presented in the coming section. Skype, a proprietary peer-to-peer Internet Telephony network is closed source, which means that the source code is not published, has over 200 million users worldwide. Skype does not use SRTP, but uses encryption which is transparent to the Skype provider. The user cannot turn encryption on or of, and has to rely on the software and provider. The Voice VPN solution provides secure voice for enterprise VoIP networks by applying Internet Protocol Security (IPsec) [24] encryption to the digititalsed voice stream [10]. IPsec will be explained in the upcoming section as an alternative approach to secure VoIP trafc.
29
3 Related Work
IPsec operates on network layer, therefore it is capable of securing TCP- and UDP-based protocols, which residue on transport layer, as illustrated in gure 3.1 on page 30. IPsec operates in two different modes: transport mode and tunnel mode. In transport mode, only the payload of the data packet is encrypted and/or authenticated. The routing is intact, since the IP header is neither modied nor encrypted. Transport mode is used for peer-to-peer communications. In tunnel mode, the entire packet is encrypted and/or authenticated; therefore it must be packed into a new IP packet for routing to work. The tunnel mode is used for peer-to-peer communication as well as for network-to-network and host-to network connections. The rst thing that needs to be done upon connection initiation is the exchange of the keying material. Hereby the possibly most complex component of IPsec is used, IKE. IKE is using the Dife-Hellman Key Agreement Method [27] for exchange of keys over an insecure network and is based on the Internet Security Association and Key Management Protocol (ISAKMP) [28], the IPsec Domain of Interpretation (DOI) [29] and the Oakley Key Determination protocol [30] and SKEME [31]. Both sides of the connection need to authenticate themselves to the other side and agree to a keying algorithm. The AH guarantees connectionless integrity and the data origin authentication of IP datagramms. It can optionally protect against replay attacks by using the sliding window technique and discarding old packets. AH protects the IP payload and all header elds of an IP datagram except for those that might be changed during transmission. Figure 3.2 on page 31 shows a TCP packet before the AH is inserted and after.
30
3 Related Work
The ESP protocol provides origin, authenticity, integrity and condentiality of a packet. Unlike AH, the IP packet header is not protected by ESP. Figure 3.3 on page 31 shows a TCP packet before and after ESP is applied in tunnel mode. The IPsec support is usually implemented in the kernel and the key management is carried out from the user space. However, as there is a standard interface for key management, it is possible to control one kernel IPsec stack using key management tools from a different implementation. IPsec is part of IPv6. It was intended to provide either transport mode or tunnel mode, where packets can be provided to several machines; furthermore it can be used to create Virtual Private Networks [32]. In comparison to TLS IPsec is a peer-to-peer protocol, designed as a generic security mechanism for Internet Protocols. There are a number of problems using IPsec for securing datagram trafc generated by client server applications
31
3 Related Work
which will be discussed in the comparison of IPsec and DTLS in the next section.
32
3 Related Work
the sub network is known. Another problem is the lack of standardisation among IPsec APIs resulting in portability problems when an application wishes to control the keying policy. In DTLS portability can be achieved although DTLS APIs are not standardised either since an application can be shipped along with the DTLS toolkit. For IPsec this is not so easily achievable because of its residence in the kernel space in contrast to DTLS which residues in application space. In order to simplify key negotiation, IPsec uses a reliable TCP connection to secure a separate datagram channel. This design is smart but has some problems. First, the application now has to manage two different sockets and synchronise them, where synchronisation is a signicant programming problem. If the TCP connection is left open after key negotiation, unnecessary system resources are wasted. On the other hand when the TCP connection is closed after key negotiation, any renegotiation must be done over UDP requiring another implementation for the keying negotiation over UDP which would make the key negotiation over TCP obsolete. Therefore it is more useful to have key negotiation and data transfer on the same channel. To secure RTP trafc DTLS is more suitable since RTP runs over UDP, any unnecessary connection (e.g. TCP for key negotiation) is a waste of system resources. VoIP is time sensitive therefore the addition of a security overhead should cost the least possible system resources thus providing enough security to be reliable. Furthermore for the use of IPsec as it resides in the kernel, for its use on a system not supporting IPsec the TCP/IP stack needs to be changed. To secure the application with DTLS only another application needs to be used.
33
3 Related Work
and replay protection to the RTP data in Unicast and multicast applications. Note that SRTP must not be confused with RTP over DTLS. SRTP was published as RFC 3711 [21] in March 2004. This tightly coupled encryption mode for RTP provides a number of benets. The RTP header is left unencrypted which enables header compression (see [35], [36], and [37]) and easy debugging. The packets appear to be RTP packets, which is a benet for rewall compatibility. There is a zero header overhead. SRTP relies on an external key management protocol to set up the initial master key. Two protocols specically designed to be used with SRTP are ZRTP [22] and Mikey [38]. There are also other methods to negotiate the SRTP keys, several vendors offer products that use the SDES key exchange method. For encryption and decryption of the data ow, SRTP standardises utilization of only a single cipher. The Advanced Encryption Standard (AES) [39] is used by SRTP. AES can be used in two cipher modes, which turn the originally block AES cipher into a stream cipher. Since SRTP does not provide a keying mechanism and has to rely on other protocols it cannot be regarded as solution to secure VoIP trafc. In combination with ZRTP VoIP trafc can be secured. However SRTP is not widely used since users claim a reduced audio quality as a reason to turn ZRTP protection off. Furthermore ZRTP is not a widely known security architecture like TLS and therefore not as trustworthy as RTP over DTLS can be.
34
4.1 Introduction
In order to classify the threats to VoIP properly rst the security-goals must be formulated. VoIP is IP trafc and thus the same attacks can be used. This is why VoIP calls are vulnerable to a variety of threats that traditional telephone calls are not. Any data being transmitted is at some risk of being eavesdropped. Data packets can be eavesdropped on anywhere along the transmission path. Alternatively the eavesdropped data could be changed and transmitted to the receiver, who would not notice receiving altered data, which is called a man in the middle attack. By transmitting the same message, e.g. an invitation to a VoIP phone call many times, the receiving machine could be kept so busy that no real calls can come through. This is called a denial of service attack. There are three classical primary security goals in modern communication systems: Condentiality Integrity
35
Availability Condentiality has been dened by the International Organisation for Standardisation (ISO) as "ensuring that information is accessible only to those authorised to have access" [40]. Integrity is the protection of unauthorised alteration of the transmitted data. Message integrity is as well as condentiality a part of DTLS. It ensures the user that the received data has not been changed without his notice. Availability means that the transmitted data will reach its destination and will thereby be available to the receiver. The Integrity of the voice data is hereby an important issue. Certainly it is easier to recognise whether someone on the phone is the person he or she claims to be than to recognise whether an e-mail was really written by the declared sender. This argument however applies mostly to private communication and communication among people who know each other well. But voice messages can be recorded, edited and replayed resulting in not letting the receiver notice that the caller is not the person he or she claims to be. Besides the integrity of the voice data, as well the signalling data needs to be integer and unaltered. The identity of the caller and the callee needs to be protected. If an attacker manages to manipulate his own identity he might achieve that the callee will be displayed a different id of the caller upon reception of a call. This can be used to reach persons on the phone who usually are not taking calls from anybody (e.g. the chief executive of a company). By acquiring a fake identity the billing of the VoIP provider can be bypassed and called will be charged to original owner of the account.
36
over separate networks while in VoIP the voice data is transmitted over the Internet, where all connected machines have the potential to be accessed through security holes. Many protocols in traditional telephony are barely published; therefore the analysis and attacks to traditional phone calls require special hard and software. The amount of people who are capable of eavesdropping phone calls is hereby reduced but it is not impossible.
37
various requests to a network or host in order to acquire information needed for further steps, like the operating system or installed services. For a so called Spoong Attack, messages or data packets with faked information are used. For example the IP address or MAC address of the sender can be changed so that the receiving machine assumes that the packet was sent from a trustworthy source. Another example for spoong is DNS Spoong; hereby DNS answers are changed, which results the requesting machine to communicate with a machine the hacker prepared. Denial of Service attacks replay request messages to servers in such high amounts that the servers service is not available anymore to regular users, targeting the availability of a system. VoIP might also be target of new attacks which are enabled through VoIP. Spam is a commonly known problem these days. Spamming is the abuse of e-mail to indiscriminately send unsolicited bulk messages. E-mail spam involves sending nearly identical messages to numerous recipients. As already mentioned SIP uses a similar address format as e-mail thus the problem of e-mail spam might become a problem for VoIP in the future. VoIP spam is not yet an existent problem, nonetheless it receives a great deal of attention from marketers and trade mark press. VoIP spam is also referred to as SPIT (Spam over Internet Telephony). Hereby malicious users could be telemarketers or prank callers. Currently there are rules for e-mail systems that block unwanted e-mail, such systems could (and probably will) also be applied to VoIP systems. SIP as the technology has been designed to support presence natively. Thereby incoming callers know the availability before even attempting to initiate a call. The three security services are realised through DTLS and implemented in the OpenSSL library which makes it a reasonable choice to secure VoIP trafc. Unfortunately no encryption can prevent the biggest threat, a virus or trojan on the endpoint giving a hacker access to the machine and thereby to the decrypted data.
38
39
contain condential data, this is not mandatory. RTP over DTLS is a trustworthy approach in order to achieve secured VoIP calls. DTLS is practically designed to be used in a VoIP scenario and because of its well known predecessor likely to gain the trust of users as well.
40
is not part of the implementation conducted in this thesis but worth to note for future development of the prole.
41
6 Implementation Design
This chapter provides an analysis of requirements along with a description of the choice of implementations used in this thesis. Hereby the chosen libraries are presented as well. The system idea is presented in a more detailed way along with the functionality and interaction of the single components used for the prototype implementation.
42
6 Implementation Design
6.2.1 DTLS
The DTLS protocol is designed to secure data between communicating applications. It is designed to run in application space, without requiring any kernel modications. DTLS uses regularly one UDP socket per connection and endpoint. Therefore upon connection initiation at each endpoint a socket is created before the DTLS handshake can be initiated. After successful completion of the handshake the sockets are ready to transmit and receive secured data. Upon termination of the connection both sockets are closed.
6.2.2 RTP
RTP has no possibilities to initiate a connection between two hosts itself. Therefore additionally SIP is used to initiate a Session between two computers. Upon connection initation
43
6 Implementation Design
RTP initialises two sessions on each host, one for data and one for RTCP trafc. Each of these sessions normally consist of two sockets, one for reception and one for transmission. Next the RTP stack is started and packet transmission and reception starts on each session until the RTP stack execution is stopped. Beside unicast conferences RTP is also capable of multicast conferences. This feature can not mapped to a DTLS secured session since the key exchange protocol of DTLS is designed only for host to host communication and the DTLS key exchange is one of the cornerstones of DTLSs benets to the implementation. RTP data (and control) packets are usually transmitted via UDP; therefore RTP comes with an underlying transportation layer similar to the transportation layer DTLS uses. A reuse of these functions shall be reviewed in order to keep changes slim and simple in the upcoming design section.
44
6 Implementation Design
that RTP over DTLS should be used if available for both communication partners. The SIP component needs to support mechanisms necessary to cope with basically four cases. In rst case the connection can be established without errors, when both communication partners have a proper running system which supports RTP over DTLS. In second case there is an error on the caller side which might occur, when certicates cannot be accessed. The caller should be notied by that already when settings are adjusted to use RTP over DTLS for calls in the setup. In case the RTP over DTLS feature is not supported by the callee either the connection will be established without any protection, or the next supported security system supported by both sides will be used. Hereby of course the caller must be notied that the connection is not secured in the intended way. At last there is of course the chance that security certicates cannot be veried or the DTLS connection could not be initialised properly for other reasons and therefore a secure connection therefore cannot be guaranteed. In this case the users needs to be informed immediately about the situation and get an advise what this means and what to do. When the call is accepted by the callee and both parties have RTP over DTLS available this component is started to initialise the DTLS sockets. The RTP session hereby needs to be divided to a server and client (passive and active) part, where the client initiates the DTLS connection to the server and the server accepts the clients connection request. When the connection is established the data transfer of RTP can start. At the end of the session the DTLS connection needs to be properly shut down. DTLS negotiates the ciphers during handshake (see Background section) and exchanges certicates and keys. These keys must be generated as well and certicates provided. This task will be done by the SIP application in connection with functions provided through OpenSSL.
45
6 Implementation Design
6.4.1 OpenSSL
OpenSSL1 [46] is the de facto standard open source TLS/SSL implementation [2]. It has proven to be stable and is used by numerous production quality servers such as Apache Web Server. OpenSSL implements SSLv2. SSLv3, TLSv1 and DTLSv1. Each of these protocols is implemented by sharing as much code as possible, with virtual functions handling protocol differences. The library is implemented in C and from the librarys standpoint, DTLS appears to be another version of the TLS protocol.
1 http://www.openssl.org/
46
6 Implementation Design
6.4.2 CCRTP
GNU ccRTP2 is an implementation of RTP, the real-time transport protocol from the IETF (RFC 3550, RFC 3551, and RFC 3555). The library is implemented in C++ and based on GNU Common C++3 . Therefore it can provide a high performance, exible and extensible standards-compliant RTP stack with full RTCP support. It is dened rather as an application layer framework than a typical Internet transport protocol such as TCP or UDP. In the design for ccRTP support for audio and video data is considered. Unicast, multiunicast and multicast transport models are supported, as well as multiple active synchronization sources, multiple RTP sessions (SSRC spaces), and multiple RTP applications (CNAME spaces). This allows its use for building all forms of Internet standards based audio and video conferencing systems [47]. CcRTP uses packet queue lists for reception and transmission of data packets. The synchronisation of both (outgoing and incoming) media is automatically handled within the packet queues. There is support for RTCP and other standard and extended features needed for both compatible and advanced streaming applications. The implementation uses templates to isolate threading and sockets related dependencies, so that it can be used to implement real time streaming with different threading models and underlying transport protocols which is an essential feature for this work. At its highest level, ccRTP provides classes for the real-time transport of data through RTP sessions, as well as the control functions of RTCP. The main concept in the ccRTP implementation of RTP sessions is the use of packet queues to handle transmission and reception of RTP data packets/application data units. In ccRTP, a data block is transmitted by putting it into the transmission (outgoing packets) queue, and received by getting it from the reception (incoming packets) queue.
2 http://www.gnu.org/software/ccrtp/ 3 http://www.gnu.org/software/commoncpp/
47
6 Implementation Design
4 http://www.twinklephone.com/
48
7 Design Details
This chapter describes the implementation process, milestones and problems which were handled along the way. Hereby rst the protocol operations are presented and then how the components in the prototype implementation of the unied media security framework function together. The previous chapter provides an analysis serving all necessary information to design successfully a solution method. In this chapter the architecture and interfaces of the component to develop will be designed and the adaptation to the existing structure and interfaces projected.
7.1 Design Components: RTP - ccRTP, DTLS - OpenSSL and SIP Twinkle
This section decribes the interaction of the components used to design the unied media security framework. Each library used is decribed with its interaction to other libraries.
7.1.1 OpenSSL
The OpenSSL website provides an online documentation of the application programming interface (API) to ease the implementation of a secure socket. However although DTLS is already supported by OpenSSL for more than a year, DTLS is not mentioned at all in the documentation. Merely TLS is mentioned as an optional protocol version.
49
7 Design Details
50
7 Design Details
In order to transmit, the method addDestionation is called along with the internet-address and port of the host to be transmitted to.
51
7 Design Details
52
7 Design Details
at http://linux.softpedia.com/get/Security/DTLS-Client-Server-Example-19026.shtml
53
7 Design Details
format introduced by Sun Microsystems2 . Further information can be found at [49]. Upon setting up the RTP connection between the two hosts, the DTLS connection is established during initialisation of the transport channel, where before the UDP sockets were initiated. In the last stage all parts of preceding steps have to work perfectly together in order to function as a secured VoIP call. Figure 7.3 on page 54 illustrates the progress at this stage. While stage 3 marks the goal of this thesis this is however not the end of the process. Further implementation work is needed to provide a usable application. These steps will be presented in the future work section at the end of this thesis.
2 http://www.sun.com
54
7 Design Details
55
7 Design Details
56
8 Testing
The prototype implementation of RTP over DTLS is tested in order to conrm the usability of the approach. There is a wide range for testing the approach, however due to space and time restrictions in this thesis not all aspects of RTP over DTLS were analysed so far.
57
8 Testing
58
8 Testing
program was used to calculate the delay of a data packet as the time difference between the transmission and reception timestamps. Plots and summaries from the tests were generated with Gnuplot2 [50] from the report les.
8.4 Results
This section presents the results from the experiments. Aim of the performance test is to determine the delay caused by the encryption with DTLS for RTP trafc. Tests were performed with modied versions of the ccRTP demonstration programs audiorx and audiotx. These applications initiate RTP sessions and transmit audio data from audiotx to audiorx where audiorx plays the audio data over the systems audio interface. In the original version these applications use the loopback address to simulate RTP trafc on a single machine. By changing the IP addresses used, these programs are capable of transmitting data from one host to another. Audiorx is using a 50 ms jitter buffer to assure a continuous media stream during reception. The jitter is the variation of packet interarrival time. While the sender is expected to transmit a packet every 20 ms, these packets can be delayed throughout the network and
2 http://gnnuplot.info
59
8 Testing
not arrive at that same regular interval at the receiver side. The difference between when the packet is expected and when it is actually received is jitter. The jitter buffer conceals the interarrival packet delay variation. Data packets arriving with a delay greater than 50 ms will not be played; instead the next packet that arrived is played. In VoIP applications the jitter buffer is exible in order to adapt to the delay in the current call. In order to analyse the RTP over DTLS performance instead of a regular RTP session, the RTP over DTLS server and client session objects were initialised in these applications. In order to obtain comparable results a 62.5 KB audio le was used for transmission to simulate voice data of a call which has a play time of 7 seconds. Thereby 399 data packets of audio data were transmitted. Taking account of possible measurement inaccuracy and errors due to the experimental environment, all tests were done repeatedly to verify the results.
60
8 Testing
80000
Delay in microseconds
60000
40000
20000
61
8 Testing
The sound le was played continuously without any disturbance as clear as it would be played locally.
62
8 Testing
Transmission of Encrypted Audio Data 100000 RTP over DTLS Packet Delay
80000
Delay in microseconds
60000
40000
20000
63
8 Testing
sound le was played continuously without any disturbance as clear as it would be played locally.
64
8 Testing
with a maximum of 92 ms, a minimum of 9 ms and a standard deviation of 7.7 ms. The important values in the results of these experiments are the average delay, the packet loss rate and the standard deviation. The average delay is increased by approximately 20 ms when DTLS encryption is used. According to the ITU-T a delay of 125 ms is noticeable by humans, therefore they recommend that delays should not exceed 150 ms. A delay from 200 to 280 ms still satises most of the users, while delays higher than 300 ms dissatisfy some users and a delay higher than 400 ms is unacceptable because most users are dissatised [51]. Most of the delay in real scenarios is caused by the network infrastructure. For a distance of less than 5000 km VoIP connections are likely to experience a delay smaller than 150 ms. For intercontinental connections delays in the mid-200 ms range can be expected, which does not mark a problem according to the ITU-T because users expect differences to regional calls. Compared directly, the RTP over DTLS delay average has more than twice the length of regular RTP delays, but the delays should be set in relation to ITU-T restrictions. Thus an average delay increase of 20 ms marks an increase of about 13% to the recommendation of a 150 ms delay. The small increase (1.6 ms) in the standard deviation is a good result as well. This means that the jitter buffer does not need to be increased by a relevant size. Therefore RTP over DTLS is well suited for encryption of life media as in VoIP.
65
66
Unfortunately DTLS cannot solve all issues in securing Internet Telephony. Denial of Service attacks against the SIP infrastructure cannot be secured by DTLS, since RTP over DTLS is initiated after the SIP interaction takes place to initiate the session. The approach also does not address the issue of SPIT for the same reason, the authentication can help to solve the issue since SPIT calls could be traced back to the users, but this would only possible when all users use the DTLS authentication. This is however not possible yet, since reachability through traditional phones is still desired. In this thesis a prototype of RTP over DTLS was implemented and tested in order to prove the usability of the approach. The upcoming sections summarise and evaluate the test results of the prototype. Furthermore an outlook is given to future work which will be necessary to goal of the development of a unied media security framework for VoIP. The datagram capable version of TLS was designed in order to secure media streaming without compromising the quality of the media streamed or the widely accepted security features of TLS. The test results show the good performance of the prototype implementation of RTP over DTLS in comparison to unencrypted RTP. The increase in the delay of approximately 20 ms is in an acceptable range in order to allow secure communication without impact on the quality of the VoIP call. These results allow planning of future steps that need to be done on the way to a unied media security framework for VoIP which are presented in the upcoming section.
67
ment, since this part was developed with almost no documentation from the developers of DTLS in OpenSSL. The next suggested step in further development includes improvisation at DTLS level. H. Tschofenig an E. Rescorla introduced the SRTP compatibility mode [9]. With the thereby presented enhancements to RTP over DTLS the performance can be increased since overhead is reduced to a value comparable to ZRTP. In the following the performance of SRTP compatibility mode of RTP over DTLS can be compared with experiments to ZRTP in order to evaluate the approach. The integration to the Twinkle softphone is as well not completely nished. This thesis focuses on taking part in the development of a unied security framework regarding all components in the system. Due to time restrictions the focus lies on the interaction of RTP and DTLS components to provide a basis for further development. A concept of user interaction in connection with the encryption scheme needs to be designed and integrated to the softphone and SIP. Hereby the challenge lies in the combination of understanding what is happening and ease of use in order to achieve acceptance among users. Thereby the management of certicates needs to be integrated to the softphone along with notication about the state of security and proper error handling upon possible DTLS handshake failure and user notication about the security state of the connection. Furthermore at the SIP (and SDP) side of the framework RTP over DTLS needs to be integrated to the session invitation, so that the caller is able to inform the callee about the wish to establish an RTP over DTLS session when connections are initiated over the SIP network.
68
Bibliography
[1] Christian Friedrich. Schematic representation of the ssl handshake protocol with two way authentication with certicates Wikipedia, the free encyclopedia, 2007. [Online; accessed August 2007]. [2] N. Modadugu and E. Rescorla. The Design and Implementation of Datagram TLS, 2004. [3] J. Postel. Internet Protocol. RFC 791 (Standard), 1981. Updated by RFC 1349. [4] T. Dierks and E. Rescorla. The Transport Layer Security (TLS) Protocol Version 1.1. RFC 4346 (Proposed Standard), 2006. Updated by RFCs 4366, 4680, 4681. [5] J. Postel. Transmission Control Protocol. RFC 793 (Standard), 1981. Updated by RFC 3168. [6] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. RTP: A Transport Protocol for Real-Time Applications. RFC 3550 (Standard), 2003. [7] J. Postel. User Datagram Protocol. RFC 768 (Standard), 1980. [8] E. Rescorla and N. Modadugu. Datagram Transport Layer Security. RFC 4347 (Proposed Standard), 2006. [9] E. Rescorla H. Tschofenig. Real Time Transport Protocol (RTP) over Datagram Transport Layer Security. Internet Draft, February 2006.
69
Bibliography
[10] Wikipedia. Voice over ip Wikipedia, the free encyclopedia, 2007. [Online; accessed 22-April-2007]. [11] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, R. Sparks, M. Handley, and E. Schooler. SIP: Session Initiation Protocol. RFC 3261 (Proposed Standard), 2002. Updated by RFCs 3265, 3853, 4320, 4916. [12] H. Schulzrinne and C. Agboh. Session Initiation Protocol (SIP)-H.323 Interworking Requirements. RFC 4123 (Informational), 2005. [13] T. Berson. Skype security evaluation, October 2005. [14] R. Hancock, G. Karagiannis, J. Loughney, and S. Van den Bosch. Next Steps in Signaling (NSIS): Framework. RFC 4080 (Informational), 2005. [15] E. Rescorla. HTTP Over TLS. RFC 2818 (Informational), May 2000. [16] Shamir A. Rivest, R. and L.M. Adleman. Cryptographic communications system and method. US Patent 4405829, 1977. [17] P. Karn and W. Simpson. Photuris: Session-Key Management Protocol. RFC 2522 (Experimental), 1999. [18] D. Brezinski and T. Killalea. Guidelines for Evidence Collection and Archiving. RFC 3227 (Best Current Practice), 2002. [19] H. Schulzrinne. The tel URI for Telephone Numbers. RFC 3966 (Proposed Standard), 2004. [20] Voice over miscongured internet telephones - (vomit). http://vomit.xtdnet.nl/. [21] M. Baugher, D. McGrew, M. Naslund, E. Carrara, and K. Norrman. The Secure Realtime Transport Protocol (SRTP). RFC 3711 (Proposed Standard), 2004.
70
Bibliography
[22] Ed.Avaya J. Callas P. Zimmerman, A. Johnston. ZRTP: Media Path Key Agreement for Secure RTP. Internet Draft, 2007. [23] Jrg Schwenk Andr Adelsbach, Mark Manulis. Voipsec Studie. Technical report, Bundesamt fr Sicherheit in der Informationstechnik. [24] S. Kent and K. Seo. Security Architecture for the Internet Protocol. RFC 4301 (Proposed Standard), 2005. [25] S. Kent. IP Encapsulating Security Payload (ESP). RFC 4303 (Proposed Standard), 2005. [26] C. Kaufman. Internet Key Exchange (IKEv2) Protocol. RFC 4306 (Proposed Standard), 2005. [27] E. Rescorla. Dife-Hellman Key Agreement Method. RFC 2631 (Proposed Standard), 1999. [28] D. Maughan, M. Schertler, M. Schneider, and J. Turner. Internet Security Association and Key Management Protocol (ISAKMP). RFC 2408 (Proposed Standard), 1998. Obsoleted by RFC 4306. [29] D. Piper. The Internet IP Security Domain of Interpretation for ISAKMP. RFC 2407 (Proposed Standard), 1998. Obsoleted by RFC 4306. [30] H. Orman. The OAKLEY Key Determination Protocol. RFC 2412 (Informational), 1998. [31] H. Krawczyk. Skeme: A versatile secure key exchange mechanism for internet. In Proceedings of the 1996 Symposium on Network and Distributed System Security (SNDSS 96), 1996.
71
Bibliography
[32] Wikipedia. Ipsec Wikipedia, the free encyclopedia, 2007. [Online; accessed June 2007]. [33] S. Kent. IP Authentication Header. RFC 4302 (Proposed Standard), 2005. [34] Whiteld Dife, Paul C. van Oorschot, and Michael J. Wiener. Authentication and authenticated key exchanges. Designs, Codes and Cryptography, 2(2):102125, 1992. [35] S. Casner and V. Jacobson. Compressing IP/UDP/RTP Headers for Low-Speed Serial Links. RFC 2508 (Proposed Standard), 1999. [36] C. Bormann, C. Burmeister, M. Degermark, H. Fukushima, H. Hannu, L-E. Jonsson, R. Hakenberg, T. Koren, K. Le, Z. Liu, A. Martensson, A. Miyazaki, K. Svanbro, T. Wiebke, T. Yoshimura, and H. Zheng. RObust Header Compression (ROHC): Framework and four proles: RTP, UDP, ESP, and uncompressed. RFC 3095 (Proposed Standard), 2001. Updated by RFCs 3759, 4815. [37] T. Koren, S. Casner, J. Geevarghese, B. Thompson, and P. Ruddy. Enhanced Compressed RTP (CRTP) for Links with High Delay, Packet Loss and Reordering. RFC 3545 (Proposed Standard), 2003. [38] D. Ignjatic, L. Dondeti, F. Audet, and P. Lin. MIKEY-RSA-R: An Additional Mode of Key Distribution in Multimedia Internet KEYing (MIKEY). RFC 4738 (Proposed Standard), November 2006. [39] Joan Daemen and Vincent Rijmen. The Design of Rijndael: AESThe Advanced Encryption Standard. Springer-Verlag, 2002. [40] ISO/IEC. Information technology security techniques code of practice for information security management, June 2005.
72
Bibliography
[41] E. Rescorla N. Modadugu. Extensions for dtls in low bandwidt environments. draftrescorla-tls-partial-00, October 2005. [42] E. Rescorla. Tls partial encryption mode. draft-rescorla-tls-partial-00, October 2005. [43] Certicom T. Kause A. Kapoor, R. Tschalar. Internet x.509 public key infrastructure transport protocols for cmp. Internet-Draft, feb 2004. http://tools.ietf.org/id/draftietf-pkix-cmp-transport-protocols-05.txt. [44] H. Tschofenig J. Fischl. Session initiation protocol (sip) for media over transport layer security (tls), February 2006. [45] H. Tschofenig J. Fischl. Session description protocol (sdp) indicators for datagram transport layer security (dtls). draft-schl-mmusic-sdp-dtls-00, February 2006. [46] The openssl project. http://www.openssl.org. [47] The gnu ccrtp library. http://www.gnu.org/software/ccrtp/. [48] The twinkle softphone project. http://www.twinklephone.com/. [49] Header le for the au-le format. http://www.opengroup.org/public/pubs/external/auformat.html. [50] Gnuplot. http://www.gnuplot.info/. [51] International Telecommunication Union. Recomendation G.114 - One-way Transmission Time. Series G: Transmission Systems and Media, Digital Systems and Networks, May 2003.
73