Network Working Group E. Carrara INTERNET-DRAFT F. Lindholm Expires: December 2001 M. Naslund K. Norrman J. Arkko Ericsson July, 2001 Key Management for Multimedia Sessions Status of this memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or cite them other than as "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/lid-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Abstract Some work for securing real-time applications have started to appear, and it has also brought the need for a key management infrastructure to support the security protocol. Such key management has to fulfil requirements suitable for conversational multimedia in heterogeneous environment. This document describes a key management scheme that can be used for real-time applications, and shows some examples of how such scheme can be implemented. Carrara, et al. [Page 1] INTERNET-DRAFT mm-kmgt-sol July 2001 TABLE OF CONTENTS 1. Introduction....................................................3 1.1. Existing solutions............................................3 1.2. Outline.......................................................3 1.3. Notational Conventions........................................4 2. Scenarios.......................................................4 3. Approaches......................................................4 4. Basic Key Management Schemes....................................5 4.1. Pre-shared key................................................6 4.2. Public-key encryption.........................................7 4.3. Diffie-Hellman key exchange...................................8 4.4. Parameter Negotiation.........................................9 4.5. Session Key Calculation and Key Refresh.......................9 4.5.1. Assumptions................................................10 4.5.2. Notation...................................................11 4.5.3. Description................................................11 4.6. Re-keying....................................................12 4.7. Implementation issues........................................12 4.8. Reliability..................................................12 5. Headers to support key management..............................13 5.1. Common Attributes for key management ........................13 5.2. Format specification.........................................16 5.2.1. Identities.................................................16 5.2.2. Timestamps.................................................16 5.3. Key management schemes integrated in SDP.....................17 5.3.1. Using the key field to transport the key...................17 5.3.2. SDP Attribute fields to support the key management.........17 5.4. Error handling...............................................18 5.5. SDP Examples.................................................18 6. SDP-based key management.......................................19 6.1. Initiator's and responder's behavior.........................19 6.2. Key management with SIP......................................20 6.2.1. Integration................................................20 6.2.2. Re-keying..................................................20 6.3. Key management with RTSP.....................................21 6.3.1. Integration................................................21 6.3.2. Re-keying..................................................21 6.3.3. Examples...................................................21 6.4. Groups.......................................................23 7. Security Considerations........................................23 7.1. General......................................................23 7.2. Key refresh..................................................25 7.3. Re-keying....................................................25 8. Conclusions....................................................25 9. Acknowledgments................................................25 10. Author's Addresses............................................26 11. References....................................................26 Appendix A........................................................28 Carrara, et al. [Page 2] INTERNET-DRAFT mm-kmgt-sol July 2001 1. Introduction There has recently been work to define a security framework for the protection of real-time applications running over RTP, [SRTP]. However, a security protocol needs a key management infrastructure in the background to e.g. exchange keys and security parameters, managing and refreshing keys, etc. There are some fundamental properties that such a key management scheme has to fulfil due to the kind of real-time applications (streaming, unicast, groups, multicast, etc.) and to the heterogeneous nature of the scenarios we are dealing with. [REQS] lists in detail requirements for key management to work for conversational multimedia in heterogeneous environment. Following the requirements derived in [REQS], we discuss here some scenarios and a key management solution for the media session. That is, the focus is on how to set up key management for secure multimedia sessions such that requirements in heterogeneous environment are fulfilled. 1.1. Existing solutions There is work done in IETF to develop key management schemes. For example, IKE [IKE] is a widely accepted scheme for unicast, as key management for IPsec, and the MSEC WG is developing schemes addressed to group communication [GKM]. For reasons discussed in Section 3 and in [REQS], there is however a need for a scheme optimized for demanding cases such as real-time data over heterogeneous networks. 1.2. Outline Section 4 defines a framework for three key exchange methods for the key management scheme. The key management scheme provides the parties involved with a master key and all the necessary information to secure their communication. It also specifies how to derive session keys from the agreed-upon master key, and how to perform so called key refresh. Section 5 specifies the key management information to be exchanged in order to secure media traffic end-to-end. In particular, appropriate SDP attributes are defined to carry such information. The Session Description Protocol (SDP) [SDP] is often used to describe real-time applications, and may be carried by control protocols like SIP [SIP] and RTSP [RTSP]. Therefore, Section 6 shows further proposals for integrating the key management scheme inside the SDP part carried by the control protocol. Carrara, et al. [Page 3] INTERNET-DRAFT mm-kmgt-sol July 2001 1.3. Notational Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119. 2. Scenarios We identify in the following some typical scenarios which involve the applications we are dealing with. a) unicast, e.g. a SIP-based [SIP] call between two users b) one-to-many, e.g. pay-per-view, mainly characterized by a server which is streaming down. c) many-to-many, without a centralized controller unit, e.g. for small groups d) many-to-many, with a centralized controller unit, e.g. for larger groups The key management solutions may be different in the above scenarios; in particular, for multicast applications and big groups scalability is an issue, and we refer to the MSEC WG for related work. In the following we concentrate on unicast, one-to-many, and small- size groups. 3. Approaches [REQS] lists requirements to be fulfilled by a key management scheme for the media session in an heterogeneous environment. Moreover, we recognize two basic approaches for including a key management scheme into the call set-up phase: use of a standalone key management (e.g., IKE), or use of some scheme piggy-backed on the already used set-up and control protocols (e.g., SIP/SDP). The main advantage of the first approach is the use of standard protocols whose security is well tested, and the fact that it is generally built as a complete framework, with all the necessary functions. Moreover, the key management is completely separate from the call control protocol, making the key management solution work in many different scenarios and for different applications. However, many of the current standalone protocols are designed for fixed networks, where time consumption and complexity are not too critical. Carrara, et al. [Page 4] INTERNET-DRAFT mm-kmgt-sol July 2001 The identified requirements may suggest parallelizing the key management with other protocols whenever possible, and moving to the background whatever is feasible to move (i.e., re-keying). Of course, it is worth remembering that several other optimizations are possible, e.g. caching (of keys and security parameters which have been exchanged during a previous contact) allows to almost completely skip the key management phase or to considerably reduce it. A tailor-made scheme could reduce complexity and time-consumption. A piggy-backed approach can specifically be designed to provide the necessary set of functions without imposing any extra set-up times. Therefore, a piggy-backed solution for the proposed key management scheme is described in later Sections. 4. Basic Key Management Schemes We start with some terminology. Session: an unidirectional data flow (note that this definition may override the one for specific protocols, e.g. RTP). Session keys: the set of keys that enter the encryption and/or integrity protection algorithms during a session. Master key: a bit-string agreed upon by two or more parties, associated with a session or predefined collection of sessions (e.g. a multimedia session) between said parties. From the master key, then, by a predetermined, agreed upon strategy, actual session keys are subsequently generated in a synchronized way, during the lifetime of the master key, without need for further communication. The master key will typically, though not always, be fixed for the duration of the session(s) it applies to. It is this master key (and not the corresponding session keys) that is actually exchanged by the key- exchange mechanism. Cryptographic context: a data structure containing the master key, and other security parameters (e.g. selected algorithms), associated with a session, or set of sessions. In the latter case, when several sessions share the context, the underlying security protocol MUST ensure avoidance of security compromising effects such as for instance 'two-time pads'. (This is typically done by letting session- specific data enter the security algorithms, for instance in the form of initialization vectors, IVs.) Key-refresh: the process of generating new session key(s), where the only non-public information is taken from the master key, and/or possibly previous session key(s). Carrara, et al. [Page 5] INTERNET-DRAFT mm-kmgt-sol July 2001 Re-keying: the process of re-negotiating the master key (and consequently future session key(s)) during an on-going session. In the following, the terms 'initiator' and 'responder' refer to the key exchange itself. [] denotes an optional piece of information, H() denotes a cryptographic hash function, E(k,m) (D(k,m)) denotes encryption (decryption) of m with the key k, and || denotes concatenation. Sign(k,m) is the signature of message m with key k. Further, by PK_x we mean x's public key while SK_x is x's corresponding secret key. Note that the keys for encryption respectively signing in general MUST be different, though for simplicity we use the same notation in both cases. For a fixed, agreed upon, multiplicative group, (G,*), for g in G and a natural number x, we let g^x denote g*g*..*g (x times). Choices for the functions are given in Section 5. The following sections propose three different ways to exchange a master key: with the use of a pre-shared key, public-key encryption, and Diffie-Hellman (DH) key exchange. They also describe methods for session key calculation, key-refresh, and re-keying. In the following we assume unicast communication. 4.1. Pre-shared key If a pre-shared key (s = auth_key || encr_key) exists between the two parties, the key exchange is done according to Figure 4.1. The master key (k_m) is randomly chosen by the initiator and then sent encrypted with the pre-shared key to the responder. T is a timestamp added to prevent replay attacks. The encrypted part (master key and timestamp) MUST be integrity protected (AuthTag). Hence, it is assumed that the pre-shared secret, s, consists of key material for both the encryption (encr_key) and the integrity protection (auth_key). The identity IDa MAY be sent to correctly select the pre-shared key to be used. A B [IDa],U=E(encr_key,k_m || T), | AuthTag(auth_key,U) |----------> D(encr_key,U) = k_m || T <-------| [H(IDa || IDb || k_m)][IDb] Figure 4.1. Pre-shared key based exchange, where k_m is randomly chosen, and s is the pre-shared secret. Carrara, et al. [Page 6] INTERNET-DRAFT mm-kmgt-sol July 2001 The peers will be indirectly authenticated to each other by the fact that they will be able to derive the same key. The responder MAY return a verification message showing that it knows the proposed master key. This is done by applying a hash function H (e.g. MD5, SHA-1 [MD5, SHA1]) to the master key and the peers' identities (IDx), and then including the hash value in the reply. Note that the pre-shared case is, by far, the most efficient way to handle the key exchange due to the use of symmetric cryptography only. This approach has also the advantage that only a small amount of data has to be exchanged. Of course, the issue is scalability. 4.2. Public-key encryption Public-key cryptography can be used to create a scalable system. A disadvantage with this approach is that it is more resource consuming than the pre-shared key approach. Another disadvantage is that a PKI (Public Key Infrastructure) is needed (in most cases) to handle the distribution of public keys. A B U=E(PK_b,k_m || T [|| IDa]), | [H(certB)], |--------> [F=Sign(SK_a,H(U))], | [[E(PK_b,]certA[)]] | D(SK_b,U) = k_m || T [|| IDa] <----------| [H(IDa || IDb || k_m)] Figure 4.2. Key exchange with public keys. The key exchange is done according to Figure 4.2. The initiator encrypts a randomly chosen value k_m, to be used as the master key, with the responder's public key (which the initiator has already) and sends the result to the responder. The encryption also contains a timestamp T and optionally the ID of the initiator. Note that by including the ID encrypted, identity protection is provided. The initiator MAY include a hash of the certificate of the public key used to encrypt k_m. This means that A already possesses at least one of B's certificates. Note that, if the hash of B's certificate is Carrara, et al. [Page 7] INTERNET-DRAFT mm-kmgt-sol July 2001 included, identity protection for B is not guaranteed. The responder decrypts the received value to obtain the master key k_m. The authentication is indirect, but not in both directions. The initiator can be sure that the only one that can obtain the master key is the responder, but the responder can not be sure that it was the initiator who sent the key. Therefore, the initiator SHOULD also sign the hash of the message to the responder. The initiator MAY send its own certificate so that responder can verify the signature and the certificate MAY be encrypted to provide identity protection. The responder MAY send a verification message (as in the pre-shared case) to the initiator, which proves that the responder has received the master key correctly. Certificate handling is in general complex; the scheme shown here is not the only one possible. For example, it is possible for B to fetch A's certificate via other means. Verification of certificate against Revocation Lists is not treated here, but may add extra delay. Certificate handling is in general complex and may involve a number of additional tasks not shown here, and effect the inclusion of certain parts of the message. The following observations can, however, be made: - the party A typically has to find the certificate of B in order to send the first message. If A doesn't have B's certificate already, this may involve one or more roundtrips to a central directory agent. - it will be possible for A to omit its own certificate and rely on B getting this certificate using other means. We recommend doing this, however, only when B can be reasonably expected to have cached the certificate from a previous connection. Otherwise accessing the certificate would mean additional roundtrips for B as well. - verification of the certificates using Certificate Revocation Lists (CRLs) or an on-line verification protocol may mean additional roundtrips for both parties. If a small number of roundtrips is required for acceptable performance, it may be necessary to omit some of these checks. 4.3. Diffie-Hellman key exchange The possibility of using a Diffie-Hellman (DH) key exchange method is also offered. Though, this approach in general has a higher resource consumption (both computationally and in bandwidth) than the previous ones. Carrara, et al. [Page 8] INTERNET-DRAFT mm-kmgt-sol July 2001 A B [IDa], g^x, T, | Sign (SK_a,g^x || T), | [[E(PK_b,]CertA[)]] |--------> <---------| g^y, Sign (SK_b,g^y || T), | [[E(PK_a,]CertB[)]] k_m=g^(xy) k_m=g^(xy) Figure 4.3. Diffie-Hellman key based exchange, where x and y are randomly chosen respectively by A and B. The key exchange is done according to Figure 4.3. The initiator chooses a random value x, and sends the signed g^x to the responder (optionally including its certificate or encrypted certificate, or an identity to retrieve it). The group parameters (e.g., g) are a set of pre-agreed parameters. The responder chooses a random value y, and sends the signed g^y to the initiator (optionally also providing a certificate). Both parties then calculate the master key g^(xy). T is a timestamp added into the signature to prevent replay attacks. The responder inserts the received timestamp in its response. The authentication is due to the signing of the DH key, and is necessary to avoid man-in-the-middle attacks. This approach is the most expensive approach. It requires first of all, that both sides compute one signature, then one verification and finally two exponentiations. 4.4. Parameter Negotiation Apart from the exchange and management of keys, a complete security solution needs also negotiation of security parameters. Which parameters are to be negotiated is dependent on the selected security protocol. A concrete example of mandatory security parameters in the case of the SRTP protocol is given in Appendix A. 4.5. Session Key Calculation and Key Refresh We define in the following a method to derive session keys from the master key, and to perform key-refresh. The following works for master keys of sizes up to 256 bits. Carrara, et al. [Page 9] INTERNET-DRAFT mm-kmgt-sol July 2001 By a packet index, we mean a one-to-one correspondence between the packets of a data flow belonging to a security association between the communicating parties and the set of s-bit integers for some number s. Typically, the first packet is assigned index j for a random j, and the i:th consecutively transmitted packet will have index (j + i) mod 2^s. The index, or parts thereof, are typically contained in the packet itself. At most 2^s packets can be communicated using a fixed cryptographic context. Thereafter a new master key MUST be selected. 4.5.1. Assumptions We assume that the following parameters are in place (to be exchanged as security parameters, in connection to the actual key exchange): k_m: the master key, which MUST be random and kept secret. If the master key size is not in the set {128, 192, 256}, it MUST be zero- appended to fit the nearest size above. The following parameters MAY be sent in the clear. r: a 48-bit refresh rate value (r = 0 means "no refresh") e_len: desired session encryption key-length for the security protocol a_len: desired session authentication key-length for the security protocol We assume that for each set of sessions sharing the same cryptographic context, some unique id for the individual session, s_id, of size (at most) 72 bits is available. (If there exists an "IV-formation" for the underlying encryption schemes that guarantees that all sessions sharing the same cryptographic context have unique IV's, then s_id MAY be the same for all those sessions as discussed above in the definition of cryptographic context, see discussion in Section 7). We also assume that each packet to be encrypted has an (at least) 48-bit index which is incrementing by one for each sent packet. We define two constants, encr, having the integer value "0" (represented as an 8-bit binary string), and auth, having the integer value "1" (also 8 bits). Carrara, et al. [Page 10] INTERNET-DRAFT mm-kmgt-sol July 2001 Other values MAY be specified by future applications (at most 256 different values are possible). 4.5.2. Notation x div y x divided by y, rounded down (we define x div 0 = 0 for all x). Let AES(k,x) be AES [AES] in encrypt mode applied to key k and message block x, and set AES^0(k, x) = x AES^i(k, x) = AES(k, AES^(i-1)(k,x)) for i >= 1, e.g. AES^3(k,x) = AES(k, AES(k, AES(k, x))). (Note: this is AES in OFB mode.) R(k,x,t) = AES^1(k,x) || AES^2(k,x) || ... || AES^t(k,x). Note that R will produce 128*t bits as output. 4.5.3. Description To generate an encryption/decryption key for packet having index n, belonging to session having session id s_id, the following steps are performed: 1. let a_n = n div r (the "key-sequence number") and t = e_len/128 (rounded up to the closest integer) 2. form the value v_n = a_n || encr || s_id (possibly padded by zeros to fill one 128-bit block) 3. the key is derived as k_n = R(k_m,v_n,t), truncated to its e_len most significant bits Efficiency: For typical parameter/key size values, this can be performed by a single AES encryption. Note that the above steps need only be performed for every r:th packet of the session(s). Note that even if no refresh is desired (r = 0), the above is still performed once to obtain the first (and only) key. The master key should not be used directly, due to the possibility that it may not be truly- or pseudo-random (e.g. the DH-key may not always be viewed as such). Key refresh for message authentication is the same, replacing the constant encr by the constant auth, and e_len by a_len. Carrara, et al. [Page 11] INTERNET-DRAFT mm-kmgt-sol July 2001 4.6. Re-keying A re-keying mechanism is necessary, e.g. when the key is compromised, when access control is desired, or simply when the key expires. Therefore, re-keying MUST be supported. Re-keying is performed by executing the key exchange protocol again before the master key expires. The necessary parameter(s) to be defined to support the re-keying procedure is the new master key and (when applicable) a signature (with the corresponding parameters as defined in Section 4.2 and 4.3). It may be necessary to specify the key lifetime range e.g. to trigger a new re-keying procedure during the on-going session. The parameters for re-keying are the following: k_m is the new master key. n_start and n_end define the lifetime range of the master key k_m. The lifetime range may be expressed in terms of time, packet index, etc. The employed security protocol MUST specify which unit is used. The default 'n_start' value SHOULD be that the key is valid from the first observed packet. For the 'n_end' value, as default the key is valid 'until further notice'. It MAY however not be necessary to include certificates and other information that was provided in the first exchange, i.e. all other parameters are optional to include. See also Section 6.2 and Section 6.3 for how this could be done with SIP and RTSP. Note that the initial key-exchange during a session is treated as a "re-keying", though no "previous" keying has yet taken place. 4.7. Implementation issues The only key exchange method mandatory to implement is the public-key based. Default values must exist for all parameters where it is possible. 4.8. Reliability The basic processing applied to ensure protocol reliability is the following: the transmitting entity (initiator or responder) MUST: Carrara, et al. [Page 12] INTERNET-DRAFT mm-kmgt-sol July 2001 1. Set a timer and initialize a retry counter 2. If the timer expires, the message is resent and the retry counter is decreased. 3. If the retry counter reaches zero (0), the event MAY be logged in the appropriate system audit file 4. The protocol machine clears all states and returns to IDLE The reliability scheme of transporting protocols is used when the key management protocol is integrated with them. 5. Headers to support key management We address key management schemes for end-to-end (e2e) security of the media traffic. 5.1. Common Attributes for the Key Management This subsection describes common attributes that should be included when the key exchange protocol is applied. The attributes are defined in a text-based format so that it easily can be integrated in text- based protocols (such as SDP and RTSP). Due to the text-based nature, all non-text values (such as keys) MUST be base64 encoded. If a value or attribute field is not provided, the default value MUST be used. To be able to detect the applied key exchange mechanism, the initiator MUST specify the key exchange type used: kxg-keytype:t= defines the type of key exchange that is used. There are three defined types: "PS" = pre-shared key, "PK" = public key, and "DH" = Diffie-Hellman. If the attribute is not specified, the default value is "PK". To transport the encrypted master key or the public DH-value, the following attribute is defined: kxg-key:k= t= is the encrypted key base64 encoded. If the protocol transporting the attributes already has a mechanism for transporting the key, that mechanism MAY be used instead (e.g. the "key=" field in SDP). is used in the pre-shared case due to integrity protection, and is the base64 encoded authentication tag of the encrypted data. Carrara, et al. [Page 13] INTERNET-DRAFT mm-kmgt-sol July 2001 Policies for the key usage and general parameters should be set according to: kxg-gen-par:r= s= d= e= a= h= b= where (expressed as number of packets, c.f. the r parameter in Section 4.5) is an decimal value. If the value is set to 0, this is equal to no key-refresh (except the initial key-derivation which always takes place through the refresh-mechanism). The default value is 0. The values and specify the lifetime range of the master key k_m, as defined in Section 4.5. The default values for and are set according to Section 4.6. and are described in Section 4.5.1. and are decimal values. The default values are 128. It might be necessary to use a hash function (e.g. for the signing). Therefore, the used hash function is specified by . Defined field values for are "SHA1", "SHA256", "SHA384", "SHA512", and "MD5". The default value is "SHA1". In the pre-shared case and the public-key case, the initiator MUST indicate if he wants a response from the responder, by setting to "reply" or "no-reply". The default value is "reply". kxg-userid:i= The value specifies the identity of the party. If the identity is not specified, it is assumed that the other party will get the identity in some other way. In the pre-shared case and the public-key case, the responder MUST include the hash-value in the response when requested by the initiator: kxg-response-hash:h= Special attribute field for the pre-shared key case: kxg-encrkey-alg:e= [] a= [] The kxg-encrkey-alg attribute specifies the symmetric algorithm (and related mode), , used to encrypt the master key, with additional parameters, , for the encryption algorithm (e.g. IV and key size). The field specifies the integrity protection mechanism. Other integrity protection specific parameters that may be necessary are defined in . Default encryption Carrara, et al. [Page 14] INTERNET-DRAFT mm-kmgt-sol July 2001 algorithm is "CM-AES", and default authentication algorithm is "UMAC- 2/16" (see [UMAC] for more information). For "CM-AES" we define the following encryption parameters: i= s= The value refers to the length of the pre-shared key (encr_key) used to encrypt the master key and timestamp. The default value is 128. No additional parameters is defined for UMAC. Additional attribute fields for the public key and the DH case: kxg-cert:t= c= The kxg-cert attribute specifies the type of the certificate used for signing (), and the certificate itself (). Defined field value for is "X.509", which is the default one. The certificate MUST be base64-encoded. If no certificate is provided, it is assumed that the receiver already has it or that it can get it by other means. If more than one certificate are to be provided (e.g. one for the signature key and one for the encryption key), each certificate MUST be specified in a separate attribute field. kxg-signature:s= kxg-cert-used:h= kxg-time:t= The signature SHOULD be placed in the kxg-signature attribute field and base64 encoded. The receiver might have several certificates, it could therefore be good to include, in the kxg-cert-used attribute field, a hash of the receiver's certificate that has been used to encrypt the key. The kxg-time attribute carries the timestamp. The only public key encryption and signing method that MUST be supported is RSA/PKCS#1, others are optional. kxg-DH:g= The kxg-DH attribute is used to specify the group for Diffie-Hellman. The valid pre-defined groups are the first, second and fifth OAKLEY group defined in [OAKLEY]. These should be defined respectively as ="OAKLEY1", ="OAKLEY2" and ="OAKLEY5". The fifth OAKLEY group is the default value. Negotiation of crypto suites has to be provided. The main description shall appear like the following: Carrara, et al. [Page 15] INTERNET-DRAFT mm-kmgt-sol July 2001 kxg-sec:s= [] where indicates the selected security protocol, and lists the necessary security parameters for the selected security protocol. The security protocol has to specify its related security parameters in a profile definition, as done e.g. for the SRTP profile (see Appendix A). In Table 1 it is specified which attributes that can be expected in different messages. The attributes SHOULD be sent in an code-related order as listed in the table in order to avoid later inter- operability issues between initiator and responder. Pre-shared Public Key Diffie-Hellman kxg-keytype I,{R} I,{R} I,{R} kxg-encrkey-alg I,{R} - - kxg-DH - - I,{R} kxg-key I I I,R kxg-gen-par I,R I,R I,R kxg-userid I,R - I kxg-response-hash R R - kxg-time - - I kxg-cert-used - I I,R kxg-cert - I I,R kxg-signature - I I,R kxg-sec I,{R} I,{R} I,{R} Table 1. Specification of attributed used by the initiator (I) respectively the responder (R) in the different key exchange methods. {} indicates in the table that the attribute may be used by that peer in case of wrong guess by the initiator, thus to negotiate (Section 6.1). 5.2. Format specification 5.2.1. Identities The id is a uniquely-defined identifier. Defined id's values are the Network Access Identifier (NAI) [NAI], and the Uniform Resource Identifiers (URI) [URI]. Other ids MAY be supported. 5.2.2. Timestamps The timestamp SHOULD be as defined in NTP [NTP], i.e. a 64-bit number in seconds relative to 0h on 1 January 1900. An implementation must Carrara, et al. [Page 16] INTERNET-DRAFT mm-kmgt-sol July 2001 be aware of (and take into account) the fact that the counter will overflow approximately every 136th year. 5.3. Key management schemes integrated in SDP An efficient way of performing key management is to integrate the scheme from Section 4 into SDP, when the latter is used to carry the description of the media sessions. This approach can reduce the number of roundtrips compared to a standalone key management scheme (and in some cases it could also reduce the total bandwidth consumption). Keys and crypto suites are therefore transported within SDP, as described in this section. 5.3.1. Using the key field to transport the key In SDP, a key field ("key=") is defined to transport keys. Therefore, this field SHOULD be reused to transport the encrypted master key (or the public DH value). The "key=" field defined in SDP [SDP] is currently defined as: key=: defines how the encryption key is obtained. Four different methods to obtain the key is defined: "clear" (the key is sent untransformed in the key-field), "base64" (the key is sent base64- encoded), "uri" and "prompt". is the symmetric key to be used to protect the media. For "uri"-method, an URI is included to indicate where to fetch the key, and for the "prompt"-method no information about the key will be included. The encrypted master key (or the public DH value) SHOULD be carried base64-encoded in the key-field, i.e.: key=base64: [] 5.3.2. SDP Attribute fields to support the key management For the integration with SDP, the fields in Section 5.1 are placed in attribute fields with the additional "x-" prefix (which indicates that the SDP attribute is experimental). This results in the following attributes: a=x-kxg-keytype:t= a=x-kxg-encrkey-alg:e= [] a= [] Carrara, et al. [Page 17] INTERNET-DRAFT mm-kmgt-sol July 2001 a=x-kxg-DH:g= a=x-kxg-key:k= t= a=x-kxg-gen-par:r= s= d= e= a= h= b= a=x-kxg-userid:i= a=x-kxg-response-hash:h= a=x-kxg-time:t= a=x-kxg-cert:t= c= a=x-kxg-signature:s= a=x-kxg-cert-used:h= a=x-kxg-sec:s= [] Note that the kxg-key should only be used if the SDP's "key=" field is not used. It should be noted that the SDP packet might be authenticated (signed) and in this case it would not be necessary to include the signature attribute. 5.4 Error Handling All errors due to the key exchange protocol SHOULD be handled by the transport mechanism (e.g. the call set-up protocol). If the responder does not support the set of parameters offered by the initiator, the error message SHOULD include the eventual supported parameters (see Section 6.1). 5.5. SDP Examples This section shows some examples for SDP descriptions from the initiator. Example 1 (integrated with PS): a=x-kxg-keytype:t=PS key=base64:kaDhDGE67FGD4JKkioSuiweiTYWEgDhywh== ghjs5a452Hjidmc09vI= a=x-kxg-gen-par:r=100 d=50000 a=x-kxg-userid: i=void@dev.null a=x-kxg-sec:s=SRTP m=audio 49000 RTP/SAVP 98 a=rtpmap:98 AMR/8000 m=video 2232 RTP/SAVP 31 Comments: The initiator indicates that key refresh should be carried out every 100:th packet, while re-keying is not needed. Carrara, et al. [Page 18] INTERNET-DRAFT mm-kmgt-sol July 2001 Example 2 (integrated with PK): key=base64:kaDhDGE67FGD4JKkioSuiweiTYWEgDhywh== a=x-kxg-signature:s=uiSDF9sdhs727ghsd/dhsoKkdOokdo7eWsnDSJD= a=x-kxg-sec:s=SRTP e=f8_AES a=null m=audio 49000 RTP/SAVP 98 a=rtpmap:98 AMR/8000 m=video 2232 RTP/SAVP 31 Comments: This example uses as many default values as possible (that is why many fields and values are not included). It is also assumed that the responder already knows the initiator's certificate (that is why the certificate has been left out to reduce the size). Example 3 (integrated with DH): a=x-kxg-keytype:t=DH a=x-kxg-DH:g=OAKLEY1 key=base64:kaDqhDGwE67eFGD4cJKkioSuiweiTYWEgDhywh== a=x-kxg-gen-par:s=10000 d=20000 h=MD5 a=x-kxg-signature:s=uiSDF9sdhs727ghsd5dhsoKkdOokdo7eWsnDSJD= a=x-kxg-sec:s=SRTP e=f8_AES m=audio 49000 RTP/SAVP 98 a=rtpmap:98 AMR/8000 m=video 2232 RTP/SAVP 31 6. SDP-based key management SDP descriptions can be carried by several protocols, such as SIP and RTSP. Even though the SDP attributes for the key exchange protocols (Section 4) was stated in Section 5, it also has to be stated how the actual message exchange SHOULD be done. This is however dependent on the protocol transporting SDP. This section proposes two examples, SIP and RTSP cases. 6.1. Initiator's and Responder's Behavior The initiator tries to guess the responder's capabilities in terms of security algorithms etc. If the guess is wrong, then the responder may send back its own capabilities (negotiation) to let the initiator choose a common set of parameters. Multiple key-sec attributes may be provided in sequence. This is done to reduce the number of round trips as much as possible. If one part indicates that it wants a secure media session, the other part SHOULD not be able to refuse and then set up the communication without security. If the responder is not willing/capable to provide security or the parties simply cannot agree, it is up to the parties' Carrara, et al. [Page 19] INTERNET-DRAFT mm-kmgt-sol July 2001 policies how to behave, i.e. accept an insecure communication or reject it. 6.2. Key management with SIP In a basic SIP call between two parties (see Figure 6.2.), SIP (Session Initiation Protocol, [SIP]) is used as a call control protocol between a caller A and a callee B. |A's SIP| <.......> |B's SIP| |Server | SIP |Server | --------- --------- ^ ^ . . ++++ SIP . . SIP ++++ | | <............. ..............> | | | | | | ++++ <-------------------------------------------> ++++ SRTP Fig 6.2.: SIP-based call example. The two parties uses SIP to set-up an SRTP stream from A to B. 6.2.1. Integration SIP may carry SDP descriptions, since the participants negotiate the media encoding etc. Therefore, the SDP attributes previously described may be integrated inside the INVITE and the answer to that. Eventually, subsequent SIP messages may also be used, e.g., when parameter negotiation is needed. Note that the initiator will also be the one who sends the INVITE, i.e. the caller. It can be assumed that the caller knows who to contact (i.e. the identity of the callee), but unless the initiator's identity (to retrieve the crypto context) can be derived from SIP itself, the initiator (caller) MUST provide the callee with its identity. 6.2.2. Re-keying A re-keying mechanism is necessary, e.g. when the key is compromised or simply because it has expired. When SIP is used as call control protocol, a re-INVITE can be issued carrying an SDP part modified for example in the "key= " field. Note that it may not be necessary to send all information, such as the certificate, due to the already established call. Carrara, et al. [Page 20] INTERNET-DRAFT mm-kmgt-sol July 2001 6.3. Key management with RTSP The Real Time Streaming Protocol (RTSP) [RTSP] is used to control media streaming from a server. The session is typically obtained via an SDP description, received by a DESCRIBE message, or other means (e.g., http). 6.3.1. Integration The server should include all the necessary security parameters, key included, in the SDP part of the initial DESCRIBE message. If a response is required, this should be included in the PLAY message. 6.3.2. Re-keying There are mainly two alternatives for re-keying when RTSP is used. If the server specifies a certain expiration time for the master key, the client MUST send a GET message to the server before the key expires. The server should respond with the new key. If the server does not specify an expiration time for the key, the server can still force re-keying by pushing down a new key to the client with the SET message. The attributes defined in Section 5.1 should be used and included in the GET/SET message as content type "text/parameters". In order to be able to request re-keying, the following parameter is defined: kxg-rk-get The client sends a kxg-rk-get parameter, included in a GET message, to the server in order to obtain a new key. 6.3.3. Examples Example 1 (Without re-keying): C->S: DESCRIBE rtsp://server.example.com/fizzle/foo RTSP/1.0 CSeq: 312 Accept: application/sdp S->C: RTSP/1.0 200 OK CSeq: 312 Date: 23 Jan 1997 15:35:06 GMT Content-Type: application/sdp Content-Length: 376 Carrara, et al. [Page 21] INTERNET-DRAFT mm-kmgt-sol July 2001 key=base64:kaDhDGE67FGD4JKkioSuiweiTYWEgDhywh== a=x-kxg-signature:s=uiSDF9sdhs727ghsd/dhsoKkdOokdo7eWsnDSJD= a=x-kxg-sec:s=SRTP e=f8_AES a=null m=audio 49000 RTP/SAVP 98 a=rtpmap:98 AMR/8000 m=video 2232 RTP/SAVP 31 Example 2 (Server initiated re-keying): C<->S Session Setup including key exchange S->C: SET_PARAMETER rtsp://void.com/dev/null/1.0 CSeq: 430 Content-Type: text/parameters Session: 12345678 Content-Length: 46 kxg-key:k=kaDhDGE67FGD4JKkioSuiweiTYWEgDhywh== kxg-gen-par:s=10000 d=20000 kxg-signature:s=uiSDF9sdhs727ghsd/dhsoKkdOokdo7eWsnDSJD= C->S: RTSP/1.0 200 OK CSeq: 430 Example 3 (Client initiated re-keying): C<->S Session Setup including key exchange C->S: GET_PARAMETER rtsp://void.com/dev/null/1.0 CSeq: 431 Content-Type: text/parameters Session: 12345678 Content-Length: 11 kxg-rk-get S->C: RTSP/1.0 200 OK CSeq: 431 Content-Length: 46 Content-Type: text/parameters kxg-key:k=kaDhDGE67FGD4JKkioSuiweiTYWEgDhywh== kxg-gen-par:s=10000 d=20000 kxg-signature:s=uiSDF9sdhs727ghsd/dhsoKkdOokdo7eWsnDSJD= Carrara, et al. [Page 22] INTERNET-DRAFT mm-kmgt-sol July 2001 6.4. Groups What has been discussed up to now can be extended from the unicast case to groups. However, key management is more complex in the case of groups. The pre-shared case can be extended to groups in case the pre-shared key is a group key (with scalability issues). This allows the distribution of a group session key. However, backward and forward security is not guaranteed due to the pre-shared key. The public-key case is peer-to-peer in the matter of passing the key and parameters, but a group session key can be distributed. The DH case in the described proposal is only peer-to-peer. Note that we do not consider multi-party DH. Hence, small groups can be managed with the described schemes, but groups of a certain size and multicast should be managed by other more scalable schemes, e.g., from the MSEC WG [GKM]. 7. Security Considerations 7.1 General No chain is stronger than its weakest link. The cryptographic functions protecting the keys during exchange and transport SHOULD offer a security at least corresponding to the (symmetric) keys they protect. For instance, with current state of the art [LV], protecting a 128-bit AES key by a 512-bit RSA [RSA] key offers an overall security well below 64-bits. On the other hand, protecting a 64-bit symmetric key by a 2048-bit RSA key appears to be an "overkill", leading to unnecessary time delays. Therefore, key size for the key- exchange mechanism SHOULD be weighed against the size of the exchanged key. Similarly, in the case of a pre-shared secret, it is the size of the pre-shared secret that determines the security level. Notice that in this case, we have a chain: the pre-shared secret protects the master key from which session keys are derived. The total security level is therefore equivalent to the minimum of the three key-sizes involved. Moreover, if the session keys are not random, a brute force search may be facilitated, again lowering the effective key size. Therefore, care MUST be taken when designing the (pseudo) random generators for master key generation. The same applies to re-keying and key-refresh mechanisms. Policies for key-refresh rates MUST be set so as to avoid key stream re-use (for stream ciphers) and to avoid "non-random" behavior (e.g. for block-ciphers in feedback-type mode). Carrara, et al. [Page 23] INTERNET-DRAFT mm-kmgt-sol July 2001 For the selection of the hash function, SHA1 with 160-bit output is the default one. However, to get an effective overall strength corresponding to 128-bit keys, SHA1-256 has to be used, see [SHA256]. In general hash sizes should be twice the "security level". However, due to the real-time aspects of the scenarios we are treating, shorter sizes MAY also be acceptable as the normal "existential" collision probabilities could be of secondary importance. In a multimedia session, it would be convenient to let different individual sessions (audio, video etc) share the master key as discussed earlier. From a security point of view, the criterion to be satisfied is that the encryption of the individual sessions are performed "independently". This can be accomplished in two ways. First, having unique session identifiers as discussed in Section 4.5.1, our key-derivation method assures this by assigning distinct session keys to distinct sessions, regardless of the security protocol used. Secondly, some protocols MAY in some cases provide this independence internally (by "initialization vectors" etc), in which case even the session identifiers can be "merged" to a single multi-media session identifier, greatly simplifying session key- management. However, care MUST be taken to fully understand the impacts of this by detailed knowledge of the underlying security protocol. Note that in the public-key scheme, the initiator's identity is carried by the certificate, or concatenated with the session key. In both cases, identity protection is guaranteed by encrypting the identity. However, this may open additional denial of service vulnerabilities. The first step should in fact be to check the signature, before decrypting. If identity protection is used, indeed the responder has first to decrypt to get A's identity, before the signature verification, unless the identity is not already known by other means. This protocol is resistant to Denial of Service attacks in the sense that a responder does not construct any state (at the key management protocol level) before it has authenticated the initiator. However, the protocol is open to attacks that use spoofed IP addresses to create a large number of fake requests. In the pre-shared and public-key schemes, the master key is generated by a single party (initiator). This might be viewed as a security weakness, e.g. in case the initiator uses a bad random number generator. It should also be noted that neither the pre-shared nor the public-key scheme provides perfect forward secrecy. If mutual contribution or perfect forward secrecy is desired, the Diffie- Hellman scheme MUST be used. The use of timestamps instead of challenge-response (using nonce) may add problems. Due to time synchronization problems, the protocol may suffer from replay attacks if they are performed within a time Carrara, et al. [Page 24] INTERNET-DRAFT mm-kmgt-sol July 2001 corresponding to the synchronization-accuracy. The current timestamp based solution has been selected to allow only two messages and a reasonable replay protection. A (secure) nonce-based version would require at least three messages. 7.2 Key refresh To allow arbitrary size session key material, the key refresh mechanism is designed to implement a simple pseudo-random generator (PRG). The key refresh mechanism has been designed to maximize speed. Forwards/backwards security: if the master key, k_m, is exposed, all keys generated from it during a session are compromised. However, under the assumption the key refresh as a pseudo-random generator, disclosure of an individual derived session key does not compromise other keys derived from the same master key, k_m. Key uniqueness: session keys derived for distinct sessions sharing the same cryptographic context (i.e. distinct v_n-values) are guaranteed to be distinct to avoid 'two-time pads'. As key-refresh is always performed at least once to derive session keys, possible inherent non-randomness in the master key is (heuristically) removed. 7.3. Re-keying Even if the lifetime of a master key is not specified, it MUST be taken into account that the key stream of the underlying security protocol can recycle or, in some other way, degenerate after a certain amount of encrypted data. Each security protocol MUST define such amount and trigger re-keying before the 'exhaustion' of the key. 8. Conclusions A complete security solution for real-time applications needs a key management infrastructure. Such key management has to fulfil certain requirements, especially when the Internet meets the wireless. In particular, service behavior sets strict time limits, therefore number of round trips and processing time are prime factors to be preserved. Integrating the key management scheme with the call set-up protocol could be done efficiently in most of the scenarios. We show how it could be possible, and we give examples using SDP integrated in SIP and RTSP. 9. Acknowledgments The authors would like to thank Rolf Blom, Magnus Westerlund, and Pasi Ahonen for their reviews and comments. Carrara, et al. [Page 25] INTERNET-DRAFT mm-kmgt-sol July 2001 10. Author's Addresses Elisabetta Carrara Ericsson Research SE-16480 Stockholm Phone: +46 8 50877040 Sweden EMail: elisabetta.carrara@era.ericsson.se Fredrik Lindholm Ericsson Research SE-16480 Stockholm Phone: +46 8 58531705 Sweden EMail: fredrik.lindholm@era.ericsson.se Mats Naslund Ericsson Research SE-16480 Stockholm Phone: +46 8 58533739 Sweden EMail: mats.naslund@era.ericsson.se Karl Norrman Ericsson Research SE-16480 Stockholm Phone: +46 8 4044502 Sweden EMail: karl.norrman@era.ericsson.se Jari Arkko Ericsson 02420 Jorvas Phone: +358 40 5079256 Finland Email: jari.arkko@ericsson.com 11. References [AES] Advanced Encryption Standard, www.nist.gov/aes [GKM] Baugher, M., Hardjono, T., Harney, H., Weis, B., "The Group Domain of Interpretation", Internet Draft, February 2001, and Harney, H., Colegrove, A., Harder, E., Meth, U., Fleischer, R., "Group Secure Association Key Management Protocol", Internet Draft, March 2001. [IKE] Harkins, D. and Carrel, D., "The Internet Key Exchange (IKE)", RFC 2409, November 1998. [LV] Lenstra, A. K., and Verheul, E. R., "Suggesting Key Sizes for Cryptosystems", http://www.cryptosavvy.com/suggestions.htm [MD5] Rivest, R.,"MD5 Digest Algorithm", RFC 1321, April 1992. [NAI] Aboba, B. and Beadles, M., "The Network Access Identifier", IETF, RFC 2486, January 1999. Carrara, et al. [Page 26] INTERNET-DRAFT mm-kmgt-sol July 2001 [NTP] Mills, D., "Network Time Protocol (Version 3) specification, implementation and analysis", RFC 1305, March 1992. [OAKLEY] Orman, H., "The Oakley Key Determination Protocol", RFC 2412, November 1998. [REQS] Carrara, E., Lindholm, F., Blom, R., and Arkko, J., "Design Criteria for Multimedia Session Key Management in Heterogeneous Networks", Internet Draft, July 2001. [RTSP] Schulzrinne, H., Rao, A., and Lanphier, R., "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998. [RSA] Rivest, R., Shamir, A., and Adleman, L. "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems". Communications of the ACM. Vol.21. No.2. pp.120-126. 1978. [SDP] Handley, M., and Jacobson, V., "Session Description Protocol (SDP), IETF, RFC2327 [SHA1] NIST, FIPS PUB 180-1: Secure Hash Standard, April 1995. http://csrc.nist.gov/fips/fip180-1.ps [SHA256] NIST, "Description of SHA-256, SHA-384, and SHA-512", http://csrc.nist.gov/encryption/shs/sha256-384-512.pdf [SIP] Handley, M., Schulzrinne, H., Schooler, E., and Rosenberg, J., "SIP: Session Initiation Protocol", IETF, RFC2543. [SRTP] Blom, R., Carrara, E., McGrew, D., Naslund, M, Norrman, K., and Oran, D., "The Secure Real Time Transport Protocol", Internet Draft, IETF, http://search.ietf.org/internet-drafts/draft-ietf-avt- srtp-00.txt. [UMAC] Krovetz, T., Black, J., Halevi, S., Hevia, A., Krawczyk, H., Rogaway, P., "UMAC: Message Authentication Code using Universal Hashing", Internet Draft, October 2000, . [URI] Berners-Lee. T., Fielding, R., Masinter, L., "Uniform Resource Identifiers (URI): Generic Syntax", RFC 2396 Carrara, et al. [Page 27] INTERNET-DRAFT mm-kmgt-sol July 2001 Appendix A Security Parameters for the SRTP profile. Apart the common framework we have been providing, each security protocol may have particular parameters to exchange. We give here an example for SRTP [SRTP] defining some of the necessary parameters. The use of SRTP is indicated by the following attribute: kxg-sec:s=SRTP e= a= s= where = "null" | "CM_AES" | "f8_AES" | .. = "null" | "SRTP_UMAC " | .. is an identifier used to select an encryption scheme. A set of standard encryption schemes MUST be defined and assigned a number each. Defined values are "null", "CM_AES", "f8_AES" and the default is "CM_AES". is an identifier used to select an authentication scheme. Defined values are "null" and "SRTP_UMAC". SRTP_UMAC is defined as UMAC-2/4/128/16/BIG/SIGNED, see also [SRTP]. The default value is "SRTP_UMAC". The is the base64 salting key. This key may be in clear text. If it needs to be protected, it is recommended for the master key to be extended so that the salting key can be derived from the extra bits. When SDP is used, SRTP is announced in the SDP's "m=" line as profile, see [SRTP], e.g.: m = audio 5004 RTP/SAVP 9 a=x-kxg-sec:s=SRTP e=CM_AES Moreover, in case of dynamic groups, where members may join/leave, it is necessary to pass the rollover counter. Using the IV formation suggested in [SRTP], the same encryption key is used for securing RTP and related RTCP streams. The same authentication key MAY be used for RTP and related RTCP streams. This Internet-Draft expires in December 2001. Carrara, et al. [Page 28]