Network Working Group Carsten Bormann (ed.), TZI/Uni Bremen INTERNET-DRAFT Expires: August 2001 Carsten Burmeister, Matsushita Mikael Degermark, U of Arizona Hideaki Fukushima, Matsushita Hans Hannu, Ericsson Lars-Erik Jonsson, Ericsson Rolf Hakenberg, Matsushita Tmima Koren, Cisco Khiem Le, Nokia Zhigang Liu, Nokia Anton Martensson, Ericsson Akihiro Miyazaki, Matsushita Krister Svanbro, Ericsson Thomas Wiebke, Matsushita Takeshi Yoshimura, NTT DoCoMo Haihong Zheng, Nokia February 26, 2001 RObust Header Compression (ROHC): Framework and four profiles: RTP, UDP, ESP, and uncompressed Status of this memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or cite them other than as "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/lid-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This document is a product of the IETF ROHC WG. Comments should be directed to its mailing list, rohc@cdt.luth.se. Bormann (ed.) [Page 1] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 Abstract Existing header compression schemes do not work well when used over links with significant error rates and long round-trip times. For many bandwidth limited links where header compression is essential, such characteristics are common. A highly robust and efficient header compression scheme for RTP/UDP/IP, UDP/IP and ESP/IP headers is specified in this document. This is done in a framework designed to be extensible. For example, a scheme for compressing TCP/IP headers will be simple to add, and is in development. Headers specific to Mobile IPv4 are not subject to special treatment, but are expected to be compressed sufficiently well by the provided methods for compression of sequences of extension headers and tunneling headers. For the most part, the same will apply to work in progress on Mobile IPv6, but future work might be required to handle some extension headers, when a standards track Mobile IPv6 has been completed. Revision history (remove before publishing) -09: Incorporate minor editorial comments from IETF last-call -08: Incorporate changes from second, abbreviated WG last call; ready for submission to an IETF last-call. -07: Incorporate changes from WG last call -06: Final changes for WG last call -05: Editorial changes -04: Complete 5.5 and 5.8. Remove section 7 (no longer needed) and renumber. Restructure section 5 to clearly separate the profiles. -03: Redo packet formats to allow new ROHC framework. Fill in gaps. -02: Major changes after 48th IETF -01: Minor editorial changes for 48th IETF -00: Document created from ROHC submissions Table of contents Status of this memo.................................................1 Abstract............................................................2 Revision history (remove before publishing).........................2 Table of contents...................................................2 1. Introduction....................................................7 2. Terminology.....................................................9 2.1. Acronyms.....................................................13 3. Background.....................................................14 3.1. Header compression fundamentals..............................14 3.2. Existing header compression schemes..........................14 3.3. Requirements on a new header compression scheme..............16 3.4. Classification of header fields..............................16 4. Header compression framework...................................18 Bormann (ed.) [Page 2] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 4.1. Operating assumptions........................................18 4.2. Dynamicity...................................................19 4.3. Compression and decompression states.........................20 4.3.1. Compressor states..........................................21 4.3.1.1. Initialization and Refresh (IR) State....................21 4.3.1.2. First Order (FO) State...................................22 4.3.1.3. Second Order (SO) State..................................22 4.3.2. Decompressor states........................................22 4.4. Modes of operation...........................................23 4.4.1. Unidirectional mode -- U-mode..............................24 4.4.2. Bidirectional Optimistic mode -- O-mode....................24 4.4.3. Bidirectional Reliable mode -- R-mode......................24 4.5. Encoding methods.............................................25 4.5.1. Least Significant Bits (LSB) encoding .....................25 4.5.2. Window-based LSB encoding (W-LSB encoding).................27 4.5.3. Scaled RTP Timestamp encoding .............................28 4.5.4. Timer-based compression of RTP Timestamp...................30 4.5.5. Offset IP-ID encoding......................................33 4.5.6. Self-describing variable-length values ....................34 4.5.7. Encoded values across several fields in compressed headers 35 4.6. Errors caused by residual errors.............................35 4.7. Impairment considerations....................................36 5. The protocol...................................................38 5.1. Data structures..............................................38 5.1.1. Per-channel parameters.....................................38 5.1.2. Per-context parameters, profiles...........................38 5.1.3. Contexts and context identifiers ..........................39 5.2. ROHC packets and packet types................................40 5.2.1. ROHC feedback .............................................41 5.2.2. ROHC feedback format ......................................43 5.2.3. ROHC IR packet type .......................................45 5.2.4. ROHC IR-DYN packet type ...................................46 5.2.5. ROHC segmentation..........................................47 5.2.5.1. Segmentation usage considerations........................47 5.2.5.2. Segmentation protocol....................................47 5.2.6. ROHC initial decompressor processing.......................49 5.2.7. ROHC RTP packet formats from compressor to decompressor....50 5.2.8. Parameters needed for mode transition in ROHC RTP..........52 5.3. Operation in Unidirectional mode.............................52 5.3.1. Compressor states and logic (U-mode).......................52 5.3.1.1. State transition logic (U-mode)..........................52 5.3.1.1.1. Optimistic approach, upwards transition................53 5.3.1.1.2. Timeouts, downward transition..........................53 5.3.1.1.3. Need for updates, downward transition..................53 5.3.1.2. Compression logic and packets used (U-mode)..............53 5.3.1.3. Feedback in Unidirectional mode..........................54 5.3.2. Decompressor states and logic (U-mode).....................54 5.3.2.1. State transition logic (U-mode)..........................54 5.3.2.2. Decompression logic (U-mode).............................54 5.3.2.2.1. Decide whether decompression is allowed................55 5.3.2.2.2. Reconstruct and verify the header......................55 Bormann (ed.) [Page 3] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 5.3.2.2.3. Actions upon CRC failure...............................55 5.3.2.2.4. Correction of SN LSB wraparound........................57 5.3.2.2.5. Repair of incorrect SN updates.........................58 5.3.2.3. Feedback in Unidirectional mode..........................59 5.4. Operation in Bidirectional Optimistic mode...................59 5.4.1. Compressor states and logic (O-mode).......................59 5.4.1.1. State transition logic...................................60 5.4.1.1.1. Negative acknowledgments (NACKs), downward transition..60 5.4.1.1.2. Optional acknowledgments, upwards transition...........60 5.4.1.2. Compression logic and packets used.......................61 5.4.2. Decompressor states and logic (O-mode).....................61 5.4.2.1. Decompression logic, timer-based timestamp decompression.61 5.4.2.2. Feedback logic (O-mode)..................................61 5.5. Operation in Bidirectional Reliable mode.....................62 5.5.1. Compressor states and logic (R-mode).......................62 5.5.1.1. State transition logic (R-mode)..........................63 5.5.1.1.1. Upwards transition.....................................63 5.5.1.1.2. Downward transition....................................63 5.5.1.2. Compression logic and packets used (R-mode)..............63 5.5.2. Decompressor states and logic (R-mode).....................65 5.5.2.1. Decompression logic (R-mode).............................65 5.5.2.2. Feedback logic (R-mode)..................................65 5.6. Mode transitions.............................................67 5.6.1. Compression and decompression during mode transitions......67 5.6.2. Transition from Unidirectional to Optimistic mode..........68 5.6.3. From Optimistic to Reliable mode...........................69 5.6.4. From Unidirectional to Reliable mode.......................69 5.6.5. From Reliable to Optimistic mode...........................69 5.6.6. Transition to Unidirectional mode..........................70 5.7. Packet formats...............................................71 5.7.1. Packet type 0: UO-0, R-0, R-0-CRC .........................74 5.7.2. Packet type 1 (R-mode): R-1, R-1-TS, R-1-ID ...............75 5.7.3. Packet type 1 (U/O-mode): UO-1, UO-1-ID, UO-1-TS ..........76 5.7.4. Packet type 2: UOR-2 ......................................77 5.7.5. Extension formats..........................................78 5.7.5.1. RND flags and packet types...............................83 5.7.5.2. Flags/Fields in context..................................83 5.7.6. Feedback packets and formats...............................84 5.7.6.1. Feedback formats for ROHC RTP............................85 5.7.6.2. ROHC RTP Feedback options................................86 5.7.6.3. The CRC option...........................................86 5.7.6.4. The REJECT option........................................86 5.7.6.5. The SN-NOT-VALID option..................................87 5.7.6.6. The SN option............................................87 5.7.6.7. The CLOCK option.........................................87 5.7.6.8. The JITTER option........................................88 5.7.6.9. The LOSS option..........................................88 5.7.6.10. Unknown option types....................................89 5.7.6.11. RTP feedback example....................................89 5.7.7. RTP IR and IR-DYN packets..................................90 5.7.7.1. Basic structure of the IR packet.........................90 Bormann (ed.) [Page 4] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 5.7.7.2. Basic structure of the IR-DYN packet.....................91 5.7.7.3. Initialization of IPv6 Header [IPv6].....................93 5.7.7.4. Initialization of IPv4 Header [IPv4, section 3.1]........94 5.7.7.5. Initialization of UDP Header [RFC-768]...................95 5.7.7.6. Initialization of RTP Header [RTP].......................95 5.7.7.7. Initialization of ESP Header [ESP, section 2]............97 5.7.7.8. Initialization of Other Headers..........................97 5.8. List compression.............................................97 5.8.1. Table-based item compression...............................98 5.8.1.1. Translation table in R-mode..............................99 5.8.1.2. Translation table in U/O-modes...........................99 5.8.2. Reference list determination..............................100 5.8.2.1. Reference list in R-mode and U/O-mode...................100 5.8.3. Encoding schemes for the compressed list..................103 5.8.4. Special handling of IP extension headers..................105 5.8.4.1. Next Header field.......................................105 5.8.4.2. Authentication Header (AH)..............................107 5.8.4.3. Encapsulating Security Payload Header (ESP).............107 5.8.4.4. GRE Header [RFC 2784, RFC 2890].........................110 5.8.5. Format of compressed lists in Extension 3.................111 5.8.5.1. Format of IP Extension Header(s) field..................111 5.8.5.2. Format of Compressed CSRC List..........................112 5.8.6. Compressed list formats...................................112 5.8.6.1. Encoding Type 0 (generic scheme)........................112 5.8.6.2. Encoding Type 1 (insertion only scheme).................114 5.8.6.3. Encoding Type 2 (removal only scheme)...................115 5.8.6.4. Encoding Type 3 (remove then insert scheme).............115 5.8.7. CRC coverage for extension headers........................116 5.9. Header compression CRCs, coverage and polynomials...........116 5.9.1. IR and IR-DYN packet CRCs.................................116 5.9.2. CRCs in compressed headers................................117 5.10. ROHC UNCOMPRESSED -- no compression (Profile 0x0000).......118 5.10.1. IR packet................................................118 5.10.2. Normal packet............................................119 5.10.3. States and modes.........................................119 5.10.4. Feedback.................................................120 5.11. ROHC UDP -- non-RTP UDP/IP compression (Profile 0x0002)....121 5.11.1. Initialization...........................................121 5.11.2. States and modes.........................................122 5.11.3. Packet types.............................................122 5.11.4. Extensions...............................................123 5.11.5. IP-ID....................................................124 5.11.6. Feedback.................................................124 5.12. ROHC ESP -- ESP/IP compression (Profile 0x0003)............124 5.12.1. Initialization...........................................124 5.12.2. Packet types.............................................125 6. Implementation issues.........................................126 6.1. Reverse decompression.......................................126 6.2. RTCP........................................................127 6.3. Implementation parameters and signals.......................128 6.3.1. ROHC implementation parameters at compressor..............128 Bormann (ed.) [Page 5] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 6.3.2. ROHC implementation parameters at decompressor............130 6.4. Handling of resource limitations at the decompressor........130 6.5. Implementation structures...................................131 6.5.1. Compressor context........................................131 6.5.2. Decompressor context......................................132 6.5.3. List compression: Sliding windows in R-mode and U/O-mode..134 7. Security considerations.......................................134 8. IANA considerations...........................................135 9. Acknowledgments...............................................136 10. Intellectual property right claim considerations.............136 11. References...................................................137 11.1. Normative References.......................................137 11.2. Informative References.....................................137 12. Authors' addresses...........................................138 Appendix A. Detailed classification of header fields.............140 A.1. General classification......................................140 A.1.1. IPv6 header fields........................................141 A.1.2. IPv4 header fields........................................142 A.1.3. UDP header fields.........................................144 A.1.4. RTP header fields.........................................145 A.1.5. Summary for IP/UDP/RTP....................................146 A.2. Analysis of change patterns of header fields................146 A.2.1. IPv4 Identification.......................................148 A.2.2. IP Traffic-Class / Type-Of-Service........................150 A.2.3. IP Hop-Limit / Time-To-Live...............................150 A.2.4. UDP Checksum..............................................150 A.2.5. RTP CSRC Counter..........................................150 A.2.6. RTP Marker................................................150 A.2.7. RTP Payload Type..........................................150 A.2.8. RTP Sequence Number.......................................150 A.2.9. RTP Timestamp.............................................151 A.2.10. RTP Contributing Sources (CSRC)..........................151 A.3. Header compression strategies...............................151 A.3.1. Do not send at all........................................151 A.3.2. Transmit only initially...................................152 A.3.3. Transmit initially, but be prepared to update.............152 A.3.4. Be prepared to update or send as-is frequently............152 A.3.5. Guarantee continuous robustness...........................153 A.3.6. Transmit as-is in all packets.............................153 A.3.7. Establish and be prepared to update delta.................153 Bormann (ed.) [Page 6] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 1. Introduction During the last five years, two communication technologies in particular have become commonly used by the general public: cellular telephony and the Internet. Cellular telephony has provided its users with the revolutionary possibility of always being reachable with reasonable service quality no matter where they are. The main service provided by the dedicated terminals has been speech. The Internet, on the other hand, has from the beginning been designed for multiple services and its flexibility for all kinds of usage has been one of its strengths. Internet terminals have usually been general-purpose and have been attached over fixed connections. The experienced quality of some services (such as Internet telephony) has sometimes been low. Today, IP telephony is gaining momentum thanks to improved technical solutions. It seems reasonable to believe that in the years to come, IP will become a commonly used way to carry telephony. Some future cellular telephony links might also be based on IP and IP telephony. Cellular phones may have become more general-purpose, and may have IP stacks supporting not only audio and video, but also web browsing, email, gaming, etc. One of the scenarios we are envisioning might then be the one in Figure 1.1, where two mobile terminals are communicating with each other. Both are connected to base stations over cellular links, and the base stations are connected to each other through a wired (or possibly wireless) network. Instead of two mobile terminals, there could of course be one mobile and one wired terminal, but the case with two cellular links is technically more demanding. Mobile Base Base Mobile Terminal Station Station Terminal | ~ ~ ~ \ / \ / ~ ~ ~ ~ | | | | | +--+ | | +--+ | | | | | | | | | | | | +--+ | | +--+ | | |=========================| Cellular Wired Cellular Link Network Link Figure 1.1 : Scenario for IP telephony over cellular links Bormann (ed.) [Page 7] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 It is obvious that the wired network can be IP-based. With the cellular links, the situation is less clear. IP could be terminated in the fixed network, and special solutions implemented for each supported service over the cellular link. However, this would limit the flexibility of the services supported. If technically and economically feasible, a solution with pure IP all the way from terminal to terminal would have certain advantages. However, to make this a viable alternative, a number of problems have to be addressed, in particular problems regarding bandwidth efficiency. For cellular phone systems, it is of vital importance to use the scarce radio resources in an efficient way. A sufficient number of users per cell is crucial, otherwise deployment costs will be prohibitive. The quality of the voice service should also be as good as in today's cellular systems. It is likely that even with support for new services, lower quality of the voice service is acceptable only if costs are significantly reduced. A problem with IP over cellular links when used for interactive voice conversations is the large header overhead. Speech data for IP telephony will most likely be carried by RTP [RTP]. A packet will then, in addition to link layer framing, have an IP [IPv4] header (20 octets), a UDP [UDP] header (8 octets), and an RTP header (12 octets) for a total of 40 octets. With IPv6 [IPv6], the IP header is 40 octets for a total of 60 octets. The size of the payload depends on the speech coding and frame sizes being used and may be as low as 15- 20 octets. From these numbers, the need for reducing header sizes for efficiency reasons is obvious. However, cellular links have characteristics that make header compression as defined in [IPHC,CRTP] perform less than well. The most important characteristic is the lossy behavior of cellular links, where a bit error rate (BER) as high as 1e-3 must be accepted to keep the radio resources efficiently utilized. In severe operating situations, the BER can be as high as 1e-2. The other problematic characteristic is the long round-trip time (RTT) of the cellular link, which can be as high as 100-200 milliseconds. An additional problem is that the residual BER is nontrivial, i.e., lower layers can sometimes deliver frames containing undetected errors. A viable header compression scheme for cellular links must be able to handle loss on the link between the compression and decompression point as well as loss before the compression point. Bandwidth is the most costly resource in cellular links. Processing power is very cheap in comparison. Implementation or computational simplicity of a header compression scheme is therefore of less importance than its compression ratio and robustness. Bormann (ed.) [Page 8] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. BER Bit Error Rate. Cellular radio links can have a fairly high BER. In this document BER is usually given as a probability, but one also needs to consider the error distribution as bit errors are not independent. Cellular links Wireless links between mobile terminals and base stations. Compression efficiency The performance of a header compression scheme can be described with three parameters: compression efficiency, robustness and compression transparency. The compression efficiency is determined by how much the header sizes are reduced by the compression scheme. Compression transparency The performance of a header compression scheme can be described with three parameters: compression efficiency, robustness, and compression transparency. The compression transparency is a measure of the extent to which the scheme ensures that the decompressed headers are semantically identical to the original headers. If all decompressed headers are semantically identical to the corresponding original headers, the transparency is 100 percent. Compression transparency is high when damage propagation is low. Context The context of the compressor is the state it uses to compress a header. The context of the decompressor is the state it uses to decompress a header. Either of these or the two in combination are usually referred to as "context", when it is clear which is intended. The context contains relevant information from previous headers in the packet stream, such as static fields and possible reference values for compression and decompression. Moreover, additional information describing the packet stream is also part of the context, for example information about how the IP Identifier field changes and the typical inter-packet increase in sequence numbers or timestamps. Bormann (ed.) [Page 9] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 Context damage When the context of the decompressor is not consistent with the context of the compressor, decompression may fail to reproduce the original header. This situation can occur when the context of the decompressor has not been initialized properly or when packets have been lost or damaged between compressor and decompressor. Packets which cannot be decompressed due to inconsistent contexts are said to be lost due to context damage. Errors in decompressed headers due to inconsistent contexts are said to be damaged due to context damage. Context repair mechanism Context repair mechanisms are mechanisms that bring the contexts in sync when they were not. This is needed to avoid excessive loss due to context damage. Examples are the context request mechanism of CRTP, the NACK mechanisms of O- and R-mode, and the periodic refreshes of U-mode. Note that there are also mechanisms that prevent (some) context inconsistencies from occurring, for example the ACK-based updates of the context in R-mode, the repetitions after change in U- and O- mode, and the CRCs which protect context updating information. CRC-DYNAMIC Opposite of CRC-STATIC. CRC-STATIC A CRC over the original header is the primary mechanism used by ROHC to detect incorrect decompression. In order to decrease computational complexity, the fields of the header are conceptually rearranged when the CRC is computed, so that it is first computed over octets which are static (called CRC-STATIC in this document) and then over octets whose values are expected to change between packets (CRC-DYNAMIC). In this manner, the intermediate result of the CRC computation, after it has covered the CRC-STATIC fields, can be reused for several packets. The restarted CRC computation only covers the CRC-DYNAMIC octets. See section 5.9. Damage propagation Delivery of incorrect decompressed headers, due to errors in (i.e., loss of or damage to) previous header(s) or feedback. Loss propagation Loss of headers, due to errors in (i.e., loss of or damage to) previous header(s)or feedback. Bormann (ed.) [Page 10] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 Error detection Detection of errors. If error detection is not perfect, there will be residual errors. Error propagation Damage propagation or loss propagation. Header compression profile A header compression profile is a specification of how to compress the headers of a certain kind of packet stream over a certain kind of link. Compression profiles provide the details of the header compression framework introduced in this document. The profile concept makes use of profile identifiers to separate different profiles which are used when setting up the compression scheme. All variations and parameters of the header compression scheme that are not part of the context state are handled by different profile identifiers. Packet Generally, a unit of transmission and reception (protocol data unit). Specifically, when contrasted with "frame", the packet compressed and then decompressed by ROHC. Also called "uncompressed packet". Packet Stream A sequence of packets where the field values and change patterns of field values are such that the headers can be compressed using the same context. Pre-HC links The Pre-HC links are all links that a packet has traversed before the header compression point. If we consider a path with cellular links as first and last hops, the Pre-HC links for the compressor at the last link are the first cellular link plus the wired links in between. Residual error Error introduced during transmission and not detected by lower- layer error detection schemes. Robustness Bormann (ed.) [Page 11] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 The performance of a header compression scheme can be described with three parameters: compression efficiency, robustness, and compression transparency. A robust scheme tolerates loss and residual errors on the link over which header compression takes place without losing additional packets or introducing additional errors in decompressed headers. RTT The RTT (round-trip time) is the time elapsing from the moment the compressor sends a packet until it receives feedback related to that packet (when such feedback is sent). Spectrum efficiency Radio resources are limited and expensive. Therefore they must be used efficiently to make the system economically feasible. In cellular systems this is achieved by maximizing the number of users served within each cell, while the quality of the provided services is kept at an acceptable level. A consequence of efficient spectrum use is a high rate of errors (frame loss and residual bit errors), even after channel coding with error correction. String A sequence of headers in which the values of all fields being compressed change according to a pattern which is fixed with respect to a sequence number. Each header in a string can be compressed by representing it with a ROHC header which essentially only carries an encoded sequence number. Fields not being compressed (e.g., random IP-ID, UDP Checksum) are irrelevant to this definition. Timestamp stride The timestamp stride (TS_STRIDE) is the expected increase in the timestamp value between two RTP packets with consecutive sequence numbers. Bormann (ed.) [Page 12] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 2.1. Acronyms This section lists most acronyms used for reference. AH Authentication Header. CID Context Identifier. CRC Cyclic Redundancy Check. Error detection mechanism. CRTP Compressed RTP. RFC 2508. CTCP Compressed TCP. Also called VJ header compression. RFC 1144. ESP Encapsulating Security Payload. FC Full Context state (decompressor). FO First Order state (compressor). GRE Generic Routing Encapsulation. RFC 2784, RFC 2890. HC Header Compression. IPHC IP Header Compression. RFC 2507. IPX Flag in Extension 2. IR Initiation and Refresh state (compressor). Also IR packet. IR-DYN IR-DYN packet. LSB Least Significant Bits. MRRU Maximum Reconstructed Reception Unit. MTU Maximum Transmission Unit. MSB Most Significant Bits. NBO Flag indicating whether the IP-ID is in Network Byte Order. NC No Context state (decompressor). O-mode Bidirectional Optimistic mode. PPP Point-to-Point Protocol. R-mode Bidirectional Reliable mode. RND Flag indicating whether the IP-ID behaves randomly. ROHC RObust Header Compression. RTCP Real-Time Control Protocol. See RTP. RTP Real-Time Protocol. RFC 1889. RTT Round Trip Time (see section 2). SC Static Context state (decompressor). SN (compressed) Sequence Number. Usually RTP Sequence Number. SO Second Order state (compressor). SPI Security Parameters Index. SSRC Sending source. Field in RTP header. CSRC Contributing source. Optional list of CSRCs in RTP header. TC Traffic Class. Octet in IPv6 header. See also TOS. TOS Type Of Service. Octet in IPv4 header. See also TC. TS (compressed) RTP Timestamp. U-mode Unidirectional mode. W-LSB Window based LSB encoding. See section 4.5.2. Bormann (ed.) [Page 13] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 3. Background This chapter provides a background to the subject of header compression. The fundamental ideas are described together with existing header compression schemes. Their drawbacks and requirements are then discussed, providing motivation for new header compression solutions. 3.1. Header compression fundamentals The main reason why header compression can be done at all is the fact that there is significant redundancy between header fields, both within the same packet header but in particular between consecutive packets belonging to the same packet stream. By sending static field information only initially and utilizing dependencies and predictability for other fields, the header size can be significantly reduced for most packets. Relevant information from past packets is maintained in a context. The context information is used to compress (decompress) subsequent packets. The compressor and decompressor update their contexts upon certain events. Impairment events may lead to inconsistencies between the contexts of the compressor and decompressor, which in turn may cause incorrect decompression. A robust header compression scheme needs mechanisms for avoiding context inconsistencies and also needs mechanisms for making the contexts consistent when they were not. 3.2. Existing header compression schemes The original header compression scheme, CTCP [VJHC], was invented by Van Jacobson. CTCP compresses the 40 octet IP+TCP header to 4 octets. The CTCP compressor detects transport-level retransmissions and sends a header that updates the context completely when they occur. This repair mechanism does not require any explicit signaling between compressor and decompressor. A general IP header compression scheme, IP header compression [IPHC], improves somewhat on CTCP and can compress arbitrary IP, TCP, and UDP headers. When compressing non-TCP headers, IPHC does not use delta encoding and is robust. When compressing TCP, the repair mechanism of CTCP is augmented with a link-level nacking scheme which speeds up the repair. IPHC does not compress RTP headers. CRTP [CRTP, IPHC] by Casner and Jacobson is a header compression scheme that compresses 40 octets of IPv4/UDP/RTP headers to a minimum of 2 octets when the UDP Checksum is not enabled. If the UDP Checksum is enabled, the minimum CRTP header is 4 octets. CRTP cannot use the same repair mechanism as CTCP since UDP/RTP does not retransmit. Instead, CRTP uses explicit signaling messages from decompressor to Bormann (ed.) [Page 14] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 compressor, called CONTEXT_STATE messages, to indicate that the context is out of sync. The link round-trip time will thus limit the speed of this context repair mechanism. On lossy links with long round-trip times, such as most cellular links, CRTP does not perform well. Each lost packet over the link causes several subsequent packets to be lost since the context is out of sync during at least one link round-trip time. This behavior is documented in [CRTPC]. For voice conversations such long loss events will degrade the voice quality. Moreover, bandwidth is wasted by the large headers sent by CRTP when updating the context. [CRTPC] found that CRTP did not perform well enough for a lossy cellular link. It is clear that CRTP alone is not a viable header compression scheme for IP telephony over cellular links. To avoid losing packets due to the context being out of sync, CRTP decompressors can attempt to repair the context locally by using a mechanism known as TWICE. Each CRTP packet contains a counter which is incremented by one for each packet sent out by the CRTP compressor. If the counter increases by more than one, at least one packet was lost over the link. The decompressor then attempts to repair the context by guessing how the lost packet(s) would have updated it. The guess is then verified by decompressing the packet and checking the UDP Checksum -- if it succeeds, the repair is deemed successful and the packet can be forwarded or delivered. TWICE derives its name from the observation that when the compressed packet stream is regular, the correct guess is to apply the update in the current packet twice. [CRTPC] found that even with TWICE, CRTP doubled the number of lost packets. TWICE improves CRTP performance significantly. However, there are several problems with using TWICE: 1) It becomes mandatory to use the UDP Checksum: - the minimal compressed header size increases by 100% to 4 octets. - most speech codecs developed for cellular links tolerate errors in the encoded data. Such codecs will not want to enable the UDP Checksum, since they do want damaged packets to be delivered. - errors in the payload will make the UDP Checksum fail when the guess is correct (and might make it succeed when the guess is wrong). 2) Loss in an RTP stream that occurs before the compression point will make updates in CRTP headers less regular. Simple-minded versions of TWICE will then perform badly. More sophisticated versions would need more repair attempts to succeed. Bormann (ed.) [Page 15] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 3.3. Requirements on a new header compression scheme The major problem with CRTP is that it is not sufficiently robust against packets being damaged between compressor and decompressor. A viable header compression scheme must be less fragile. This increased robustness must be obtained without increasing the compressed header size; a larger header would make IP telephony over cellular links economically unattractive. A major cause of the bad performance of CRTP over cellular links is the long link round-trip time, during which many packets are lost when the context is out of sync. This problem can be attacked directly by finding ways to reduce the link round-trip time. Future generations of cellular technologies may indeed achieve lower link round-trip times. However, these will probably always be fairly high. The benefits in terms of lower loss and smaller bandwidth demands if the context can be repaired locally will be present even if the link round-trip time is decreased. A reliable way to detect a successful context repair is then needed. One might argue that a better way to solve the problem is to improve the cellular link so that packet loss is less likely to occur. Such modifications do not appear to come for free, however. If links were made (almost) error free, the system might not be able to support a sufficiently large number of users per cell and might thus be economically infeasible. One might also argue that the speech codecs should be able to deal with the kind of packet loss induced by CRTP, in particular since the speech codecs probably must be able to deal with packet loss anyway if the RTP stream crosses the Internet. While the latter is true, the kind of loss induced by CRTP is difficult to deal with. It is usually not possible to completely hide a loss event where well over 100 ms worth of sound is completely lost. If such loss occurs frequently at both ends of the end-to-end path, the speech quality will suffer. A detailed description of the requirements specified for ROHC may be found in [REQ]. 3.4. Classification of header fields As mentioned earlier, header compression is possible due to the fact that there is much redundancy between header field values within packets, but especially between consecutive packets. To utilize these properties for header compression, it is important to understand the change patterns of the various header fields. All header fields have been classified in detail in appendix A. The fields are first classified at a high level and then some of them are studied more in detail. Finally, the appendix concludes with Bormann (ed.) [Page 16] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 recommendations on how the various fields should be handled by header compression algorithms. The main conclusion that can be drawn is that most of the header fields can easily be compressed away since they never or seldom change. Only 5 fields, with a combined size of about 10 octets, need more sophisticated mechanisms. These fields are: - IPv4 Identification (16 bits) - IP-ID - UDP Checksum (16 bits) - RTP Marker (1 bit) - M-bit - RTP Sequence Number (16 bits) - SN - RTP Timestamp (32 bits) - TS The analysis in Appendix A reveals that the values of the TS and IP-ID fields can usually be predicted from the RTP Sequence Number, which increments by one for each packet emitted by an RTP source. The M-bit is also usually the same, but needs to be communicated explicitly occasionally. The UDP Checksum should not be predicted and is sent as-is when enabled. The way ROHC RTP compression operates, then, is to first establish functions from SN to the other fields, and then reliably communicate the SN. Whenever a function from SN to another field changes, i.e., the existing function gives a result which is different from the field in the header to be compressed, additional information is sent to update the parameters of that function. Headers specific to Mobile IP (for IPv4 or IPv6) do not receive any special treatment in this document. They are compressible, however, and it is expected that the compression efficiency for Mobile IP headers will be good enough due to the handling of extension header lists and tunneling headers. It would be relatively painless to introduce a new ROHC profile with special treatment for Mobile IPv6 specific headers should the completed work on the Mobile IPv6 protocols (work in progress in the IETF) make that necessary. Bormann (ed.) [Page 17] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 4. Header compression framework 4.1. Operating assumptions Cellular links, which are a primary target for ROHC, have a number of characteristics that are described briefly here. ROHC requires functionality from lower layers that is outlined here and more thoroughly described in the lower layer guidelines document [LLG]. Channels ROHC header-compressed packets flow on channels. Unlike many fixed links, some cellular radio links can have several channels connecting the same pair of nodes. Each channel can have different characteristics in terms of error rate, bandwidth, etc. Context identifiers On some channels, the ability to transport multiple packet streams is required. It can also be feasible to have channels dedicated to individual packet streams. Therefore, ROHC uses a distinct context identifier space per channel and can eliminate context identifiers completely for one of the streams when few streams share a channel. Packet type indication Packet type indication is done in the header compression scheme itself. Unless the link already has a way of indicating packet types which can be used, such as PPP, this provides smaller compressed headers overall. It may also be less difficult to allocate a single packet type, rather than many, in order to run ROHC over links such as PPP. Reordering The channel between compressor and decompressor is required to maintain packet ordering, i.e., the decompressor must receive packets in the same order as the compressor sent them. (Reordering before the compression point, however, is dealt with, i.e., there is no assumption that the compressor will only receive packets in sequence.) Duplication The channel between compressor and decompressor is required to not duplicate packets. (Duplication before the compression point, however, is dealt with, i.e., there is no assumption that the compressor will receive only one copy of each packet.) Packet length Bormann (ed.) [Page 18] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 ROHC is designed under the assumption that lower layers indicate the length of a compressed packet. ROHC packets do not contain length information for the payload. Framing The link layer must provide framing that makes it possible to distinguish frame boundaries and individual frames. Error detection/protection The ROHC scheme has been designed to cope with residual errors in the headers delivered to the decompressor. CRCs and sanity checks are used to prevent or reduce damage propagation. However, it is RECOMMENDED that lower layers deploy error detection for ROHC headers and do not deliver ROHC headers with high residual error rates. Without giving a hard limit on the residual error rate acceptable to ROHC, it is noted that for a residual bit error rate of at most 1E-5, the ROHC scheme has been designed not to increase the number of damaged headers, i.e., the number of damaged headers due to damage propagation is designed to be less than the number of damaged headers caught by the ROHC error detection scheme. Negotiation In addition to the packet handling mechanisms above, the link layer MUST provide a way to negotiate header compression parameters, see also section 5.1.1. (For unidirectional links, this negotiation may be performed out-of-band or even a priori.) 4.2. Dynamicity The ROHC protocol achieves its compression gain by establishing state information at both ends of the link, i.e., at the compressor and at the decompressor. Different parts of the state are established at different times and with different frequency; hence, it can be said that some of the state information is more dynamic than the rest. Some state information is established at the time a channel is established; ROHC assumes the existence of an out-of-band negotiation protocol (such as PPP), or predefined channel state (most useful for unidirectional links). In both cases, we speak of "negotiated channel state". ROHC does not assume that this state can change dynamically during the channel lifetime (and does not explicitly support such changes, although some changes may be innocuous from a protocol point of view). An example of negotiated channel state is the highest context ID number to be used by the compressor (MAX_CID). Bormann (ed.) [Page 19] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 Other state information is associated with the individual packet streams in the channel; this state is said to be part of the context. Using context identifiers (CIDs), multiple packet streams with different contexts can share a channel. The negotiated channel state indicates the highest context identifier to be used, as well as the selection of one of two ways to indicate the CID in the compressed header. It is up to the compressor to decide which packets to associate with a context (or, equivalently, which packets constitute a single stream); however, ROHC is efficient only when all packets of a stream share certain properties, such as having the same values for fields that are described as "static" in this document (e.g., the IP addresses, port numbers, and RTP parameters such as the payload type). The efficiency of ROHC RTP also depends on the compressor seeing most RTP Sequence Numbers. Streams need not share all characteristics important for compression. ROHC has a notion of compression profiles: a compression profile denotes a predefined set of such characteristics. To provide extensibility, the negotiated channel state includes the set of profiles acceptable to the decompressor. The context state includes the profile currently in use for the context. Other elements of the context state may include the current values of all header fields (from these one can deduce whether an IPv4 header is present in the header chain, and whether UDP Checksums are enabled), as well as additional compression context that is not part of an uncompressed header, e.g., TS_STRIDE, IP-ID characteristics (incrementing as a 16-bit value in network byte order? random?), a number of old reference headers, and the compressor/decompressor state machines (see next section). This document actually defines four ROHC profiles: One uncompressed profile, the main ROHC RTP compression profile, and two variants of this profile for compression of packets with header chains that end in UDP and ESP, respectively, but where RTP compression is not applicable. The descriptive text in the rest of this section is referring to the main ROHC RTP compression profile. 4.3. Compression and decompression states Header compression with ROHC can be characterized as an interaction between two state machines, one compressor machine and one decompressor machine, each instantiated once per context. The compressor and the decompressor have three states each, which in many ways are related to each other even if the meaning of the states are slightly different for the two parties. Both machines start in the lowest compression state and transit gradually to higher states. Bormann (ed.) [Page 20] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 Transitions need not be synchronized between the two machines. In normal operation it is only the compressor that temporarily transits back to lower states. The decompressor will transit back only when context damage is detected. Subsequent sections present an overview of the state machines and their corresponding states, respectively, starting with the compressor. 4.3.1. Compressor states For ROHC compression, the three compressor states are the Initialization and Refresh (IR), First Order (FO), and Second Order (SO) states. The compressor starts in the lowest compression state (IR) and transits gradually to higher compression states. The compressor will always operate in the highest possible compression state, under the constraint that the compressor is sufficiently confident that the decompressor has the information necessary to decompress a header compressed according to that state. +----------+ +----------+ +----------+ | IR State | <--------> | FO State | <--------> | SO State | +----------+ +----------+ +----------+ Decisions about transitions between the various compression states are taken by the compressor on the basis of: - variations in packet headers - positive feedback from decompressor (Acknowledgments -- ACKs) - negative feedback from decompressor (Negative ACKs -- NACKs) - periodic timeouts (when operating in unidirectional mode, i.e., over simplex channels or when feedback is not enabled) How transitions are performed is explained in detail in chapter 5 for each mode of operation. 4.3.1.1. Initialization and Refresh (IR) State The purpose of the IR state is to initialize the static parts of the context at the decompressor or to recover after failure. In this state, the compressor sends complete header information. This includes all static and nonstatic fields in uncompressed form plus some additional information. The compressor stays in the IR state until it is fairly confident that the decompressor has received the static information correctly. Bormann (ed.) [Page 21] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 4.3.1.2. First Order (FO) State The purpose of the FO state is to efficiently communicate irregularities in the packet stream. When operating in this state, the compressor rarely sends information about all dynamic fields, and the information sent is usually compressed at least partially. Only a few static fields can be updated. The difference between IR and FO should therefore be clear. The compressor enters this state from the IR state, and from the SO state whenever the headers of the packet stream do not conform to their previous pattern. It stays in the FO state until it is confident that the decompressor has acquired all the parameters of the new pattern. Changes in fields that are always irregular are communicated in all packets and are therefore part of what is a uniform pattern. Some or all packets sent in the FO state carry context updating information. It is very important to detect corruption of such packets to avoid erroneous updates and context inconsistencies. 4.3.1.3. Second Order (SO) State This is the state where compression is optimal. The compressor enters the SO state when the header to be compressed is completely predictable given the SN (RTP Sequence Number) and the compressor is sufficiently confident that the decompressor has acquired all parameters of the functions from SN to other fields. Correct decompression of packets sent in the SO state only hinges on correct decompression of the SN. However, successful decompression also requires that the information sent in the preceding FO state packets has been successfully received by the decompressor. The compressor leaves this state and goes back to the FO state when the header no longer conforms to the uniform pattern and cannot be independently compressed on the basis of previous context information. 4.3.2. Decompressor states The decompressor starts in its lowest compression state, "No Context" and gradually transits to higher states. The decompressor state machine normally never leaves the "Full Context" state once it has entered this state. +--------------+ +----------------+ +--------------+ | No Context | <---> | Static Context | <---> | Full Context | +--------------+ +----------------+ +--------------+ Bormann (ed.) [Page 22] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 Initially, while working in the "No Context" state, the decompressor has not yet successfully decompressed a packet. Once a packet has been decompressed correctly (for example, upon reception of an initialization packet with static and dynamic information), the decompressor can transit all the way to the "Full Context" state, and only upon repeated failures will it transit back to lower states. However, when that happens it first transits back to the "Static Context" state. There, reception of any packet sent in the FO state is normally sufficient to enable transition to the "Full Context" state again. Only when decompression of several packets sent in the FO state fails in the "Static Context" state will the decompressor go all the way back to the "No Context" state. When state transitions are performed is explained in detail in chapter 5. 4.4. Modes of operation The ROHC scheme has three modes of operation, called Unidirectional, Bidirectional Optimistic, and Bidirectional Reliable mode. It is important to understand the difference between states, as described in the previous chapter, and modes. These abstractions are orthogonal to each other. The state abstraction is the same for all modes of operation, while the mode controls the logic of state transitions and what actions to perform in each state. +----------------------+ | Unidirectional Mode | | +--+ +--+ +--+ | | |IR| |FO| |SO| | | +--+ +--+ +--+ | +----------------------+ ^ ^ / \ / \ v v +----------------------+ +----------------------+ | Optimistic Mode | | Reliable Mode | | +--+ +--+ +--+ | | +--+ +--+ +--+ | | |IR| |FO| |SO| | <--------------> | |IR| |FO| |SO| | | +--+ +--+ +--+ | | +--+ +--+ +--+ | +----------------------+ +----------------------+ The optimal mode to operate in depends on the characteristics of the environment of the compression protocol, such as feedback abilities, error probabilities and distributions, effects of header size variation, etc. All ROHC implementations MUST implement and support all three modes of operation. The three modes are briefly described in the following subsections. Bormann (ed.) [Page 23] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 Detailed descriptions of the three modes of operation regarding compression and decompression logic are given in chapter 5. The mode transition mechanisms, too, are described in chapter 5. 4.4.1. Unidirectional mode -- U-mode When in the Unidirectional mode of operation, packets are sent in one direction only: from compressor to decompressor. This mode therefore makes ROHC usable over links where a return path from decompressor to compressor is unavailable or undesirable. In U-mode, transitions between compressor states are performed only on account of periodic timeouts and irregularities in the header field change patterns in the compressed packet stream. Due to the periodic refreshes and the lack of feedback for initiation of error recovery, compression in the Unidirectional mode will be less efficient and have a slightly higher probability of loss propagation compared to any of the Bidirectional modes. Compression with ROHC MUST start in the Unidirectional mode. Transition to any of the Bidirectional modes can be performed as soon as a packet has reached the decompressor and it has replied with a feedback packet indicating that a mode transition is desired (see chapter 5). 4.4.2. Bidirectional Optimistic mode -- O-mode The Bidirectional Optimistic mode is similar to the Unidirectional mode. The difference is that a feedback channel is used to send error recovery requests and (optionally) acknowledgments of significant context updates from decompressor to compressor (not, however, for pure sequence number updates). Periodic refreshes are not used in the Bidirectional Optimistic mode. O-mode aims to maximize compression efficiency and sparse usage of the feedback channel. It reduces the number of damaged headers delivered to the upper layers due to residual errors or context invalidation. The frequency of context invalidation may be higher than for R-mode, in particular when long loss/error bursts occur. Refer to section 4.7 for more details. 4.4.3. Bidirectional Reliable mode -- R-mode The Bidirectional Reliable mode differs in many ways from the previous two. The most important differences are a more intensive usage of the feedback channel and a stricter logic at both the compressor and the decompressor that prevents loss of context synchronization between compressor and decompressor except for very Bormann (ed.) [Page 24] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 high residual bit error rates. Feedback is sent to acknowledge all context updates, including updates of the sequence number field. However, not every packet updates the context in Reliable mode. R-mode aims to maximize robustness against loss propagation and damage propagation, i.e., minimize the probability of context invalidation, even under header loss/error burst conditions. It may have a lower probability of context invalidation than O-mode, but a larger number of damaged headers may be delivered when the context actually is invalidated. Refer to section 4.7 for more details. 4.5. Encoding methods This chapter describes the encoding methods used for header fields. How the methods are applied to each field (e.g., values of associated parameters) is specified in section 5.7. 4.5.1. Least Significant Bits (LSB) encoding Least Significant Bits (LSB) encoding is used for header fields whose values are usually subject to small changes. With LSB encoding, the k least significant bits of the field value are transmitted instead of the original field value, where k is a positive integer. After receiving k bits, the decompressor derives the original value using a previously received value as reference (v_ref). The scheme is guaranteed to be correct if the compressor and the decompressor each use interpretation intervals 1) in which the original value resides, and 2) in which the original value is the only value that has the exact same k least significant bits as those transmitted. The interpretation interval can be described as a function f(v_ref, k). Let f(v_ref, k) = [v_ref - p, v_ref + (2^k - 1) - p] where p is an integer. <------- interpretation interval (size is 2^k) -------> |-------------+---------------------------------------| v_ref - p v_ref v_ref + (2^k-1) - p The function f has the following property: for any value k, the k least significant bits will uniquely identify a value in f(v_ref, k). Bormann (ed.) [Page 25] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 The parameter p is introduced so that the interpretation interval can be shifted with respect to v_ref. Choosing a good value for p will yield a more efficient encoding for fields with certain characteristics. Below are some examples: a) For field values that are expected always to increase, p can be set to -1. The interpretation interval becomes [v_ref + 1, v_ref + 2^k]. b) For field values that stay the same or increase, p can be set to 0. The interpretation interval becomes [v_ref, v_ref + 2^k - 1]. c) For field values that are expected to deviate only slightly from a constant value, p can be set to 2^(k-1) - 1. The interpretation interval becomes [v_ref - 2^(k-1) + 1, v_ref + 2^(k-1)]. d) For field values that are expected to undergo small negative changes and larger positive changes, such as the RTP TS for video, or RTP SN when there is misordering, p can be set to 2^(k-2) - 1. The interval becomes [v_ref - 2^(k-2) + 1, v_ref + 3 * 2^(k-2)], i.e., 3/4 of the interval is used for positive changes. The following is a simplified procedure for LSB compression and decompression; it is modified for robustness and damage propagation protection in the next subsection: 1) The compressor (decompressor) always uses v_ref_c (v_ref_d), the last value that has been compressed (decompressed), as v_ref; 2) When compressing a value v, the compressor finds the minimum value of k such that v falls into the interval f(v_ref_c, k). Call this function k = g(v_ref_c, v). When only a few distinct values of k are possible, for example due to limitations imposed by packet formats (see section 5.7), the compressor will instead pick the smallest k that puts v in the interval f(v_ref_c, k). 3) When receiving m LSBs, the decompressor uses the interpretation interval f(v_ref_d, m), called interval_d. It picks as the decompressed value the one in interval_d whose LSBs match the received m bits. Note that the values to be encoded have a finite range; for example, the RTP SN ranges from 0 to 0xFFFF. When the SN value is close to 0 or 0xFFFF, the interpretation interval can straddle the wraparound boundary between 0 and 0xFFFF. The scheme is complicated by two factors: packet loss between the compressor and decompressor, and transmission errors undetected by the lower layer. In the former case, the compressor and decompressor Bormann (ed.) [Page 26] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 will lose the synchronization of v_ref, and thus also of the interpretation interval. If v is still covered by the intersection(interval_c, interval_d), the decompression will be correct. Otherwise, incorrect decompression will result. The next section will address this issue further. In the case of undetected transmission errors, the corrupted LSBs will give an incorrectly decompressed value that will later be used as v_ref_d, which in turn is likely to lead to damage propagation. This problem is addressed by using a secure reference, i.e., a reference value whose correctness is verified by a protecting CRC. Consequently, the procedure 1) above is modified as follows: 1) a) the compressor always uses as v_ref_c the last value that has been compressed and sent with a protecting CRC. b) the decompressor always uses as v_ref_d the last correct value, as verified by a successful CRC. Note that in U/O-mode, 1) b) is modified so that if decompression of the SN fails using the last verified SN reference, another decompression attempt is made using the last but one verified SN reference. This procedure mitigates damage propagation when a small CRC fails to detect a damaged value. See section 5.3.2.2.3 for further details. 4.5.2. Window-based LSB encoding (W-LSB encoding) This section describes how to modify the simplified algorithm in 4.5.1 to achieve robustness. The compressor may not be able to determine the exact value of v_ref_d that will be used by the decompressor for a particular value v, since some candidates for v_ref_d may have been lost or damaged. However, by using feedback or by making reasonable assumptions, the compressor can limit the candidate set. The compressor then calculates k such that no matter which v_ref_d in the candidate set the decompressor uses, v is covered by the resulting interval_d. Since the decompressor always uses as the reference the last received value where the CRC succeeded, the compressor maintains a sliding window containing the candidates for v_ref_d. The sliding window is initially empty. The following operations are performed on the sliding window by the compressor: 1) After sending a value v (compressed or uncompressed) protected by a CRC, the compressor adds v to the sliding window. 2) For each value v being compressed, the compressor chooses k = max(g(v, v_min), g(v, v_max)), where v_min and v_max are the minimum and maximum values in the sliding window, and g is the function defined in the previous section. Bormann (ed.) [Page 27] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 3) When the compressor is sufficiently confident that a certain value v will not be used as a reference by the decompressor, the window is advanced by removing v and all values older than v. The confidence may be obtained by various means. In R-mode, an ACK from the decompressor implies that values older than the ACKed one can be removed from the sliding window. In U/O-mode there is always a CRC to verify correct decompression, and a sliding window with a limited maximum width is used. The window width is an implementation dependent optimization parameter. Note that the decompressor follows the procedure described in the previous section, except that in R-mode it MUST ACK each header received with a succeeding CRC (see also section 5.5). 4.5.3. Scaled RTP Timestamp encoding The RTP Timestamp (TS) will usually not increase by an arbitrary number from packet to packet. Instead, the increase is normally an integral multiple of some unit (TS_STRIDE). For example, in the case of audio, the sample rate is normally 8 kHz and one voice frame may cover 20 ms. Furthermore, each voice frame is often carried in one RTP packet. In this case, the RTP increment is always n * 160 (= 8000 * 0.02), for some integer n. Note that silence periods have no impact on this, as the sample clock at the source normally keeps running without changing either frame rate or frame boundaries. In the case of video, there is usually a TS_STRIDE as well when the video frame level is considered. The sample rate for most video codecs is 90 kHz. If the video frame rate is fixed, say, to 30 frames/second, the TS will increase by n * 3000 (= n * 90000 / 30) between video frames. Note that a video frame is often divided into several RTP packets to increase robustness against packet loss. In this case several RTP packets will carry the same TS. When using scaled RTP Timestamp encoding, the TS is downscaled by a factor of TS_STRIDE before compression. This saves floor(log2(TS_STRIDE)) bits for each compressed TS. TS and TS_SCALED satisfy the following equality: TS = TS_SCALED * TS_STRIDE + TS_OFFSET TS_STRIDE is explicitly, and TS_OFFSET implicitly, communicated to the decompressor. The following algorithm is used: 1. Initialization: The compressor sends to the decompressor the value of TS_STRIDE and the absolute value of one or several TS fields. The latter are used by the decompressor to initialize Bormann (ed.) [Page 28] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 TS_OFFSET to (absolute value) modulo TS_STRIDE. Note that TS_OFFSET is the same regardless of which absolute value is used, as long as the unscaled TS value does not wrap around; see 4) below. 2. Compression: After initialization, the compressor no longer compresses the original TS values. Instead, it compresses the downscaled values: TS_SCALED = TS / TS_STRIDE. The compression method could be either W-LSB encoding or the timer-based encoding described in the next section. 3. Decompression: When receiving the compressed value of TS_SCALED, the decompressor first derives the value of the original TS_SCALED. The original RTP TS is then calculated as TS = TS_SCALED * TS_STRIDE + TS_OFFSET. 4. Offset at wraparound: Wraparound of the unscaled 32-bit TS will invalidate the current value of TS_OFFSET used in the equation above. For example, let us assume TS_STRIDE = 160 = 0xA0 and the current TS = 0xFFFFFFF0. TS_OFFSET is then 0x50 = 80. Then if the next RTP TS = 0x00000130 (i.e., the increment is 160 * 2 = 320), the new TS_OFFSET should be 0x00000130 modulo 0xA0 = 0x90 = 144. The compressor is not required to re-initialize TS_OFFSET at wraparound. Instead, the decompressor MUST detect wraparound of the unscaled TS (which is trivial) and update TS_OFFSET to TS_OFFSET = (Wrapped around unscaled TS) modulo TS_STRIDE 5. Interpretation interval at wraparound: Special rules are needed for the interpretation interval of the scaled TS at wraparound, since the maximum scaled TS, TSS_MAX, (0xFFFFFFFF / TS_STRIDE) may not have the form 2^m - 1. For example, when TS_STRIDE is 160, the scaled TS is at most 26843545 which has LSBs 10011001. The wraparound boundary between the TSS_MAX may thus not correspond to a natural boundary between LSBs. interpretation interval |<------------------------------>| unused scaled TS ------------|--------------|----------------------> TSS_MAX zero When TSS_MAX is part of the interpretation interval, a number of unused values are inserted into it after TSS_MAX such that their LSBs follow naturally upon each other. For example, for TS_STRIDE = 160 and k = 4, values corresponding to the LSBs 1010 through 1111 are inserted. The number of inserted values depends on k and the LSBs of the maximum scaled TS. The number of valid values in the interpretation interval should be high enough to maintain robustness. This can be ensured by the following rule: Bormann (ed.) [Page 29] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 Let a be the number of LSBs needed if there was no wraparound, and let b be the number of LSBs needed to disambiguate between TSS_MAX and zero. The number of LSB bits to send while TSS_MAX or zero is part of the interpretation interval is max(a + 1, b). This scaling method can be applied to many frame-based codecs. However, the value of TS_STRIDE might change during a session, for example as a result of adaptation strategies. If that happens, the unscaled TS is compressed until re-initialization of the new TS_STRIDE and TS_OFFSET is completed. 4.5.4. Timer-based compression of RTP Timestamp The RTP Timestamp [RFC 1889] is defined to identify the number of the first sample used to generate the payload. When 1) RTP packets carry payloads corresponding to a fixed sampling interval, 2) the sampling is done at a constant rate, and 3) packets are generated in lock-step with sampling, then the timestamp value will closely approximate a linear function of the time of day. This is the case for conversational media, such as interactive speech. The linear ratio is determined by the source sample rate. The linear pattern can be complicated by packetization (e.g., in the case of video where a video frame usually corresponds to several RTP packets) or frame rearrangement (e.g., B-frames are sent out-of-order by some video codecs). With a fixed sample rate of 8 kHz, 20 ms in the time domain is equivalent to an increment of 160 in the unscaled TS domain, and to an increment of 1 in the scaled TS domain with TS_STRIDE = 160. As a consequence, the (scaled) TS of headers arriving at the decompressor will be a linear function of time of day, with some deviation due to the delay jitter (and the clock inaccuracies) between the source and the decompressor. In normal operation, i.e., no crashes or failures, the delay jitter will be bounded to meet the requirements of conversational real-time traffic. Hence, by using a local clock the decompressor can obtain an approximation of the (scaled) TS in the header to be decompressed by considering its arrival time. The approximation can then be refined with the k LSBs of the (scaled) TS carried in the header. The value of k required to ensure correct decompression is a function of the jitter between the source and the decompressor. If the compressor knows the potential jitter introduced between compressor and decompressor, it can determine k by using a local clock to estimate jitter in packet arrival times, or alternatively it can use a fixed k and discard packets arriving too much out of time. Bormann (ed.) [Page 30] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 The advantages of this scheme include: a) The size of the compressed TS is constant and small. In particular, it does NOT depend on the length of silence intervals. This is in contrast to other TS compression techniques, which at the beginning of a talkspurt require sending a number of bits dependent on the duration of the preceding silence interval. b) No synchronization is required between the clock local to the compressor and the clock local to the decompressor. Note that although this scheme can be made to work using both scaled and unscaled TS, in practice it is always combined with scaled TS encoding because of the less demanding requirement on the clock resolution, e.g., 20 ms instead of 1/8 ms. Therefore, the algorithm described below assumes that the clock-based encoding scheme operates on the scaled TS. The case of unscaled TS would be similar, with changes to scale factors. The major task of the compressor is to determine the value of k. Its sliding window now contains not only potential reference values for the TS but also their times of arrival at the compressor. 1) The compressor maintains a sliding window {(T_j, a_j), for each header j that can be used as a reference}, where T_j is the scaled TS for header j, and a_j is the arrival time of header j. The sliding window serves the same purpose as the W-LSB sliding window of section 4.5.2. 2) When a new header n arrives with T_n as the scaled TS, the compressor notes the arrival time a_n. It then calculates Max_Jitter_BC = max {|(T_n - T_j) - ((a_n - a_j) / TIME_STRIDE)|, for all headers j in the sliding window}, where TIME_STRIDE is the time interval equivalent to one TS_STRIDE, e.g., 20 ms. Max_Jitter_BC is the maximum observed jitter before the compressor, in units of TS_STRIDE, for the headers in the sliding window. 3) k is calculated as k = ceiling(log2(2 * J + 1), where J = Max_Jitter_BC + Max_Jitter_CD + 2. Bormann (ed.) [Page 31] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 Max_Jitter_CD is the upper bound of jitter expected on the communication channel between compressor and decompressor (CD-CC). It depends only on the characteristics of CD-CC. The constant 2 accounts for the quantization error introduced by the clocks at the compressor and decompressor, which can be +/-1. Note that the calculation of k follows the compression algorithm described in section 4.5.1, with p = 2^(k-1) - 1. 4) The sliding window is subject to the same window operations as in section 4.5.2, 1) and 3), except that the values added and removed are paired with their arrival times. Decompressor: 1) The decompressor uses as its reference header the last correctly (as verified by CRC) decompressed header. It maintains the pair (T_ref, a_ref), where T_ref is the scaled TS of the reference header, and a_ref is the arrival time of the reference header. 2) When receiving a compressed header n at time a_n, the approximation of the original scaled TS is calculated as: T_approx = T_ref + (a_n - a_ref) / TIME_STRIDE. 3) The approximation is then refined by the k least significant bits carried in header n, following the decompression algorithm of section 4.5.1, with p = 2^(k-1) - 1. Note: The algorithm does not assume any particular pattern in the packets arriving at the compressor, i.e., it tolerates reordering before the compressor and nonincreasing RTP Timestamp behavior. Note: Integer arithmetic is used in all equations above. If TIME_STRIDE is not equal to an integral number of clock ticks, time must be normalized such that TIME_STRIDE is an integral number of clock ticks. For example, if a clock tick is 20 ms and TIME_STRIDE is 30 ms, (a_n - a_ref) in 2) can be multiplied by 3 and TIME_STRIDE can have the value 2. Note: The clock resolution of the compressor or decompressor can be worse than TIME_STRIDE, in which case the difference, i.e., actual resolution - TIME_STRIDE, is treated as additional jitter in the calculation of k. Note: The clock resolution of the decompressor may be communicated to the compressor using the CLOCK feedback option. Bormann (ed.) [Page 32] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 Note: The decompressor may observe the jitter and report this to the compressor using the JITTER feedback option. The compressor may use this information to refine its estimate of Max_Jitter_CD. 4.5.5. Offset IP-ID encoding As all IPv4 packets have an IP Identifier to allow for fragmentation, ROHC provides for transparent compression of this ID. There is no explicit support in ROHC for the IPv6 fragmentation header, so there is never a need to discuss IP IDs outside the context of IPv4. This section assumes (initially) that the IPv4 stack at the source host assigns IP-ID according to the value of a 2-byte counter which is increased by one after each assignment to an outgoing packet. Therefore, the IP-ID field of a particular IPv4 packet flow will increment by 1 from packet to packet except when the source has emitted intermediate packets not belonging to that flow. For such IPv4 stacks, the RTP SN will increase by 1 for each packet emitted and the IP-ID will increase by at least the same amount. Thus, it is more efficient to compress the offset, i.e., (IP-ID - RTP SN), instead of IP-ID itself. The remainder of section 4.5.5 describes how to compress/decompress the sequence of offsets using W-LSB encoding/decoding, with p = 0 (see section 4.5.1). All IP-ID arithmetic is done using unsigned 16-bit quantities, i.e. modulo 2^16. Compressor: The compressor uses W-LSB encoding (section 4.5.2) to compress a sequence of offsets Offset_i = ID_i - SN_i, where ID_i and SN_i are the values of the IP-ID and RTP SN of header i. The sliding window contains such offsets and not the values of header fields, but the rules for adding and deleting offsets from the window otherwise follow section 4.5.2. Decompressor: The reference header is the last correctly (as verified by CRC) decompressed header. When receiving a compressed packet m, the decompressor calculates Offset_ref = ID_ref - SN_ref, where ID_ref and SN_ref are the values of IP-ID and RTP SN in the reference header, respectively. Then W-LSB decoding is used to decompress Offset_m, using the received LSBs in packet m and Offset_ref. Note that m may contain zero LSBs for Offset_m, in which case Offset_m = Offset_ref. Bormann (ed.) [Page 33] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 Finally, the IP-ID for packet m is regenerated as IP-ID for m = decompressed SN of packet m + Offset_m Network byte order: Some IPv4 stacks do use a counter to generate IP ID values as described, but do not transmit the contents of this counter in network byte order, but instead send the two octets reversed. In this case, the compressor can compress the IP-ID field after swapping the bytes. Consequently, the decompressor also swaps the bytes of the IP-ID after decompression to regenerate the original IP-ID. This requires that the compressor and the decompressor synchronize on the byte order of the IP-ID field using the NBO or NBO2 flag (see section 5.7). Random IP Identifier: Some IPv4 stacks generate the IP Identifier values using a pseudo- random number generator. While this may provide some security benefits, it makes it pointless to attempt compressing the field. Therefore, the compressor should detect such random behavior of the field. After detection and synchronization with the decompressor using the RND or RND2 flag, the field is sent as-is in its entirety as additional octets after the compressed header. 4.5.6. Self-describing variable-length values The values of TS_STRIDE and a few other compression parameters can vary widely. TS_STRIDE can be 160 for voice and 90 000 for 1 f/s video. To optimize the transfer of such values, a variable number of octets is used to encode them. The number of octets used is determined by the first few bits of the first octet: First bit is 0: 1 octet. 7 bits transferred. Up to 127 decimal. Encoded octets in hexadecimal: 00 to 7F First bits are 10: 2 octets. 14 bits transferred. Up to 16 383 decimal. Encoded octets in hexadecimal: 80 00 to BF FF First bits are 110: 3 octets. 21 bits transferred. Up to 2 097 151 decimal. Encoded octets in hexadecimal: C0 00 00 to DF FF FF Bormann (ed.) [Page 34] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 First bits are 111: 4 octets. 29 bits transferred. Up to 536 870 911 decimal. Encoded octets in hexadecimal: E0 00 00 00 to FF FF FF FF 4.5.7. Encoded values across several fields in compressed headers When a compressed header has an extension, pieces of an encoded value can be present in more than one field. When an encoded value is split over several fields in this manner, the more significant bits of the value are closer to the beginning of the header. If the number of bits available in compressed header fields exceeds the number of bits in the value, the most significant field is padded with zeroes in its most significant bits. For example, an unscaled TS value can be transferred using an UOR-2 header (see section 5.7) with an extension of type 3. The Tsc bit of the extension is then unset (zero) and the variable length TS field of the extension is 4 octets, with 29 bits available for the TS (see section 4.5.6). The UOR-2 TS field will contain the three most significant bits of the unscaled TS, and the 4-octet TS field in the extension will contain the remaining 29 bits. 4.6. Errors caused by residual errors ROHC is designed under the assumption that packets can be damaged between the compressor and decompressor, and that such damaged packets can be delivered to the decompressor ("residual errors"). Residual errors may damage the SN in compressed headers. Such damage will cause generation of a header which upper layers may not be able to distinguish from a correct header. When the compressed header contains a CRC, the CRC will catch the bad header with a probability dependent on the size of the CRC. When ROHC does not detect the bad header, it will be delivered to upper layers. Damage is not confined to the SN: a) Damage to packet type indication bits can cause a header to be interpreted as having a different packet type. b) Damage to CID information may cause a packet to be interpreted according to another context and possibly also according to another profile. Damage to CIDs will be more harmful when a large part of the CID space is being used, so that it is likely that the damaged CID corresponds to an active context. c) Feedback information can also be subject to residual errors, both when feedback is piggybacked and when it is sent in separate ROHC Bormann (ed.) [Page 35] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 packets. ROHC uses sanity checks and adds CRCs to vital feedback information to allow detection of some damaged feedback. Note that context damage can also result in generation of incorrect headers; section 4.7 elaborates further on this. 4.7. Impairment considerations Impairments to headers can be classified into the following types: (1) the lower layer was not able to decode the packet and did not deliver it to ROHC, (2) the lower layer was able to decode the packet, but discarded it because of a detected error, (3) ROHC detected an error in the generated header and discarded the packet, or (4) ROHC did not detect that the regenerated header was damaged and delivered it to upper layers. Impairments cause loss or damage of individual headers. Some impairment scenarios also cause context invalidation, which in turn results in loss propagation and damage propagation. Damage propagation and undetected residual errors both contribute to the number of damaged headers delivered to upper layers. Loss propagation and impairments resulting in loss or discarding of single packets both contribute to the packet loss seen by upper layers. Examples of context invalidating scenarios are: (a) Impairment of type (4) on the forward channel, causing the decompressor to update its context with incorrect information; (b) Loss/error burst of pattern update headers: Impairments of types (1),(2) and (3) on consecutive pattern update headers; a pattern update header is a header carrying a new pattern information, e.g., at the beginning of a new talk spurt; this causes the decompressor to lose the pattern update information; (c) Loss/error burst of headers: Impairments of types (1),(2) and (3) on a number of consecutive headers that is large enough to cause the decompressor to lose the SN synchronization; (d) Impairment of type (4) on the feedback channel which mimics a valid ACK and makes the compressor update its context; (e) a burst of damaged headers (3) erroneously triggers the "k- out-of-n" rule for detecting context invalidation, which results in a NACK/update sequence during which headers are discarded. Scenario (a) is mitigated by the CRC carried in all context updating headers. The larger the CRC, the lower the chance of context invalidation caused by (a). In R-mode, the CRC of context updating Bormann (ed.) [Page 36] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 headers is always 7 bits or more. In U/O-mode, it is usually 3 bits and sometimes 7 or 8 bits. Scenario (b) is almost completely eliminated when the compressor ensures through ACKs that no context updating headers are lost, as in R-mode. Scenario (c) is almost completely eliminated when the compressor ensures through ACKs that the decompressor will always detect the SN wraparound, as in R-mode. It is also mitigated by the SN repair mechanisms in U/O-mode. Scenario (d) happens only when the compressor receives a damaged header that mimics an ACK of some header present in the W-LSB window, say ACK of header 2, while in reality header 2 was never received or accepted by the decompressor, i.e., header 2 was subject to impairment (1), (2) or (3). The damaged header must mimic the feedback packet type, the ACK feedback type, and the SN LSBs of some header in the W-LSB window. Scenario (e) happens when a burst of residual errors causes the CRC check to fail in k out of the last n headers carrying CRCs. Large k and n reduces the probability of scenario (e), but also increases the number of headers lost or damaged as a consequence of any context invalidation. ROHC detects damaged headers using CRCs over the original headers. The smallest headers in this document either include a 3-bit CRC (U/O-mode) or do not include a CRC (R-mode). For the smallest headers, damage is thus detected with a probability of roughly 7/8 for U/O-mode. For R-mode, damage to the smallest headers is not detected. All other things (coding scheme at lower layers, etc.) being equal, the rate of headers damaged by residual errors will be lower when headers are compressed compared when they are not, since fewer bits are transmitted. Consequently, for a given ROHC CRC setup the rate of incorrect headers delivered to applications will also be reduced. The above analysis suggests that U/O-mode may be more prone than R- mode to context invalidation. On the other hand, the CRC present in all U/O-mode headers continuously screens out residual errors coming from lower layers, reduces the number of damaged headers delivered to upper layers when context is invalidated, and permits quick detection of context invalidation. R-mode always uses a stronger CRC on context updating headers, but no CRC in other headers. A residual error on a header which carries no CRC will result in a damaged header being delivered to upper layers (4). The number of damaged headers delivered to the upper layers Bormann (ed.) [Page 37] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 depends on the ratio of headers with CRC vs. headers without CRC, which is a compressor parameter. 5. The protocol 5.1. Data structures The ROHC protocol is based on a number of parameters that form part of the negotiated channel state and the per-context state. This section describes some of this state information in an abstract way. Implementations can use a different structure for and representation of this state. In particular, negotiation protocols that set up the per-channel state need to establish the information that constitutes the negotiated channel state, but it is not necessary to exchange it in the form described here. 5.1.1. Per-channel parameters MAX_CID: Nonnegative integer; highest context ID number to be used by the compressor (note that this parameter is not coupled to, but in effect further constrained by, LARGE_CIDS). LARGE_CIDS: Boolean; if false, the short CID representation (0 bytes or 1 prefix byte, covering CID 0 to 15) is used; if true, the embedded CID representation (1 or 2 embedded CID bytes covering CID 0 to 16383) is used. PROFILES: Set of nonnegative integers, each integer indicating a profile supported by the decompressor. The compressor MUST NOT compress using a profile not in PROFILES. FEEDBACK_FOR: Optional reference to a channel in the reverse direction. If provided, this parameter indicates which channel any feedback sent on this channel refers to (see 5.7.6.1). MRRU: Maximum reconstructed reception unit. This is the size of the largest reconstructed unit in octets that the decompressor is expected to reassemble from segments (see 5.2.5). Note that this size includes the CRC. If MRRU is negotiated to be 0, no segment headers are allowed on the channel. 5.1.2. Per-context parameters, profiles Per-context parameters are established with IR headers (see section 5.2.3). An IR header contains a profile identifier, which determines how the rest of the header is to be interpreted. Note that the profile parameter determines the syntax and semantics of the packet type identifiers and packet types used in conjunction with a specific Bormann (ed.) [Page 38] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 context. This document describes profiles 0x0000, 0x0001, 0x0002, and 0x0003; further profiles may be defined when ROHC is extended in the future. Profile 0x0000 is for sending uncompressed IP packets. See section 5.10. Profile 0x0001 is for RTP/UDP/IP compression, see sections 5.3 through 5.9. Profile 0x0002 is for UDP/IP compression, i.e., compression of the first 12 octets of the UDP payload is not attempted. See section 5.11. Profile 0x0003 is for ESP/IP compression, i.e., compression of the header chain up to and including the first ESP header, but not subsequent subheaders. See section 5.12. Initially, all contexts are in no context state, i.e. all packets referencing this context except IR packets are discarded. If defined by a "ROHC over X" document, per-channel negotiation can be used to pre-establish state information for a context (e.g., negotiating profile 0x0000 for CID 15). Such state information can also be marked read-only in the negotiation, which would cause the decompressor to discard any IR packet attempting to modify it. 5.1.3. Contexts and context identifiers Associated with each compressed flow is a context, which is the state compressor and decompressor maintain in order to correctly compress or decompress the headers of the packet stream. Contexts are identified by a context identifier, CID, which is sent along with compressed headers and feedback information. The CID space is distinct for each channel, i.e., CID 3 over channel A and CID 3 over channel B do not refer to the same context, even if the endpoints of A and B are the same nodes. In particular, CIDs for any pairs of forward and reverse channels are not related (forward and reverse channels need not even have CID spaces of the same size). Context information is conceptually kept in a table. The context table is indexed using the CID which is sent along with compressed headers and feedback information. The CID space can be negotiated to be either small, which means that CIDs can take the values 0 through 15, or large, which means that CIDs take values between 0 and 2^14 - 1 = 16383. Whether the CID space is large or small is negotiated no later than when a channel is established. A small CID with the value 0 is represented using zero bits. A small CID with a value from 1 to 15 is represented by a four-bit field in Bormann (ed.) [Page 39] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 place of a packet type field (Add-CID) plus four more bits. A large CID is represented using the encoding scheme of section 4.5.6, limited to two octets. 5.2. ROHC packets and packet types The packet type indication scheme for ROHC has been designed under the following constraints: a) it must be possible to use only a limited number of packet sizes; b) it must be possible to send feedback information in separate ROHC packets as well as piggybacked on forward packets; c) it is desirable to allow elimination of the CID for one packet stream when few packet streams share a channel; d) it is anticipated that some packets with large headers may be larger than the MTU of very constrained lower layers. These constraints have led to a design which includes - optional padding, - a feedback packet type, - an optional Add-CID octet which provides 4 bits of CID, and - a simple segmentation and reassembly mechanism. A ROHC packet has the following general format (in the diagram, colons ":" indicate that the part is optional): --- --- --- --- --- --- --- --- : Padding : variable length --- --- --- --- --- --- --- --- : Feedback : 0 or more feedback elements --- --- --- --- --- --- --- --- : Header : variable, with CID information --- --- --- --- --- --- --- --- : Payload : --- --- --- --- --- --- --- --- Padding is any number (zero or more) of padding octets. Either of Feedback or Header must be present. Feedback elements always start with a packet type indication. Feedback elements carry internal CID information. Feedback is described in section 5.2.2. Header is either a profile-specific header or an IR or IR-DYN header (see sections 5.2.3 and 5.2.4). Header either 1) does not carry any CID information (indicating CID zero), or Bormann (ed.) [Page 40] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 2) includes one Add-CID Octet (see below), or 3) contains embedded CID information of length one or two octets. Alternatives 1) and 2) apply only to compressed headers in channels where the CID space is small. Alternative 3) applies only to compressed headers in channels where the CID space is large. Padding Octet 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ | 1 1 1 0 0 0 0 0 | +---+---+---+---+---+---+---+---+ Add-CID Octet 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ | 1 1 1 0 | CID | +---+---+---+---+---+---+---+---+ CID: 0x1 through 0xF indicates CIDs 1 through 15. Note: The Padding Octet looks like an Add-CID octet for CID 0. Header either starts with a packet type indication or has a packet type indication immediately following an Add-CID Octet. All Header packet types have the following general format (in the diagram, slashes "/" indicate variable length): 0 x-1 x 7 --- --- --- --- --- --- --- --- : Add-CID octet : if (CID 1-15) and (small CIDs) +---+--- --- --- ---+--- --- ---+ | type indication | body | 1 octet (8-x bits of body) +---+--- ---+---+---+--- --- ---+ : : / 0, 1, or 2 octets of CID / 1 or 2 octets if (large CIDs) : : +---+---+---+---+---+---+---+---+ / body / variable length +---+---+---+---+---+---+---+---+ The large CID, if present, is encoded according to section 4.5.6. 5.2.1. ROHC feedback Feedback carries information from decompressor to compressor. The following principal kinds of feedback are supported. In addition to Bormann (ed.) [Page 41] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 the kind of feedback, other information may be included in profile- specific feedback information. ACK : Acknowledges successful decompression of a packet, which means that the context is up-to-date with a high probability. NACK : Indicates that the dynamic context of the decompressor is out of sync. Generated when several successive packets have failed to be decompressed correctly. STATIC-NACK : Indicates that the static context of the decompressor is not valid or has not been established. It is anticipated that feedback to the compressor can be realized in many ways, depending on the properties of the particular lower layer. The exact details of how feedback is realized is to be specified in a "ROHC over X" document, for each lower layer X in question. For example, feedback might be realized using 1) lower-layer specific mechanisms 2) a dedicated feedback-only channel, realized for example by the lower layer providing a way to indicate that a packet is a feedback packet 3) a dedicated feedback-only channel, where the timing of the feedback provides information about which compressed packet caused the feedback 4) interspersing of feedback packets among normal compressed packets going in the same direction as the feedback (lower layers do not indicate feedback) 5) piggybacking of feedback information in compressed packets going in the same direction as the feedback (this technique may reduce the per-feedback overhead) 6) interspersing and piggybacking on the same channel, i.e., both 4) and 5). Alternatives 1-3 do not place any particular requirements on the ROHC packet type scheme. Alternatives 4-6 do, however. The ROHC packet type scheme has been designed to allow alternatives 4-6 (these may be used for example over PPP): A)The ROHC scheme provides a feedback packet type. The packet type is able to carry variable-length feedback information. Bormann (ed.) [Page 42] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 B)The feedback information sent on a particular channel is passed to, and interpreted by, the compressor associated with feedback on that channel. Thus, the feedback information must contain CID information if the associated compressor can use more than one context. The ROHC feedback scheme requires that a channel carries feedback to at most one compressor. How a compressor is associated with feedback on a particular channel needs to be defined in a "ROHC over X" document. C)The ROHC feedback information format is octet-aligned, i.e., starts at an octet boundary, to allow using the format over a dedicated feedback channel, 2). D)To allow piggybacking, 5), it is possible to deduce the length of feedback information by examining the first few octets of the feedback. This allows the decompressor to pass piggybacked feedback information to the associated same-side compressor without understanding its format. The length information decouples the decompressor from the compressor in the sense that the decompressor can process the compressed header immediately without waiting for the compressor to hand it back after parsing the feedback information. 5.2.2. ROHC feedback format Feedback sent on a ROHC channel consists of one or more concatenated feedback elements, where each feedback element has the following format: 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ | 1 1 1 1 0 | Code | feedback type octet +---+---+---+---+---+---+---+---+ : Size : if Code = 0 +---+---+---+---+---+---+---+---+ / feedback data / variable length +---+---+---+---+---+---+---+---+ Code: 0 indicates that a Size octet is present. 1-7 indicates the size of the feedback data field in octets. Size: Optional octet indicating the size of the feedback data field in octets. feedback data: Profile-specific feedback information. Includes CID information. The total size of the feedback data field is determinable upon reception by the decompressor, by inspection of the Code field and Bormann (ed.) [Page 43] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 possibly the Size field. This explicit length information allows piggybacking and also sending more than one feedback element in a packet. When the decompressor has determined the size of the feedback data field, it removes the feedback type octet and the Size field (if present) and hands the rest to the same-side associated compressor together with an indication of the size. The feedback data received by the compressor has the following structure (feedback sent on a dedicated feedback channel MAY also use this format): 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ : Add-CID octet : if for small CIDs and (CID != 0) +---+---+---+---+---+---+---+---+ : : / large CID (4.5.6 encoding) / 1-2 octets if for large CIDs : : +---+---+---+---+---+---+---+---+ / feedback / +---+---+---+---+---+---+---+---+ The large CID, if present, is encoded according to section 4.5.6. CID information in feedback data indicates the CID of the packet stream for which feedback is sent. Note that the LARGE_CIDS parameter that controls whether a large CID is present is taken from the channel state of the receiving compressor's channel, NOT from that of the channel carrying the feedback. It is REQUIRED that the feedback field have either of the following two formats: FEEDBACK-1 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ | profile specific information | 1 octet +---+---+---+---+---+---+---+---+ FEEDBACK-2 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ |Acktype| | +---+---+ profile specific / at least 2 octets / information | +---+---+---+---+---+---+---+---+ Acktype: 0 = ACK Bormann (ed.) [Page 44] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 1 = NACK 2 = STATIC-NACK 3 is reserved (MUST NOT be used. Otherwise unparseable.) The compressor can use the following logic to parse the feedback field. 1) If for large CIDs, the feedback will always start with a CID encoded according to section 4.5.6. If the first bit is 0, the CID uses one octet. If the first bit is 1, the CID uses two octets. 2) If for small CIDs, and the size is one octet, the feedback is a FEEDBACK-1. 3) If for small CIDs, and the size is larger than one octet, and the feedback starts with the two bits 11, the feedback starts with an Add-CID octet. If the size is 2, it is followed by FEEDBACK-1. If the size is larger than 2, the Add-CID is followed by FEEDBACK-2. 4) Otherwise, there is no Add-CID octet, and the feedback starts with a FEEDBACK-2. 5.2.3. ROHC IR packet type The IR header associates a CID with a profile, and typically also initializes the context. It can typically also refresh (parts of) the context. It has the following general format. 0 1 2 3 4 5 6 7 --- --- --- --- --- --- --- --- : Add-CID octet : if for small CIDs and (CID != 0) +---+---+---+---+---+---+---+---+ | 1 1 1 1 1 1 0 | x | IR type octet +---+---+---+---+---+---+---+---+ : : / 0-2 octets of CID / 1-2 octets if for large CIDs : : +---+---+---+---+---+---+---+---+ | Profile | 1 octet +---+---+---+---+---+---+---+---+ | CRC | 1 octet +---+---+---+---+---+---+---+---+ | | / profile specific information / variable length | | +---+---+---+---+---+---+---+---+ x: Profile specific information. Interpreted according to the profile indicated in the Profile field. Bormann (ed.) [Page 45] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 Profile: The profile to be associated with the CID. In the IR packet, the profile identifier is abbreviated to the 8 least significant bits. It selects the highest-number profile in the channel state parameter PROFILES that matches the 8 LSBs given. CRC: 8-bit CRC computed using the polynomial of section 5.9.1. Its coverage is profile-dependent, but it MUST cover at least the initial part of the packet ending with the Profile field. Any information which initializes the context of the decompressor should be protected by the CRC. Profile specific information: The contents of this part of the IR packet are defined by the individual profiles. Interpreted according to the profile indicated in the Profile field. 5.2.4. ROHC IR-DYN packet type In contrast to the IR header, the IR-DYN header can never initialize an uninitialized context. However, it can redefine what profile is associated with a context, see for example 5.11 (ROHC UDP) and 5.12 (ROHC ESP). Thus the type needs to be reserved at the framework level. The IR-DYN header typically also initializes or refreshes parts of a context, typically the dynamic part. It has the following general format: 0 1 2 3 4 5 6 7 --- --- --- --- --- --- --- --- : Add-CID octet : if for small CIDs and (CID != 0) +---+---+---+---+---+---+---+---+ | 1 1 1 1 1 0 0 0 | IR-DYN type octet +---+---+---+---+---+---+---+---+ : : / 0-2 octets of CID / 1-2 octets if for large CIDs : : +---+---+---+---+---+---+---+---+ | Profile | 1 octet +---+---+---+---+---+---+---+---+ | CRC | 1 octet +---+---+---+---+---+---+---+---+ | | / profile specific information / variable length | | +---+---+---+---+---+---+---+---+ Profile: The profile to be associated with the CID. This is abbreviated in the same way as with IR packets. CRC: 8-bit CRC computed using the polynomial of section 5.9.1. Its coverage is profile-dependent, but it MUST cover at least the Bormann (ed.) [Page 46] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 initial part of the packet ending with the Profile field. Any information which initializes the context of the decompressor should be protected by the CRC. Profile specific information: This part of the IR packet is defined by individual profiles. It is interpreted according to the profile indicated in the Profile field. 5.2.5. ROHC segmentation Some link layers may provide a much more efficient service if the set of different packet sizes to be transported is kept small. For such link layers, these sizes will normally be chosen to transport frequently occurring packets efficiently, with less frequently occurring packets possibly adapted to the next larger size by the addition of padding. The link layer may, however, be limited in the size of packets it can offer in this efficient mode, or it may be desirable to request only a limited largest size. To accommodate the occasional packet that is larger than that largest size negotiated, ROHC defines a simple segmentation protocol. 5.2.5.1. Segmentation usage considerations The segmentation protocol defined in ROHC is not particularly efficient. It is not intended to replace link layer segmentation functions; these SHOULD be used whenever available and efficient for the task at hand. ROHC segmentation should only be used for occasional packets with sizes larger than what is efficient to accommodate, e.g. due to exceptionally large ROHC headers. The segmentation scheme was designed to reduce packet size variations that may occur due to outliers in the header size distribution. In other cases, segmentation should be done at lower layers. The segmentation scheme should only be used for packet sizes that are larger than the maximum size in the allowed set of sizes from the lower layers. In summary, ROHC segmentation should be used with a relatively low frequency in the packet flow. If this cannot be ensured, segmentation should be performed at lower layers. 5.2.5.2. Segmentation protocol Segment Packet 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ | 1 1 1 1 1 1 1 | F | +---+---+---+---+---+---+---+---+ / Segment / variable length +---+---+---+---+---+---+---+---+ Bormann (ed.) [Page 47] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 F: Final bit. If set, it indicates that this is the last segment of a reconstructed unit. The segment header may be preceded by padding octets and/or feedback. It never carries a CID. All segment header packets for one reconstructed unit have to be sent consecutively on a channel, i.e., any non-segment-header packet following a nonfinal segment header aborts the reassembly of the current reconstructed unit and causes the decompressor to discard the nonfinal segments received on this channel so far. When a final segment header is received, the decompressor reassembles the segment carried in this packet and any nonfinal segments that immediately preceded it into a single reconstructed unit, in the order they were received. The reconstructed unit has the format: Reconstructed Unit 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ | | / Reconstructed ROHC packet / variable length | | +---+---+---+---+---+---+---+---+ / CRC / 4 octets +---+---+---+---+---+---+---+---+ The CRC is used by the decompressor to validate the reconstructed unit. It uses the FCS-32 algorithm with the following generator polynomial: x^0 + x^1 + x^2 + x^4 + x^5 + x^7 + x^8 + x^10 + x^11 + x^12 + x^16 + x^22 + x^23 + x^26 + x^32 [HDLC]. If the reconstructed unit is 4 octets or less, or if the CRC fails, or if it is larger than the channel parameter MRRU (see 5.1.1), the reconstructed unit MUST be discarded by the decompressor. If the CRC succeeds, the reconstructed ROHC packet is interpreted as a ROHC Header, optionally followed by a payload. Note that this means that there can be no padding and no feedback in the reconstructed unit, and that the CID is derived from the initial octets of the reconstructed unit. (It should be noted that the ROHC segmentation protocol was inspired by SEAL by Steve Deering et al., which later became ATM AAL5. The same arguments for not having sequence numbers in the segments but instead providing a strong CRC in the reconstructed unit apply here as well. Note that, as a result of this protocol, there is no way in ROHC to make any use of a segment that has residual bit errors.) Bormann (ed.) [Page 48] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 5.2.6. ROHC initial decompressor processing The following packet types are reserved at the framework level in the ROHC scheme: 1110: Padding or Add-CID octet 11110: Feedback 11111000: IR-DYN packet 1111110: IR packet 1111111: Segment Other packet types can be used at will by individual profiles. The following steps is an outline of initial decompressor processing which upon reception of a ROHC packet can determine its contents. 1) If the first octet is a Padding Octet (11100000), strip away all initial Padding Octets and goto next step. 2) If the first remaining octet starts with 1110, it is an Add-CID octet: remember the Add-CID octet; remove the octet. 3)If the first remaining octet starts with 11110, and an Add-CID octet was found in step 2), an error has occurred; the header MUST be discarded without further action. 4)If the first remaining octet starts with 11110, and an Add-CID octet was not found in step 2), this is feedback: find the size of the feedback data, call it s; remove the feedback type octet; remove the Size octet if Code is 0; send feedback data of length s to the same-side associated compressor; if packet exhausted, stop; otherwise goto 2). 5) If the first remaining octet starts with 1111111, this is a segment: attempt reconstruction using the segmentation protocol (5.2.5). If a reconstructed packet is not produced, this finishes the processing of the original packet. If a reconstructed packet is produced, it is fed into step 1) above. Padding, segments, and feedback are not allowed in reconstructed packets, so when processing them, steps 1), 4), and 5) are modified so that the packet is discarded without further action when their conditions match. 6) Here, it is known that the rest is forward information (unless the header is damaged). Bormann (ed.) [Page 49] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 7) If the forward traffic uses small CIDs, there is no large CID in the packet. If an Add-CID immediately preceded the packet type (step 2), it has the CID of the Add-CID; otherwise it has CID 0. 8) If the forward traffic uses large CIDs, the CID starts with the second remaining octet. If the first bit(s) of that octet are not 0 or 10, the packet MUST be discarded without further action. If an Add-CID octet immediately preceded the packet type (step 2), the packet MUST be discarded without further action. 9) Use the CID to find the context. 10) If the packet type is IR, the profile indicated in the IR packet determines how it is to be processed. If the CRC fails to verify the packet, it MUST be discarded. If a profile is indicated in the context, the logic of that profile determines what, if any, feedback is to be sent. If no profile is noted in the context, no further action is taken. 11) If the packet type is IR-DYN, the profile indicated in the IR-DYN packet determines how it is to be processed. a)If the CRC fails to verify the packet, it MUST be discarded. If a profile is indicated in the context, the logic of that profile determines what, if any, feedback is to be sent. If no profile is noted in the context, no further action is taken. b)If the context has not been initialized by an IR packet, the packet MUST be discarded. The logic of the profile indicated in the IR-DYN header (if verified by the CRC), determines what, if any, feedback is to be sent. 12) Otherwise, the profile noted in the context determines how the rest of the packet is to be processed. If the context has not been initialized by an IR packet, the packet MUST be discarded without further action. The procedure for finding the size of the feedback data is as follows: Examine the three bits which immediately follow the feedback packet type. When these bits are 1-7, the size of the feedback data is given by the bits; 0, a Size octet, which explicitly gives the size of the feedback data, is present after the feedback type octet. 5.2.7. ROHC RTP packet formats from compressor to decompressor ROHC RTP uses three packet types to identify compressed headers, and two for initialization/refresh. The format of a compressed packet can depend on the mode. Therefore a naming scheme of the form Bormann (ed.) [Page 50] INTERNET-DRAFT Robust Header Compression Feb 26, 2001 -- is used to uniquely identify the format when necessary, e.g., UOR-2, R-1. For exact formats of the packet types, see section 5.7. Packet type zero: R-0, R-0-CRC, UO-0. This, the minimal, packet type is used when parameters of all SN-functions are known by the decompressor, and the header to be compressed adheres to these functions. Thus, only the W-LSB encoded RTP SN needs to be communicated. R-mode: Only if a CRC is present (packet type R-0-CRC) may the header be used as a reference for subsequent decompression. U-mode and O-mode: A small CRC is present in the UO-0 packet. Packet type 1: R-1, R-1-ID, R-1-TS, UO-1, UO-1-ID, UO-1-TS. This packet type is used when the number of bits needed for the SN exceeds those available in packet type zero, or when the parameters of the SN-functions for RTP TS or IP-ID change. R-mode: R-1-* packets are not used as references for subsequent decompression. Values for other fields than the RTP TS or IP-ID can be communicated using an extension, but they do not update