rfc9627v4.txt   rfc9627.txt 
Internet Engineering Task Force (IETF) J. Lennox Internet Engineering Task Force (IETF) J. Lennox
Request for Comments: 9627 8x8 / Jitsi Request for Comments: 9627 8x8 / Jitsi
Category: Standards Track D. Hong Category: Standards Track D. Hong
ISSN: 2070-1721 Vidyo ISSN: 2070-1721 Vidyo
J. Uberti J. Uberti
OpenAI
S. Holmer S. Holmer
M. Flodman M. Flodman
Google Google
February 2025 February 2025
The Layer Refresh Request (LRR) RTCP Feedback Message The Layer Refresh Request (LRR) RTCP Feedback Message
Abstract Abstract
This memo describes the RTCP Payload-Specific Feedback Message Layer This memo describes the RTCP Payload-Specific Feedback Message Layer
skipping to change at line 60 skipping to change at line 61
Table of Contents Table of Contents
1. Introduction 1. Introduction
2. Conventions and Terminology 2. Conventions and Terminology
2.1. Terminology 2.1. Terminology
3. Layer Refresh Request 3. Layer Refresh Request
3.1. Message Format 3.1. Message Format
3.2. Semantics 3.2. Semantics
4. Usage with Specific Codecs 4. Usage with Specific Codecs
4.1. H264 SVC 4.1. H.264 SVC
4.2. VP8 4.2. VP8
4.3. H265 4.3. H.265
5. Usage with Different Scalability Transmission Mechanisms 5. Usage with Different Scalability Transmission Mechanisms
6. SDP Definitions 6. SDP Definitions
7. Security Considerations 7. Security Considerations
8. IANA Considerations 8. IANA Considerations
9. References 9. References
9.1. Normative References 9.1. Normative References
9.2. Informative References 9.2. Informative References
Authors' Addresses Authors' Addresses
1. Introduction 1. Introduction
skipping to change at line 113 skipping to change at line 114
depend both on earlier pictures of that spatial layer and also on depend both on earlier pictures of that spatial layer and also on
lower-layer pictures of the current picture. However, a layer lower-layer pictures of the current picture. However, a layer
refresh typically requires that a spatial-layer picture be encoded in refresh typically requires that a spatial-layer picture be encoded in
a way that references only the lower-layer subpictures of the current a way that references only the lower-layer subpictures of the current
picture, not any earlier pictures of that spatial layer. picture, not any earlier pictures of that spatial layer.
Additionally, the encoder must promise that no earlier pictures of Additionally, the encoder must promise that no earlier pictures of
that spatial layer will be used as reference in the future. that spatial layer will be used as reference in the future.
However, even in a layer refresh, layers other than the ones being However, even in a layer refresh, layers other than the ones being
refreshed may still maintain dependency on earlier content of the refreshed may still maintain dependency on earlier content of the
stream. This is the difference between a layer refresh and a FIR stream. This is the difference between a layer refresh and an FIR
[RFC5104]. This minimizes the coding overhead of refresh to only [RFC5104]. This minimizes the coding overhead of refresh to only
those parts of the stream that actually need to be refreshed at any those parts of the stream that actually need to be refreshed at any
given time. given time.
The spatial-layer refresh of an enhancement layer is shown below. The spatial-layer refresh of an enhancement layer is shown below.
The "<--" indicates a coding dependency. The "<--" indicates a coding dependency.
... <-- S1 <-- S1 S1 <-- S1 <-- ... ... <-- S1 <-- S1 S1 <-- S1 <-- ...
| | | | | | | |
\/ \/ \/ \/ \/ \/ \/ \/
... <-- S0 <-- S0 <-- S0 <-- S0 <-- ... ... <-- S0 <-- S0 <-- S0 <-- S0 <-- ...
1 2 3 4 1 2 3 4
Figure 1: Refresh of a Spatial Enhancement Layer Figure 1: Refresh of a Spatial Enhancement Layer
In Figure 1, frame 3 is a layer refresh point for spatial-layer S1; a In Figure 1, frame 3 is a layer refresh point for spatial layer S1; a
decoder that had previously only been decoding spatial-layer S0 would decoder that had previously only been decoding spatial layer S0 would
be able to decode layer S1 starting at frame 3. be able to decode layer S1 starting at frame 3.
The spatial-layer refresh of a base layer is shown below. The "<--" The spatial-layer refresh of a base layer is shown below. The "<--"
indicates a coding dependency. indicates a coding dependency.
... <-- S1 <-- S1 <-- S1 <-- S1 <-- ... ... <-- S1 <-- S1 <-- S1 <-- S1 <-- ...
| | | | | | | |
\/ \/ \/ \/ \/ \/ \/ \/
... <-- S0 <-- S0 S0 <-- S0 <-- ... ... <-- S0 <-- S0 S0 <-- S0 <-- ...
1 2 3 4 1 2 3 4
Figure 2: Refresh of a Spatial Base Layer Figure 2: Refresh of a Spatial Base Layer
In Figure 2, frame 3 is a layer refresh point for spatial-layer S0; a In Figure 2, frame 3 is a layer refresh point for spatial layer S0; a
decoder that had previously not been decoding the stream at all could decoder that had previously not been decoding the stream at all could
decode layer S0 starting at frame 3. decode layer S0 starting at frame 3.
For temporal layers, while normal encoding allows frames to depend on For temporal layers, while normal encoding allows frames to depend on
earlier frames of the same temporal layer, layer refresh requires earlier frames of the same temporal layer, layer refresh requires
that the layer be "temporally nested", i.e., use as reference only that the layer be "temporally nested", i.e., use as reference only
earlier frames of a lower temporal layer, not any earlier frames of earlier frames of a lower temporal layer, not any earlier frames of
this temporal layer and promise that no future frames of this this temporal layer and promise that no future frames of this
temporal layer will reference frames of this temporal layer before temporal layer will reference frames of this temporal layer before
the refresh point. In many cases, the temporal structure of the the refresh point. In many cases, the temporal structure of the
skipping to change at line 186 skipping to change at line 187
An inherently temporally nested stream is shown below. The "<--" An inherently temporally nested stream is shown below. The "<--"
indicates a coding dependency. indicates a coding dependency.
T1 T1 T1 T1 T1 T1
/ / / / / /
|_ |_ |_ |_ |_ |_
... <-- T0 <------ T0 <------ T0 <------ T0 <--- ... ... <-- T0 <------ T0 <------ T0 <------ T0 <--- ...
1 2 3 4 5 6 7 1 2 3 4 5 6 7
Figure 4: An Inherently Temporally-Nested Stream Figure 4: An Inherently Temporally Nested Stream
In Figure 4, the stream is temporally nested in its ordinary In Figure 4, the stream is temporally nested in its ordinary
structure; a decoder receiving layer T0 can begin decoding layer T1 structure; a decoder receiving layer T0 can begin decoding layer T1
at any point. at any point.
A "layer index" is a numeric label for a specific spatial and A "layer index" is a numeric label for a specific spatial and
temporal layer of a scalable stream. It consists of both a temporal layer of a scalable stream. It consists of both a
"temporal-layer ID" identifying the temporal layer and a "layer ID" "temporal-layer ID" identifying the temporal layer and a "layer ID"
identifying the spatial or quality layer. The details of how layers identifying the spatial or quality layer. The details of how layers
of a scalable stream are labeled are codec specific. Details for of a scalable stream are labeled are codec specific. Details for
skipping to change at line 213 skipping to change at line 214
message [RFC4585] asking the encoder to encode a frame that makes it message [RFC4585] asking the encoder to encode a frame that makes it
possible to upgrade to a higher layer. The LRR contains one or two possible to upgrade to a higher layer. The LRR contains one or two
tuples, indicating the temporal and spatial layer the decoder wants tuples, indicating the temporal and spatial layer the decoder wants
to upgrade to and (optionally) the currently highest temporal and to upgrade to and (optionally) the currently highest temporal and
spatial layer the decoder can decode. spatial layer the decoder can decode.
The specific format of the tuples, and the mechanism by which a The specific format of the tuples, and the mechanism by which a
receiver recognizes a refresh frame, is codec dependent. Usage for receiver recognizes a refresh frame, is codec dependent. Usage for
several codecs is discussed in Section 4. several codecs is discussed in Section 4.
An LRR follows the FIR model (Section 3.5.1 of [RFC5104]) for its The design of LRR follows the FIR model (Section 3.5.1 of [RFC5104])
retransmission, reliability, and use in multipoint conferences. for its retransmission, reliability, and use in multipoint
conferences.
The LRR message is identified by RTCP packet type value PT=PSFB and The LRR message is identified by RTCP packet type value PT=PSFB and
FMT=10. The Feedback Control Information (FCI) field MUST contain FMT=10. The Feedback Control Information (FCI) field MUST contain
one or more LRR entries. Each entry applies to a different media one or more LRR entries. Each entry applies to a different media
sender, identified by its Synchronization Source (SSRC). sender, identified by its Synchronization Source (SSRC).
3.1. Message Format 3.1. Message Format
The FCI for the Layer Refresh Request consists of one or more FCI The FCI for the Layer Refresh Request consists of one or more FCI
entries, the content of which is depicted in Figure 5. The length of entries, the content of which is depicted in Figure 5. The length of
skipping to change at line 343 skipping to change at line 345
In order for an LRR to be used with a scalable codec, the format of In order for an LRR to be used with a scalable codec, the format of
the temporal and layer ID fields (for both the target and current the temporal and layer ID fields (for both the target and current
layer indices) needs to be specified for that codec's RTP layer indices) needs to be specified for that codec's RTP
packetization. New RTP packetization specifications for scalable packetization. New RTP packetization specifications for scalable
codecs SHOULD define how this is done. (The VP9 payload [RFC9628], codecs SHOULD define how this is done. (The VP9 payload [RFC9628],
for instance, has done so.) If the payload also specifies how it is for instance, has done so.) If the payload also specifies how it is
used with the Video Frame Marking RTP Header Extension described in used with the Video Frame Marking RTP Header Extension described in
[RFC9626], the syntax MUST be defined in the same manner as the TID [RFC9626], the syntax MUST be defined in the same manner as the TID
and LID fields in that header. and LID fields in that header.
4.1. H264 SVC 4.1. H.264 SVC
H.264 SVC [RFC6190] defines temporal, dependency (spatial), and H.264 SVC [RFC6190] defines temporal, dependency (spatial), and
quality scalability modes. quality scalability modes.
+---------------+---------------+ +---------------+---------------+
|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RES | TID |R| DID | QID | | RES | TID |R| DID | QID |
+---------------+---------------+ +---------------+---------------+
skipping to change at line 422 skipping to change at line 424
+---------------+---------------+ +---------------+---------------+
Figure 7: VP8 Layer Index Field Format Figure 7: VP8 Layer Index Field Format
Figure 7 shows the format of the layer index field for VP8 streams. Figure 7 shows the format of the layer index field for VP8 streams.
The "RES" fields MUST be set to zero on transmission and be ignored The "RES" fields MUST be set to zero on transmission and be ignored
on reception. See Section 4.2 of [RFC7741] for details on the TID on reception. See Section 4.2 of [RFC7741] for details on the TID
field. field.
A VP8 layer refresh point can be identified by the presence of the Y A VP8 layer refresh point can be identified by the presence of the Y
bit in the VP8 payload header. When this bit is set, this and all bit (see [RFC7741]) in the VP8 payload header. When this bit is set,
subsequent frames depend only on the current base temporal layer. On this and all subsequent frames depend only on the current base
receipt of an LRR for a VP8 stream, a sender that supports LRRs MUST temporal layer. On receipt of an LRR for a VP8 stream, a sender that
encode the stream so it can set the Y bit in a packet whose temporal supports LRRs MUST encode the stream so it can set the Y bit in a
layer is at or below the target layer index. packet whose temporal layer is at or below the target layer index.
Note that in VP8, not every layer switch point can be identified by Note that in VP8, not every layer switch point can be identified by
the Y bit since the Y bit implies layer switch of all layers, not the Y bit since the Y bit implies layer switch of all layers, not
just the layer in which it is sent. Thus, the use of an LRR with VP8 just the layer in which it is sent. Thus, the use of an LRR with VP8
can result in some inefficiency in transmission. However, this is can result in some inefficiency in transmission. However, this is
not expected to be a major issue for temporal structures in normal not expected to be a major issue for temporal structures in normal
use. use.
4.3. H265 4.3. H.265
The initial version of the H.265 payload format [RFC7798] defines The initial version of the H.265 payload format [RFC7798] defines
temporal scalability, with protocol elements reserved for spatial or temporal scalability, with protocol elements reserved for spatial or
other scalability modes (which are expected to be defined in a future other scalability modes (which are expected to be defined in a future
version of the specification). version of the specification).
+---------------+---------------+ +---------------+---------------+
|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RES | TID |RES| layer ID | | RES | TID |RES| layer ID |
skipping to change at line 470 skipping to change at line 472
every frame is implicitly a temporal layer refresh point. every frame is implicitly a temporal layer refresh point.
If a stream's sps_temporal_id_nesting_flag is not set, the NAL unit If a stream's sps_temporal_id_nesting_flag is not set, the NAL unit
types 2 to 5 inclusively identify temporal layer switching points. A types 2 to 5 inclusively identify temporal layer switching points. A
layer refresh to any higher target temporal layer is satisfied when a layer refresh to any higher target temporal layer is satisfied when a
NAL unit type of 4 or 5 with TID equal to 1 more than current TID is NAL unit type of 4 or 5 with TID equal to 1 more than current TID is
seen. Alternatively, layer refresh to a target temporal layer can be seen. Alternatively, layer refresh to a target temporal layer can be
incrementally satisfied with a NAL unit type of 2 or 3. In this incrementally satisfied with a NAL unit type of 2 or 3. In this
case, given current TID = TO and target TID = TN, layer refresh to TN case, given current TID = TO and target TID = TN, layer refresh to TN
is satisfied when a NAL unit type of 2 or 3 is seen for TID = T1, is satisfied when a NAL unit type of 2 or 3 is seen for TID = T1,
then TID = T2, all the way up to TID = TN. During this incremental then TID = T2, all the way up to TID = TN (note that TN and TO refer
to nonce variables in this instance). During this incremental
process, layer refresh to TN can be completely satisfied as soon as a process, layer refresh to TN can be completely satisfied as soon as a
NAL unit type of 2 or 3 is seen. NAL unit type of 2 or 3 is seen.
Of course, temporal layer refresh can also be satisfied whenever any Of course, temporal layer refresh can also be satisfied whenever any
Intra-Random Access Point (IRAP) NAL unit type (with values 16-23, Intra-Random Access Point (IRAP) NAL unit type (with values 16-23,
inclusively) is seen. An IRAP picture is similar to an IDR picture inclusively) is seen. An IRAP picture is similar to an IDR picture
in H.264 (NAL unit type of 5 in H.264) where decoding of the picture in H.264 (NAL unit type of 5 in H.264) where decoding of the picture
can start without any older pictures. can start without any older pictures.
In the (future) H.265 payloads that support spatial scalability, a In the (future) H.265 payloads that support spatial scalability, a
skipping to change at line 535 skipping to change at line 538
All the security considerations of FIR feedback packets [RFC5104] All the security considerations of FIR feedback packets [RFC5104]
apply to LRR feedback packets as well. Additionally, media senders apply to LRR feedback packets as well. Additionally, media senders
receiving LRR feedback packets MUST validate that the payload types receiving LRR feedback packets MUST validate that the payload types
and layer indices they are receiving are valid for the stream they and layer indices they are receiving are valid for the stream they
are currently sending, and discard the requests if not. are currently sending, and discard the requests if not.
8. IANA Considerations 8. IANA Considerations
This document defines a new entry to the "Codec Control Messages" This document defines a new entry to the "Codec Control Messages"
subregistry of the "Session Description Protocol (SDP) Parameters" registry of the "Session Description Protocol (SDP) Parameters"
registry, according to the following data: registry group, according to the following data:
Value Name: lrr Value Name: lrr
Long Name: Layer Refresh Request Command Long Name: Layer Refresh Request Command
Usable with: ccm Usable with: ccm
Mux: IDENTICAL-PER-PT Mux: IDENTICAL-PER-PT
Reference: RFC 9627 Reference: RFC 9627
This document also defines a new entry to the "FMT Values for PSFB This document also defines a new entry to the "FMT Values for PSFB
Payload Types" subregistry of the "Real-Time Transport Protocol (RTP) Payload Types" registry of the "Real-Time Transport Protocol (RTP)
Parameters" registry, according to the following data: Parameters" registry group, according to the following data:
Name: LRR Name: LRR
Long Name: Layer Refresh Request Command Long Name: Layer Refresh Request Command
Value: 10 Value: 10
Reference: RFC 9627 Reference: RFC 9627
9. References 9. References
9.1. Normative References 9.1. Normative References
skipping to change at line 603 skipping to change at line 606
[RFC7798] Wang, Y.-K., Sanchez, Y., Schierl, T., Wenger, S., and M. [RFC7798] Wang, Y.-K., Sanchez, Y., Schierl, T., Wenger, S., and M.
M. Hannuksela, "RTP Payload Format for High Efficiency M. Hannuksela, "RTP Payload Format for High Efficiency
Video Coding (HEVC)", RFC 7798, DOI 10.17487/RFC7798, Video Coding (HEVC)", RFC 7798, DOI 10.17487/RFC7798,
March 2016, <https://www.rfc-editor.org/info/rfc7798>. March 2016, <https://www.rfc-editor.org/info/rfc7798>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC9626] Zanaty, M., Berger, E., and S. Nandakumar, "Video Frame [RFC9626] Zanaty, M., Berger, E., and S. Nandakumar, "Video Frame
Marking RTP Header Extension", RFC 9621, Marking RTP Header Extension", RFC 9626,
DOI 10.17487/RFC9621, February 2025, DOI 10.17487/RFC9626, February 2025,
<https://www.rfc-editor.org/info/rfc9626>. <https://www.rfc-editor.org/info/rfc9626>.
9.2. Informative References 9.2. Informative References
[RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and [RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and
B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms
for Real-Time Transport Protocol (RTP) Sources", RFC 7656, for Real-Time Transport Protocol (RTP) Sources", RFC 7656,
DOI 10.17487/RFC7656, November 2015, DOI 10.17487/RFC7656, November 2015,
<https://www.rfc-editor.org/info/rfc7656>. <https://www.rfc-editor.org/info/rfc7656>.
skipping to change at line 643 skipping to change at line 646
Danny Hong Danny Hong
Vidyo, Inc. Vidyo, Inc.
433 Hackensack Avenue 433 Hackensack Avenue
Seventh Floor Seventh Floor
Hackensack, NJ 07601 Hackensack, NJ 07601
United States of America United States of America
Email: danny@vidyo.com Email: danny@vidyo.com
Justin Uberti Justin Uberti
Google, Inc. OpenAI
747 6th Street South 747 6th Street South
Kirkland, WA 98033 Kirkland, WA 98033
United States of America United States of America
Email: justin@uberti.name Email: justin@uberti.name
Stefan Holmer Stefan Holmer
Google, Inc. Google, Inc.
Kungsbron 2 Kungsbron 2
SE-111 22 Stockholm SE-111 22 Stockholm
Sweden Sweden
 End of changes. 16 change blocks. 
24 lines changed or deleted 27 lines changed or added

This html diff was produced by rfcdiff 1.48.