Diff: rfc9627v4.txt - rfc9627.txt

	rfc9627v4.txt	rfc9627.txt

	Internet Engineering Task Force (IETF) J. Lennox	Internet Engineering Task Force (IETF) J. Lennox
	Request for Comments: 9627 8x8 / Jitsi	Request for Comments: 9627 8x8 / Jitsi
	Category: Standards Track D. Hong	Category: Standards Track D. Hong
	ISSN: 2070-1721 Vidyo	ISSN: 2070-1721 Vidyo
	J. Uberti	J. Uberti

		OpenAI
	S. Holmer	S. Holmer
	M. Flodman	M. Flodman
	Google	Google
	February 2025	February 2025

	The Layer Refresh Request (LRR) RTCP Feedback Message	The Layer Refresh Request (LRR) RTCP Feedback Message

	Abstract	Abstract

	This memo describes the RTCP Payload-Specific Feedback Message Layer	This memo describes the RTCP Payload-Specific Feedback Message Layer

	skipping to change at line 60 ¶	skipping to change at line 61 ¶

	Table of Contents	Table of Contents

	1. Introduction	1. Introduction
	2. Conventions and Terminology	2. Conventions and Terminology
	2.1. Terminology	2.1. Terminology
	3. Layer Refresh Request	3. Layer Refresh Request
	3.1. Message Format	3.1. Message Format
	3.2. Semantics	3.2. Semantics
	4. Usage with Specific Codecs	4. Usage with Specific Codecs

	4.1. H264 SVC	4.1. H.264 SVC
	4.2. VP8	4.2. VP8

	4.3. H265	4.3. H.265
	5. Usage with Different Scalability Transmission Mechanisms	5. Usage with Different Scalability Transmission Mechanisms
	6. SDP Definitions	6. SDP Definitions
	7. Security Considerations	7. Security Considerations
	8. IANA Considerations	8. IANA Considerations
	9. References	9. References
	9.1. Normative References	9.1. Normative References
	9.2. Informative References	9.2. Informative References
	Authors' Addresses	Authors' Addresses

	1. Introduction	1. Introduction

	skipping to change at line 113 ¶	skipping to change at line 114 ¶
	depend both on earlier pictures of that spatial layer and also on	depend both on earlier pictures of that spatial layer and also on
	lower-layer pictures of the current picture. However, a layer	lower-layer pictures of the current picture. However, a layer
	refresh typically requires that a spatial-layer picture be encoded in	refresh typically requires that a spatial-layer picture be encoded in
	a way that references only the lower-layer subpictures of the current	a way that references only the lower-layer subpictures of the current
	picture, not any earlier pictures of that spatial layer.	picture, not any earlier pictures of that spatial layer.
	Additionally, the encoder must promise that no earlier pictures of	Additionally, the encoder must promise that no earlier pictures of
	that spatial layer will be used as reference in the future.	that spatial layer will be used as reference in the future.

	However, even in a layer refresh, layers other than the ones being	However, even in a layer refresh, layers other than the ones being
	refreshed may still maintain dependency on earlier content of the	refreshed may still maintain dependency on earlier content of the

	stream. This is the difference between a layer refresh and a FIR	stream. This is the difference between a layer refresh and an FIR
	[RFC5104]. This minimizes the coding overhead of refresh to only	[RFC5104]. This minimizes the coding overhead of refresh to only
	those parts of the stream that actually need to be refreshed at any	those parts of the stream that actually need to be refreshed at any
	given time.	given time.

	The spatial-layer refresh of an enhancement layer is shown below.	The spatial-layer refresh of an enhancement layer is shown below.
	The "<--" indicates a coding dependency.	The "<--" indicates a coding dependency.

	... <-- S1 <-- S1 S1 <-- S1 <-- ...	... <-- S1 <-- S1 S1 <-- S1 <-- ...
	\| \| \| \|	\| \| \| \|
	\/ \/ \/ \/	\/ \/ \/ \/
	... <-- S0 <-- S0 <-- S0 <-- S0 <-- ...	... <-- S0 <-- S0 <-- S0 <-- S0 <-- ...

	1 2 3 4	1 2 3 4

	Figure 1: Refresh of a Spatial Enhancement Layer	Figure 1: Refresh of a Spatial Enhancement Layer


	In Figure 1, frame 3 is a layer refresh point for spatial-layer S1; a	In Figure 1, frame 3 is a layer refresh point for spatial layer S1; a
	decoder that had previously only been decoding spatial-layer S0 would	decoder that had previously only been decoding spatial layer S0 would
	be able to decode layer S1 starting at frame 3.	be able to decode layer S1 starting at frame 3.

	The spatial-layer refresh of a base layer is shown below. The "<--"	The spatial-layer refresh of a base layer is shown below. The "<--"
	indicates a coding dependency.	indicates a coding dependency.

	... <-- S1 <-- S1 <-- S1 <-- S1 <-- ...	... <-- S1 <-- S1 <-- S1 <-- S1 <-- ...
	\| \| \| \|	\| \| \| \|
	\/ \/ \/ \/	\/ \/ \/ \/
	... <-- S0 <-- S0 S0 <-- S0 <-- ...	... <-- S0 <-- S0 S0 <-- S0 <-- ...

	1 2 3 4	1 2 3 4

	Figure 2: Refresh of a Spatial Base Layer	Figure 2: Refresh of a Spatial Base Layer


	In Figure 2, frame 3 is a layer refresh point for spatial-layer S0; a	In Figure 2, frame 3 is a layer refresh point for spatial layer S0; a
	decoder that had previously not been decoding the stream at all could	decoder that had previously not been decoding the stream at all could
	decode layer S0 starting at frame 3.	decode layer S0 starting at frame 3.

	For temporal layers, while normal encoding allows frames to depend on	For temporal layers, while normal encoding allows frames to depend on
	earlier frames of the same temporal layer, layer refresh requires	earlier frames of the same temporal layer, layer refresh requires
	that the layer be "temporally nested", i.e., use as reference only	that the layer be "temporally nested", i.e., use as reference only
	earlier frames of a lower temporal layer, not any earlier frames of	earlier frames of a lower temporal layer, not any earlier frames of
	this temporal layer and promise that no future frames of this	this temporal layer and promise that no future frames of this
	temporal layer will reference frames of this temporal layer before	temporal layer will reference frames of this temporal layer before
	the refresh point. In many cases, the temporal structure of the	the refresh point. In many cases, the temporal structure of the

	skipping to change at line 186 ¶	skipping to change at line 187 ¶
	An inherently temporally nested stream is shown below. The "<--"	An inherently temporally nested stream is shown below. The "<--"
	indicates a coding dependency.	indicates a coding dependency.

	T1 T1 T1	T1 T1 T1
	/ / /	/ / /
	\|_ \|_ \|_	\|_ \|_ \|_
	... <-- T0 <------ T0 <------ T0 <------ T0 <--- ...	... <-- T0 <------ T0 <------ T0 <------ T0 <--- ...

	1 2 3 4 5 6 7	1 2 3 4 5 6 7


	Figure 4: An Inherently Temporally-Nested Stream	Figure 4: An Inherently Temporally Nested Stream

	In Figure 4, the stream is temporally nested in its ordinary	In Figure 4, the stream is temporally nested in its ordinary
	structure; a decoder receiving layer T0 can begin decoding layer T1	structure; a decoder receiving layer T0 can begin decoding layer T1
	at any point.	at any point.

	A "layer index" is a numeric label for a specific spatial and	A "layer index" is a numeric label for a specific spatial and
	temporal layer of a scalable stream. It consists of both a	temporal layer of a scalable stream. It consists of both a
	"temporal-layer ID" identifying the temporal layer and a "layer ID"	"temporal-layer ID" identifying the temporal layer and a "layer ID"
	identifying the spatial or quality layer. The details of how layers	identifying the spatial or quality layer. The details of how layers
	of a scalable stream are labeled are codec specific. Details for	of a scalable stream are labeled are codec specific. Details for

	skipping to change at line 213 ¶	skipping to change at line 214 ¶
	message [RFC4585] asking the encoder to encode a frame that makes it	message [RFC4585] asking the encoder to encode a frame that makes it
	possible to upgrade to a higher layer. The LRR contains one or two	possible to upgrade to a higher layer. The LRR contains one or two
	tuples, indicating the temporal and spatial layer the decoder wants	tuples, indicating the temporal and spatial layer the decoder wants
	to upgrade to and (optionally) the currently highest temporal and	to upgrade to and (optionally) the currently highest temporal and
	spatial layer the decoder can decode.	spatial layer the decoder can decode.

	The specific format of the tuples, and the mechanism by which a	The specific format of the tuples, and the mechanism by which a
	receiver recognizes a refresh frame, is codec dependent. Usage for	receiver recognizes a refresh frame, is codec dependent. Usage for
	several codecs is discussed in Section 4.	several codecs is discussed in Section 4.


	An LRR follows the FIR model (Section 3.5.1 of [RFC5104]) for its	The design of LRR follows the FIR model (Section 3.5.1 of [RFC5104])
	retransmission, reliability, and use in multipoint conferences.	for its retransmission, reliability, and use in multipoint
		conferences.

	The LRR message is identified by RTCP packet type value PT=PSFB and	The LRR message is identified by RTCP packet type value PT=PSFB and
	FMT=10. The Feedback Control Information (FCI) field MUST contain	FMT=10. The Feedback Control Information (FCI) field MUST contain
	one or more LRR entries. Each entry applies to a different media	one or more LRR entries. Each entry applies to a different media
	sender, identified by its Synchronization Source (SSRC).	sender, identified by its Synchronization Source (SSRC).

	3.1. Message Format	3.1. Message Format

	The FCI for the Layer Refresh Request consists of one or more FCI	The FCI for the Layer Refresh Request consists of one or more FCI
	entries, the content of which is depicted in Figure 5. The length of	entries, the content of which is depicted in Figure 5. The length of

	skipping to change at line 343 ¶	skipping to change at line 345 ¶
	In order for an LRR to be used with a scalable codec, the format of	In order for an LRR to be used with a scalable codec, the format of
	the temporal and layer ID fields (for both the target and current	the temporal and layer ID fields (for both the target and current
	layer indices) needs to be specified for that codec's RTP	layer indices) needs to be specified for that codec's RTP
	packetization. New RTP packetization specifications for scalable	packetization. New RTP packetization specifications for scalable
	codecs SHOULD define how this is done. (The VP9 payload [RFC9628],	codecs SHOULD define how this is done. (The VP9 payload [RFC9628],
	for instance, has done so.) If the payload also specifies how it is	for instance, has done so.) If the payload also specifies how it is
	used with the Video Frame Marking RTP Header Extension described in	used with the Video Frame Marking RTP Header Extension described in
	[RFC9626], the syntax MUST be defined in the same manner as the TID	[RFC9626], the syntax MUST be defined in the same manner as the TID
	and LID fields in that header.	and LID fields in that header.


	4.1. H264 SVC	4.1. H.264 SVC

	H.264 SVC [RFC6190] defines temporal, dependency (spatial), and	H.264 SVC [RFC6190] defines temporal, dependency (spatial), and
	quality scalability modes.	quality scalability modes.

	+---------------+---------------+	+---------------+---------------+
	\|0\|1\|2\|3\|4\|5\|6\|7\|0\|1\|2\|3\|4\|5\|6\|7\|	\|0\|1\|2\|3\|4\|5\|6\|7\|0\|1\|2\|3\|4\|5\|6\|7\|
	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
	\| RES \| TID \|R\| DID \| QID \|	\| RES \| TID \|R\| DID \| QID \|
	+---------------+---------------+	+---------------+---------------+


	skipping to change at line 422 ¶	skipping to change at line 424 ¶
	+---------------+---------------+	+---------------+---------------+

	Figure 7: VP8 Layer Index Field Format	Figure 7: VP8 Layer Index Field Format

	Figure 7 shows the format of the layer index field for VP8 streams.	Figure 7 shows the format of the layer index field for VP8 streams.
	The "RES" fields MUST be set to zero on transmission and be ignored	The "RES" fields MUST be set to zero on transmission and be ignored
	on reception. See Section 4.2 of [RFC7741] for details on the TID	on reception. See Section 4.2 of [RFC7741] for details on the TID
	field.	field.

	A VP8 layer refresh point can be identified by the presence of the Y	A VP8 layer refresh point can be identified by the presence of the Y

	bit in the VP8 payload header. When this bit is set, this and all	bit (see [RFC7741]) in the VP8 payload header. When this bit is set,
	subsequent frames depend only on the current base temporal layer. On	this and all subsequent frames depend only on the current base
	receipt of an LRR for a VP8 stream, a sender that supports LRRs MUST	temporal layer. On receipt of an LRR for a VP8 stream, a sender that
	encode the stream so it can set the Y bit in a packet whose temporal	supports LRRs MUST encode the stream so it can set the Y bit in a
	layer is at or below the target layer index.	packet whose temporal layer is at or below the target layer index.

	Note that in VP8, not every layer switch point can be identified by	Note that in VP8, not every layer switch point can be identified by
	the Y bit since the Y bit implies layer switch of all layers, not	the Y bit since the Y bit implies layer switch of all layers, not
	just the layer in which it is sent. Thus, the use of an LRR with VP8	just the layer in which it is sent. Thus, the use of an LRR with VP8
	can result in some inefficiency in transmission. However, this is	can result in some inefficiency in transmission. However, this is
	not expected to be a major issue for temporal structures in normal	not expected to be a major issue for temporal structures in normal
	use.	use.


	4.3. H265	4.3. H.265

	The initial version of the H.265 payload format [RFC7798] defines	The initial version of the H.265 payload format [RFC7798] defines
	temporal scalability, with protocol elements reserved for spatial or	temporal scalability, with protocol elements reserved for spatial or
	other scalability modes (which are expected to be defined in a future	other scalability modes (which are expected to be defined in a future
	version of the specification).	version of the specification).

	+---------------+---------------+	+---------------+---------------+
	\|0\|1\|2\|3\|4\|5\|6\|7\|0\|1\|2\|3\|4\|5\|6\|7\|	\|0\|1\|2\|3\|4\|5\|6\|7\|0\|1\|2\|3\|4\|5\|6\|7\|
	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
	\| RES \| TID \|RES\| layer ID \|	\| RES \| TID \|RES\| layer ID \|

	skipping to change at line 470 ¶	skipping to change at line 472 ¶
	every frame is implicitly a temporal layer refresh point.	every frame is implicitly a temporal layer refresh point.

	If a stream's sps_temporal_id_nesting_flag is not set, the NAL unit	If a stream's sps_temporal_id_nesting_flag is not set, the NAL unit
	types 2 to 5 inclusively identify temporal layer switching points. A	types 2 to 5 inclusively identify temporal layer switching points. A
	layer refresh to any higher target temporal layer is satisfied when a	layer refresh to any higher target temporal layer is satisfied when a
	NAL unit type of 4 or 5 with TID equal to 1 more than current TID is	NAL unit type of 4 or 5 with TID equal to 1 more than current TID is
	seen. Alternatively, layer refresh to a target temporal layer can be	seen. Alternatively, layer refresh to a target temporal layer can be
	incrementally satisfied with a NAL unit type of 2 or 3. In this	incrementally satisfied with a NAL unit type of 2 or 3. In this
	case, given current TID = TO and target TID = TN, layer refresh to TN	case, given current TID = TO and target TID = TN, layer refresh to TN
	is satisfied when a NAL unit type of 2 or 3 is seen for TID = T1,	is satisfied when a NAL unit type of 2 or 3 is seen for TID = T1,

	then TID = T2, all the way up to TID = TN. During this incremental	then TID = T2, all the way up to TID = TN (note that TN and TO refer
		to nonce variables in this instance). During this incremental
	process, layer refresh to TN can be completely satisfied as soon as a	process, layer refresh to TN can be completely satisfied as soon as a
	NAL unit type of 2 or 3 is seen.	NAL unit type of 2 or 3 is seen.

	Of course, temporal layer refresh can also be satisfied whenever any	Of course, temporal layer refresh can also be satisfied whenever any
	Intra-Random Access Point (IRAP) NAL unit type (with values 16-23,	Intra-Random Access Point (IRAP) NAL unit type (with values 16-23,
	inclusively) is seen. An IRAP picture is similar to an IDR picture	inclusively) is seen. An IRAP picture is similar to an IDR picture
	in H.264 (NAL unit type of 5 in H.264) where decoding of the picture	in H.264 (NAL unit type of 5 in H.264) where decoding of the picture
	can start without any older pictures.	can start without any older pictures.

	In the (future) H.265 payloads that support spatial scalability, a	In the (future) H.265 payloads that support spatial scalability, a

	skipping to change at line 535 ¶	skipping to change at line 538 ¶

	All the security considerations of FIR feedback packets [RFC5104]	All the security considerations of FIR feedback packets [RFC5104]
	apply to LRR feedback packets as well. Additionally, media senders	apply to LRR feedback packets as well. Additionally, media senders
	receiving LRR feedback packets MUST validate that the payload types	receiving LRR feedback packets MUST validate that the payload types
	and layer indices they are receiving are valid for the stream they	and layer indices they are receiving are valid for the stream they
	are currently sending, and discard the requests if not.	are currently sending, and discard the requests if not.

	8. IANA Considerations	8. IANA Considerations

	This document defines a new entry to the "Codec Control Messages"	This document defines a new entry to the "Codec Control Messages"

	subregistry of the "Session Description Protocol (SDP) Parameters"	registry of the "Session Description Protocol (SDP) Parameters"
	registry, according to the following data:	registry group, according to the following data:

	Value Name: lrr	Value Name: lrr
	Long Name: Layer Refresh Request Command	Long Name: Layer Refresh Request Command
	Usable with: ccm	Usable with: ccm
	Mux: IDENTICAL-PER-PT	Mux: IDENTICAL-PER-PT
	Reference: RFC 9627	Reference: RFC 9627

	This document also defines a new entry to the "FMT Values for PSFB	This document also defines a new entry to the "FMT Values for PSFB

	Payload Types" subregistry of the "Real-Time Transport Protocol (RTP)	Payload Types" registry of the "Real-Time Transport Protocol (RTP)
	Parameters" registry, according to the following data:	Parameters" registry group, according to the following data:

	Name: LRR	Name: LRR
	Long Name: Layer Refresh Request Command	Long Name: Layer Refresh Request Command
	Value: 10	Value: 10
	Reference: RFC 9627	Reference: RFC 9627

	9. References	9. References

	9.1. Normative References	9.1. Normative References


	skipping to change at line 603 ¶	skipping to change at line 606 ¶
	[RFC7798] Wang, Y.-K., Sanchez, Y., Schierl, T., Wenger, S., and M.	[RFC7798] Wang, Y.-K., Sanchez, Y., Schierl, T., Wenger, S., and M.
	M. Hannuksela, "RTP Payload Format for High Efficiency	M. Hannuksela, "RTP Payload Format for High Efficiency
	Video Coding (HEVC)", RFC 7798, DOI 10.17487/RFC7798,	Video Coding (HEVC)", RFC 7798, DOI 10.17487/RFC7798,
	March 2016, <https://www.rfc-editor.org/info/rfc7798>.	March 2016, <https://www.rfc-editor.org/info/rfc7798>.

	[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC	[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
	2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,	2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
	May 2017, <https://www.rfc-editor.org/info/rfc8174>.	May 2017, <https://www.rfc-editor.org/info/rfc8174>.

	[RFC9626] Zanaty, M., Berger, E., and S. Nandakumar, "Video Frame	[RFC9626] Zanaty, M., Berger, E., and S. Nandakumar, "Video Frame

	Marking RTP Header Extension", RFC 9621,	Marking RTP Header Extension", RFC 9626,
	DOI 10.17487/RFC9621, February 2025,	DOI 10.17487/RFC9626, February 2025,
	<https://www.rfc-editor.org/info/rfc9626>.	<https://www.rfc-editor.org/info/rfc9626>.

	9.2. Informative References	9.2. Informative References

	[RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and	[RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and
	B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms	B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms
	for Real-Time Transport Protocol (RTP) Sources", RFC 7656,	for Real-Time Transport Protocol (RTP) Sources", RFC 7656,
	DOI 10.17487/RFC7656, November 2015,	DOI 10.17487/RFC7656, November 2015,
	<https://www.rfc-editor.org/info/rfc7656>.	<https://www.rfc-editor.org/info/rfc7656>.


	skipping to change at line 643 ¶	skipping to change at line 646 ¶

	Danny Hong	Danny Hong
	Vidyo, Inc.	Vidyo, Inc.
	433 Hackensack Avenue	433 Hackensack Avenue
	Seventh Floor	Seventh Floor
	Hackensack, NJ 07601	Hackensack, NJ 07601
	United States of America	United States of America
	Email: danny@vidyo.com	Email: danny@vidyo.com

	Justin Uberti	Justin Uberti

	Google, Inc.	OpenAI
	747 6th Street South	747 6th Street South
	Kirkland, WA 98033	Kirkland, WA 98033
	United States of America	United States of America
	Email: justin@uberti.name	Email: justin@uberti.name

	Stefan Holmer	Stefan Holmer
	Google, Inc.	Google, Inc.
	Kungsbron 2	Kungsbron 2
	SE-111 22 Stockholm	SE-111 22 Stockholm
	Sweden	Sweden

End of changes. 16 change blocks.
	24 lines changed or deleted	27 lines changed or added
This html diff was produced by rfcdiff 1.48.