| rfc9755v1.txt | rfc9755.txt | |||
|---|---|---|---|---|
| Internet Engineering Task Force (IETF) P. Resnick | Internet Engineering Task Force (IETF) P. Resnick | |||
| Request for Comments: 9755 Episteme Technology Consulting LLC | Request for Comments: 9755 Episteme | |||
| Obsoletes: 6855 J. Yao | Obsoletes: 6855 J. Yao | |||
| Category: Standards Track CNNIC | Category: Standards Track CNNIC | |||
| ISSN: 2070-1721 A. Gulbrandsen | ISSN: 2070-1721 A. Gulbrandsen | |||
| ICANN | ICANN | |||
| February 2025 | March 2025 | |||
| IMAP Support for UTF-8 | IMAP Support for UTF-8 | |||
| Abstract | Abstract | |||
| This specification extends the Internet Message Access Protocol, | This specification extends the Internet Message Access Protocol, | |||
| specifically IMAP4rev1 (RFC 3501), to support UTF-8 encoded | specifically IMAP4rev1 (RFC 3501), to support UTF-8 encoded | |||
| international characters in user names, mail addresses, and message | international characters in user names, mail addresses, and message | |||
| headers. This specification replaces RFC 6855. This specification | headers. This specification replaces RFC 6855. This specification | |||
| does not extend IMAP4rev2 (RFC 9051), since that protocol includes | does not extend IMAP4rev2 (RFC 9051), since that protocol includes | |||
| skipping to change at line 126 ¶ | skipping to change at line 126 ¶ | |||
| A client MUST use the "ENABLE" command [RFC5161] with the | A client MUST use the "ENABLE" command [RFC5161] with the | |||
| "UTF8=ACCEPT" option (defined in Section 4 below) to indicate to the | "UTF8=ACCEPT" option (defined in Section 4 below) to indicate to the | |||
| server that the client accepts UTF-8 in quoted-strings and supports | server that the client accepts UTF-8 in quoted-strings and supports | |||
| the "UTF8=ACCEPT" extension. The "ENABLE UTF8=ACCEPT" command is | the "UTF8=ACCEPT" extension. The "ENABLE UTF8=ACCEPT" command is | |||
| only valid in the authenticated state. | only valid in the authenticated state. | |||
| The IMAP base specification [RFC3501] forbids the use of 8-bit | The IMAP base specification [RFC3501] forbids the use of 8-bit | |||
| characters in atoms or quoted-strings. Thus, a UTF-8 string can only | characters in atoms or quoted-strings. Thus, a UTF-8 string can only | |||
| be sent as a literal. This can be inconvenient from a coding | be sent as a literal. This can be inconvenient from a coding | |||
| standpoint, and unless the server offers IMAP non-synchronizing | standpoint, and unless the server offers IMAP non-synchronizing | |||
| literals [RFC2088], this requires an extra round trip for each UTF-8 | literals [RFC7888], this requires an extra round trip for each UTF-8 | |||
| string sent by the client. When the IMAP server supports | string sent by the client. When the IMAP server supports | |||
| "UTF8=ACCEPT", it supports UTF-8 in quoted-strings with the following | "UTF8=ACCEPT", it supports UTF-8 in quoted-strings with the following | |||
| ABNF syntax [RFC5234]: | ABNF syntax [RFC5234]: | |||
| quoted =/ DQUOTE *uQUOTED-CHAR DQUOTE | quoted =/ DQUOTE *uQUOTED-CHAR DQUOTE | |||
| ; QUOTED-CHAR is not modified, as it will affect | ; QUOTED-CHAR is not modified, as it will affect | |||
| ; other RFC 3501 ABNF non-terminals. | ; other RFC 3501 ABNF non-terminals. | |||
| uQUOTED-CHAR = QUOTED-CHAR / UTF8-2 / UTF8-3 / UTF8-4 | uQUOTED-CHAR = QUOTED-CHAR / UTF8-2 / UTF8-3 / UTF8-4 | |||
| skipping to change at line 161 ¶ | skipping to change at line 161 ¶ | |||
| quoted syntax with any IMAP argument that permits a string (including | quoted syntax with any IMAP argument that permits a string (including | |||
| astring and nstring). However, if characters outside the US-ASCII | astring and nstring). However, if characters outside the US-ASCII | |||
| repertoire are used in an inappropriate place, the results would be | repertoire are used in an inappropriate place, the results would be | |||
| the same as if other syntactically valid but semantically invalid | the same as if other syntactically valid but semantically invalid | |||
| characters were used. Specific cases where UTF-8 characters are | characters were used. Specific cases where UTF-8 characters are | |||
| permitted or not permitted are described in the following paragraphs. | permitted or not permitted are described in the following paragraphs. | |||
| All IMAP servers that support "UTF8=ACCEPT" SHOULD accept UTF-8 in | All IMAP servers that support "UTF8=ACCEPT" SHOULD accept UTF-8 in | |||
| mailbox names, and those that also support the Mailbox International | mailbox names, and those that also support the Mailbox International | |||
| Naming Convention described in [RFC3501], Section 5.1.3, MUST accept | Naming Convention described in [RFC3501], Section 5.1.3, MUST accept | |||
| UTF8-quoted mailbox names and convert them to the appropriate | UTF-8 in mailbox names and convert them to the appropriate internal | |||
| internal format. Mailbox names MUST comply with the Net-Unicode | format. Mailbox names MUST comply with the Net-Unicode Definition | |||
| Definition ([RFC5198], Section 2) with the specific exception that | ([RFC5198], Section 2) with the specific exception that they MUST NOT | |||
| they MUST NOT contain control characters (U+0000 - U+001F and U+0080 | contain control characters (U+0000 - U+001F and U+0080 - U+009F), a | |||
| - U+009F), a delete character (U+007F), a line separator (U+2028), or | delete character (U+007F), a line separator (U+2028), or a paragraph | |||
| a paragraph separator (U+2029). | separator (U+2029). | |||
| Once an IMAP client has enabled UTF-8 support with the "ENABLE | Once an IMAP client has enabled UTF-8 support with the "ENABLE | |||
| UTF8=ACCEPT" command, it MUST NOT issue a "SEARCH" command that | UTF8=ACCEPT" command, it MUST NOT issue a "SEARCH" command that | |||
| contains a charset specification. If an IMAP server receives such a | contains a charset specification. If an IMAP server receives such a | |||
| "SEARCH" command in that situation, it SHOULD reject the command with | "SEARCH" command in that situation, it SHOULD reject the command with | |||
| a "BAD" response (due to the conflicting charset labels). | a "BAD" response (due to the conflicting charset labels). This also | |||
| applies to any IMAP command or extension that includes an optional | ||||
| charset label and associated strings in the command arguments, | ||||
| including the MULTISEARCH extension. For commands with a mandatory | ||||
| charset field, such as SORT and THREAD, servers SHOULD reject charset | ||||
| values other than UTF-8 with a "BAD" response (due to the conflicting | ||||
| charset labels). | ||||
| 4. "APPEND" Command | 4. "APPEND" Command | |||
| If the server supports "UTF8=ACCEPT", then the server accepts UTF-8 | If the server supports "UTF8=ACCEPT", then the server accepts UTF-8 | |||
| headers in the "APPEND" command message argument. | headers in the "APPEND" command message argument. | |||
| If an IMAP server supports "UTF8=ACCEPT" and the IMAP client has not | If an IMAP server supports "UTF8=ACCEPT" and the IMAP client has not | |||
| issued the "ENABLE UTF8=ACCEPT" command, the server MUST reject, with | issued the "ENABLE UTF8=ACCEPT" command, the server MUST reject, with | |||
| a "NO" response, an "APPEND" command that includes any 8-bit | a "NO" response, an "APPEND" command that includes any 8-bit | |||
| character in message header fields. | character in message header fields. | |||
| skipping to change at line 348 ¶ | skipping to change at line 354 ¶ | |||
| * UTF8=ALL (OBSOLETE) | * UTF8=ALL (OBSOLETE) | |||
| * UTF8=APPEND (OBSOLETE) | * UTF8=APPEND (OBSOLETE) | |||
| * UTF8=ONLY | * UTF8=ONLY | |||
| * UTF8=USER (OBSOLETE) | * UTF8=USER (OBSOLETE) | |||
| 11. Security Considerations | 11. Security Considerations | |||
| The security considerations of UTF-8 [RFC3629] and SASLprep [RFC4013] | The security considerations of UTF-8 [RFC3629] and SASLprep [RFC8265] | |||
| apply to this specification, particularly with respect to use of | apply to this specification, particularly with respect to use of | |||
| UTF-8 in usernames and passwords. Otherwise, this is not believed to | UTF-8 in usernames and passwords. Otherwise, this is not believed to | |||
| alter the security considerations of IMAP. | alter the security considerations of IMAP. | |||
| Special considerations, some of them with security implications, | Special considerations, some of them with security implications, | |||
| occur if a server that conforms to this specification is accessed by | occur if a server that conforms to this specification is accessed by | |||
| a client that does not, as well as in some more complex situations in | a client that does not, as well as in some more complex situations in | |||
| which a given message is accessed by multiple clients that might use | which a given message is accessed by multiple clients that might use | |||
| different protocols and/or support different capabilities. Those | different protocols and/or support different capabilities. Those | |||
| issues are discussed in Section 8. | issues are discussed in Section 8. | |||
| skipping to change at line 377 ¶ | skipping to change at line 383 ¶ | |||
| <https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
| [RFC3501] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION | [RFC3501] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION | |||
| 4rev1", RFC 3501, DOI 10.17487/RFC3501, March 2003, | 4rev1", RFC 3501, DOI 10.17487/RFC3501, March 2003, | |||
| <https://www.rfc-editor.org/info/rfc3501>. | <https://www.rfc-editor.org/info/rfc3501>. | |||
| [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO | [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO | |||
| 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November | 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November | |||
| 2003, <https://www.rfc-editor.org/info/rfc3629>. | 2003, <https://www.rfc-editor.org/info/rfc3629>. | |||
| [RFC4013] Zeilenga, K., "SASLprep: Stringprep Profile for User Names | ||||
| and Passwords", RFC 4013, DOI 10.17487/RFC4013, February | ||||
| 2005, <https://www.rfc-editor.org/info/rfc4013>. | ||||
| [RFC5161] Gulbrandsen, A., Ed. and A. Melnikov, Ed., "The IMAP | [RFC5161] Gulbrandsen, A., Ed. and A. Melnikov, Ed., "The IMAP | |||
| ENABLE Extension", RFC 5161, DOI 10.17487/RFC5161, March | ENABLE Extension", RFC 5161, DOI 10.17487/RFC5161, March | |||
| 2008, <https://www.rfc-editor.org/info/rfc5161>. | 2008, <https://www.rfc-editor.org/info/rfc5161>. | |||
| [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network | [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network | |||
| Interchange", RFC 5198, DOI 10.17487/RFC5198, March 2008, | Interchange", RFC 5198, DOI 10.17487/RFC5198, March 2008, | |||
| <https://www.rfc-editor.org/info/rfc5198>. | <https://www.rfc-editor.org/info/rfc5198>. | |||
| [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax | [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax | |||
| Specifications: ABNF", STD 68, RFC 5234, | Specifications: ABNF", STD 68, RFC 5234, | |||
| skipping to change at line 410 ¶ | skipping to change at line 412 ¶ | |||
| February 2012, <https://www.rfc-editor.org/info/rfc6530>. | February 2012, <https://www.rfc-editor.org/info/rfc6530>. | |||
| [RFC6532] Yang, A., Steele, S., and N. Freed, "Internationalized | [RFC6532] Yang, A., Steele, S., and N. Freed, "Internationalized | |||
| Email Headers", RFC 6532, DOI 10.17487/RFC6532, February | Email Headers", RFC 6532, DOI 10.17487/RFC6532, February | |||
| 2012, <https://www.rfc-editor.org/info/rfc6532>. | 2012, <https://www.rfc-editor.org/info/rfc6532>. | |||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
| 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
| May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
| 12.2. Informative References | [RFC8265] Saint-Andre, P. and A. Melnikov, "Preparation, | |||
| Enforcement, and Comparison of Internationalized Strings | ||||
| Representing Usernames and Passwords", RFC 8265, | ||||
| DOI 10.17487/RFC8265, October 2017, | ||||
| <https://www.rfc-editor.org/info/rfc8265>. | ||||
| [RFC2088] Myers, J., "IMAP4 non-synchronizing literals", RFC 2088, | 12.2. Informative References | |||
| DOI 10.17487/RFC2088, January 1997, | ||||
| <https://www.rfc-editor.org/info/rfc2088>. | ||||
| [RFC2342] Gahrns, M. and C. Newman, "IMAP4 Namespace", RFC 2342, | [RFC2342] Gahrns, M. and C. Newman, "IMAP4 Namespace", RFC 2342, | |||
| DOI 10.17487/RFC2342, May 1998, | DOI 10.17487/RFC2342, May 1998, | |||
| <https://www.rfc-editor.org/info/rfc2342>. | <https://www.rfc-editor.org/info/rfc2342>. | |||
| [RFC4314] Melnikov, A., "IMAP4 Access Control List (ACL) Extension", | [RFC4314] Melnikov, A., "IMAP4 Access Control List (ACL) Extension", | |||
| RFC 4314, DOI 10.17487/RFC4314, December 2005, | RFC 4314, DOI 10.17487/RFC4314, December 2005, | |||
| <https://www.rfc-editor.org/info/rfc4314>. | <https://www.rfc-editor.org/info/rfc4314>. | |||
| [RFC5530] Gulbrandsen, A., "IMAP Response Codes", RFC 5530, | [RFC5530] Gulbrandsen, A., "IMAP Response Codes", RFC 5530, | |||
| skipping to change at line 445 ¶ | skipping to change at line 449 ¶ | |||
| [RFC6857] Fujiwara, K., "Post-Delivery Message Downgrading for | [RFC6857] Fujiwara, K., "Post-Delivery Message Downgrading for | |||
| Internationalized Email Messages", RFC 6857, | Internationalized Email Messages", RFC 6857, | |||
| DOI 10.17487/RFC6857, March 2013, | DOI 10.17487/RFC6857, March 2013, | |||
| <https://www.rfc-editor.org/info/rfc6857>. | <https://www.rfc-editor.org/info/rfc6857>. | |||
| [RFC6858] Gulbrandsen, A., "Simplified POP and IMAP Downgrading for | [RFC6858] Gulbrandsen, A., "Simplified POP and IMAP Downgrading for | |||
| Internationalized Email", RFC 6858, DOI 10.17487/RFC6858, | Internationalized Email", RFC 6858, DOI 10.17487/RFC6858, | |||
| March 2013, <https://www.rfc-editor.org/info/rfc6858>. | March 2013, <https://www.rfc-editor.org/info/rfc6858>. | |||
| [RFC7888] Melnikov, A., Ed., "IMAP4 Non-synchronizing Literals", | ||||
| RFC 7888, DOI 10.17487/RFC7888, May 2016, | ||||
| <https://www.rfc-editor.org/info/rfc7888>. | ||||
| [RFC8620] Jenkins, N. and C. Newman, "The JSON Meta Application | [RFC8620] Jenkins, N. and C. Newman, "The JSON Meta Application | |||
| Protocol (JMAP)", RFC 8620, DOI 10.17487/RFC8620, July | Protocol (JMAP)", RFC 8620, DOI 10.17487/RFC8620, July | |||
| 2019, <https://www.rfc-editor.org/info/rfc8620>. | 2019, <https://www.rfc-editor.org/info/rfc8620>. | |||
| [RFC9051] Melnikov, A., Ed. and B. Leiba, Ed., "Internet Message | [RFC9051] Melnikov, A., Ed. and B. Leiba, Ed., "Internet Message | |||
| Access Protocol (IMAP) - Version 4rev2", RFC 9051, | Access Protocol (IMAP) - Version 4rev2", RFC 9051, | |||
| DOI 10.17487/RFC9051, August 2021, | DOI 10.17487/RFC9051, August 2021, | |||
| <https://www.rfc-editor.org/info/rfc9051>. | <https://www.rfc-editor.org/info/rfc9051>. | |||
| Appendix A. Design Rationale | Appendix A. Design Rationale | |||
| skipping to change at line 486 ¶ | skipping to change at line 494 ¶ | |||
| UTF8-related syntax compatible with IMAP4rev2 as defined by [RFC9051] | UTF8-related syntax compatible with IMAP4rev2 as defined by [RFC9051] | |||
| and making it simpler for clients to support IMAP4rev1 and IMAP4rev2 | and making it simpler for clients to support IMAP4rev1 and IMAP4rev2 | |||
| with the same code. | with the same code. | |||
| IMAP4rev2 [RFC9051] provides roughly the same abilities as [RFC6855] | IMAP4rev2 [RFC9051] provides roughly the same abilities as [RFC6855] | |||
| but does not include APPEND's UTF8 item. None of [RFC6855], | but does not include APPEND's UTF8 item. None of [RFC6855], | |||
| IMAP4rev2, or JMAP [RFC8620] specify any way to learn whether a | IMAP4rev2, or JMAP [RFC8620] specify any way to learn whether a | |||
| particular message was stored using the UTF8 data item. As of today, | particular message was stored using the UTF8 data item. As of today, | |||
| an IMAP client cannot learn whether a particular message was stored | an IMAP client cannot learn whether a particular message was stored | |||
| using the UTF8 data item, nor would it be able to trust that | using the UTF8 data item, nor would it be able to trust that | |||
| information even if IMAP4rev1/2 were extended to provide that | information even if IMAP4rev1 and 2 were extended to provide that | |||
| information. | information. | |||
| In July 2023, one of the authors found only one IMAP client that uses | In July 2023, one of the authors found only one IMAP client that uses | |||
| the UTF8 data item, and that client uses it incorrectly (it sends the | the UTF8 data item, and that client uses it incorrectly (it sends the | |||
| data item for all messages if the server supports UTF8=ACCEPT, | data item for all messages if the server supports UTF8=ACCEPT, | |||
| without regard to whether a particular message includes any UTF8 at | without regard to whether a particular message includes any UTF8 at | |||
| all). | all). | |||
| For these reasons, it was judged best to revise [RFC6855] and adopt | For these reasons, it was judged best to revise [RFC6855] and adopt | |||
| the same syntax as IMAP4rev2. | the same syntax as IMAP4rev2. | |||
| B.2. FETCH BODYSTRUCTURE | B.2. FETCH BODYSTRUCTURE | |||
| [RFC6532] defines a new MIME type, message/global, which is | [RFC6532] defines a new media type, message/global, which is | |||
| substantially like message/rfc822 except that the submessage may | substantially like message/rfc822 except that the submessage may | |||
| (also) use the syntax defined in [RFC6532]. [RFC3501] and [RFC9051] | (also) use the syntax defined in [RFC6532]. [RFC3501] and [RFC9051] | |||
| define a FETCH item to return the MIME structure of a message, which | define a FETCH item to return the MIME structure of a message, which | |||
| servers usually compute once and store. | servers usually compute once and store. | |||
| None of the RFCs point out to implementers that IMAP4rev1 and | None of the RFCs point out to implementers that IMAP4rev1 and | |||
| IMAP4rev2 are slightly different, so storing the BODYSTRUCTURE in the | IMAP4rev2 are slightly different, so storing the BODYSTRUCTURE in the | |||
| way servers and clients often do can easily lead to problems. | way servers and clients often do can easily lead to problems. | |||
| This document makes the syntax optional, making it simple for server | This document makes the syntax optional, making it simple for server | |||
| End of changes. 12 change blocks. | ||||
| 21 lines changed or deleted | 29 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. | ||||