Universal Multiple-Octet Coded Character Set
International Organization for Standardization
Organisation internationale de normalisation
|Doc Type:||Working Group Document|
|Title:||Request to allow FFFF, FFFE in UTF-8 in the text of ISO/IEC 10646|
|Source:||Unicode Technical Committee|
|Action:||For adoption by JTC1/SC2/WG2|
The Unicode Technical Committee requests that WG2 change its definition of UTF-8 to allow the representation of the code points U+FFFF and U+FFFE. These are disallowed in ISO/IEC 10646, but are clearly an anomaly: other non-characters (U+1FFFE, U+1FFFF, etc.) as well as the new non-characters U+FDD0..U+FDEF are allowed.
these code points are all legal in HTML: see the SGML declaration
The 10646 definition of UTF-8 should be amended as soon as possible to allow all non-characters to be represented in UTF-8.