ISO/IEC  WG2 - Standards

2014-11-16: home | standards | projects | meetings | contacts


Abstract

ISO/IEC 10646:2104 specifies the Universal Character Set (UCS). It is applicable to the representation, transmission, interchange, processing, storage, input and presentation of the written form of the languages of the world as well as additional symbols. It covers 120 585 characters from the world's scripts.

ISO/IEC 10646:2014:

The charts of the ideographic characters are now in multi-column format.

The UCS is an encoding system different from that specified in ISO/IEC 2022. ISO/IEC 10646:2014 specifies the method to designate UCS from ISO/IEC 2022.

A graphic character will be assigned only one code point in the standard, located either in the BMP or in one of the supplementary planes.

By defining a consistent way of encoding multilingual text it enables the exchange of data internationally. The information technology industry gains data stability, greater global interoperability and data interchange. ISO/IEC 10646 has been widely adopted on the World Wide Web and implemented in modern operating systems and computer languages.

 

Edition: 4 (Monolingual)

ICS: 35.040

Status: Published

Stage: 60.60 (2014-08-29)

TC/SC: ISO/IEC JTC 1/SC 2

Number of Pages: 2467

Related standards

ISO/IEC 10646:2012 Abstract

ISO/IEC 10646 specifies the Universal Character Set (UCS). It is applicable to the representation, transmission, interchange, processing, storage, input and presentation of the written form of the languages of the world as well as additional symbols. This edition covers 110,181 characters from the world‘s scripts. This edition of ISO/IEC 10646 cancels and replaces ISO/IEC 10646:2011.

Summary contents of ISO/IEC 10646:2012:

§  It specifies the architecture of ISO/IEC 10646.

§  It defines terms used ISO/IEC 10646.

§  It describes the general structure of the UCS codespace.

§  It specifies the Basic Multilingual Plane (BMP) of the UCS.

§  It specifies supplementary planes of the UCS: the Supplementary Multilingual Plane (SMP), the Supplementary Ideographic Plane (SIP), the Tertiary Ideographic Plane (TIP), and the Supplementary Special-purpose Plane (SSP).

§  It defines a set of graphic characters used in scripts and the written form of languages on a world-wide scale.

§  It specifies the names for the graphic characters and format characters of the BMP, SMP, SIP, SSP and their coded representations within the UCS codespace. (Note: TIP is currently empty).

§  It specifies the coded representations for control characters and private use characters.

§  It specifies three encoding forms of the UCS: UTF-8, UTF-16, and UTF-32.

§  It specifies seven encoding schemes of the UCS: UTF-8, UTF-16, UTF-16BE, UTF-16LE, UTF-32, UTF-32BE, and UTF-32LE.

§  It specifies the management of future additions to this coded character set.

§  The charts of the ideographic characters are now in multi-column format.

The UCS is an encoding system different from that specified in ISO/IEC 2022. The method to designate UCS from ISO/IEC 2022 is specified in 12.2.

 

A graphic character will be assigned only one code point in the standard, located either in the BMP or in one of the supplementary planes.

NOTE – The Unicode Standard, Version 6.1 includes a set of characters, names, and coded representations that are identical with those in this International Standard. It additionally provides details of character properties, processing algorithms, and definitions that are useful to implementers

By defining a consistent way of encoding multilingual text it enables the exchange of data internationally. The information technology industry gains data stability, greater global interoperability and data interchange. ISO/IEC 10646 has been widely adopted on the World Wide Web and implemented in modern operating systems and computer languages.

 

Approved international standards and other specifications that are now freely available include:  

ISO/IEC 10646: 2003(E) - 1st merged edition

ISO/CEI 10646: 2003(F) 

Information technology -- Universal Multiple-Octet Coded Character Set (UCS) - merged earlier editions Part 1 and Part 2 published in 2000 and 2001 respectively.
Technologies de l'information -- Jeu universel de caractères codés sur plusieurs octets (JUC)

ISO/IEC 10646: 2011 2nd edition 2nd edition of ISO/IEC 10646: 2011.  It includes all the amendments, 1-8, of the 1st merged edition of ISO/IEC 10646:2003 - Total character count: 109379
ISO/IEC 10646: 2012 3rd edition 3rd Edition of ISO/IEC 10646: 2012.  This edition covers 110,181 characters from the world‘s scripts. This edition of ISO/IEC 10646 cancels and replaces ISO/IEC 10646:2011.

 

WG2 has in cooperation with SC18 developed a technical report (freely available): ISO/IEC TR 15285 - An operational model for characters and glyphs.

 

ISO/IEC TR 15285 An operational model for characters and glyphs

 

There is one amendment in progress for ISO/IEC 10646:2012 3r edition which is currently in FDIS stage.

ISO/IEC 10646 3rd edition FDIS 3rd  edition of ISO/IEC 10646: 2012 (expected):  Contains content of 2nd edition plus additional characters:  Current Total character count is 110181.

Previous Editions

There are three previous editions of ISO/IEC 10646 were published earlier but has been withdrawn since the publication of ISO/IEC 10646:2014 edition:

  1. The ISO/IEC 10646-1 standard "Universal Multiple-Octet Coded Character Set (UCS) - Part 1: Architecture and Basic Multilingual Plane (BMP)", second edition was published March 2000.

  2. The first edition of IS 10646-1 was published in May 1993.

Here is a list of the earlier withdrawn editions specific to ISO/IEC 10646 and their amendments/corrigenda:

ISO/IEC 10646-1:1993
Information technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture and Basic Multilingual Plane

ISO/IEC 10646-1:1993/Amd 1:1996
Transformation Format for 16 planes of group 00 (UTF-16)

ISO/IEC 10646-1:1993/Cor 1:1996

ISO/IEC 10646-1:1993/Amd 2:1996
UCS Transformation Format 8 (UTF-8)

ISO/IEC 10646-1:1993/Cor 2:1998

ISO/IEC 10646-1:1993/Amd 3:1996

ISO/IEC 10646-1:1993/Amd 4:1996

ISO/IEC 10646-1:1993/Amd 5:1998
Hangul syllables

ISO/IEC 10646-1:1993/Amd 6:1997
Tibetan

ISO/IEC 10646-1:1993/Amd 7:1997
33 additional characters

ISO/IEC 10646-1:1993/Amd 8:1997

ISO/IEC 10646-1:1993/Amd 9:1997
Identifiers for characters

ISO/IEC 10646-1:1993/Amd 10:1998
Ethiopic

ISO/IEC 10646-1:1993/Amd 11:1998
Unified Canadian Aboriginal Syllabics

ISO/IEC 10646-1:1993/Amd 12:1998
Cherokee

ISO/IEC 10646-1:1993/Amd 13:1998
CJK unified ideographs with supplementary sources

ISO/IEC 10646-1:1993/Amd 16:1998
Braille patterns

ISO/IEC 10646-1:1993/Amd 17:1999
CJK Unified Ideographs Extension A

ISO/IEC 10646-1:1993/Amd 18:1999
Symbols and other characters

ISO/IEC 10646-1:1993/Amd 19:1998
Runic

ISO/IEC 10646-1:1993/Amd 20:1998
Ogham

ISO/IEC 10646-1:1993/Amd 21:1999
Sinhala

ISO/IEC 10646-1:1993/Amd 23:1999
Bopomofo Extended and others characters

ISO/IEC 10646-1:2000
Information technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture and Basic Multilingual Plane

ISO/IEC 10646-2:2001
Information technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 2: Supplementary Planes

ISO/IEC 10646-1:2000/Amd 1:2002
Mathematical symbols and other characters


2014-11-16: home | standards | projects | meetings | contacts