ISO/IEC JTC 1/SC22/Java SG ISO/IEC JTC 1/SC22/Java SG N 3-12 DATE: 1998-10-17 REPLACES: N/A DOC TYPE: Plain DOS Text TITLE: Requirements for supporting non-BMP planes in UCS-4 SOURCE: Japan (Kazuhiro Kazama) PROJECT: N/A STATUS: This document is circulated to National Bodies of JTC 1/SC22/Java SG for review and consideration at the October 1998 SC22/Java SG meeting in Tokyo. ACTION ID: FYI DUE DATE: DISTRIBUTION: P and L Members MEDIUM: DISKETTE NO.: NO. OF PAGES: 1 Text of contribution: ISO Java standard should support non-BMP planes in UCS-4. For example, additional ideographs will be added to plane 2 in ISO/IEC 10646. Those characters are necessary for reading/writing Japanese documents. The current Java specification has some problems in non-BMP plane support. The primitive character data type "char" can't store UCS-4 data directly because its values are 16-bit unsigned integers. Although the Unicode Standard 2.0 supports surrogate pairs (UTF-16), there are the following problems in the Java specification. * Unicode escapes (ex. \uFFFF) don't support UCS-4 representation. * Almost all methods in java.lang.Character class don't process surrogate pairs as one character. * Because The "CONSTANT_Utf8" format in class files isn't the standard UTF-8 format, surrogate pairs don't convert correctly. This document has reported only requirements of non-BMP planes, and doesn't specify methods to support those planes.