SC22/WG20 N826 From: Kenneth Whistler [kenw@sybase.com] Sent: Monday, April 09, 2001 8:52 PM Subject: WG2 #40 Meeting Summary (Mountain View) Unicordates, WG2 met last week in Mountain View. You can go dig out the resolutions (WG2 N2354) and eventually the minutes yourself, but, as usual, I'd like to send around a short--well, kind of long, actually--summary report, emphasizing the issues of relevance to the UTC. 1. FDIS 10646-2 (see Resolution M40.3) This is approved now, and on its way to publication as an IS. The editor (Michel) accommodated a few minor editorial comments. The main issue that was discussed was the status of the Extension B font, needed both for publication of 10646-2 and for the final chart publication of Extension B for Unicode 3.1. China reported that the FDIS font was created by 3 vendors. The current font is now the work of a single vendor, so it is stylistically consistent and the quality has been improved. It is still being checked for correctness -- and that will have to be reviewed by the IRG meeting June 18-20 in Hong Kong. China promised delivery of the "final, final font" by July 15. I'll spare you the rights-to-online-publication hassling about the font. I am assuming that that will be worked out, but if not, then we have a production problem for both standards. 2. Amendment 1 for 10646-1 (see Resolutions M40.4, M40.5, M40.6) The resolution of comments for PDAM 1 was the main order of business for the meeting. The PDAM passed, with negative votes by Japan and Ireland, and with lots of comments from other national bodies. WG2 accommodated most of the comments, including enough to turn the Japanese and Irish votes to YES. And the amendment now progresses to its FPDAM balloting. In the process of resolving national ballot comments or as a result of considering independent proposal documents brought into the meeting, a number of characters were added to the amendment. I'll list these here, ordered by their status for the UTC. And then list other significant changes. Characters added to the BMP, already approved by the UTC, with the same code point and names. These are non-problematical additions, synching up with what UTC has approved, and require no further action by the UTC: 0220 LATIN CAPITAL LETTER N WITH LONG RIGHT LEG 034F COMBINING GRAPHEME JOINER 066E ARABIC LETTER DOTLESS BEH 066F ARABIC LETTER DOTLESS QAF 267A RECYCLING SYMBOL FOR GENERIC MATERIALS 267B BLACK UNIVERSAL RECYCLING SYMBOL 2768..2775 (14 Dingbat ornamental brackets) FE73 ARABIC TAIL FRAGMENT Characters added to the BMP, already approved by the UTC, with the same code point, but with different names. The UTC needs to revisit these, to update their approval to the new names: 10F7 GEORGIAN LETTER YN (UTC approved GEORGIAN LETTER IRRATIONAL VOWEL) 10F8 GEORGIAN LETTER ELIFI (UTC approved GEORGIAN LETTER AINI) 267C RECYCLED PAPER SYMBOL (UTC approved RECYCLED PAPER) 267D PARTIALLY-RECYCLED PAPER SYMBOL (UTC approved PARTIALLY RECYCLED PAPER) Math characters added to the BMP, considered, but not yet approved by the UTC. There are 74 of these, documented in WG2 N2356. That was based on WG2 N2336 "Additional Mathematical Symbols", which superseded WG2 N2318, the document that resulted from the discussion of these at the last UTC meeting. The most problematical of the characters in the earlier documents were omitted, in favor of progressing those which seemed less controversial and of higher priority. The UTC will need to review WG2 N2336, N2356, and N2341R (the draft charts for the FPDAM, which includes these 74) and decide either to approve them, or ask for modifications or removals from the FPDAM. I won't type up the entire list of 74 here, as it is available in those other documents. Character whose code point was moved from that shown in the PDAM. This will need to be reviewed and approved by the UTC: 27D0 WHITE DIAMOND WITH CENTRED DOT (moved from 255F) Character removed from the PDAM. This removal accords with the UTC decision to reject this character, and so requires no further action by the UTC: 17DD KHMER SIGN LAAK Characters that the U.S. ballot comments requested be removed from the PDAM, but which were not. These are the four extra radicals in the Japanese compatibility ideograph set. At this point, since the removal was refused by WG2, the UTC will need to reconfirm the four characters in question to ensure that the two standards are synchronized. (Or it could ask for their removal one more time in FPDAM ballot comments from the U.S. NB -- but that seems pointless at this time, since the question was already decided by the WG2 and will get the same result in the FPDAM unless someone can come up with a stronger implementation argument for their removal.) FA4A CJK COMPATIBLITY CHARACTER-FA4A FA5E CJK COMPATIBLITY CHARACTER-FA5E FA5F CJK COMPATIBLITY CHARACTER-FA5F FA67 CJK COMPATIBLITY CHARACTER-FA67 Characters whose names were changed from those printed in the PDAM text. These fall into two categories: 1. those requested in U.S. ballot comments, which can be considered to be pre-approved by the UTC, since they came out of decisions made in the joint UTC/L2 ad hoc meeting. 2. those requested in other NB comments or by the WG2 plenary, which will need to be reviewed and approved by the UTC. "Pre-approved" name changes: 2140 DOUBLE-STRUCK N-ARY SUMMATION 291D LEFTWARDS ARROW TO BLACK DIAMOND 291E..2920 (same change of "FILLED" to "BLACK" as for 291D) 2933 WAVE ARROW POINTING DIRECTLY RIGHT 29A8 MEASURED ANGLE WITH OPEN ARM ENDING IN ARROW POINTING UP AND RIGHT 29A9..29AB (same removal of "TO THE" as for 2933 and 29A8) 29D1 LEFT BLACK BOWTIE 29D2 RIGHT BLACK BOWTIE 29D3 BLACK BOWTIE 29D4 LEFT BLACK TIMES 29D5 RIGHT BLACK TIMES 29D7 BLACK HOURGLASS 29EA BLACK DIAMOND WITH DOWN ARROW 29EB BLACK LOZENGE 29ED BLACK CIRCLE WITH DOWN ARROW 29EF ERROR-BARRED BLACK SQUARE 29F1 ERROR-BARRED BLACK DIAMOND 29F3 ERROR-BARRED BLACK CIRCLE 2A28 PLUS SIGN WITH BLACK TRIANGLE [Incidentally, of these, I think the changes for 29D1..29D5 create misnomers, and should be reverted to FILLED. I will raise that as a UTC issue for comment on the FPDAM.] Name changes requiring further review and approval: 2144 TURNED SANS-SERIF CAPITAL Y (changed "INVERTED" to "TURNED") 23BE DENTISTRY SYMBOL LIGHT VERTICAL AND TOP RIGHT 23BF..23CC (comparable change of "DENTIST" to "DENTISTRY" in each name) In addition to all the additions and name changes, WG2 also agreed to a number of glyph corrections -- some of them requested in NB ballot comments (technically out of scope for the PDAM, but accommodated anyway), and others from other sources. Most of these were non-controversial small fixes, and are going to be rolled in as soon as possible. 3. Dis-unification of Brackets for CJK and Math (see Resolution M40.7) Acting as individuals, Asmus and Michel brought in a proposal to disunify the CJK brackets also used in math, to solve the implementation problem Asmus talked about at the last UTC meeting. This, and the proposal for adding more mathematical symbols (see above) led to an ad-hoc meeting on mathematical symbols, whose report got written up as WG2 N2344. The ad-hoc served as a vehicle to get Irish, and in particular, Japanese support for the proposal. I argued against the disunification, in accord with the official U.S. position. Michel argued for the disunification, in opposition to the U.S. position, and Asmus argued for the disunification, in opposition to the UTC position. Korea effectively abstained in the ad hoc, and China was not officially represented. The net of the ad hoc on this issue was for Kent Karlsson (Sweden), who also supported the disunification, to write up a proposal. In plenary I argued that proposal down from 10 disunifications to 6, but in the end WG2 approved the 6: 2B00 MATHEMATICAL LEFT WHITE SQUARE BRACKET 2B01 MATHEMATICAL RIGHT WHITE SQUARE BRACKET 2B02 MATHEMATICAL LEFT ANGLE BRACKET 2B03 MATHEMATICAL RIGHT ANGLE BRACKET 2B04 MATHEMATICAL LEFT DOUBLE ANGLE BRACKET 2B05 MATHEMATICAL RIGHT DOUBLE ANGLE BRACKET These are disunification clones of 301A, 301B, 3008, 3009, 300A, and 300B, respectively. In addition, WG2 approved the renaming of two characters in Amendment 1: 2985 MATHEMATICAL WHITE LEFT PARENTHESIS 2986 MATHEMATICAL WHITE RIGHT PARENTHESIS and the addition of two CJK (wide) disunification clones of those two: 33DE WHITE LEFT PARENTHESIS 33DF WHITE RIGHT PARENTHESIS The resolution didn't actually add these to the FPDAM text, but in some ways, the wording is actually even worse than if they were added to the FPDAM: "WG 2 provisionally accepts to add 6 new math symbols ... (etc.) ... per recommendation of the Math ad hoc group in document N2345R, with the intent of including these in the standard in the FDAM-1 to 10646-1. This provisional acceptance is to permit member bodies and liaison organizations to review and comment by the next meeting of WG 2 in October 2001." What this means, in effect, is that the disunifications will be added to the FDAM-1 in October, without them having gone through PDAM or FPDAM ballot comment and review, unless one or more member bodies or liaison organizations raise strong enough objections to overturn the consensus that was manufactured at WG2 #40, as captured in Resolution M40.7. Since the UTC and the U.S. national body are on record as opposing this disunification, that means that if they want the disunification reversed, they will have to line up a very strong objection before the October WG2 meeting -- and will not have the nominal vehicle of the FPDAM ballot comments to do it in, since these disunifications will not actually appear in the ballot text. 4. Future Script Additions: Limbu, Ugaritic Cuneiform, Aegean scripts WG2 didn't take any resolutions on these, but minuted the fact that these proposals are now considered mature. The revised proposals are in the hopper now, with national body comments invited. And I made it clear that the intent by the proposers is to progress these to amendment balloting by resolution at the Singapore meeting this October. In particular: Limbu (for the BMP) would be in Amendment 2 for 10646-1. Ugaritic Cuneiform and the Aegean scripts (Linear B, etc.) would be in Amendment 1 for 10646-2 (they go on Plane 1). Given the mature status of the proposals now, and the buy-in from the relevant academic communities for the historic scripts, these are now the most likely candidate additions we can see coming that would meet the presumptive deadline for inclusion in Unicode 4.0. The Dai scripts (for the BMP) might also make it, if they get pushed between now and October, since those proposals are also fairly mature, have had a long history in WG2, and since China has an interest in completing them. 5. Korean Ad Hoc Meeting Report (see Resolution M40.1) The DPRK and ROK were both heavily represented at the meeting. China also brought along a Korean expert in their delegation. There was a fairly extensive discussion of the meeting report (N2331) that the Korean script ad hoc group put on the record. However, the net effect was no impact on anything currently encoded. Everyone was invited to continue talking, with Kyongsok Kim (ROK) and a player to be named later from the DPRK nominated as co-chairs of the ad hoc to coordinate any future reports to WG2. Side discussions with the DPRK further clarified the distinction between the standard arrangement of characters in 10646 and the use of tables for arbitrary collation orders. And there was a breakthrough of sorts in clarifying that it is perfectly o.k. to *translate* the English reference names of characters in the standard to whatever may be locally appropriate -- just as the French translators have done for the French edition of 10646. This may help remove the pressure from the DPRK to change the names of all the Korean Hangul characters in 10646. --Ken 5