SC22/WG20 N774
L2/00-307

Collection of reactions to the WG20 convenor's
"Personal thoughts about the future of WG20"

Part 1: August 30 through September 6, 2000

 

 

Akio Kido suggested that I collect all reactions to my proposal about the future of WG20 in one document for easy reference.  Due to the interest in this subject, it became a rather lengthy document and I decided to put a linked index in front of it – that allows you to go straight to the contribution that interests you.  I did not do any formatting – please apologize, if text in html does not look as good as it could be, but I wanted to maintain the original form of the e-mails the way I received them.

 

The document got too long – I had to split it into parts:

 

Parts

SC22/WG20

NCITS/L2 - UTC

Part 1, from August 30 – September 6, 2000

N774

L2/00-307

Part 2, from September 6 on …

N775

L2/00-308

 

Index with the latest document on top:

 

National Body

Name

Date

Content

Supports
N3164

USA

Asmus Freytag

2000-09-06

I18N in PLs

Y

Canada

Dave Blackwood

2000-09-06

I18N in PL and in OS

N

Norway

Keld Simonsen

2000-09-06

I18N API in C++

N

USA

Ken Whistler

2000-09-06

Technical considerations

Y

Germany

Marc Küster

2000-09-06

Customer emphasis

?N?

Sweden

K.I. Larsson

2000-09-05

SC2 plenary discussion ?

Y

USA

Ken Whistler

2000-09-05

Answer to Canadian contribution

Y

Canada

Alain LaBonté

2000-09-05

CAC/SC22 meeting result

N

Ireland

Michael Everson

2000-09-05

Character attributes

Y

Ireland

Michael Everson

2000-09-05

Character properties, sort

Y

Ireland

Michael Everson

2000-09-05

Support, 14651

Y

Japan

Akio Kido

2000-09-01

WG20 work in SC22 or not ?

Y

Germany

Marc Küster

2000-09-01

Customers of WG20 ?

Y

Sweden

Ken Karlsson

2000-09-01

Where the experts are

Y

UK

John Clews

2000-09-01

Value of WG20, role of CEN

?N?

Japan

T.K. Sato

2000-08-31

Discuss 2375 for 15897

Y

Norway

Keld Simonsen

2000-08-31

WG20 has a role to play in I18N

N

Sweden

Kent Karlsson

2000-08-31

Sweden's structure for WG20

Y

Japan

Masayuki Takata

2000-08-30

Agreement in principle

Y

Japan

Akio Kido

2000-08-30

Agreement, discuss in CLAUI

Y

SC22 N 3164

Arnold F. Winkler

2000-08-30

Personal thoughts about the future of WG20

Y

 

 

 

Individual contributions on e-mail:

 

 

Japan, Akio Kido, August 30, 2000

 

 

I agree with Arnold's thought.

It is good idea to work in the CLAUI. Some of our standard and TR are

tightly related with ISO/IEC 10646, and without having the involvement

of SC2, we can not maintain those IS and TR and make them alighn

with the latest ISO/IEC 10646. It is moving target to follow ISO/IEC

10646. So we do need to work togather with SC2.

 

Best regards,

Akio Kido (Globalization CoC, Yamato, IBM  & Co-chair person of Li18nux)

 

 

Japan, Masayuki Takata, August 30, 2000

 

As an individual, I totally agree with your thoughts.  Thanks for the

good ideas.  This is not a small thing for us all, so it will take some

time to achieve conclusion in the Japanese working group.  However, I have

no doubt that we will agree with you, at least in principle.

 

As the Head of Japanese equivalent of WG20, I'll try to find a group

consensus and, probably, delegate Kido-san to discuss in the SC22 Nara

Plenary.

 

Regards,

 

TAKATA Masayuki

 

 

Sweden, Kent Karlsson, August 31, 2000

 

        I agree in principle with your suggestions (with the
exception that I would prefer the withdrawal also of 14652,
assuming that no-one is willing to rewrite it from scratch
foregoing POSIX compatibility; similar problem with the
registry standard).

        From a formal point of view, the responsibility of
SC22/WG20 matters within the Swedish NB was very recently
transferred from our AG22 to our AG2, which takes care of
TC304, SC2, SC35, and now also SC22/WG20 matters.  So from
an NB point of view, a transferral of most WG20 projects
to SC2 would (now) not make any difference.

                Kind regards
                /kent k

 

Norway, Keld Simonsen, August 31, 2000

 

> Friends,

>

> After some hefty thinking and soul-searching, I decided to send the attached

> personal contribution to SC22 for consideration at the plenary in Nara.  I

> will also send it to CLAUI for consideration at their meeting in October in

> France.  I wanted you to see it before the official SC22 distribution. 

>

> I do hope, you agree with me, at least in parts. 

 

I think it is good that you have done some thinking about it.

 

I do think that WG20 has a role to play.

In my mind we are now about to begin the real work of WG20.

WG20 was set up to standardize i18n functionality, that is

APIs and also to find out what i18n is all about.

We have completed (more or less) location of i18n and

(with the usual time that it takes) now standardized

kind of what was standardized in other WGs of ISO wrt. i18n.

Then we have done a littel more, extended some specifications,

and we made 14651.

 

So now we are "lords in our own house", and we can begin

standardizing APIs and go beyond standard i18n functionality.

There are a lot of functionality to cover, before we can have

truly internationalized, portable applications.

 

I think the standardization of APIs and formats for data

specifications are best done in SC22, which standardizes

libraries, and also interacts with the many ISO programming languages.

 

Moving WG20 activities into SC2, as Arnold Winkler proposes, would be an error, IMHO.

APIs are not in the scope of SC2. Neither are sorting or

character attributes. And sorting and character attributes

have for a long time been a SC22 issue, viz. C, and other

programming languages islower(), isupper() etc. I do not

see the kind of expertise in character attributes at SC2 meetings,

but maybe they are available in Unicode, as the Unicode

Technical Commitee chair, Arnold Winkler, is hinting at, and maybe

we should just leave everything to Unicode, and stop making open

world-wide standards. In that way all our culture, not just

our MacDonald hamburgers, can be really standardized:-)

 

Kind regards

Keld

 

 

Japan, T.K. Sato, August 31, 2000

 

Arnold, you are going to make what I wanted.

 

I agree with you in principle.     For each details, such as 2375 extension,

I think some more discussion might be necessary.

 

Sato

 

 

United Kingdom, John Clews, September 1, 2000

 

I'm sending my thoughts via <SC22WG20@dkuug.dk> which is probably

similar in content to the list of individuals.

 

I think Keld Simonsen sums up several of the things I'd considered

myself, and I find myself agreeing with several of his points, as

noted below: I also throw in some other issues which may be related

to the wider picture.

 

In message <20000831202309.A3987@rap.rap.dk> Keld wrote:

> I do think that WG20 [still] has a role to play...

> WG20 was set up to standardize i18n functionality...

 

No other ISO/IEC JTC1 committee is doing this at present, although we

should certainly continue our liaison within and outside of ISO/IEC

JTC1 committees - in fact JTC1/SC22/WG20 seems to be quite good at

that.

 

It's also an ideal size working group in terms of size, cost

effectiveness, and in what it can get done.

 

> We have completed (more or less) location of i18n and

> (with the usual time that it takes) now standardized

> kind of what was standardized in other WGs of ISO wrt. i18n.

> Then we have done a littel more, extended some specifications,

> and we made 14651...

 

Which is certainly our big success story, also involving extremely

valuable liaison and participation with the Unicode Technical

Committee. This still needs more work in its second edition.

 

> There are a lot of functionality to cover, before we can have

> truly internationalized, portable applications.

>

> I think the standardization of APIs and formats for data

> specifications are best done in SC22, which standardizes

> libraries, and also interacts with the many ISO programming languages.

>

> Moving WG20 activities into SC2, as Arnold Winkler proposes,

> would be an error, IMHO.

> APIs are not in the scope of SC2. Neither are sorting or

> character attributes. And sorting and character attributes

> have for a long time been a SC22 issue, viz. C, and other

> programming languages...

 

> I do not

> see the kind of expertise in character attributes at SC2 meetings,

> but maybe they are available in Unicode, as the Unicode

> Technical Commitee chair, Arnold Winkler, is hinting at...

 

Unicode Consortium and the UTC have an extremely important role.

So does ISO, in enabling more international input at expert level

than the UTC does on its own.

 

It may be useful to have  view on this (not necesarily official) from

the UTC, or somebody within it.

 

The UTC and ISO/IEC JTC1/SC2/WG2 make a valuable complementary pair:

the UTC and ISO/IEC JTC1/SC22/WG20 also make a valuable complementary

pair.

 

In passing I also notice that comments from Europe that I have seen

tend towards keeping ISO/IEC JTC1/SC22/WG20, and that comments from

the USA and Japan that I have seen tend towards moving away from

ISO/IEC JTC1/SC22/WG20, although I wouldn't read anything too much

into that.

 

However, it does reminds me that the European Commission has

commissioned Price Waterhouse Coopers, if I have the details correct,

to evaluate future work in CEN/TC304: Information and Communications

Technologies: European Localization Requirements.

 

Considering the degree of overlap of some aspects of work between

ISO/IEC JTC1/SC22/WG20 and CEN/TC304, and between the Unicode

Technical Committee and CEN/TC304 to a lesser degree (probably

complementing each other rather than overlapping) it may be useful for

ISO/IEC JTC1/SC22/WG20 and/or the UTC to provide some input into that

process in due course as well, to see if this work can also provide a

wider picture of a useful future in ICT standardisation.

 

Best regards

 

John Clews

 

 

Sweden, Kent Karlsson, September 1, 2000

 

 

> -----Original Message-----
> From: Keld Jørn Simonsen [mailto:keld@dkuug.dk]
...
> I do think that WG20 has a role to play.
> In my mind we are now about to begin the real work of WG20.
> WG20 was set up to standardize i18n functionality, that is
> APIs and also to find out what i18n is all about.
> We have completed (more or less) location of i18n and
> (with the usual time that it takes) now standardized
> kind of what was standardized in other WGs of ISO wrt. i18n.
> Then we have done a littel more, extended some specifications,
> and we made 14651.
>
> So now we are "lords in our own house", and we can begin
> standardizing APIs and go beyond standard i18n functionality.
> There are a lot of functionality to cover, before we can have
> truly internationalized, portable applications.
>
> I think the standardization of APIs and formats for data
> specifications are best done in SC22, which standardizes
> libraries, and also interacts with the many ISO programming
> languages.
>
> Moving WG20 activities into SC2, as Arnold Winkler proposes,
> would be an error, IMHO.

> APIs are not in the scope of SC2.

True, API development/standardisation should not be done in SC2.
But nobody is suggesting that is should.  The suggestion is to
cancel the API standard development of WG20, due to lack of
interest and lack of quality.  What is troubling is that if
14652 and the corresponing API standard continue, Linuxers
and C (C++, POSIX) standardisers will be misguided by them.
The only hope for the i18n file format and API standards to
be of any use would be to start over from scratch, essentially
ignore POSIX, but pick up the very best from the others, and
do something completely new.  But I don't see that happening in
WG20 at this time.

> Neither are sorting or
> character attributes. And sorting and character attributes
> have for a long time been a SC22 issue, viz. C, and other
> programming languages islower(), isupper() etc. I do not
> see the kind of expertise in character attributes at SC2 meetings,
> but maybe they are available in Unicode, as the Unicode
> Technical Commitee chair, Arnold Winkler, is hinting at,

Character attributes and ordering certainly belongs in SC2.
That's where the expertese about such matters is to be found
within ISO, not in SC22.  C (and C++ and POSIX) has botched
both character and character string representation, as well
as character properties. C does NOT specify what wchar_t is,
leaving open to each implementation to do whatever, nor does
it specify any other suitable datatype, and what char is is
locale-dependent. Ada fares a bit better on that point where
Wide_Character and Wide_String are UCS-2 (except in non-conforming
implementations). In C (and C++ and POSIX) islower etc. are
locale-dependent, not character-dependent. C and POSIX are
definitely the wrong places to look for guidance regarding this.

> and maybe
> we should just leave everything to Unicode, and stop making open
> world-wide standards. In that way all our culture, not just
> our MacDonald hamburgers, can be really standardized:-)

I find that statement to be uncalled for.  In my experience
Unicode consortium is quite open to input, more so than
ISO, and definitely more so than W3C, and extremely much
more so than TC304...  And the results from Unicode consortium
are also more open than those from ISO.

 

> -----Original Message-----
> From: Ordering@sesame.demon.co.uk [mailto:Ordering@sesame.demon.co.uk]
...
> However, it does reminds me that the European Commission has
> commissioned Price Waterhouse Coopers, if I have the details correct,
> to evaluate future work in CEN/TC304: Information and Communications
> Technologies: European Localization Requirements.

"European Localisation Requirements" sounds good.  Unfortunately
TC304 has come up with some rather useless 'delivarables':
reports that misrepresent Unicode/10646, and seem to argue for
increased use of ISO 2022 (currently at about 0% usage in Europe);
botched MES-1 and MES-2 subsets, lacking MES-3 subsets; a report
on fall-back that is hopelessly outdated; "euro-locales" based
on POSIX 'locales' and that in addition are ambivalent to
localisation (common ordering, but language varying week/month
names); not to mention an internal quarrel about *exactly* what
constitutes Europe (as if that really mattered for TC304).
And the involvement of IT (and communication) industry in
TC304 has, as far as I can tell, been extremely small.

                Kind regards
                /kent k

 

 

Germany, Marc Küster, September 1, 2000

 

Dear Colleagues,

 

> > After some hefty thinking and soul-searching, I decided to send the attached

> > personal contribution to SC22 for consideration at the plenary in Nara.  I

> > will also send it to CLAUI for consideration at their meeting in October in

> > France.  I wanted you to see it before the official SC22 distribution. 

> >

> > I do hope, you agree with me, at least in parts. 

>

 

Arnold's thoughts are indeed stimulating and not unfounded. 14651

certainly is WG20's most relevant project at this point in time, and it is

going to become an international standard very soon now. While there is

the need for immediate revision to cover at least the

repertoire of 10646-1:2000, end is in sight.

 

There is no point in keeping WG20 as a cosy debating club.

 

[Keld]

> I do think that WG20 has a role to play.

 

Still, I do agree with Keld that WG20 may have a role to play. Whether it

is currently doing so in the best possible manner is open to debate -- you

know the German views on both 14652 and the API standard --, but that does

neither mean that such work is superfluous nor that there is a lot of

value in WG20s deliverables.

 

When we agreed in Québec to look for "customers" of WG20's deliverables,

especially for the API standard, it was in this spirit. If no-one is

interested in them, by all means cancel them. Yet, I think it is

worthwhile to try. If that may take, as Kent suggests and I agree, drastic

overhaul of the papers, why not?

 

Moreover, John is right in pointing to the market study that the European

Commission has ordered on CEN/TC304 and that is to be delivered within a

few months. This study by Price Waterhouse Coopers may or may not filter

out important new areas of work for TC304, it may recommend anything in

between closure and drastic extension of responsibilities. In any case

many of the conclusions they draw on a European level will be of value to

WG20, and it is worthwhile to scrutinize them before taking action either

way.

 

What I am driving at is not keeping WG20 alive at all cost, quite on the

contrary. I'd, however, counsel patience and level-headed evaluation of

the development of the next six months.

 

      Best regards,

 

            Marc

 

 

***************************************************

Marc Wilhelm Kuester

 

Computing Centre of the University of Tuebingen

Dept. Literary and Documentary Data Processing

 

 

Japan, Akio Kido, September 1, 2000

 

 

 

I think what Arnold proposes NOT simple termination of WG20 work,

rather he proposes to work at more appropriate place.

 

I like to understand the point why some people stick the current WG20.

I'm talking about just organization view point. I personally observe that

other WGs in SC22 might have less interests on the further work of WG20,

since we have quite less participateion of representatives from other

working groups in our meeting. Rather, I observe, they might pay

more attention to ISO/IEC SC2 and Unicode works.

 

Of course, if we start to work in a new group, we should discuss

our futher works from the scratch. In order to do that, we need to

cancel our projects that have not yet reached to final stage, once.

If an existing projects still has importance to market, we can issue NWI again

with new workable business plan.

 

I beleive that what we need to discuss is NOT the importance of

some existing work, but is where and how we can contribute internationalization

in a timely manner. We should recognize that some of our projects are delaying.

As convenors report said, we can not put priority to API standard, although

the ISO 3 years timer was already expired.

 

The reson why I agreed with Arnod is that I think he proposes some

work-able actions.

 

1) Complete ordering standard ASAP

2) Terminate WG20 activity onece.

3) Bring some work that require maintanance to appropriate groups.

    ( those who are interesting in the maintanance work can join the group ).

 

That proposal would also impley,

a) if no one interesting in some WG20 works which require maintanance,

   those standard or TR  should be frozen.

b) the project that we can not put priority in the current WG20, has a

   chance to re-evaluate its importance and re-start in a new group.

   ( Of course, if the projects can not have enough support and

     participation, that projects should be dead project, we should

     not re-start those canceled projects. )

 

 

Best regards,

Akio Kido (Globalization CoC, Yamato, IBM  & Co-chair person of Li18nux)

 

 

 

 

 

Ireland, Michael Everson, September 5, 2000

 

I am in complete agreement with Arnold's contribution. The only thing I

would say is that I would like the plan for how SC2 is to take over

responsibility for the 14651 table to be elaborated. On the other hand

maybe that is an SC2 matter.

 

Michael Everson  **  Everson Gunn Teoranta  **   http://www.egt.ie

 

 

Ireland, Michael Everson, September 5, 2000

 

>Character attributes and ordering certainly belongs in SC2.

 

Maintaining ISO standards on the character attributes would unfairly burden

SC2 and it would be impossible to make timely changes, such have been made

numerous times as the bugs in the bidi algorithm have been ironed out.

Industry (UTC) is the right place for that work, as it is eminently

practical. The Generic Ordering Template is easier to maintain and has

already been standardized.

 

Michael Everson  **  Everson Gunn Teoranta  **   http://www.egt.ie

 

 

Ireland, Michael Everson, September 5, 2000

 

>Moving WG20 activities into SC2, as Arnold Winkler proposes, would be

>an error, IMHO. APIs are not in the scope of SC2. Neither are

>sorting or character attributes.

 

Character attributes should not be maintained by SC2 because of the nature

of the ballotting process. But you are wrong that sorting is not in the

scope of SC2. ALL of these scripts when presented for encoding have had

ordering scrutinized by WG2 experts in order to put the code tables

together. All the expertise for this is in the UTC and in WG2. To maintain

the default table, it is logical and natural for SC2 to handle this,

whether in WG2 or a new WG4.

 

The script and linguistic expertise is NOT available in the Programming

Languages subcommittee.

 

Michael Everson  **  Everson Gunn Teoranta  **   http://www.egt.ie

 

 

Canada, Alain LaBonté, September 5, 2000

 

Outcome of our Québec 2000-09-05 CAC/SC22 meeting on this issue

 

Generally speaking: I18N is a fundamental requirement on programming

languages (PL) and PLs don't take care enough about it (currently APL,

COBOL, C, POSIX, ADA and FORTRAN communities have dealt with such issues to

acertain point, and most others have notr at all; those who did something

did not completely do what needs to be done), that maybe the main problem.

We lose something if we weaken SC22/WG20 too much, if we do not cancel it.

I18N issues need to be reminded all the time to the PL community at least

in Plenaries. The CPL community think this would be a big loss and probably

a mistake at least for this reason. If SC2 takes the lead of most

of  SC22/WG20's program of work, programming language i18n will be

neglected even more than today.

 

On the other hand we know that most, if not all, SC22/WG20 experts are

already working too in SC2, and SC22/WG20 and SC2 work is already

integrated in some countries, including Canada, due to the small community

of experts. That would continue anyway.

 

There is a need for the i18n community to keep a handle on PL activities.

 

SC22/WG20 needs to reflect on the reshuffling of current and maintenance

work to perhaps have a greater impact on ISO/IEC PL activities. Canada

would be in favour of reexamining all work having this as a goal.

 

Secific work:

 

IS 14651 Sort Standard: can be anywhere... it has to be in SC22 or in

SC2... SC22's advantage would be to maintain the thing open to PL standards

development more. It is sure that SC2's strong collaboration is required,

and if it were in SC2, strong collaboration would also be required from the

PL community.

 

TR 14652 Specification Method for Cultural Conventions: It is indeed POSIX

oriented but the POSIX WG always said it belonged to WG20... The POSIX WG

now has a lot of challenges and it would not be timely to transfer that

project there (to WG15). Perhaps we need to de-emphasize the perception

that this has more to do with POSIX than with PLs, a perception which

should be wrong, otehrwise the TR has at least partly failed. Controversial

issues should be removed and the TR should be enhanced in WG20, with strong

collaboration with SC2.

 

ISO/IEC 15435 API standard project: we believe in Canada that an I18N API

standard is required, and that otherwise kitchen-made solutions will rather

tie customers to some developers, which is the opposite goal of

international standards. If the current proposal is not OK, then we should

at least try, in a short-time study with a precise deadline, if at all

possible without annoying intellectual property, to find the commonalities

of what is being done by producers and see if we can make a standard with

it or at least a TR. That would belong in SC22 in Canada's opinion.

 

ISO/IEC 15897 Cultural Registry: Canada believes that this should be

managed in the same way as the Character set Registry with an advisory

group. This advisory group could very well be formed with a mix

representation from SC22/WG20, SC2 and perhaps SC35 (User interfaces). As

everybody said in the past, what we need is an independent "IBM green book"

(National Language Design Guide volume 2). We believe that this registry is

about this and should be marketed better. This data is required by a lot of

communities in JTC1 and even electronic commerce standards community

demonstrated an interest in this in the BT-EC report.

 

TR 10176 Programming languages standards guidelines: its annex on

identifier-related characters is in our opinion linked to SC2's interests,

while all the rest belongs to SC22. Again a strong collaboration between

SC22 and SC2 is required. The place for maintaining this appears to be in

SC22/WG20.

 

Other issues: a lot of I18N issues belong to the user interface domain.

This is dealt with in SC35. We should remember that the whole domain of

I18N is a horizontal issue in JTC1 and that cultural and linguistic

adaptability remains a strategic thrust of that super-committee. The TD on

CLAUI will hold a meeting in Southern France in October. That should be an

opportunity for SC2 and SC22's convenors and editors to reflect on all

these problems and come up with a plan for the reshuffling of those

activities in the whole of JTC1, but more particularly in SC22/WG20 and SC2.

 

 

 

USA, Ken Whistler, September 5, 2000

 

Many thanks to Alain for providing a timely report of the deliberations

on this topic at the CAC/SC22 meeting.

 

I have a few observations on some of the conclusions that Alain

and his colleagues reached.

 

> Outcome of our Québec 2000-09-05 CAC/SC22 meeting on this issue

>

> Generally speaking: I18N is a fundamental requirement on programming

> languages (PL) and PLs don't take care enough about it (currently APL,

> COBOL, C, POSIX, ADA and FORTRAN communities have dealt with such issues to

> ascertain point, and most others have not at all; those who did something

> did not completely do what needs to be done), that maybe the main problem.

> We lose something if we weaken SC22/WG20 too much, if we do not cancel it.

> I18N issues need to be reminded all the time to the PL community at least

> in Plenaries. The CPL community think this would be a big loss and probably

> a mistake at least for this reason. If SC2 takes the lead of most

> of  SC22/WG20's program of work, programming language i18n will be

> neglected even more than today.

 

My main concern here is that the problem of internationalization

is somewhat misconstrued here as a "requirement on programming

languages." Some of the deep trouble that WG20 is in is the result

of attempting to conceive internationalization as being in the

domain of formal programming languages, and thereby setting up an

agenda to create standards that can be grafted back onto a whole

host of PL's. This is, I am afraid, bound to fail, since it is taking

an inherently complex field, full of user-specific cultural behavior,

and trying to find a way to bolt on extensions to existing programming

language standards (some of them *very* old, like COBOL and FORTRAN)

to deal with it.

 

I, instead, see the *appropriate* adaptation of the programming

languages to consist essentially of making sure they interoperate

with 10646 data and program text, since that seems to be the way

the world is heading. This should consist of specifying that the

languages will work with UTF-8 (as well as Shift-JIS, or whatever)

as program text, and to allow arbitrary textual content into comment

fields, for example. And the C standard has adjusted the definition

of wchar_t to take UCS into account already.

 

*Maybe* some adaptations to extend identifier syntax should be

allowed -- but that would depend on the language. (For example,

there really is no point in messing with FORTRAN in this way.)

 

Otherwise, most internationalization extensions for PL's are just

asking for trouble, if they weren't designed in from the start.

 

Does that mean I don't care about internationalization? Not at all.

It is just that I am convinced it is a *software design* issue,

and not a programming language issue at all.

 

The very best internationalized software I *ever* worked on was

the Metaphor Data Interpretation System. It was Unicode-based,

had multiple language support, both for user messages and for

all aspects of the GUI, including complete dynamic forms generation

that adjusted all graphic objects to the translated text. It

supported localization formatting hierarchies, from individual

cells in spreadsheets, through applications, through a user's

desktop preferences, to network system settings. It had provisions

for user-settable locale-specific collations.

 

Now how did Metaphor do such a thing? Did it depend on internationalization

in the programming language? Hardly. The entire system was programmed

using C -- but all direct OS calls were forbidden (you had to go

through a strictly controlled set of Metaphor kernel routines, so that the

system architects could guarantee portability and stability), and

likewise all locale-related library calls were also forbidden (so that

the Metaphor system was not inexplicably sensitive to differences

in machine set-up that could not be filtered through explicit

user preferences).

 

Nothing in WG20's program of work related to providing standard

extensions (API's or whatever) for PL's would have helped Metaphor

one whit in that regard -- all such extensions would also have been

tossed so that the system architects could do the software design

that they wanted. C was just treated as it should be -- as a general

purpose programming language widely available on multiple machine

platforms. After that, it is up to the system architects and

software designers to do what they need to do, using the general

purpose programming language as a basic tool for instantiating

algorithms on real machines.

 

Does this mean that internationalization should *never* be a part

of a PL? Well, no -- just that if you want to do that, it needs

to be carefully built into the language by formal language

designers, and preferably from the very start. Java is the

best example we have to date of a language done this way, and even

that has significant flaws. But creating a "standard" for

internationalization, and then telling all the language committees

to add it to their languages so they will better support

internationalization, is just a recipe for failure. That is why

I have been so opposed to the proposed API standard, 15435.

 

>

> ISO/IEC 15435 API standard project: we believe in Canada that an I18N API

> standard is required, and that otherwise kitchen-made solutions will rather

> tie customers to some developers, which is the opposite goal of

> international standards.

 

You are *always* tied to some developers. Someone has to implement

the behavior behind an API, whether you make it an international

standard or not. If you make a particular internationalization API

an international standard and then succeed in getting one or more

language committees to graft it onto their formal language standard,

then all you have managed to do at that point is to push the problem onto

the developers at Microsoft, Symantec, IBM, Borland, Sun, and the gnomes

maintaining Gnu C who then have to implement those extensions. And

customers will in turn be tied to those developers when they use

those tools.

 

If you don't mandate a particular API in an international standard

connected to a programming language, then *other* developers will come

forth with class libraries and components to do internationalization.

And yes, if a customer chooses to use them, they will be "tied" to

some particular developers. But guess what -- those developers are

going to be offering class libraries and components whether or not

an ISO I18N API standard is ever created -- and customers have been

and will continue to be choosing such libraries to accomplish what

they need to do in their applications.

 

If the worry is that this is all too chaotic, and standards-making

would lead to better interoperability in this area, I would argue

that at this point this is a little bit like trying to sweep back

the sea from the beach. Everyone would be better off if those of us

who care about internationalization and interoperability worked at

providing usable, reliable resource lists online for developers to

depend on in building class libraries and components.

 

As an aside, that is why I brought Graham Rhind's address resource

as an exhibit to the Denmark meeting -- it shows the kind of information

collection, compiling, and publication that is actually useful progress

in dealing with internationalization. Instead of just poopoohing the

inevitable mistakes in any compilation of that scale, and then

dismissing the effort, it would behoove those involved in international

standards in this area to ask themselves why internationalization

software engineers are immediately attracted to such compilations

as useful for their work, but show little interest in developing

standards for an "internationalization API".

 

Why does the IBM Green Book get an honored place on the shelf of

every internationalization engineer in the UTC, while the

proposed API for 15435 gets laughed at?

 

> If the current proposal is not OK, then we should

> at least try, in a short-time study with a precise deadline, if at all

> possible without annoying intellectual property, to find the commonalities

> of what is being done by producers and see if we can make a standard with

> it or at least a TR. That would belong in SC22 in Canada's opinion.

 

There *is* no commonality at the API level. A C library is different

from a C++ library is different from a Java class library is

different from a software component like a Java Bean. How these

things are structured is a matter of software design which WG20

is ill-equipped to handle -- and which, for that matter, the

PL standards committees are also not prepared to deal with.

 

The commonality is in the set of problems that people are trying to

solve in software and the kinds of data they need to generate the

tables for parsers, formatters, renderers, converters, transliteraters, and

translators.

 

In any case, I would urge WG20 to *first* do a market relevance

study *before* starting down the road to do some project to compare

all the commercial internationalization libraries looking for

commonalities that could be turned into a standard. If the market

is not clamoring for a standard in this area, then JTC1 should not

be laboring to produce a standard whether it will be used or not.

 

--Ken

 

 

Sweden, K.I. Larsen, September 5, 2000

 

 

Arnold,

 

I completely agree with your views in your Aug. 30 contribution. Are you

going to add the issue to the SC2 Plenary agenda?

 

Incidentally it seems we foresaw this development here in Sweden, since we

decided earlier this year to transfer responsibility for SC22/WG20 matters

from its traditional Swedish WG into our Character Set WG.

 

Best regards, and see you in Athens!

 

KI

 

 

USA, Ken Whistler, September 6, 2000

 

From: Kenneth Whistler [kenw@sybase.com]

Sent: Friday, September 01, 2000 4:40 PM

Subject: Some technical issues regarding the future of SC22/WG20

 

================================================================

 

Arnold Winkler has recently raised a number of issues regarding the future

of SC22/WG20 and the standards that it maintains or has under

development, for consideration at the upcoming SC22 plenary in Nara.

Chief among the issues he raised is whether WG20 is now at the

end of its useful life, and whether it should be sunsetted, with

its various projects redistributed over time to other committees as

appropriate for maintenance.

 

I want to review some of the technical issues that may have a bearing

on where such maintenance should be done, and to further consider

whether some of the projects currently under development in WG20

have enough technical merit to warrant their continuation in some

other committee, should WG20 itself be dissolved sometime in the

not-so-distant future. (Presumably any such dissolution would be

judiciously staged, over a 1-to-2 year period, to allow completion,

termination, or transfer of responsibilities, as appropriate.)

 

The charter of WG20 was fairly broad: standards in the area of

internationalization, as reflected in the first published TR

developed by WG20: TR 11017, "Framework for internationalization".

However, the committee has, in recent years, focused on a few

significant areas, so I will concentrate my comments on those areas

that have, de facto, constituted the majority of WG20's work.

 

1. Collation

 

WG20 developed ISO 14651, soon to be approved and published as an

international standard. This standard needs an immediate

amendment, to deal with the larger repertoire of characters added

for 10646-1:2000 (= Unicode 3.0). The question arises as to the

appropriate venue for that maintenance, if not WG20. The alternatives

being argued are SC22 or SC2.

 

This issue is actually rather easy to resolve on technical grounds. The

character-related expertise in SC2, and in particular in SC2/WG2

(maintainer of ISO 10646) is exactly what is needed to be able to

do the extensions of the tables required for ISO 14651. And that is

in fact the main work that will need to be done for 14651 maintenance.

The architecture for string ordering in 14651 is complete -- 14651 is

just in need of extension of the weights listed in the tailorable

template table, to keep up with the continual additions of characters

to 10646. The best way to accomplish that is to keep that standard

with the committee that actually does the additions of the

characters -- they know what the characters are and would best be

able to do timely coordination of updates for a related standard that

needs to add those characters to its tables.

 

Furthermore, among the active participants in WG2 are the experts

on collation (with implementation experience) who actually ended

up authoring much of the content of 14651. Comparable experience is

not obviously available in the SC22 committees other than WG20.

Furthermore, because of the current close working relationship

between WG2 and the Unicode Technical Committee, WG2 is also the

best place to maintain a standard that should stay in synch with

the Unicode Collation Algorithm maintained by the UTC, to prevent

unanticipated "drift" between the two standards.

 

2. Locale Extensions

 

WG20 is developing TR 14652, "Specification Method for Cultural

Conventions". The specifications defined in 14652 are very closely

modeled on the definition of locale in ISO 9945, the POSIX standard,

and as reflected in related documentation such as XPG4 from X/Open.

In effect, it was conceived of as an extension to the locale

constructs: to add more internationalization elements, as mentioned

in TR 11017, into a formal syntactic construct that could be used

to generate machine-readable locale definitions. So it adds

definitions for LC_NAME, LC_ADDRESS, LC_IDENTIFICATION, etc. to

the older groupings LC_COLLATE, LC_CTYPE, LC_MESSAGES, LC_MONETARY,

LC_NUMERIC, and LC_TIME. Furthermore, it attempts to extend the

preexisting categories with new keywords to deal with collation

as defined in 14651, with the new large character set defined in

10646, and new internationalization issues such as monetary

formats involving the euro sign.

 

It is pretty clear that the impetus and rationale for 14652 derive

from the POSIX side. As such, it logically belongs in SC22/WG15 for

further development, rather than in SC2. The participants in SC2,

while interested in internationalization issues related to locales,

have no particular interest or expertise in the POSIX-specific

syntax extensions covered by 14652, nor do they have any expertise

in ISO 9945 itself, which has to be closely tracked in the development

of 14652, to avoid superfluous inconsistencies. SC2 also has no

established history of working liaison relationships with SC22/WG15--

a situation which would bode ill for trying to develop what is

effectively a POSIX extension in a committee ill-suited to do so.

 

3. Character Properties

 

The most contentious issue regarding DTR 14652 is the effort to

extend LC_CTYPE to cover the repertoire of ISO 10646-1. The contending

positions effectively reflect a worldview divide among the participants

regarding character properties:

 

Position A: Character properties have not traditionally been covered

by character encoding standards, and have not been viewed as the

domain of the ISO committee responsible for encoding characters: SC2.

Instead, character properties are an implementation issue, traditionally

dealt with in the standards most directly concerned with character

implementation -- namely the formal language standards -- and are

dealt with in ISO by the working groups under SC22. In the context

of 14652, the appropriate place to define character properties is

LC_CTYPE, where the properties would be usable in a POSIX context as

part of locale definitions.

 

Position B: Character properties for the *universal* character set --

namely ISO 10646 (= Unicode) are inherent to *characters*, and should

*not* be defined in locales. The locale model and LC_CTYPE were an

attempt to provide a mechanism for dealing with properties of characters

in alternate encodings, but that model does not scale well for dealing

with properties for the universal repertoire of 10646. Furthermore,

it is inappropriate to assert that character properties are defined

in locales, and are thus subject to locale-specific variation, since

such a position would lead to inconsistent and inexplicable differences

in application behavior, depending on locale, in ways that have

no bearing on the usually understood issues of locale-specific

formatting differences, etc. Because character properties are closely

tied to the characters themselves, responsibility for defining them

should belong with the character encoding committees, rather than

with the language committees -- and thus in SC2, rather than SC22.

 

It is clear that among the rather large community of implementers

of 10646 (= Unicode), Position B has much more widespread support

than Position A. Position A is, however, a vocally held minority

opinion among those committed to the extension of the POSIX framework.

 

In point of actual fact, the *real* work on standardization of

10646 character properties is being done almost entirely

by the Unicode Technical Committee, which for years now has been

publishing machine-readable tables of character properties and

associated technical reports that are in widespread implementation

in many products. A very few character properties, most notably

"combining" and "mirroring", are also formally maintained by SC2/WG2 in

ISO 10646 itself, and those properties are tracked in parallel by

the UTC.

 

On balance, it would seem far preferable to conclude that within

JTC1 any responsibility for character properties should belong

to SC2, rather than SC22. Once again, this is a matter of expertise

regarding the huge number of characters in 10646. That expertise

is in SC2, and not in SC22. And the implementation experience

regarding character properties resides in the UTC, which has a

firm working relationship with SC2, but no close ties to SC22.

 

Regarding LC_CTYPE in particular, the maintenance or extension of

LC_CTYPE should be remanded to WG15, along with all of DTR 14652,

but with the following recommendations: Rather than attempting to

independently extend LC_CTYPE definitions to cover 10646, a mechanism

should be developed whereby POSIX implementations using LC_CTYPE

can make use of the more widespread and better researched and

reviewed character property definitions developed by the UTC, in

cooperation with SC2/WG2's development of 10646. This should be

done by *reference*, rather than by enumerating lists of characters

in SC22 standards or TR's, because of the danger of those lists

getting out of synch or introducing errors that cause interoperability

problems. Furthermore, this practice of dealing with character

properties by reference to UTC and/or SC2 developed standards

for them, should be recommended to *all* the SC22 committees, as

the generic way to deal with character properties in formal

language standards.

 

4. Internationalization API Standard

 

WG20 has a project on the books, 15435, to develop an API standard

for internationalization. To date, there has been very little

evidence proffered that there is any actual demand for such a

standard. There is no list of IT companies requesting it to solve

some interoperability problem. The big OS and tools vendors are not

requesting it. The Linux internationalization community has rejected it

in favor of other options. The Java community has no interest -- they

already have a sophisticated internationalization architecture. The Unicode

Technical Committee, which has very widespread representation from

the implementing community, has indicated zero interest in the

15435 project.

 

No one in WG20 but the project editor seems to be doing any active

work to develop the API standard for internationalization, and the

committee feedback to date has largely been that the quality of

the drafts is poor. Fundamental questions regarding the nature

of the API design have not been resolved. Furthermore, there has

been a lot of hand-waving over the issue of how closely tied the

proposed API is to the locale extension constructs of DTR 14652.

The API under development for 15435 is locale-centric, in that

it requires information in an "FDCC-set" defined a la DTR 14652,

assuming API behavior will depend on that information, resident

in some implementation-defined "database".

 

Modern internationalization libraries have largely eschewed that

kind of locale-centric design as too constrained, instead breaking up

the problem of internationalization support into more modular

designs that separate out different aspects of the problems

involved.

 

Furthermore, the proposed API standard aspires to platform

independent design. That, however, inappropriately conflates the

issue of designing appropriate behavior for internationalization

with the problem of designing appropriately abstracted API's

for that behavior on distinct platforms. In actual practice,

implementers are tending to make use of available libraries that

surface correct internationalization behavior (such as the

ICU classes) and then writing whatever wrappers are necessary to

abstract that behavior into their systems. The days of trying

to define complex behavior via ISO API standards, to be rolled

out by language compiler vendors in standard C libraries and such,

are being overtaken by object-oriented design and software

component models.

 

At this point, WG20's project 15435 should just be abandoned as

a well-intentioned but obsolete project that has no demonstrated

need or support for its development.

 

5. Cultural Registry Standard

 

WG20 is also charged with the maintenance of the cultural registry

standard, ISO 15897. That registry needs a firm review and

resolution process to ensure its correctness and market relevance.

WG20 should be able to provide the definition of such a resolution

process, along the lines provided by ISO 2375 for the character

set registry. Once the review is done, and ISO 15897 has been

appropriately updated, it should be a stabilized standard, requiring

little further work or attention.

 

It will then be the responsibility of the registering agency (DKUUG)

to follow the registration process and to make the cultural element

registry worthwhile.

 

6. Identifiers

 

An issue that WG20 has had to deal with fairly recently is the

list of recommended characters for identifiers, in Annex A of

TR 10176, "Guidelines for the preparation of programming

language standards". Because the list of recommended characters

for identifiers is based on the repertoire of ISO 10646, this

is another area where repeated maintenance into the future can

be foreseen, as the repertoire of 10646 continues to expand.

 

Once again, because of the location of character expertise regarding

all the characters added to 10646, the logical source for recommendations

about how to extend the list in Annex A in the future is SC2. This

is supported by the additional fact that determination of which

characters are and are not appropriate in identifiers implicitly

depends on specification of a constellation of properties

for those characters -- again an area in which the expertise is

located in SC2.

 

However, there is somewhat of a conundrum here, since the remainder

of the content of TR 10176 is clearly in the domain of SC22, and the

TR as a whole is inappropriate for maintenance in SC2. Perhaps

some kind of understanding could be arranged between the SC's

to guarantee that modifications to Annex A or TR 10176 should only be made

with timely, coequal input from SC2.

 

A better solution, in the long run, would be to sever the contents

of the exact table in Annex A, which has to track character repertoires

and properties that are (or should be) the responsibility of SC2,

from TR 10176 per se, and instead insert a reference there to a

standard list maintained by SC2, either in the context of 10646

itself or in some associated TR to be developed by WG2 for this

purpose. That would more appropriately divide the responsibilities

for the part of TR 10176 associated with formal language syntax

and design and the part which is attempting to track the universal

character encoding repertoire as it expands over time.

 

Another reason for moving in this direction is the particular interest

that the Unicode Technical Committee has in the identifier content

problem. The Unicode Standard has detailed recommendations regarding

identifiers, and the Unicode Technical Committee is currently working

on even more detailed specifications regarding identifiers and

identifier-like constructs for use in various contexts on the Worldwide

Web and the Internet. It is in JTC1's interest to keep this particular

technical issue active in a venue, namely SC2/WG2, where the character

encoding expertise is available and the working relation with the UTC

is strong. Even though on the surface it might seem that programming

identifier syntax clearly belongs to SC22, the real issue is not the

syntax per se (which is quite simple), nor the concept of an identifier

and its relation to other programming language constructs (which the

UTC and SC2 have little interest in and consider to be long ago

fixed and decided by the SC22 standards). No, the *real* issue that

remains open and problematical is how to classify and distribute all

the thousands of additional characters in 10646, and how to deal

with the complex ramifications of inclusions of various compatibility

characters which may or may not change under various kinds of

identifier normalization processes. That is where the UTC and WG2

expertise would be most helpful, and where joint development of

Unicode and ISO standards would be most likely to minimize

interoperability problems for identifiers in different programming

languages and Internet and Web protocols.

 

This entire issue, is, by the way, also of intense interest to

the Database standards arena, where it is of direct relevance

to the SQL standard, for example. So the SC22 working groups are

not the only JTC1 groups with an interest in standard,

interoperable results in this area for 10646 characters.

 

7. Case Mapping and Case Folding

 

WG20 has not spent much time dealing with case mapping and case

folding issues, although those clearly have an internationalization

angle, because of local differences in case mapping preferences.

 

The one point where this has been dealt with by WG20 is in the

LC_CTYPE specification in DTR 14652. This is because LC_CTYPE is

the location of the information used by the tolower() and toupper()

case mapping transforms for C (and by extension, other languages).

As a result, PDTR 14652 includes tables of case pairs for all

of the 10646 characters that have case pairs.

 

However, the inclusion of these case mappings explicitly in the

"i18n" LC_CTYPE definition in DTR 14652 has been controversial in

the committee, in part because of a small number of unexplained

inconsistencies between those tables and the case mappings provided

by the Unicode Consortium on its website. The Unicode case mappings

are very widely implemented in many products, and are being treated

by the industry as a de facto standard. So it is problematical for

DTR 14652 to be proposing slightly different case mappings for

a standards document that contradict widespread practice.

 

This is once again an area where the JTC1 standards arena would be

better served by using references to de facto practice, rather than

trying to reinvent the wheel with long lists in other standards or

TR's, subject to the introduction of error or drift that can

introduce interoperability problems. Perhaps here the SC22 language

working groups could work with SC2/WG2 to find a way to get the

de facto Unicode tables to be referenceable through an SC2 TR of

some sort, to avoid the synchronization issues of trying to maintain

two (huge) lists separately.

 

The area of case folding is related to case mapping, but is subtly

different. WG20 has not dealt with this issue, but it is clear

that SC22 language working groups need to deal with this. In particular,

COBOL, Pascal, and other languages that have case-insensitive

identifiers, need to be able to do reliable case-folding during their

parsing/lexing phases of program text interpretation. For that, they need

reliable definitions of case-folding as applied to 10646 characters

for the domain of characters allowed inside identifiers for each

language.

 

While WG20 has not touched on this issue and the SC22 working groups

are starting to search for an answer, the Unicode Technical Committee

and the IETF have moved ahead, creating de facto solutions that will

see widespread implementation in the near future.

 

The Unicode Technical Committee has already published CaseFolding.txt, a

machine-readable file with recommendations on exactly how to do

case-folding for all Unicode 3.0 characters (i.e. 10646-1:2000 characters).

The SC22 committees should be reviewing that file, and the associated

case mapping information available in UnicodeData.txt and in

SpecialCasing.txt -- also available on the Unicode website -- before

concluding that new standardization efforts need to be initiated in

SC22 (whether in WG20 or in other working groups), to repeat the

work involved in creating those files, which are already freely available

to all implementers.

 

The UTC and the IETF are currently working on the even thornier

problem of determining how best to define identifiers in a context

(such as internationalized domain names) where certain characters

are disallowed (such as punctuation that has other reserved uses in

URL syntax), where case folding is required, where normalization of

data is also required (disallowing of equivalent sequences that might

otherwise appear identical), and where even visual look-a-likes of

otherwise different characters are to be avoided if possible because

of the confusion they can pose for user entry and the possibility

of spoofing. This is an area where intimate knowledge of all the

characters in 10646 and their interaction of properties and appearances

is required. Yet again, it would behoove the SC22 working groups

to participate in the joint UTC/IETF effort in this area through

review and feedback, rather than trying to reinvent the wheel in

a committee context where less relevant expertise would be available

to start with.

 

 

 

Germany, Marc Küster, September 6, 2000

 

Dear Alain,

 

> There is a need for the i18n community to keep a handle on PL activities.

>

> SC22/WG20 needs to reflect on the reshuffling of current and maintenance

> work to perhaps have a greater impact on ISO/IEC PL activities. Canada

> would be in favour of reexamining all work having this as a goal.

>

 

IMHO Canada is quite right on this emphasis.

 

That said, SC22/WG20 would benefit from an increased "customer focus". But

who are WG20's direct customers? Not necessarily enterprises or

individuals, but first and above all the other PL (+ POSIX) working groups

who need to consider i18n.

 

That has happened, but to my personal experience rather without WG20's

direct involvement. E. g., let's have a look at the new i18n features of

C++ with its intelligent facet mechanism. I cannot remember that these

extension to the C++ standard library has ever been discussed in the

context of WG20's own API standard. (I'll gladly stand corrected if these

have been an issue in the past, prior to my personal involvement).

 

On a national level, i. e. in our national SC22 mirror committee, we have

decided to look into i18n features of the different programming languages,

just as before we have studied different OO-techniques (for, while many

PLs nowadays claims to be object oriented, the differences between the

realizations are significant).

 

This is a kind of work that would have to be performed before WG20 rushes

at an API standard that, as Ken rightly points out, would be best ignored

if it is made without taking into account what has been done elsewhere --

especially in Java. Even then, it is doubtful if a formal WG20 standard is

needed at all.

 

This kind of work can largely be performed online, making a significant

reduction in WG20's meeting schedule feasible.

 

 

> Specific work:

>

> IS 14651 Sort Standard: can be anywhere... it has to be in SC22 or in

> SC2... SC22's advantage would be to maintain the thing open to PL standards

> development more. It is sure that SC2's strong collaboration is required,

> and if it were in SC2, strong collaboration would also be required from the

> PL community.

>

 

Agreed. It would be best, however, to keep 14651 located within the SC22

framework. Standards are not only developed by individuals or individual

working groups. They are firmly bound to an organizational structure --

and that is SC22 and, in many countries, its national counterparts.

 

      Best regards,

 

            Marc

 

 

WG20 convenor, Arnold Winkler, August 30, 2000

 

Personal thoughts about the future of SC22/WG20 - Internationalization

for consideration by the SC22 plenary in Nara

From:  Arnold F. Winkler (convenor)

Date:  August 30, 2000

 

 

The following contribution to the SC22 plenary holds my very personal thoughts about the work of SC22/WG20 (Internationalization) and what I see as the best way to serve the programming language community in SC22. 

 

 

I think, it is time to wrap up WG20's life. 

WG20's most important work will hopefully be completed this fall:

* making the world aware of I18N in TR 11017

* carrying ISO 10646 and I18N into programming languages in TR 10176

* establishing a culturally correct sorting method for ISO 10646 encoded data in IS 14651

 

When we started working on these projects, and when we asked for projects for a cultural specification standard and an API standard, mainly using POSIX syntax, there was no other method on the market.  This is not true any more, object orientation and Java, and the web, and W3C, and LINUX, and even Microsoft's I18N have changed the playing field for ever.  WG20 "inherited" the CEN registration for cultural conventions as IS 15897, once again a bit late for the modern languages and implementations.

 

In my (and the US) opinion, WG20 should not do much more new development work.  It could go away totally, when the sort standard is approved and when we have found good homes for the maintenance of the completed work.

 

I would not touch TR 11017, unless somebody makes a comprehensive contribution that covers the full extent of I18N technologies and requirements as presented in the marketplace today.  The web, the proliferation of ISO 10646, access technologies for disabled persons in all countries - these are subjects that could, but don't NEED to be addressed in TR 11017, in case somebody has the time, resources, and interest to do a revision.

 

TR 10176 is fine, the amendments due to extended character repertoire (Annex A) could easily be done by SC2.  That's where the experts are.

 

IS 14651, the sort standard, will also need amendments once it is approved, to keep up with the repertoire additions in ISO 10646.  Again, it is the maintenance of the table and could/should be done by SC2.

 

The cultural elements stuff (specification, API, registry) is in my opinion outdated and most likely almost unnecessary.  With lots of input from the US (Ken Whistler), and valuable additions from Japan (Takata), both 14652 and 15435 will get new drafts before the meeting in November in Malvern, Pennsylvania.

 

ISO/IEC 14652 is now a TR, and could be useful to the specific group it was defined for.  However, the US is only interested in ensuring that compliance with this document is never a requirement for modern programming languages, such as Java.

 

Project 22.15435, the API standard, should be withdrawn.  There is no interest in the user community and the project has not seen a ballot document for 3 years. 

 

One concern is the registry ISO/IEC 15897 - DKUUG is the registration authority.  I believe that no real standards work is needed, but good registration procedures need to be established.  We are currently looking into the SC2 registration process for character sets - ISO 2375 is being distributed to WG20 as a template for a working process with all the ingredients: submission process (who - individuals, companies, NBs), review process (who, time), resolution of difficulties, etc...  If we can get this set up correctly, the registry will be helpful, especially if it can be made available on the web. 

And any additional work would be related to character properties - much better located in SC2 where we find all the experts.  We had a short discussion in the last meeting and I was told that I had "no vision" for new work.  I guess, this is right, but nobody else came up with anything either that fit into the WG20 scope. The UK pushed transliterations, the WAP pictograms came up, and user interfaces - none of which is within the knowledge base of WG20 and other subjects are already placed in other WGs in JTC1 or ISO or other SDOs.

 

There will be a meeting of the Technical Direction (CLAUI) for cultural and linguistic adaptability and user interfaces - October 19-20 in France.  I will NOT be able to go there to represent WG20.  This would be the best place to find competent homes for the maintenance of the WG20 completed work and agree on the registration process, at least in principle.

 

I would like to see WG20 :

* complete the sort standard ISO 14651

* find home(s) for the maintenance of its completed work (TR 11017, TR 10176, and sort), preferably in SC2

* agree on registration processes for the registration of cultural elements in ISO/IEC 15897 by adjusting the ISO 2375 process

* move the project TR 14652 for the specification of cultural conventions to SC22/WG15

* withdraw ISO 15435, the API standard

* and go out of business in about 11/2 years.

 

This would mean for SC22:

* Agree with this plan in principle

* Encourage WG15 to take over TR 14652

* Withdraw project 22.15435

* Ask SC2 for specific support in the complex issues of character properties as they apply to identifiers in programming languages

* Move the maintenance of IS 14651 and TR 10176 to SC2 (provided SC2 agrees, e.g. at the CLAUI meeting)

* Dissolve SC22/WG20 when all above items are completed and the registry is operational.

 

 

Best regards

Arnold

 

 

Norway, Keld Simonsen, September 6, 2000

 

On Wed, Sep 06, 2000 at 02:38:42PM +0200, Marc Wilhelm Küster wrote:

>

> That has happened, but to my personal experience rather without WG20's

> direct involvement. E. g., let's have a look at the new i18n features of

> C++ with its intelligent facet mechanism. I cannot remember that these

> extension to the C++ standard library has ever been discussed in the

> context of WG20's own API standard. (I'll gladly stand corrected if these

> have been an issue in the past, prior to my personal involvement).

 

We did in WG20 decide (but later reverted) that we wanted a C++ binding,

but we did not explicitely discuss the facet mechanism of C++ for

this. I have gone 2 times to WG21 to discuss the i18n API with them

and one German representative (Dietmar?) promised to help, but later

declined due to lack of time. We have later decided just to do

a C version of the PAI.

 

Keld

 

 

Canada, Dave Blackwood, September 6, 2000

 

The fact that POSIX has dealt with internationalization issues at all is a

tribute to those involved.  The problems that we have encountered however

are not from a lack of caring but a lack of expertise.  While many

internationalization experts may be willing to devote time to WG20 and/or

SC2, relatively few of them are willing to attend WG15 meetings (and more

importantly IEEE PASC and Austin Group meetings where the real technical

development is done) to explain the issues and help develop the solutions.

It is insufficient to simply have a liaison between working groups whose

primary role is to report what one group is doing that may be of interest to

the other.  We need real, ongoing and substantive involvement.

 

We have also seen many requests for the operating system to fix what are

essentially application problems.  The subtle differences between dictionary

and telephone book sorting across cultures is beyond the functionality that

can be expected from an OS, as is the storage, format, and presentation of

dates very, very far into the past or very, very far into the future as may

be required for astronomical calculations, etc.

 

WG20 could be more effective if it worked to incorporate i18n solutions into

existing PL and OS standards and concentrated less on developing stand-alone

i18n standards that are based on invention rather than existing practice and

consequently are rarely implemented fully by PL and OS vendors.  A C

compiler conforms to the C standard, a POSIX OS conforms to the POSIX

standard, what conforms to an i18n standard?

 

Dave

--

D. J. Blackwood, Chair

Canadian POSIX Working Group

 

 

 

USA, Asmus Freytag, September 6, 2000

 

 

At 12:18 PM 9/5/00 -0400, Alain LaBonté  wrote:

>Outcome of our Québec 2000-09-05 CAC/SC22 meeting on this issue

>

>Generally speaking: I18N is a fundamental requirement on programming

>languages (PL) and PLs don't take care enough about it (currently APL,

>COBOL, C, POSIX, ADA and FORTRAN communities have dealt with such issues

>to ascertain point, and most others have not at all; those who did

>something did not completely do what needs to be done),

 

This statement excludes the forward looking work of languages such as C++

and even more so Java.

 

A more important issue is that while I18n is indeed a fundamental

requirement for PL it is a fundamental requirement for all aspects of IT -

languages, operating systems, applications, data formats, query languages,

markup languages, ....

 

Burying this work in SC22 has the effect of isolating it from all those

fields of application that are not SC22 developed programming language

standards.

 

>If SC2 takes the lead of most of  SC22/WG20's program of work, programming

>language i18n will be neglected even more than today.

 

I'm not sure that I agree. I see a lot of the impetus for strong support

for internationalization go hand in hand with adoption of support for

10646/Unicode. Since the majority of new work on i18n is built upon the use

of 10646/Unicode, it would be natural for those doing the work to look to a

single SC2.

 

 From a JTC1 perspective, the question of where certain work is being done

must address the need of all of JTC1 and its liaison organization (such as

IETF and W3C) and not just the needs of a particular SC to motivate its

working groups to get internationalization support added to their

programming language standards.

 

>On the other hand we know that most, if not all, SC22/WG20 experts are

>already working too in SC2, and SC22/WG20 and SC2 work is already

>integrated in some countries, including Canada, due to the small community

>of experts. That would continue anyway.

 

The point is that a move to WG2 with it's larger community of experts would

in all likelihood be quite positive from an organizational point of view

and would help to elevate the visibility of the I18n efforts.

 

 

>Specific work:

>

>IS 14651 Sort Standard: can be anywhere... it has to be in SC22 or in

>SC2... SC22's advantage would be to maintain the thing open to PL

>standards development more. It is sure that SC2's strong collaboration is

>required, and if it were in SC2, strong collaboration would also be

>required from the PL community.

>

>TR 14652 Specification Method for Cultural Conventions: It is indeed POSIX

>oriented but the POSIX WG always said it belonged to WG20... The POSIX WG

>now has a lot of challenges and it would not be timely to transfer that

>project there (to WG15). Perhaps we need to de-emphasize the perception

>that this has more to do with POSIX than with PLs, a perception which

>should be wrong, otherwise the TR has at least partly failed.

>Controversial issues should be removed and the TR should be enhanced in

>WG20, with strong collaboration with SC2.

>

>ISO/IEC 15435 API standard project: we believe in Canada that an I18N API

>standard is required, and that otherwise kitchen-made solutions will

>rather tie customers to some developers, which is the opposite goal of

>international standards.

 

My sense is that the nature of APIs itself is still under strong debate and

hasn't settled into a consensus where one could do an i18n API set without

inadvertently taking sides in the larger debates of object-oriented vs.

procedural and whether C++ style or Java style etc. When the first POSIX

standard was written, the world was a simpler place and such an effort made

a lot of sense. Nowadays it's all more difficult.

 

>TR 10176 Programming languages standards guidelines: its annex on

>identifier-related characters is in our opinion linked to SC2's interests,

>while all the rest belongs to SC22. Again a strong collaboration between

>SC22 and SC2 is required. The place for maintaining this appears to be in

>SC22/WG20.

 

The problem of identifier guidelines is so firmly linked with character

issues, that a way needs to be found to separate that part and move it into

SC2. Identifiers are not only needed in programming languages, but in many

other types of languages and internet related services (domain names).

Since the issues connect with the character set standard on which they are

based, SC2 is the right place.

 

>Other issues: a lot of I18N issues belong to the user interface domain.

>This is dealt with in SC35. We should remember that the whole domain of

>I18N is a horizontal issue in JTC1

 

This is indeed the case. While I am firmly in support of moving character

related maintenance and standard into SC2, there are many i18n areas that

should be placed in other places. In all cases, though, if programming

languages are not central to the issue, the work should probably be taken

out of SC22.

 

A./

 

 

For more reactions please see the links on the top of this document.

 

Arnold

September 11, 2000