19112 – Question mark is always used for ANSI equivalents in RTF files

Issue 19112 - Question mark is always used for ANSI equivalents in RTF files

Summary: Question mark is always used for ANSI equivalents in RTF files

Status:	CONFIRMED

Alias:	None

Product:	Writer
Classification:	Application
Component:	code (show other issues)
Version:	OOo 1.1 RC3
Hardware:	All All

Importance:	P3 Trivial with 6 votes (vote)
Target Milestone:	---
Assignee:	AOO issues mailing list
QA Contact:

URL:
Keywords:

Depends on:	10538
Blocks:
	Show dependency tree

Reported:	2003-09-05 13:15 UTC by akrioukov
Modified:	2017-05-20 11:26 UTC (History)
CC List:	1 user (show)

See Also:
Issue Type:	ENHANCEMENT
Latest Confirmation in:	---
Developer Difficulty:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.

Description akrioukov 2003-09-05 13:15:14 UTC

According to the MS RTF specification, each Unicode character in RTF should have its ANSI 
equivalent. This is necessary for compatibility with old applications which don't recognize 
Unicode RTF. However, in RTF files generated by OOo this ANSI equivalent is always replaced 
with a question mark. For example, a word of Russian text may look as follows: 
 
\u1055 ?\u1088 ?\u1086 ?\u1074 ?\u1077 ?\u1088 ?\u1082 ?\u1072 ? 
 
This means that any RTF file containing national characters (Cyrillic, Greek, accented Latin...) 
can be opened only in a Unicode-aware application. If we open such a file in any other 
application (like MS Word 6.0/95, Page Maker, Quark Xpress), all national characters will 
be simply lost. So I can't use RTF as a universal interchange format, which was possible with 
MS Word. 
 
Really, however, these question marks should be inserted only for characters which have no 
equivalent in any windows-125* codepage. If such an equivalent exists (for example, Cyrillic 
`a' has code 0xC0 in windows-1251, Greek alpha is 0xC1 in windows-1253 and so on), it 
should be used in RTF file. So the Russian word above should look as follows: 
 
\u1055\'cf\u1088\'f0\u1086\'ee\u1074\'e2\u1077\'e5\u1088\'f0\u1082\'ea\u1072\'e0 
 
I understand that this is a feature rather than a bug, however, I think fixing it should be 
important for compatibility with other software.

Comment 1 caolanm 2003-09-05 13:33:02 UTC

cmc: Mine. I have already done something similiar to support exporting
unicode to 8bit word 6/95 format, so I should be able to leverage my
unicode character classification into equivalent windows codepage
stuff to .rtf as well.

Comment 2 caolanm 2003-09-10 14:58:24 UTC

Accepted.

Comment 3 caolanm 2003-09-19 16:33:49 UTC

Now that I have issue 10538 implemented I should be about half way to
having this support.

Comment 4 caolanm 2004-03-04 13:49:21 UTC

cmc->mmaher: This would be a nice feature, but its not critical. Would be great
to get some code from volunteers to implement this (hint hint outside world)

Comment 5 maxbritov 2004-03-05 10:14:54 UTC

I think this is enchance feature and not critical only for english speaking
peoples, but this issue is BUG for us who speak and make documents and exchange
ones! I'm russian OO user and I have many problem with that issue and related
international troubles.
I work on plant with ~100 PC and we have many partners and we having many troubles
and this issue is one of these. Please don't ignore these issues: many peoples
wait this.
I taked 2 Vote. I vote - this is BUG (for me)!
Please excuse me for my English. I'm usually write russian texts
and I write they only in russian OpenOffice.org from Russian translation team.
Thanks all.

Comment 6 vorchun 2004-03-05 12:41:33 UTC

In my oppinion that it is critical for all non-English linguages. I have many problems when I use 
OO.org RTF filter for document exchange. Range of software in which problems were occured 
is wery huge: MS Word, MS Wordpad, Quark Xpress, Corel Draw, AbiWord and so on... So it 
is critical bug, not a feature or whish!

Comment 7 martin_maher 2005-04-13 17:08:39 UTC

mmaher->flr: Yours I think

Comment 8 Mathias_Bauer 2006-08-30 15:16:40 UTC

reassigning to hbrinkm

Comment 9 Marcus 2017-05-20 11:24:35 UTC

Reset assigne to the default "issues@openoffice.apache.org".

Comment 10 Marcus 2017-05-20 11:26:08 UTC

Reset assigne to the default "issues@openoffice.apache.org".