Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing |
Summary: | l10n: Insert LRM automatically for Latin words in RTL mode | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Internationalization | Reporter: | mbnoimi <mbnoimi> | ||||||
Component: | code | Assignee: | AOO issues mailing list <issues> | ||||||
Status: | CONFIRMED --- | QA Contact: | |||||||
Severity: | Trivial | ||||||||
Priority: | P2 | CC: | hdu, issues | ||||||
Version: | OOo 3.2.1 RC2 | ||||||||
Target Milestone: | --- | ||||||||
Hardware: | All | ||||||||
OS: | All | ||||||||
Issue Type: | FEATURE | Latest Confirmation in: | --- | ||||||
Developer Difficulty: | --- | ||||||||
Attachments: |
|
Description
mbnoimi
2010-06-05 15:16:23 UTC
Created attachment 69815 [details]
Correct rendering (see "C++" word)
Created attachment 69816 [details]
Wrong rendering (see "C++" word)
@mbnoimi: In order to type C++ in a RTL paragraph, insert a LRM (left-to-right mark) after the last "+". (Insert, Formatting Mark, left-to-right mark). This isn't practical solution because the user have to insert this mark manually for every Latin word in RTL document where I suppose it must be inserted automatically just like Microsoft editors (Word or wordpad) for that I reported this issue. Any way I changed this issue to feature request because I believe this isn't a real bug although it's very necessary for RTL documents (I'm suffering a lot because of missing it). Eike, can you find better owner for this issue? Inserting an LRM for Latin words certainly is not a solution.. writing direction should be determined by script, and usually is. It might be necessary though for the string C++ because C isn't exactly a word and + characters may be weak. However, C is not weak but LTR, and even if + was weak the correct writing direction should result for C++, if I'm not mistaken. @hdu: any insights? The direction of plus-signs in the sample text is AFAIK determined by UAX#9's X3.3.4 "Resolving Neutral Types" rule N2 which says: "any remaining neutrals take the embedding direction". So unless they are in a LRM context or in a default-LTR paragraph their default-N2 direction needs to be overridden by a LR* mark for the desired behaviour. Isn't the embedding direction in this case LTR because the ++ follow C? Indeed, unicode defines plus sign as ET: weak, not neutral. But since there is no european number before it W5 rules that it becomes ON: neutral, so N2 applies. Several BiDi libraries agree that the way OOo is doing BiDi is correct. Do an experiment with your favorite webbrowser looking at this file: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML></HEAD><BODY DIR="RTL"> ئC++ئ</BODY></HTML> Change the RTL to LTR and look again: voilĂ ! The topic of applying UAX#9's rules to unmarked BiDi-text not being what is expected from experienced BiDi users comes up regularly: see issue 100737, issue 93325, issue 105623 and maybe 85360. There is obviously some need for a more general heuristic that can automatically add BiDi-Markers to some unmarked BiDi-text so that the resulting UAX#9 BiDi ordering becomes DWIMmy. s/93325/92325/g Reset assigne to the default "issues@openoffice.apache.org". |