Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing |
Summary: | Aqua: Bullets imported from MS Word use ambigous symbol font | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Writer | Reporter: | mdkey74 <mdkey74> | ||||||||||||||
Component: | open-import | Assignee: | AOO issues mailing list <issues> | ||||||||||||||
Status: | ACCEPTED --- | QA Contact: | |||||||||||||||
Severity: | Trivial | ||||||||||||||||
Priority: | P3 | CC: | crosbeee, don.troodon, hdu, issues, jonathan20824, Mathias_Bauer, smokey.ardisson, sven.jacobi, thomas.lange, timdonahey | ||||||||||||||
Version: | 605 | Keywords: | aqua, ms_interoperability | ||||||||||||||
Target Milestone: | --- | ||||||||||||||||
Hardware: | Mac | ||||||||||||||||
OS: | Mac OS X, all | ||||||||||||||||
Issue Type: | DEFECT | Latest Confirmation in: | --- | ||||||||||||||
Developer Difficulty: | --- | ||||||||||||||||
Issue Depends on: | 113851 | ||||||||||||||||
Issue Blocks: | 92203, 118251, 94172 | ||||||||||||||||
Attachments: |
|
Description
mdkey74
2008-07-22 20:50:31 UTC
Created attachment 55309 [details]
Comparison of Word and OpenOffice
MRU->HDU: the characters from the Symbol font are not correctly displayed on MAC. Is correctly done on Win/Sol/Linux. Created attachment 55326 [details]
simple MS Word file
Someone draws the codepoint U+F0B7 while the font "Symbol" is selected. Of course a font Symbol on OSX looks different than on Windows. The OSX one doesn't even have something at that codepoint! @hbrinkm: is import of bullets on your turf? Please use only well defined fonts (e.g. OpenSymbol) if you need well defined glyphs for specific codepoints. *** Issue 94727 has been marked as a duplicate of this issue. *** The clapboard symbol is Unicode F087 from Webdings Regular font. On my system I got the "private use" (looks like a W unless you blow up the font size). Docs created on Word 2003 contain Symbol F087 for a bullet and docs created in OO3 (Word format) use the same symbol. In that regard, the behavior seems correct for what is created in OO3. However, it doesn't address the fact that the wrong symbol displays (and prints) from Word docs opened in OO3 on the Mac. Seems like the need for a hack/workaround that correction of improper behavior? In my case, I installed Symbol.ttf from a Windows system and disabled Symbol included with my Mac then all is well. However, I do not know the side-effects from Mac apps that might rely on Symbol having certain characters that won't match with my Windows version of Symbol. @hbrinkm: When importing documents like *doc, *docx, etc. then substituting ambiguous font requests (like the "Symbol" font in this case) into more specific font requests (e.g. OpenSymbol) is the only way out of this problem. *** Issue 96512 has been marked as a duplicate of this issue. *** moved target There's some funny mapping going on. However even if you fix the mapping for bullets, a document that uses other characters from Symbol won't look right. It appears to me that OO assumes the existence of the MS symbol font. If you put symbol.ttf in OO's font directory, OpenOffice.org.app/Contents/basis- link/share/fonts/truetype, things work. I did a test document with all the forms of bullet and every symbol in the MS symbol font. They all come across properly to the Mac if you have the MS symbol font installed. Because that raises possible licensing issues, I have constructed an emulation of it starting from the URW symbol font included with Ghostscript. I have converted it to TTF with the correct encapsulation and Unicode values to match the MS original. See http://techdir.rutgers.edu/symbol.ttf. It might make sense to include this with the Macintosh version of OpenOffice. Since this works properly when installed in OO's internal font directory, it won't affect the behavior of the system in general. The question is whether this would cause trouble for documents authored in OO. With this change any document using Symbol in effect uses MS Symbol. I argue that as long as OO on Windows use MS Symbol you want the Mac to do so as well, to provide for proper interchange. However I haven't done a full test of all combinations of interchange. My testing has been with Word 2008 running on the Mac, with documents originated in Word. I created two test documents. One used all forms of bullet that are standard in Word. The other used every character from the Symbol font as shown in Word. Both document came across properly in both .doc and .docx files when opened by OO. They also worked properly when OO saved them in .doc format and they were opened in Word. I haven't tested with versions of Word on Windows. *** Issue 97677 has been marked as a duplicate of this issue. *** *** Issue 99183 has been marked as a duplicate of this issue. *** *** Issue 99287 has been marked as a duplicate of this issue. *** borrowed a symbol.ttf from my xp machine to see if that would fix the problem. Installed it with the os x Font Book, no help. Copied it into the /library/fonts folder, no help. I open a word doc and get the clapboard. This issue is enough to make me purchase MS office for os x when I buy a Macbook this year (gag). One thing that seems to work is saving the file as a word doc using the bullet of the arial font. I select it by diving deep down in the bullets formatting menu in OO. This must be made seamless to get casual users to accept OO. *** Issue 101314 has been marked as a duplicate of this issue. *** *** Issue 102095 has been marked as a duplicate of this issue. *** @billmerkel: installing the symbol font does not necessarily fix the problem, it's possible that OSX still selects the Mac symbol font. To summarize (as this issue has got some intereste recently), it seems that we have to create several mappings for code point conversions between the different symbol fonts and apply them in the import and export filter. To be able to do this, we must know wich application on which operating system has created the document (as the font name "symbol" is too unsharp to let us know which mapping table we had to apply). Created attachment 62619 [details]
Symbol-Font Codepoint xxB7: OSX vs. Win
*** Issue 16337 has been marked as a duplicate of this issue. *** *** Issue 103871 has been marked as a duplicate of this issue. *** As reported in my March 29 09 entry, I was afraid I would have to purchase MS office to address this issue. With my new MacBook and MS 2004 for Mac loaded I thought I would not have to fight this problem and other nuisance formatting problems. The problem does appear although less severely between Word files. It is not uncommon for me to get empty check boxes where a windows Word user created a dot. If I work with the file, do not change the bullet to another character, and send it back, they get it okay. So far they have not complained about documents that I create. Amazing - 25 years after I was selling the original Mac 128K and IBM PC at ComputerLand (now bankrupt) this kind of thing still "bugs" us..... *** Issue 104004 has been marked as a duplicate of this issue. *** I made some test around this issue. If a create a text file (Word95 format) using OO 3.x with a standard bullet list (small circle) and a customized bullet list (using square or dimaond or big circle), and save it to .doc When i reopen it, the standard bullet list icons are replaced as indicated in this issue. But the customized bullet list are right. So it seems to append only for Standard bullet list... This problem is terrible for us at our office, because as we migrate to OO 3.x this issue made impossible for us to use OO in production. We have too many .doc documents to deal with... Created attachment 63963 [details]
standard bullet bug, and other list OK
*** Issue 105258 has been marked as a duplicate of this issue. *** Could this be ONLY related to documents created on old versions of Microsoft Office/Windows? References that seem likely to cause problems: ## The range of characters between U+F020 and U+F0FF in the Private Use Area of Unicode is mapped to symbol fonts in Richedit 4.1 http://support.microsoft.com/kb/897872 Richedit 4.1 maps the range of characters between U+F020 and U+F0FF in the PUA to symbol fonts. Therefore, when you map any character in this range, Richedit 4.1 shows the symbol character instead of the end-user-defined character (EUDC). APPLIES TO: Microsoft Platform Software Development Kit-January 2000 Edition ## Handling of PUA Characters in Microsoft Software http://scripts.sil.org/cms/SCRIPTs/page.php?site%5Fid=nrsi&item%5Fid=PUACharsInMSSotware Excerpt: One of the mysteries of text formatted with symbol fonts (at least, in certain Microsoft applications) is that characters appear to be encoded in terms of 8-bit code points even if a document is otherwise encoded in Unicode. When U+F021 was inserted into WordPad from the clipboard, not only did WordPad (more precisely, the Rich Edit control) apply the Wingdings font, it seems that it also changed the code point to 0x21. When the character was reformatted to a non-symbol font, this became U+0021. to avbentem : no the discribed bug do not appear with document created with ol MS software, but it appear with OOo juste by saving a doc to .DOC format and reopen it immediatly. it just have to content a bullet list. See the sample doc (created with OOo 3.1) i sent on 5 august 2009 to this list : test.doc Thanks. My problem is caused because I save my documents in the MS Word format using Open Office, then reopen them using Open Office again. The problem is definitely caused by a fault in Open Office, as MS Word wasn't used at any point in the process. I am now using the ODT format and the problem no longer happens. I would rather save my work in a MS Word format, as this means that I can send the documents directly to colleagues using Word. Also, I am also not using Windows, I use OSX. I am slightly disturbed that my "issue" was dismissed without any effort to investigate on the person who cancelled my "issue". I appreciate that there are many people raising similar issues here, but this issue makes the use of bulleted documents or saving in MS Word format not possible for many people. Thanks, Dave. True, especially when reading many of the duplicates, the cause cannot be Office nor Windows. Still, maybe OpenOffice.org somehow has implemented something to circumvent that odd usage of U+F020 through U+F0FF. If that would be the case, thus: if saving in Office format mimics Office just a bit too well, then it still may be the cause? (Sorry if I make things more complicated! I don't need an answer, I just wanted to point out a possible cause.) I imagine that if I went to the office, jumped onto the Windows PC running MS Word and opened a document, created some bullets, saved the document and then opened it up again, the bullets wouldn't have changed into clapperboards. I can't remember a time when that ever happened before. Personally, speaking purely as a user of the software, (I am not a programmer, I am a project manager) OO is brilliant...but it does need to be 100% compatible with MS products for it to be of any use to people. I can't really imagine ever submitting a document to the board of directors of our £100M company and the bullets are all clapperboards. @daveysawyer, this is not the place to complain about OpenOffice.org. This is the place to find a solution for the problem. According to the OO website, this is the place to raise issues and report bugs...which is exactly what I did. I am not trying to start a flame war, and I am not trying to complain about either OO or MS Word. I have tried all the suggestions in the previous posts and the fault still exists (since 22Jul08) and I was merely explaining why you need to deal with it. As I say, I am not a programmer, I am merely raising the issue again and asking that it be solved. This problem is totally repeatable, is totally the fault of OO/MS Word conversion and needs to be fixed. It's not like the problem is with something minor, like one of the weird "doughnuts" in the clipart section, bullets are used all the time in business writing and most people need to remain compatible with MS Word. Many thanks. @hbrinkm,@mba: IMHO this interoperability issue deserves a target earlier than 3.x @hdu: 3.x is the earliest target that makes sense. An earlier target should not be given before we have decided to fix that problem in the next release. Otherwise this would just be another issues that gets retargetted at the end of the release cycle. So I read your comment as "please consider to fix that issue in 3.3 and in case you think you can do that, assign it the target 3.3". :-) > So I read your comment as "please consider to fix that issue in 3.3 and in case you think you can do
that, assign it the target 3.3". :-)
@mba: Yep :-) The issue is really annoying and seriously impacts interoperability.
This bug also occurs when using the Sun PDF Import plugin - I just got a document full of clapper boards using that import. There are days when I love open source, and there are days when I find a really irritating bug with no workaround that's been open for over a year. I found a fix that seems to work for me. I found it at: http://user.services.openoffice.org/en/forum/viewtopic.php?f=17&t=10846 In one of the posts near the bottom it says: "I've had success using the Font Replacement Table, located in the Tools -> Options -> OpenOffice.org ->Fonts Enable "Apply replacement table", select "Symbol" in the lefthand FONT drop down, select "OpenSymbol" in the righthand "Replace with" dropdown. Press the checkmark to the right to add the substitution to the table. Make sure you check the Always box. Once this is set up, opening and saving in MSWord 97/XP format preserves the bullet characters in both directions." I tried that a few months ago and it seems to work in most bullet import cases. I haven't seen it cause any other issues but I am not a heavy user so I just may not have run into it yet. Thoughts about making this a default setting in the next release? Another fix to this issue : Select the bullet list, use the local menu to personalyze the bullet, choose the second bullet (bug one) and OK. Now bullet is preserve throught Word format. Not simple and must be done on all the list manually, but works. I fully support daveysawyer's comments: fixing greek fonts and bullets in OO is a mess. The hassle getting greek fonts to work on the Mac platform is a major obstacle for introducing Open Office at universities and research institutes. Most researchers in natural sciences and medicine have an absolute need to be able to deal with alphas, betas etc. And, scientists collaborate, write grant applications and papers in collaborative projects, and send manuscripts to journals. With the current encumbrance in using a greek font, bullets etc, there is no way I can lobby for an increased use of OO at our university, despite the fact that the program can handle essentially everything else most scientists need to do with an office package. There is no way one can ask a thousand scientists to apply a "fix" - we are usually scientists, not programmers or computer technicians. The issue needs to be dealt with, with highest priority! I am a Mac user at an academic institution. I'm suffering with this symbol font and bullet problem and have decided to buy Microsoft Office instead. Maybe entering a comment here isn't the correct way to prioritize this issue, but just getting to this issue log was quite a chore. I strongly suggest that you fix this font problem. The academic community is a big factor in open-source software and writing them off is not a wise thing to do. This not only affects mathematical symbols and equations, but bullets as well. moved target Analysis: The reason for the wrong bullet character are different Symbol fonts on MS Windows and Mac OS X. The code-point 0xf0b7 used in the .doc has no equivalent in the Symbol font on Mac OS X. The font fallback on Mac OSX then gets the "best match". Analysis: The reason for the wrong bullet character are different Symbol fonts on MS Windows and Mac OS X. The code-point 0xf0b7 used in the .doc has no equivalent in the Symbol font on Mac OS X. The font fallback on Mac OSX then gets the "best match". Several users in this thread have clearly explained the compelling importance of resolving this issue to enable migration from the other office suite. 3.2 looks like it is heading for RC1 now and this issue is not in the list of open issues to be addressed in the release. I assume it is too late to be added as in issue for the 3.2 release, I understand the difficulties of adding issues this late in the release cycle. Is there an accepted way to raise the priority of this issue? By my reading of the priority descriptions this should be a P2. Since I do not understand how the priorities are determined in the open source model (I have been the product manager for large proprietary sw projects) I do not know if we have "pulled all the levers" available to us. My goal would be for it to be in a 3.2.x release rather than wait for 3.3, or ensure that it has high 3.3 priority. I would like to second the suggestion to put this in the 3.2 release (even if it means delaying the 3.2 release, *unless* the 3.2 release contains security fixes). TODO: Add a converter for Symbol(Microsoft)->OpenSymbol to /Volumes/OOoBuilds/OpenOffice/hb33issues01/Source/vcl/source/gdi/fontcvt.cxx If anyone could provide such a converter I would very much appreciate. :-) Adding sj and me to cc since issue 92203 is related to this issue. *** Issue 109281 has been marked as a duplicate of this issue. *** Hello, I'm new here, so apologies if I duplicate someone's remarks somewhere else. I just downloaded OO 3.2 and the issue remains, only now it's not a director's clapboard but a square with an ornamental W in it. I take it from the last remark then that the 3.2 version does not have a resolution for this issue? Just checking... I usually fix it by marking the first of the bullet points, and simply change it back to the bullet in bullets & numbering that I wanted in the first place. "target 3.3" clearly says that it is not fixed in 3.2. Created attachment 69098 [details]
Bullet-test.odt
With OpenOffice 3.2.0 (mine is Mac OSX 10.6, OO320m12-Build 9483): 1.Open the previous attachment Bullet-test.odt file, and the « Styles & Formatting » window, section « Bullets ». Move from one bullet line to the other, and watch the bullet style change accordingly. 2.Now save as Bullet-test.doc (Word 97/2000/XP): on screen, nothing changes. You can still move through the same bullet styles. 3.Now close the file, and reopen Bullet-test.doc: watch the differences in the bullets shapes & spacings: style 1 & 2 have wrong bullets, style 4 spacing is too big . Watch also the « Bullet » section of the « Styles & Formats » window: bullet styles 1-5 have in fact been changed to WW8Num3-7, which were not to see before reopening: this is treacherous!!! These new styles are only created when a bullet style is actually used in the document: for instance, WW8Num2 is created for the numbering style 1, used in this section, but nothing for numbering style 2. Also, WW8Num4 is used both the bullet style 2 in the previous section, and for the more indented version of this paragraph. When actually opened with MS Word, or an older OpenOffice (e.g. NeoOffice's 3.0), these styles look very close to the original styles 1-5, so there is no point to switch back, even if it cures the appearence in OO. The next saving would anyway redo the switch. So apparently, the bug is double: - why create a new style, when the old one is still there, and seemingly works? - why is OpenOffice 3.2.0 the only one to interpret wrong the « equivalent » WW8Num-styles it created itself? Same issue, but the bullets don't look like a director's clapboard, but as a stretched "accent circonflex". Created attachment 69953 [details]
2 pictures of issue - compare
Analysis: The WW8 filter seems to do the appropriate mapping already. OOo 3.3 is nearly final. I change the target of this issue to OOo 3.x. Please find a solution and set a correct target, if you know when a fix can be integrated. OMGosh! I found a solution that works, at least for my system. Having the problem in Word 2004, Word 2011, Open Office both Oracle and free, I took one more crack at a search and on a Microsoft Web Site found a solution. See: http://www.officeformac.com/ms/ProductForums/Word/11721 Perhaps it will work for you. @Bill Thanks for finding this. It did not work for me. I'm running OS X 10.5 and I still get boxed Yen signs for bullets. My FontBook does not work the same as in the article: I don't have a "Find Duplicates" command. Some details. Instead of Find Duplicates I used Resolve Duplicates under the edit menu. But this turned out to not be worth the time. This selected only one version of each font. However I found that the selections were not very rational. For example, I would have three copies of a font with different copyright dates. It would select only one of each variant - normal, bold, italic, Bold italic. but sometimes it would select different copyright dates or unique name. Like regular from 2007 and italic from 2001. So I went through each font family, selected any font older than 2005 (apparently os X had dragged fonts along from my mac purchased in 99) and deleted them. You have to look through the dates in the different fields by selecting Preview/show font info as sometimes the unique name date is older than the copyright date. For example a font family with a 99 version a 2001 version and a 2007 version of Regular, Italic, Bold, and Bold Italic. Select any one of the older ones, for example 99 regular, and press delete, then ok. All four subfonts from 99 are deleted. I cleaned out half my fonts this way. Remember to also check User in the Collection Column. I think what helped the most was deleting an ancient symbol font that was in my User collection which made the one under Computer the only Symbol font. This remaining symbol font is copyright 1990-99 Apple, but the unique name is "Symbol;6.1d7e3; 2008-05-12. So it appears to be a TT font from 2008. After deleting everything older than 2005 I did not have any duplicates to resolve. I deleted a lot of old fonts I had never used anyway and do not expect to miss them. I followed the instructions on how to setup the fonts for bullets on the link I gave above. I then opened a document that had given me problems and it looked fine in word and openoffice. Bullets were actually bullets! Clean out all those old fonts using the show information option and make sure you have a modern symbol font. That seems to be what worked for me. Another interesting find, at least in snow leopard. Go to system preferences, keyboard. On the Keyboard tab check the "Show Keyboard and Character Viewer in menu bar" Click on the american flag on the tool bar, click the character viewer. In the tiny little search field at the bottom of the window type the word "bullet". All the bullets available in every font show up. Click the pulldown labeled "Collections:" to see the list of collections. Look! one of the lists is labeled "Windows Office Compatible". Select that collection and the number of fonts with a bullet is reduced. Hmmmm. Is this another path to cross platform bullet symbols? (In reply to hedrick from comment #10) > There's some funny mapping going on. However even if you fix the mapping for > bullets, a document that > uses other characters from Symbol won't look right. It appears to me that OO > assumes the existence of > the MS symbol font. If you put symbol.ttf in OO's font directory, > OpenOffice.org.app/Contents/basis- > link/share/fonts/truetype, things work. I did a test document with all the > forms of bullet and every > symbol in the MS symbol font. They all come across properly to the Mac if > you have the MS symbol font > installed. Because that raises possible licensing issues, I have constructed > an emulation of it starting from > the URW symbol font included with Ghostscript. I have converted it to TTF > with the correct encapsulation > and Unicode values to match the MS original. See > http://techdir.rutgers.edu/symbol.ttf. It might make > sense to include this with the Macintosh version of OpenOffice. Since this > works properly when installed > in OO's internal font directory, it won't affect the behavior of the system > in general. > > The question is whether this would cause trouble for documents authored in > OO. With this change any > document using Symbol in effect uses MS Symbol. I argue that as long as OO > on Windows use MS Symbol > you want the Mac to do so as well, to provide for proper interchange. > However I haven't done a full test of > all combinations of interchange. My testing has been with Word 2008 running > on the Mac, with > documents originated in Word. I created two test documents. One used all > forms of bullet that are > standard in Word. The other used every character from the Symbol font as > shown in Word. Both > document came across properly in both .doc and .docx files when opened by > OO. They also worked > properly when OO saved them in .doc format and they were opened in Word. I > haven't tested with > versions of Word on Windows. thanks so much Hedrick, this problem was bugging me for a while, finally I did some research and the good people of the OO community have come through again. cheers Considering this issue hasn't been resolved in six years, I resolved it by switching to LibreOffice. Confirmed in OpenOffice 4.0.1 running under Mac OS 10.4.11. Reset the assignee to the default "issues@openoffice.apache.org". |