Apache OpenOffice (AOO) Bugzilla – Issue 47323
Read Error trying to open MS Word file without extension
Last modified: 2013-08-07 14:42:16 UTC
OOo Writer 1.9.91 fails to open a Microsoft Word file that does not have the '.doc' extension. OOo Writer 1.1.4 opens the file without any problems. To reproduce: 1. Run OOo writer (1.9.91) and type in 'fuubar' 2. Save as 'foobar' with type 'Microsoft Word 97/2000/XP' without 'Automatic file name extension' 3. Click 'Yes' to confirm MS format 4. Try to open in writer (1.9.91) with any of the following ways: A. click 'foobar' in KDE B. $ /opt/openoffice2/program/soffice foobar C. select File->Open from OOo writer menu 5. OOo writer says 'Read-Error.\nError reading file' [OK] 6. rename to 'foobar.doc' => now it opens in 1.9.91
MRU->MBA: this only happens on Linux. Save a document in MS Word Format without suffix in Filename, close it and try to open via FIleOpenDialog -> Eror message.
Andreas, one for you
AS: The problem is the following one: On Windows we detect this file as "MS_Word_97" format everytimes. On Linux we detect "MS_Word_97" only if the extension is set. Otherwhise the deep detection of the writer detects "writer_MS_Winword_5". Afterwards the sfx try to load this file nad runs into an ERROR ... I saw something like "ERROR_IO_BROKENPACKAGE" .... May be we cant handle storage and non storage based formats in such case right. AS->MAV: Can you please investigate what's happen there? THX.
*** Issue 47739 has been marked as a duplicate of this issue. ***
*** Issue 48149 has been marked as a duplicate of this issue. ***
*** Issue 48873 has been marked as a duplicate of this issue. ***
Currently there is no "ERROR_IO_BROKENPACKAGE" error during the loading process. After "ConvertFrom()" is called the error code is set to 0x70B02 that is application specific error code. I assume that this is a result of using of the wrong filter for the loading. So either this is a problem of the filter "MS Winword 5" that is not able to import the document or the problem is in the writer deep detection that returns the wrong filter. MAV->FLR: Please take a look whether this filter should be able to open documents of 'Microsoft Word 97/2000/XP' format. If not ( means that the writer deep detection is wrong on linux ) please send this bug to the owner of the writer deep detection.
hmm... what I'm seeing is that the binfilter (com.sun.star.comp.sfx2.BinaryFormatDetector) typedetection gets called first on an unsuffixed file and its ::detect gets the clipboard id of the file (correctly as the msword type) and uses this to get a filter, and the writer typedetection service doesn't get a look in. The thing is that there are three filters writer_MS_WinWord_5/writer_MS_WinWord_60 and writer_MS_Word_97 which have this clipboard id. So it looks like it grabs the first one which is writer_MS_WinWord_5. It's worth saying that all these three filters are all implemented by the same code in sw/source/filter/ww8 so it's probably feasible in writer to work around the problem by re-detecting what the format really is when given a MSWordDoc type. But it looks sort of fragile if the binfilter detect is going to operate on ClipBoard id's when a single ClipBoard id could map to multiple filters. So I don't know if the real fix is to a) not have the binfilter detect get called first, or b) remove the MSWordDoc clipboard id for the non word97 formats in ./registry/modules/org/openoffice/TypeDetection/Types/fcfg_writer_types.xcu, or c) make them different for different filters to have a single filter to clipboard id mapping d) have writer re-detect when given one of the three word filternames which one to really use. In which case it's worth mentioning that writer_MS_WinWord_5 is not a storage format, only a stream format. A nasty quick fix patch for binfilter attached.
Created attachment 27054 [details] miserable hack
flr->od: take over as discussed
OD->CMC: Hi Caolan, I hope your are fine. Thx for your patch. We've already decided to fix this defect more generally: The filter service of binfilter modul should take only responsibility for the type detection of files, which the binfilter modul will handle with its import filters. Thus, we will not only exclude the detection of files with a clipboard ID of Microsoft Word documents.
fixed in cws swqbf35 - changed file: /binfilter/binfilterdetect/source/bindetect.cxx, 1.6.62.1
OD->MRU: Checked in internal installation set of cws swqbf35 - please verify. re-open issue and reassign to mru@openoffice.org
reassign to mru@openoffice.org
reset resolution to FIXED
Verified fix in CWS swqbf35.
*** Issue 51030 has been marked as a duplicate of this issue. ***
*** Issue 55524 has been marked as a duplicate of this issue. ***
This is a general problem also for WIN (see issue 55524!) . I hope it has been veryfied for WIN, too?
Checked fix in 680m132 build.
*** Issue 55975 has been marked as a duplicate of this issue. ***
*** Issue 57244 has been marked as a duplicate of this issue. ***
*** Issue 57660 has been marked as a duplicate of this issue. ***
*** Issue 59326 has been marked as a duplicate of this issue. ***
I am finding the same thing. I have a document server that is serving an MSWord document without an extension. OOo 2.0 gives a "read error: error reading file" message. It will open it with a .txt extension but not with .d or .do extensions. Please fix this. Writer should not require a file extension to determine the type of file.
Please try out with OO 2.0.2. If you problem still occurs, file a new issue and do not post into closed/fixed issues.