Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing 
Summary:  OO corrupts formulas when reading Excel spreadsheet  

Product:  Calc  Reporter:  J.Ty. <jty>  
Component:  openimport  Assignee:  AOO issues mailing list <issues>  
Status:  CONFIRMED   QA Contact:  
Severity:  Normal  
Priority:  P5 (lowest)  CC:  jty, oliver.brinzing, orcmid  
Version:  4.1.1  
Target Milestone:    
Hardware:  PC  
OS:  Windows 7  
Issue Type:  DEFECT  Latest Confirmation in:    
Developer Difficulty:    
Attachments: 

Description
J.Ty.
20150623 07:56:59 UTC
Created attachment 84806 [details]
an example spreadsheet corrupted by OO
OO corrupts spreadsheets created under Excel, which contain references to whole rows in the form '1:1'.
E.g., Excel's meaningful =INDEX(1:1,INDEX(2:2,3))
becomes OO's =INDEX($A1:$AMJ1INDEX($A2:$AMJ2)), which makes no sense and produces an error.
The most pervert form is that Excel's =INDEX(2:2,2)
is turned into OO's =INDEX($A2:$AMJ2), which is syntactically correct, but produces generally a different value.
SUMMARY (the TLDR) I changed this to be a problem with openimport, since it involves conversion. I also found that the problem is with the conversion to and/or recording of OpenFormula incorrectly, leading to errors of various kinds. The use of fullrow references in these INDEX formulas is not interoperable and involves a combination of defects. It doubtless extends to cases beyond use in INDEX. DEMONSTRATION I confirmed J.Ty's example XSLX using Microsoft Office 2013 Excel. It reopens fine.    Making an ODS from Excel and interchanging that    I used Excel 2013 to save the example as an .ODS file. Excel reopens the saved .ODS just fine. Note, what is shown to users and what is recorded in the file can be different. For example, the XSLX cell A4 formula entry "=INDEX(1:1,INDEX(2:2,3))" is *recorded* in the Excelproduced ODS file as "of:=INDEX([.1:.1];INDEX([.2:.2];3))". This is perfectly valid in accordance with section 5.8 References of the OpenFormula Specification part of ODF 1.2. It might also be valid to record it as "of:=INDEX([.1];INDEX([.2];3)" since the start and end rows are the same each time. Details at 6.14.6 INDEX determine that the second parameter is the number of the column to select since the data source is a single row vector. Using the Excelproduced ODS, LibreOffice 4.3.5.2 *presents* the A4 formula as "=INDEX(1:1,INDEX(2:2,3))" and the results for all of A4:A6 are correct. However, on performing a Save As of the opened file, that LibreOffice saved version is not reopened correctly. The resaved A4:A6 on opening in LibreOffice *present* the same formulas but the cells show value "Err:502". When LibreOffice recorded that ODS, the A4 formula is *recorded* as "of:=INDEX(1:1;INDEX(2:2;3))" and similarly for A5:A6. That is, the formulas are rewritten in the Save As result to a form where LibreOffice fails. When that file is opened in Excel 2013, Excel 2013 reports that the file is defective. If Excel is allowed to repair the file, it will have the formulas dropped and the values preserved. Using the Excelproduced ODS, Apache OpenOffice 4.1.1 Calc "presents" the A4 formula as "=INDEX(1:1;INDEX(2:2;3))" and similarly for the rest of A4:A6. However, AOO shows "#NAME?" for the value of each of those cells. When the saved version from LibreOffice is opened in AOO, the results are the same ("Err:502") as when LibreOffice opens that file it saved.    Importing the .XSLX    Now that we see there are serious asymmetries in how OpenFormula (recorded in the ODS) INDEX (and full row/column selection) is interpreted by three different products, where some products can't even read what they have written, and AOO can't read any of it, let's go back to the import of .XSLX. On opening of the J.Ty's original XSLX file in LibreOffice, LibreOffice 4.3.5.2 *presents* mangled formulas for A4:A6, showing "=INDEX($A1:$AMJ1INDEX($A2:$AMJ2))", "=INDEX($A2:$AMJ2)", and "=INDEX($A1:$AMJ1A5)". The cell values are shown as "Err:508", "2", and "Err:509". It is cleared that the presented formulas are a defective rewriting of the row references that corrupts the formula as presented. On saving of that file to a .ODS from LibreOffice, the A4:A6 formulas as saved are "of:=INDEX([.$A1:.$AMJ1]INDEX([.$A2:.$AMJ2]))", "of:=INDEX([.$A2:.$AMJ2])", and "of:=INDEX([.$A1:.$AMJ1][.A5])". Excel 2013 reports that file as corrupted. On opening of the J.Ty's original XSLX file in Apache OpenOffice 4.1.1, the A4:A6 values are "Err:508", 3, and "Err:509" but the *presented* formulas are the same as presented by LibreOffice. When the opened file is saved to a .ODS file by AOO, the formulas in the file are the same as those from LibreOffice also. Excel 2013 reports that file as corrupted. The defect is confirmed. There appear to be multiple problems involved. . 