Apache OpenOffice (AOO) Bugzilla – Issue 101471
Import formulas in msoxl: namespace from ODS
Last modified: 2013-08-07 15:14:30 UTC
Excel2007 is able to save in ODF-Format, when SP2 is applied. I'll attach such a file. OOo can open the file, but the formulas have got a msoxl-namespace. OOo should handle this namespace in a "proper" way. Proposal: If "Load and convert the object" "Excel to OOo Calc" in Tool > Options > Load/Save is checked, then the formulas are converted to oooc-namespace, if the formula will be valid in OOo.
Created attachment 61942 [details] Excel 2007 SP2 export to ODF
Excel writes the formulas in its own notation used in the OOXML file format. The formulas should always be converted, regardless of some config settings. For that, the formula parser implemented in the OOXML import filter of OOo could be used.
Yup, this should be a simple process, of re-using the xlsx formula parser already used in the xlsx import filter. I believe the formula syntax used in their ODF formula is the same as the OOXML one.
*** Issue 102325 has been marked as a duplicate of this issue. ***
CCed: er
*** Issue 102362 has been marked as a duplicate of this issue. ***
will be implemented in CWS dr71
well, it is not *that* simple. :-) For example, external references are stored diffently (not indexed as in OOXML, but with explicit URLs in the formula). And, to be able to offer a UNO service, the implementation has to be made independent from some internal base classes such as the omnipresent WorkbookHelper... But then, we have all the advantages of the import parser, such as postprocessing of function parameters etc.
@dr: I figured it would not be *that* simple. :-) I meant to say it should be *conceptually* simple (of re-using the parser written for the ooxml import), but of course there is always tons of road blocks in the implementation. We developers all know that nothing is not quite as simple as it appears. ;-)
The reason that the formulae are in a different namespace is that Excel uses different semantics for formula evaluation - that is what namespaces are for, to indicate different syntax and semantics. Specifically, one of the differences is that direct references (as against range references) to text cells are converted to numbers for expression evaluation. e.g. =A1+A2 will work even if A1 or A2 are a text cell that contains something that can be converted to a number in the current locale settings, whereas =SUM(A1,A2) will treat all text values as 0. If OOo is to support the msoxl namespace, it should support its semantics as well, and should leave the formula in that namespace so that the user knows that the semantics are different to normal OOo operation and can manually debug the conversion to OOo.
Minor correction to myself - it's not that it's a direct reference, since =SUM(A1,A2) is direct references as against =SUM(A1:A2) being a range reference, yet it still does not do automatic number conversion. I am unsure as to the exact rules that are used to determine whether or not conversion should be applied.
Another thing to bear in mind is that my proposal of handling the msoxl namespace with Excel-compatible semantics does not answer the question of interoperability with other ODF implementations that already work in an Excel-like way. Implementing this would give OOo interoperability with Excel, but other spreadsheet applications e.g. Google Docs would evaluate OOo formulae differently to OOo (in respect of string conversion). In my ideal world, OOo should be the gold standard for behaviour in the default ODF namespace that OOo puts its formulae in, and other implementations should change their behaviour in that namespace, and all legacy Excel-compatible behaviour should be shunted out into the msoxl namespace. To me, this is the holy grail situation - Excel spreadsheets continue to work as they always have, and OOo spreadsheets continue to work as they always have, and the end user can migrate formulae from "old" behaviour to "new" if they want, or vice versa. The thorny cases would be down to ODF spreadsheets created with other tools whose default behaviour is Excel-like. Is there a way to tell what tool created the document, and pop up a dialog box asking the user if they want to shunt all formulae into the msoxl namespace the first time a Google Docs ODF file is opened in OOo?
"Is there a way to tell what tool created the document". Yes there is: the meta:generator element in meta.xml. Although it is actually not a required element, it seems to be set by most (all ?) implementations.
Fixed in CWS dr71. After this fix, Calc is able to import formulas from msoxl: namespace: - cell formulas - matrix formulas - conditions of conditional formatting - conditions of data validation Note that this fix creates plain Calc formulas which do not retain the original formula namespace as suggested by philhibbs. This needs a lot more work in the Calc core. You may write an enhancement request for this. This fix does not include support of external references (references to cells or ranges in other spreadsheet documents). See issue 103315 for that.
I will attach a bunch of test documents in ODF 1.0 (generated by StarOffice 7), ODF 1.1 (generated by StarOffice 8), ODF 1.2 (generated by StarOffice 9), and finally the document in ODF 1.1 with msoxl: formulas (generated by MS Excel 2007 SP2). The SO documents are useful to check that the regular ODF import has not been broken by this fix.
Created attachment 63348 [details] ODF 1.0 test document
Created attachment 63349 [details] ODF 1.1 test document
Created attachment 63350 [details] ODF 1.2 test document
Created attachment 63351 [details] ODF 1.1 test document created by XL 2007
back to QA
Issue 103318 created to propose handling of alternative semantics in different namespaces (e.g. msoxl).
verified in internal build cws_dr71
Verified in DEV300m54 on WinXP Closing
I have been sent an ods document created by excel 7 and do get "msoxl:=MONTH(B2)" instead of =MONTH(B2) and other alike entries when opening it. There is no troublesome fancy reference stuff, just pure simple cell references. --> Is this the standard behaviour or did I miss the right option enabled? --> Can I do anythoing about it (beside manual "guess, find and replace") --> Are there any plans to increase the compatibility? I am using oO 3.1.1 reagrds, Walter.
@walterwerner: This issue should be fixed for OOo3.2, please install the new release.