Apache OpenOffice (AOO) Bugzilla – Issue 5658
Spreadsheet thinks a number is a string (text)
Last modified: 2017-05-20 10:33:51 UTC
I opened a .xls file and the integers in the file seem to be acting as strings - - if I use the integers in formulas they behave as strings, not integers. If I open the .xls in Excel the integers are really integers. Also, I could not find a way to force them to integers. I'd be happy to email the .xls file if that is useful.
Hi Alan, please attach this file to this Issue using the 'Create a new attachment' Link. Thanks Frank
.
Created attachment 1902 [details] An .XLS file with integers that OO beieves to be strings
All cells in column C except C25 *are* text (see number format dialog). In difference to Excel, OO Calc does not calculate with cells containing a value as text. Change column C to Default number format and it will work (Cell format- >Number format->Standard).
Invalid
Sorry to tell, it is not that easy to convert text to numbers: Changing only the numberformat does not help, after that you have to enter the numbers again (or to delete the leading apostroph in each cell).
I don't think that this should be closed. It really does seem like a bug -- OO works in unexpected ways, treating what appears to be numbers (there are no apostrophes, for example) as strings. In equations, they are treated as zeros. Excel handles this properly.
Hi Alan, we handle the differnet data types stricter than excel does. This is a kind of philosophical question and we're the guys that represent the other position. Best regards, Peter
Closed.
Hi Alan, Text is Text, and you can't calculate with text. So the bug is not on our side, it's Excels. Sometimes Excel tries to read some numbers out of a cell with mixed text and numbers, and what should I say, it breaks a calculation delivering a wrong result as this software looks into the cristalball and takes the wrong number, e.g. a 1 instead of 10. OpenOffice.org Calc don't do that, as we never calculate with 'text' numbers. So I think we are the good guy's. Best regards Frank
I'll start by saying that as a software engineer, I'm a big big fan of strongly-typed languages. I want my software to shatter into little pieces when I try to use one type as another. (If I am a good engineer, this will happen at compile time). That being said, OO does a very, very bad thing. If OO gagged when a string was used as a number, and fired a "You can't do that!" error, that would be OK. If it took the Excel approach and: -- looked to see if the string can be parsed as a valid number -- if it can, use it as a number -- if it cannot, throw an error that would also be OK in my book. But OO simply treats all strings as 0s. *This is really awful.* It is very easy for folks to use strings (that they think are numbers) in equations, and get bad results without being warned.
Hi Falko, I think we should make it possible to calculate with numbers formatted as text. I checked it in Excel 2000, typed 'Hallo 20' into A2 and '=A2*10' in C3, got an #VALUE. Formatted A5 as text and typed in '20' and '=A5*10' into C5, got 200 as result. So no calculation is made with strings containing numbers and characters but calculation is made if the string contains only numbers. A 2o (2 Ohh)is treated as number / Character combination and will not calculate. Now I think it's time for us to do it inn the same way. Frank
*** Issue 6157 has been marked as a duplicate of this issue. ***
*** Issue 6233 has been marked as a duplicate of this issue. ***
I have seen OOo react differently to formalae created in Excel file too. For example, I had a check feature in one of my columns that stated something like =C3=E3. I expected the answer to be TRUE or FALSE. After bringing the file into OOo, I got an answer of $1.00. It donned on me to check my cell format, and I had it set up to be a currency format. When I changed it to a general format, I got my expected TRUE or FALSE answer. In thinking about how OOo dealt with this, I have to back how it handled the formula somewhat. Excel has trained us to be lazier in our formatting than we should be technically speaking. I vote for not incorporating that laziness into the OOo programs. It might be a bit of a bump in file transfers at first, but it is more exact mathematically speaking. When we start incorporating the features that allow laziness in this, perhaps we end up bringing in the idiosynchrisies that the other office programs have. Is that what we really want? I vote not. I vote for the preciseness of the conventional mathematic rules.
I agree with Franks currently last comment written by himself. And on what Diane wrote: For every cell itself, its mask should determine what is displayed, not the content (formula, result etc); that's correct. But that wouldn't stop the possibility of calculating with numbers formatted as text! (or to be even more precise, perhaps, _displayed_ as text! see http://www.openoffice.org/issues/show_bug.cgi?id=6357 "displayed format should be independant from number if possible" ).
*** Issue 6768 has been marked as a duplicate of this issue. ***
Created attachment 3293 [details] Example of this "feature" causing a poblem with Excel
Even if this is a "feature", it is deeply counter-intuitive. It has cost me several hours of gnashing my teeth, cursing what I thought was an excellent alternative to Microsoft Office. I created a spreadsheet (attached as prh_demo.sxc) with cells that are sometimes blank, and cells with formulae that depend on the blank cells (they are blank if the source cell is blank, otherwise they are a calculated value, such as cells D7 and D8). This spreadsheet worked fine, and if I entered a value in B8, D8 would calculate correctly; if I then overtyped D8, the E8 would calculate correctly. I saved the spreadsheet as .xls, and re-loaded it into OpenOffice. It now behaves incorrectly; when I enter a value in B8 and overtype D8, then it calculates incorrectly. Fortunately the miscalculation is so great that it is immediately obvious. This is so wrong as to be unusable as a replacement for Excel in a heterogenous environment.
I think my problem lies in the Save As Excel functionality. I have created a new Issue #8640 for this purpose.
*** Issue 8950 has been marked as a duplicate of this issue. ***
Another implementation of this bug - if OO is set to check words with numbers (in language settings), it thinks all the numbers containing more than 1 digit are text. It makes really impossible to use this language checking feature if one uses spreadsheet.
*** Issue 11041 has been marked as a duplicate of this issue. ***
Set target (based on PCD relevance)
*** Issue 29115 has been marked as a duplicate of this issue. ***
*** Issue 31407 has been marked as a duplicate of this issue. ***
*** Issue 35809 has been marked as a duplicate of this issue. ***
*** Issue 40513 has been marked as a duplicate of this issue. ***
*** Issue 43116 has been marked as a duplicate of this issue. ***
Corrected the owner added CC.
I've spent 18 months phasing out Office and Lotus while rolling out OOo, before coming across this. Unfortunately we can't force our clients/vendors to stick to the rules and are constrained to take whatever crap they put in their price and quantity columns. Prevalidating every field is something we don't have the expertise to do. This is a showstopper.
FT: I see that this is a serious problem. But for 2.0 I do not see that we can solve this problem w/o getting into deep trouble with legacy data (and users).
I'm a developer but not in the OOo pool, I feel your pain <g> As I see it this can be solved in 3 ways. The M$ way is to make calc detect numbers in text as part of the formula process, but to leave the fields as text. The better=than-MS/intuitive/one-time way is to detect such fields on loading, give a warning message, and convert them. The lowest-effort (for the OOo team) way is to detect on loading and just warn (maybe lighlight such fields?) and leave the user to fix them if needed. Dave
sorry to bang on about this but I've thought of an even lower-effort solution for the OO team. I'm not fluent enough yet with the macro language but would it be possible to have a macro that checked for numbers or formulae in non-number cells and flipped them?
Doing this: "Change column C to Default number format and it will work (Cell format->Number format->Standard)." as mentioned in a comment below does not work, at least not with Openoffice 2.x I had a wedding guestlist and the number of invitees per invitation in a column. Some of these numbers were text for some reason. I tried and tried again and again to change the numbers into numbers but it would never work. I finally got it fixed by converting to csv, then re-importing. I had a similar problem with the postal codes and zip codes in openoffice. It thinks the postal codes, like V6Y1G2 are strings (fair enough), but it thinks the zip codes are numbers. I could not get this column to change to "string format) by using the Cell format->number format->standard prescription. This causes big problems when I create an Openoffice Base (database) linked to this spreadsheet for the data. The databse will assume the data in that field/column is a certain datatype and ignore those things which aren't in the same datatype. So many of the records are missing any content in that field. Again, I had to convert to csv, then re-import the file. Big huge pain in the ass. Might be just a 2.x issue for me, I've never seen this before. The major broken functionality here is the Cell format->number format->standard part. Are there any other bugs for this?
*** Issue 50772 has been marked as a duplicate of this issue. ***
*** Issue 53833 has been marked as a duplicate of this issue. ***
*** Issue 55954 has been marked as a duplicate of this issue. ***
This is a HUGE issue. It almost cost me lots and lots of money. I publish books and I get royalty reports in XLS format emailed to me on a quarterly basis. The reports are auto-generated. I'm supposed to invoice off of the total amount that is displayed at the bottom of a very long and wide table. My default application to open XLS is OO.org calc. I thought that the royalty amount seemed to small, so I opened the XLS file in Excel, Gnumeric, and Documents to Go v7 (PalmOS Spreadsheet app). Excel, Gnumeric, and Docs to Go all had the proper royalty dollar figure. In OO.org calc the number was tens of thousands of dollars less. WOW, a big shocker! So, the fault is the document, but really, OO.org has no choice here, it needs to behave the same as the other spreadsheets to continue to claim XLS document compatibility. I don't think that OO.org calc will get a second chance if it violates and loses the trust (and/or money) of it's users. Please consider this a 2.0 showstopper. I do note that now in Excel 2003, it still calculates properly, but a small indicator is displayed in the cell (small green triangle in upper left corner) that when you hover over it tells you that something is amiss. I'll attach a couple screenshots of Excel 2003 handling this situation.
Created attachment 30449 [details] Excel 2003 UI to indicate something is amiss with mixed number/text cell
Created attachment 30450 [details] Excel 2003 UI with options to user to handle mixed number / text cell
Created attachment 30451 [details] Nasty excel test case with undisplayed text causing OO misscalc
*** Issue 56034 has been marked as a duplicate of this issue. ***
*** Issue 3851 has been marked as a duplicate of this issue. ***
Pls also see issue 57095!
Created attachment 31104 [details] Cell formatted as text, but calculation works
Created attachment 31105 [details] Cell formatted as text, calculation does not work
I have created two new attachments, texttest1.ods and texttest2.ods - can someone tell me why the calculation works in the first, but not in the second? In both cases the cells are formatted as Text as far as can I can tell in OOo. I opened the archives and looked at the content.xml and in the first file I found this: table:table-cell table:style-name="ce2" office:value-type="float" office:value="1" as against this in the second: table:table-cell table:style-name="ce2" office:value-type="string" Where is this difference made clear to the user in the application? I just turned on "Value Highlighting", and that shows the string cell as black in the second spreadsheet, so that is one way to tell but is there a better way? Dark blue and black are not all that clearly different.
OK I see it now. Sorry. The formula bar has '1 in it. This is how I got here: 1. Enter numbers and formula, it works 2. Format B3 as Text, it still works 3. Enter '1 in B3, it stops working 4. Overtype '1 with just 1, and it still doesn't work, this is what was causing the confusion. I was demonstrating this issue to a colleague as a reason that OOo was not a suitable replacement for Excel. It really hurts to have to do that, because I hate Microsoft Office with a passion, but this issue rules out OOo as a replacement.
OK this is very confusing. I closed the document, re-opened it, and the ' is no longer displayed! Somehow, after I turned on Value Highlighting, the ' showed up in the formula bar, but when I closed and discarded the changes and re-opened it, it is back to showing just 1 without the ' in front. So, my original question still stands - where is this difference visible to the user.
philhibbs, You did everything you could to obfuscate the fact that the cell content is text, including formatting everything to be right-aligned. Now demonstrate the same 4 steps from above to your colleague using Excel and use the formula =SUM(B2:B3) instead of =B2+B3 and argue again why OOo is not a replacement for Excel. This issue is about implementing something similar to the green triangle you see in a recent Excel version in these cases.
Real-world spreadsheets are the result of many changes by different users with different levels of experience. I don't think that the (admittedly artificial) steps taken are that far off what might happen in the real world - I've seen people select a column and right-align it because some of the numbers are left-aligned and some are right-aligned. I've written formulae that take a substring of some other cell, and then do maths on them in other cells. Take a look up this thread, and you will see other people using phrases like "showstopper" and "really awful". The suggestions about warning the user are ok, but treating a 1 as a 0 and doing the calculation without generating an obvious error message is the main problem. An error would be just fine.
I'll never get it why people think that treating this the Excel way, 0 in SUM(B2:B3) but 1 in B2+B3, is a better thing to do. I don't think throwing an error would be appropriate, there are too many cases where people just sum up a range of cells with mixed content, for example including column headers or other labels. Or should we just throw an error if individual cells are involved instead of cell ranges?
>I'll never get it why people think that treating this the Excel >way, 0 in SUM(B2:B3) but 1 in B2+B3, is a better thing to do. I don't think that anyone is specifically arguing that that is better, it is an equally bad state of affairs, and I was unaware of this behaviour before today. I guess it comes down to the fact that Excel is already entrenched. The little-green-triangle is in some respects a good development (although it is obviously a hack to work around legacy behaviour without introducing incompatability), although it annoys me when I import a file containing telephone numbers and thousands of green triangles appear, and I can't just get rid of them all. Can anyone explain what is going on in my example - the calculation fails because, at some point in the past, I had typed in '1 and then overtyped it with 1 again. Both B3 cells are formatted as text, but one works and the other doesn't. I just went back to Excel and tried calculating using cells that contain =MID(x,y,z) and this also works if I set the formula to A+B, but not for sum(A:B), and this rings a bell from back when I had to do that. I think I had another column set to =A+1 and summed that. In summary, Excel sucks as well, but I already knew that.
*** Issue 58589 has been marked as a duplicate of this issue. ***
This "bug" is going to end up being a massive public relations problem for OpenOffice.org (which I do not want). I recommend removing support for the XLS format entirely. After all, as fst said in bug 58589, "we are not an Excel clone and will not be !" The current support for XLS is "broken" by any definition. The first time someone blogs or goes to the media with "OpenOffice.org Calc cost me $20,000 USD!!" (as it nearly did for me) nobody will ever trust the software again. You can't earn the trust back once lost. Do what you will with files stored in native OpenOffice.org/OASIS format, but when you open an Excel file, YOU MUST CALCULATE it the SAME as Excel does, other wise the support for Excel files is WORSE THAN NONEXISTENT.
We have some other examples and arguments concerning these "EXCEL-compatibility"-problems in issue, pls decide whether that one should be marked as DUP of this one. A (very) short summary concerning results from Issue 58903: I believe one way to solve those problems might be to create an "Excel-Mode" for OOo. Comparing with modifications in the import filter the advantage would be that such solutions will not only allow to use EXCEL sheets without problems, but even further use with EXCEL, what would ease document exchange. For details pls see Issue 58903. Do we have a META-issue concerning EXCEL-interoperability for these issues?
I see that it's been almost four years since I first reported this issue and, frankly, I'm very surprised that it has not been fixed and is still not a priority. There are two big bugs here: 1. Numbers are sometimes imported as text. 2. Text values are treated as 0s in calculations, rather than throwing an error. 1. is pretty bad. 2. is very, very bad. 1. and 2. together have the makings of an absolute catastrophe when a company or hospital or whatever imports some data, sees a numerical result to a calculation, does something based on this calculation, and later finds out that the calculation was utterly bogus because a number was pulled in as a string. At the VERY LEAST, people need to be warned when using a string in a calculation!!!
*** Issue 61329 has been marked as a duplicate of this issue. ***
*** Issue 58903 has been marked as a duplicate of this issue. ***
*** Issue 63203 has been marked as a duplicate of this issue. ***
>>[Issue 63203] >>IMHO not a bug from Calc but from Excel, they calculate with text values >>as the MID function result is always a text, you get the difference. >> >>Normally I should close this as invalid, but the expectations on the user >>site have to be considered, so double to Issue 5658. I strongly support consideration of "fixing" this. - We can say its the fault of Excel, but I have a rack of beer on that it will not change neither in the old Excel versions nor in any new version [retain backward compatibility] - It is clearly a compatibility issue. Anyone migrating will see this is a bad implementation/handling of OOo, not as Excel's fault, probably not happy to check and fix dozens of spreadsheets, don't even know what to change (besides insert a value(cell) into every cell referenced fuctions) - If OOo treats these as texts, then the following should be treated as text: ="3"+1 should result 1(*), not 4 as it is now. - (*)more properly is should result Error:502 as it is for ="abc"+1 To sum it up, fixing it would cause "customer satisfaction" besides stopping this "bug" to be reported over and over... Thanks
*** Issue 65604 has been marked as a duplicate of this issue. ***
*** Issue 66193 has been marked as a duplicate of this issue. ***
*** Issue 67332 has been marked as a duplicate of this issue. ***
*** Issue 69158 has been marked as a duplicate of this issue. ***
*** Issue 76676 has been marked as a duplicate of this issue. ***
*** Issue 75433 has been marked as a duplicate of this issue. ***
I came to this thread from issue 75433 which was my report of the problem and has been marked duplicate. Someone advised me to "vote" for this issue. But I can see from the long tails on these threads, laced with philosophical declarations about how "Excel is broken", and "we are the good guys", that this serious problem will always be present. So I'll have to use the only solution available to me. I'll just use Excel. It works for me, I can interchange spreadsheets with others without headaches and I trust it. And since I'll have Excel, I don't need OO Calc anymore. Good luck.
I wouldn't put it quite as strongly as billmichaelson above. However, the seriousness of this problem - and the little attention it has gotten - has kept me from giving serious thought to moving from Excel.
*** Issue 79752 has been marked as a duplicate of this issue. ***
Sorry for duplicate post of this bug. Now that I am on the right thread... Those who claim that current functionality of OO is not broken probably have never worked in real life situations with spreadsheets. They are not always prepared by excel/OO gurus, who do everything correctly. They are prepared by very average people and at least by now most of the time in excel. I just happened to approove price offering sent by a supplier only to discover that some fields in his excel spreadsheed were formatted as text and thus caused 0,00 to appear in all the consequitive calculations for some items. Being a coder myself I know it is important to work by strict standards, but relying that user input will always be high quality is stupid. If OO was truly strict about what input is used it would not voluntary convert cells formatted as text to zero for the sake of calculation. So (as mentioned above) OO should do one of three: 1. Use value of the cell when calculating formulas, deisregarding formatting (the bahavior of excel!); 2. Check spreadsheet when opened for numeric entries that are formatted as text and offer a choice between changing their formatting to number or leave as is; 3. Show error messahe in the formula calculation, so that user knows that the result of formula is not correct!!! The only incorrect beahvior is now, because user will most likely not notice the errorous calculation, particulary in big spreadsheets!
Thought we'd have thrown a virtual party to celebrate this issue being active since June 2002, or wasn't I invited? ;) I can't believe this hasn't been tackled yet. Every month I have to buy more copies of Excel and roll back OOo a little more.......
LOL Excel is in-consequent (a) as is OOo (b) AD a) comments from fst Fri Jun 14 2002: I can reproduce in Excel 2003. But now enter number in A6, and fomula =SUM(A5:A6) in A7 ... A5 is nót counted :-) AD b) open Writer and add table A1:B3 - turn off number recognition (context menu or somewhere in options) - make sure formatting of A1 is text - enter number there - turn on number recognition - enter number there - add formula to A3: =sum(<A1:A2>) A1 ís counted :-))) Shall we organize a poll which team is right, sc or sw ;-) No, seriously. It is a problem when receiving wrong formatted xls-documents that you have to count upon. I created a macro that marks text cells with 'numbers' with a color and also changes them, so that the text becomes a real number. Happy to provide that to anyone interested (not yet had time to make a nice extension of it). Furthermore there are other tools to help with this (didn't try those however).
cornouws, could you please provide your macro? It is really hard for average user to "fix" incorrect XLS docs
*** Issue 85970 has been marked as a duplicate of this issue. ***
My mind just got blown. I first reported this incredibly awful problem almost six years ago and it's not fixed! My occupation is managing the design of medical devices. In my field, problems like this - wrong result with absolutely no warning - are considered absolutely the worst. Catastrophic. If someone tried to use OO at my company, or at another FDA-regulated company, I'd have to advise them in the strongest terms not to use it. I think that I need to start writing to the media and to Sun. Maybe to Slashdot. I'm rooting for OO, but it's inconceivable that this is still horribly, dangerously broken. You folks seem to not understand how awful this is.
@ pmike, all, I've done some more checking on the macro - you wouldn't be happy with a formula evaluating to zero, being replaced by 0, for example, I guess - and published it as extension at the wiki: http://extensions.services.openoffice.org/project/CT2N Feedback welcome of course.
I just came across this issue while working on fixing this exact problem in the Novell edition of OO.o per customer request. I'd like to eventually upstream my code, but looks like I'm going to have a tough time based on the discussion taking place here. ;-)
@kohei: Could you please give us a short explanation about how your fix will handle this issue? Thank you!
Sure. Basically, I divide this into two cases: 1) basic arithmetics i.e. addition, subtraction, multiplication and division 2) built-in functions, such as SUM, AVERAGE etc. that takes a cell range which may contains a mixture of text, text numbers and numbers. For case 1), I try to convert a text-number into a real number where applicable, or throw error 529 if that's not possible. For instance, with my fix, the following formulas will calculate with the text numbers being treated as numbers = "3" + 4 (=7) = A1 + 4 (=13) where A1 contains "'9" (text number) but the following formulas will throw Error:529 = "foo" + 4 = A1 + 4 where A1 contains "foo" For case 2) where a built-in function takes a cell range as an argument and that cell range contains a mixture of real numbers and text numbers, I treat as if text numbers don't exist. I'm basically trying to emulate Excel's behavior as best as I can to improve interoperability with Excel. My code is still work in progress, and there are still lots of cases I don't handle yet (like matrix input) or handle "incorrectly". So I still need to do some work before I can submit a patch. Also, my take on the philosophical issue is that, we are not trying to win the argument, but we are just trying to help our users. Sometimes Excel does things a very weird way, but what's "right" in a software behavior is highly subjective & depends a lot on other factors than just theoretical correctness, so I think we should treat it as such.
>For case 2) where a built-in function takes a cell range as an argument and that >cell range contains a mixture of real numbers and text numbers, I treat as if >text numbers don't exist. I would call that a deeply flawed approach, if a cell can be used as the input to a numerical calculation, then that's that, it should work in all cases. Changing a cell from A1+A2+A3 to SUM(A1:A3) should never ever change the result! I don't care if Excel does this, OOo does not and should not have emulation of Excel bugs on its feature list.
@philhibbs Then what would you do when real world users who rely on that behavior approaches and ask you to "fix" it? Would you tell them "tough! take it or leave it" ?
@kohei: Absolutely. You can't introduce a bug into a spreadsheet application just because some users of some other spreadsheet application rely on that bug.
Although, it depends on your context of course - if you are working for a client who wants a customised, Excel-emulating version of OOo or Novell Office, and they are paying you to do the customisation just for them, then of course you should do what your client wants. But don't expect the wider OOo/Novell community to accept your deliberately buggy patch. I think the first part that fixes the A1+A2 code is fine, but it has to fix SUM(A1:A2) as well or it's just replcing one bug with another in my opinion.
That highly depends on how big that "some users" user base is. Again, you're basing solely on the ground of theoretical correctness, and for that, I'd agree with you. But unfortunately there are other factors to consider than just that, and even if you are right, we would still need to provide some solutions to those users who rely on that behavior of Excel. We can't just leave them in the dust.
>But don't expect the wider OOo/Novell community to accept your deliberately buggy patch. By the way, that's the _existing_ behavior. My patch won't touch it. I just merely covered it to keep my intention clear. So, your above statement is incorrect.
On top of that, I had a separate patch just to handle =SUM(A1:A10) the way you like. Should I upstream that as well?
Could we could have a "strict" mode, where the following both throw errors? = "3" + 4 = A1 + 4 where A1 contains "'9" (text number) That would give Calc a huge advantage in mission-critical applications, where Excel (and currently Calc) simply can't be used because of sometimes evaluating text to zero, sometimes to its numeric value, and sometimes throwing an error. Incidentally, see also issue 58903 and http://ca.geocities.com/jrkrideau/OpenOffice/spreadsheet_problems.pdf
@philhibbs "OOo does not and should not have emulation of Excel bugs on its feature list." If the "bug" is needed to have the calculation result that Excel would perform, then absolutely OOo MUST do the same as Excel when opening an Excel spreadsheet. I'm just guessing that you never had Calc almost cost you $23,000.00. From the http://www.openoffice.org/product/calc.html it states, "Of course, you are free to use your old Microsoft Excel spreadsheets, or save your work in Excel format for sending to people who are still locked into Microsoft products." This a malicious LIE! It should say, "Of course, you are free to use your old Microsoft Excel spreadsheets, or save your work in Excel format for sending to people who are still locked into Microsoft products AS LONG AS YOU DON'T CARE ABOUT HAVING THE CALCULATED RESULTS MATCH EXCEL. WE KNOW THAT CALCULATION IS THE WHOLE POINT OF A SPREADSHEET, BUT WE DON'T GIVE A SHIT." That would make a good DIGG headline that would be sure to make the top page coupled with my documented, legitimate story of OO calc nearly costing me $23,000 back in 2005.
@daxkelson: Malicious? You just lost your credibility. Expecting bug-for-bug compatability with Excel is unreasonable and unachievable. @kohei: >we would still need to provide some solutions to >those users who rely on that behavior of Excel. We can't just leave them in >the dust. You have users who rely on SUM(A1:A2) not including a text cell that contains a number? I'm a little surprised. Fixing both aspects of the issue - making it calculate in both A1+A2 and SUM(A1:A2) - will fix far, far more cases than it will break. I even think it will fix more cases than just patching the first behaviour would, I'd expect that most spreadsheets that are failing to include text cells in their calculations are doing so in variance to the creator's intent. >By the way, that's the _existing_ behavior. My patch won't touch it. I just >merely covered it to keep my intention clear. >So, your above statement is incorrect. If a patch deliberately introduces a discrepancy between A1+A2 and SUM(A1:A2) then I call that a bugged patch. "Incorrect" is only true in a purely semantic sense.
@kohei: >On top of that, I had a separate patch just to handle =SUM(A1:A10) the way you >like. Should I upstream that as well? Sorry, didn't spot that. Those two patches should go together, I can't see any reason for A1+A2 to ever give a different result to SUM(A1:A2).
@philhibbs: You misrepresent me and create the classic straw man retort. You imply that I want EVERY bug recreated in Calc. That is not the case. I want OO to calculate the same as Excel when opening Excel spreadsheets, if that requires behavior considered as a "bug" (and this bug report is about one specific "bug", not all bugs), then so be it. This is a case of OO developers flat out REFUSING to do the right thing. How is NOT malicious when this fixable compatibility bug has been open since 2002? It isn't my credibility that the issue. OO calc is has the credibility problem which is why it is flat out banned at a growing number of sites specifically because of this bug.
Questions to those who oppose changing this behaviour (I would say "fixing this bug", but that's prejudicial): What should =A1+0 do, if A1 is text? What should =MID(A1;1;LEN(A1))+0 do, if A1 is text? Why does the former need to be =VALUE(A1)+0 but the latter does not need to be =VALUE(MID(A1;1;LEN(A1)))+0?
philhibbs said: "What should =A1+0 do, if A1 is text? What should =MID(A1;1;LEN(A1))+0 do, if A1 is text? Why does the former need to be =VALUE(A1)+0 but the latter does not need to be =VALUE(MID(A1;1;LEN(A1)))+0?" The answer is so easy in the context of OO calc opening an Excel document. The answer is it should do the exact same as Excel.
@daxkelson: Can we call a truce on this? I think we're mostly on the same side here, unless you strongly disagree over whether SUM(A1:A2) should also be fixed, which I think it should.
The basic issue is this: When opening an Excel file -- calculate 100% like Excel When opening an Quattro Pro file -- calculate 100% like Quattro Pro When opening an 1-2-3 file -- calculate 100% like 1-2-3 When opening an ods or sxc -- Feel free to be "proper" in your own eyes One issue I see is a UI issue in saving/exporting from one format to another where calculated results will change. There should be big fat warning.
Two questions to the crowd: 1. Does anyone watching this issue think that it should not be addressed? The only names I can see above arguing for the status quo have either changed their mind (fst) or are not on the distribution list for updates to this issue (mackmoon). 2. Is anyone watching this bug in a position to progress this through the proper channels, to get some development time allocated to doing an impact assessment, specifying a change, and eventually getting it fixed? How does that work around here?
@kohei: This may be more work than you wanted to do to with your patch, the but please consider the types of UI warnings that Excel gives. See: http://www.openoffice.org/nonav/issues/showattachment.cgi/30449/Excel2003-warning-about-mixed-cell.png and http://www.openoffice.org/nonav/issues/showattachment.cgi/30450/Excel2003-handling-mixed-cell.png
I believe we all agree on fixing case 1), while case 2) is somewhat debatable, and I personally have no idea what the "right" course of action is for case 2). So, I would like to wait on 2) until someone who is a usability expert can step in and call the shot. Meanwhile, since it's an existing behavior, leaving that alone should not cause any short-term damage IMO. @daxkelson: I believe that visual feedback enhancement was filed separately though the issue number escapes me. It may be beyond the scope of my patch-to-be, but I too would like to see some sort of visual feedback implemented for text numbers. So, your points taken.
@daxkelson: I don't understand why you are going so far in your request that it be 100% Excel-emulating. As I understand it you had an issue with direct cell calculations, such as =A1*10. Have you ever encountered a problem where you need Excel's SUM() behaviour to be maintained and not fixed in the same way? If you can't point to examples where this kind of pathological behaviour is needed, then I don't think there is any case to be made to fix A1+A2 behaviour and not SUM(A1:A2). Like I said earlier, I would guess that the majority of cases where Excel is skipping over text values an not summing them, that isn't what the spreadsheet owner intended.
@kohei: I opt for the following: - All cases where a numeric parameter is expected and an argument is passed either as a literal string or a single cell reference where the cell has textual content should generate an error. No differentiation between ="3"+4 and =A1+4. You simply can't handle text as numbers across locales reliably as soon as it involves floating point and maybe even group separators and not integers only. - There could be one exception to this rule: in date calculation context a date string of the ISO 8601 form "yyyy-mm-dd" could be accepted. @philhibbs: To my knowledge all significant spreadsheet applications treat range references (A1:B3) when expecting a numeric parameter such that they are taken as a number sequence, ignoring text cells. Text cells are not touched or converted to 0 if the function specification doesn't say different. This will not be changed. @daxkelson: > When opening a ... file You'd end up with different modes to be maintained in the interpreter, documentation, online help, the user not knowing which mode he's currently in, loosing the source information when saving in a different format, and different calculations just because you (re)opened a file in a different format. You don't really want that. @all: Discussions in issues tend to get lengthy, repeat arguments because newcomers don't read all previous comments, have no proper quoting, and make it hard to follow a thread, if there was one, if one wasn't involved from the beginning. I think such discussions are best carried out on the mailing lists, in this case maybe users@sc.openoffice.org, where at least quoting and threading are helpful instruments. The outcome or certain milestones could be added to the issue then. Just imagine you wanted someone from the User Experience team take a look at this issue, do you really expect her to wade through all this?
@er: Thank you for your thoughtful reply. The issues you raise are just the sort of thing I was thinking about when I mentioned "impact assessment" earlier. Should it also interpret Thai, Tamil, Bengali numerals as numbers as well? An error result would be an acceptable fall-back position. There's also Value Highlighting (Ctrl-F8) to help with this kind of situation.
As an example of what er is talking about in his comment to kohei above, Excel evaluates "1,234" as 1.234 or 1234 depending on whether the user's locale has comma or period as the decimal separator.
@er: At some point, OOo has to decide whether a series of characters is a number or not. What difference does it make if that point is at the time of user entry, or at the time of reading the contents of a cell? I don't see the point of flagging certain information as "do not interpret as number from now on", just 'cos it has been written to a cell. Also, please consider my previous point, why does =MID(A1;1;LEN(A1))+0 work? MID() clearly returns text, yet you can do maths with the result. My opinion is: remove the inconsistency. If OOo can interpret text as a number and do maths on it, then it should be able to do so consistently.
@philhibbs: the difference is that during input the conditions due to the locale specifics (separators and such) are determined, whereas when loading the document in a different locale that information is lost. A string of "1,000" would be exactly which value?
*** Issue 86708 has been marked as a duplicate of this issue. ***
@er: Good point. I hope that everyone is following this debate, it cuts to the very heart of why Excel and OOo behave differently and I am delighted to finally understand why it works this way. I just created an Excel spreadsheet with French locale settings, and entered '1,000 into cell A1, and then =A1+1 into A2, and it calculates the result as 2. I saved, closed, changed my Windows settings to English, re-open the spreadsheet, and it still says 2. If I edit the cell and just press Enter without changing the formula, it recalculates it to being 1,001. Doing the same in OOo with VALUE() calls around the cell reference behaves a little differently, it re-calculates using the current locale when you open the spreadsheet. One way to handle this would be to store locale information with every text value. So when a French person enters "1,000" into a spreadsheet, it is stored internally as, say, "{France}1,000" which is to be interpreted as 1 in a numerical context and maybe even displayed as "1.000" (or displayed as 1,000 with a warning indicator) when an English person opens it. This would require standardisation of the NLS formats so that all platforms can interpret {France} equivalently. Of course this is a huge change to make with a lot of impact so it may be too late to do things this way in OOo, even if it did turn out to solve this issue, and I fully expect that good reasons not to do it this way will surface. Of course, it's perfectly possible to construct spreadsheets in OOo that suffer from exactly the same locale ambiguity, using the VALUE() function. I guess if you go down that path, though, you take responsibility for text-to-number conversions, whereas making it automatic could cause ambiguity that the user does not expect. I still think that the issue should be addressed, though, and OOo should behave as though an implicit VALUE() call were wrapped around the cell reference, using the current locale to interpret text values as numbers. Anyone dealing with text entered in multiple locales should handle that themselves as a people process issue.
Adding locale information to text cells isn't viable. It wouldn't only bloat the information to be stored, it would also require other applications loading the document to have the same locales implemented. I guess as soon as one used Indic or Chinese digits or Arabic separators or the like you'd be lost. Function VALUE() is of course equally error prone if it comes to portability across locales. To overcome this situation, the ODFF specification added the function NUMBERVALUE() that takes an additional parameter for the decimal separator used. For calculation with literal strings and single referred cells of textual content there's only one correct solution: generate an error. Automatic string conversions considered dangerous. They are the GOTO statements of spreadsheets. --Robert Weir on the OpenDocument formula subcommittee's list.
As a advocate of OO.org and someone who has been hurt by this issue, I would like to add my $0.02. This issue has delayed the implementation of OO.org in our company by at least a year. It took me the longest time to figure out that the Excel spreadsheet we are getting from one of our critical vendors has this issue/bug/feature embedded in it. I have been using OO.org for several years myself and had never come across this issue. I guess I am a pretty sophisticated user so I wouldn't expect a text field to work in a calculation of any sort. Now that I know it is an issue, I can address it with my users and work with it. (Thank you cornouws for the extension) Here is my point, the OO developer community needs to address this one way or the other and get this into the roadmap for a future release. (3.0?) This is obviously an issue that needs to be prioritized. (this group has been discussing it for almost 6 years!, with dozens of duplicate issues entered) Personally, I would like it if Calc had a "Excel bugs Compatible" mode but, I would also support a notification of some sort to the issue (ideally, with a "hide all" option). In my case, if I got an error message or flag that was documented as this issue, I would have 250 users on OO.org today. Thank you all for your time and dedication to the OO.org project!
Unlike jrtapper I had OOo rolled out to 100 users, but uncertainty over this issue (which has now had a bit of press) is letting excel get its foot back in the door. 6 licences in December, another 6 this week. We use the Novell version of OOo, which emulates Excel better in the area of macros and fonts. We don't care if Excel is wrong in its handling of this situation, it builds in a workaround. OOo MUST do the same.
@robinsonky: I've had some requests to add some possibilities to my simple extension. If there's anything that is useful for you: pls let me know. However: quite a lot to do, so cannot have a serious look the next two weeks or maybe more :-(
@jrtapper: It has been an active issue for this long because it is a difficult issue. Whilst I agree with you that this issue should be addressed - and my take on it is "conversion of strings to numbers should be consistent" - it isn't simply a case of a bug that needs fixing. The risk is that there may be OOo spreadsheets out there that would start returning different - perhaps incorrect - results if this behaviour were to be changed. What's more important, compatability with Excel or compatability with OOo versions 1 and 2?
@philhibbs: I have read most of this issue thread and I understand that it is a very difficult issue. As an amateur programmer, I agree with you that the way OO handles this is more "right" than the way Excel handles this (although I would prefer an error return over a 0 return). As an IT professional, I have to look at how the programs are used by the end-users. To me, the simplest answer is to flag these situations (as Excel 2003 apparently does) and let the user know what is going on. Somewhere in this thread someone mentioned that this is annoying in certain situations, so add the ability to turn off these flags (either on a per case basis or globally). In this way, the native OO spreadsheets will still work correctly, and any imported Excel sheets will have the issue identified and the user can deal with them. I do not know how complicated it is to program this, but from a high level this seems to me to be a reasonable solution. Thank you for your dedication to this issue and OO in general!
I do not want to step on anyone's toes with the following but, I just loaded the OxygenOffice (2.4 RC) and this situation generates an error code 529. I can't find error code 529 documented anywhere, but it does appear the the folks at ooop have addressed this issue. If this were an "official" documented error code that described this issue, we could put this to bed.
@jrtapper: pls contact the people responsible for OxygenOffice (2.4 RC) if you encouter isues there.
Adding me to CC. We encountered this issue with one of our client, so we have made a custom version with the working patch of Kohei. Even if it's not perfect, it's clearly better than nothing.
*** Issue 88775 has been marked as a duplicate of this issue. ***
Somehow, I do like Excel's handling better. Though it is NOT perfect either. As a spreadsheet user, you can't always control what data is written in the spreadsheet, as someone else might have written numbers as text. BUT having NO easy way to see IF the numbers are text and the calculations are flawed is surely the worst alternative. The argument that numbers are right-aligned and text is left aligned is - well - one of the worst arguments I heard. I have yet to come across a spreadsheet, where text and number-alignments are not modified. Handling localised data is a problem, but this is independent of the "text-handling" (and is also a problem with VALUE() ). This is why I always prefer to use the English-decimal separator. While I am the guy with the strong typing in spreadsheets, [http://www.openoffice.org/issues/show_bug.cgi?id=79924], OOo does NOT enforce strong typing either, but chooses the worst possible outcome. Added to this, the user does NOT have an easy way to convert text-numbers into numerical numbers. Also, this distinction between text and numbers is rather artificial. If it were specific units or currency, than distinction between a scalar and some other type makes sense, but it has little relevance otherwise and I believe that this difference is overplayed in spreadsheets. There are only 2 ways to somehow correctly handle this issue. Both should be implemented: A.) interpret text as numbers - this should apply for every formula / function - offer option per spreadsheet to store this state - also, option per spreadsheet to set the decimal separator B.) raise error when numerical operation involves strings - this option MUST include a method to easily convert string numbers into numerical values The correct handling of this issue will implement both methods. Also, the option how to handle a particular spreadsheet should be stored within the spreadsheet. Of course, for compatibility reasons, it might be feasible to have a 3rd option as well: ignore text. This is of course a little bit more difficult to define. I would presume that ignoring text will handle text-cells as empty cells and will skip them and NOT handle them as numerical zero cells. This is probably not the case in the existing implementation. Also, in this latter case, the user MUST be notified that cells were skipped. I am not sure how this notification should look like. Some icon displayed within the specific cell seems suitable. Propagating this "notification" to downstream calculations seems a wise idea, too. Righlt-clicking on this icon should display the number of "text-cells" that were skipped. And there should be a method to review those cells and convert to numbers. When thinking more thoroughly, I become more and more convinced, that the whole data model and error handling mechanism of Calc needs to be redesigned from bottom up. A more advanced alternative is based on data-handling by R: [http://cran.R-project.org] - e.g. IF a value is missing, one can pass additional arguments to R-functions, how to handle those missing values, e.g.: "na.action: is a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The ‘factory-fresh’ default is na.omit. Another possible value is NULL, no action. Value na.exclude can be useful." I believe this is a far more transparent alternative. And it would cover many other areas, not just the text-issue. [By the way, spreadsheets have a very poor handling of missing values, and this would be a good alternative, too.] Setting options at spreadsheet level how to handle text and missing values seems sensible: is.text = FAIL is.NA = OMIT [options: FAIL, OMIT, CONVERT for text -> number] [alternatives to is.xxx functions/options could be: text.action and NA.action]
Hi *, There's a new version of the CT2N extension: http://extensions.services.openoffice.org/project/CT2N The latest version offers a window to make choises about marking of cells, including cells with non-default decimal separators, and more. In my expectation it will help in most situations. Feedback welcome of course. It does not (not yet?) take into consideration the inconsequenses of Excel's handling of text-numbers. Anyone with exact imput on this issue: welcome. Thanks for any feedback – Cor
This issue is important and listed on the quarterly review for Calc: http://wiki.services.openoffice.org/wiki/2008_Q2_Review_of_Spreadsheet_Project Therefore adjusting target to 3.x.
Just to put in my two pennies worth. I use excel and I expect the Add formula to treat text as "0". I achieve this by using the "Options > Transition > Transition Formula Evaluation" option in Excel. This is not supported in OpenOffice. So now when I open an excel sheet in OO it just says ERR 529 everywhere. Now I repeat, I use excel, I cannot view my Excel sheets in OO, so I cannot use OO. Shame. The argument about whether text should be treated as "0" or an error is a real dilemma, so the Excel method of allowing the user to choose is quite a good one. One reason for my liking the text as "0" method is that I can put comments into cells instead of using the comment feature, the other is that a complex formula can also be treated as an error e.g. "=IF('Sheet1'!R5<0,'Sheet1'!R5,"")". At present I am using Gnumeric, which seems to handle this better, which is a shame.
Even though I using OpenOffice intensively (Debian/Windows) I only came across this issue now, since debian testing upgraded to 2.4. I don't mind strong type checking, even if it's not compatible with the market leader. What I don't understand is, why refering to a cell containing "" (empty string) in formulas evaluates to Err:529 in - openoffice.org 2.4 for Linux whereas the following version are interpreting a string as 0 - openoffice.org 2.4 for Windows - openoffice.org 2.2.1 for Linux - openoffice.org 2.0.4 for Linux - openoffice.org 1.x for Windows and Linux
> Even though I using OpenOffice intensively (Debian/Windows) I only came > across this issue now, since debian testing upgraded to 2.4. If you routinely use the Debian OO.o - it's based on ooo-build which has a number of interop improvements in this area: something to be aware of.
The BIG issue with this is the formula I mentioned above =IF('Sheet1'!R5<0,'Sheet1'!R5,"") Using this, if Sheet1!R5 is either blank or less than 0, then the formula should return a blank cell, but if the resulting cell is used in an addition then it generates an error, and I have no idea how to make the formula work in OpenOffice.
Not sure if this is of any use to anyone: =IF(ISERR(VALUE(A1));"text";IF(VALUE(A1)=(A1+0);"num";"numtext")) These tests check whether the value of A1 is a number, a text value that can be converted to a number, or a text value that can't be converted to a number.
*** Issue 92163 has been marked as a duplicate of this issue. ***
> What I don't understand is, why refering to a cell containing "" > (empty string) in formulas evaluates to Err:529 This is definitely an error. Empty cells should not be evaluated as either text or number. They are just empty. Silly, but this is how it should have been handled. Actually, empty has a special value, it is called Not Available, or NaN, or NA. Calc should implement such a special type, as described in one of my previous posts. R does it (the S+ language) and all professional statistical applications. Also, R has a very powerful way to handle NA values. Default is to ignore these values, but the user can specify a different action, e.g. remove or fail. This is also the sensible way how to handle string-numbers: default would be *fail*, while other options are: *ignore* (Current Calc implementation) and *convert* to number (Excel's way). These option parameters should be passed to every function/formula/cell.
> This is also the sensible way how to handle string-numbers: default would > be *fail*, while other options are: *ignore* (Current Calc implementation) > and *convert* to number (Excel's way). Calc does not ignore string-numbers. They are treated as zero. (Try deviding by a cell with text-number.) Excel does not always convert string-numbers. It is done in combination with operators (=A1+A2). Not when the cell is part of a range that is an argument in a function (=Sum(A1:A2)).
Which only goes to show how inconsistent the current treatment is. The treatment of a blank ("") cell introduced from a formula is especially irritating, both Excel and OO treat the resulting content of the cell as "" ie. not as blank, and there is no other way of leaving a cell blank after evaluation that I know of. Excel does have the "transition" option which evaluates text/blank as zero, similar to discoleo's suggestion, but OO does not include this option. I filed a bug report on this but it seems to have been deleted. The work around is to use the sum() formula in all complex calculation, which should evaluate "correctly". BUT beware, adding a sign in the formula will cause the dreaded ERR:529 again, so, =SUM(A1;-A2;A3) can still give ERR:529 but =+SUM(A1;A3)-SUM(A2) will evaluate "correctly"
@fdservices: Please note again that you're using a go-oo build version of OOo that has some changes in this area that the upstream version does not have, there is no Err:529 in these cases. Your problems do not exist in upstream OOo. If you're having problems with Err:529 please file an issue within the Novell bug tracker instead. Discussing them here unnecessarily adds complexity to this issue and confuses the reader. Thank you.
True, and I would prefer to use the ooo build, except that the kde integration module does not include the file picker :-( All the above comments apply to Excel and IMHO are actually Excel bugs (and always have been). My understanding was that OpenOffice were going to change their evaluation method to the "Excel standard". If that it not so then I have no further comments to make here :-) All the best Andrew
*** Issue 94612 has been marked as a duplicate of this issue. ***
I have to chime in with those who argue that this issue almost cost me serious money. A proposal from one of my vendors showed costs that were off by hundreds of dollars per month; my customer would have sued me had I not caught this in time. When loading an XLS spreadsheet, it is absolutely imperative that the results are the same as they were in Excel. That should override any philosophical considerations on what the "right" treatment is - when you are competing with Excel and advertising compatibility, then "right" is what Excel does. Anything else may be appropriate for an ivory tower - but not for the real world of business. As a consultant, I used to tell my customers to replace Microsoft Office with OpenOffice. But until this issue is fixed, I cannot do that any more. And the fact that, as I now see, this discussion has been dragging on for six years does not help to build trust in OpenOffice. How many other such issues are there lurking, and will they also take more than half a decade of discussion without a solution?
After 6.5 years of watching this lethal bug, I've all but given up on it being fixed. So, I've decided to alert the press to it - either this will encourage OpenOffice to finally recognize and fix this bug, or at least the general public will know how dangerous OpenOffice Calc is. First email went out today.
Hi alcohenma, Have you already worked with the CT2N extension, or tested it? If there are needs or wishes for improvement, pls let them know. Thanks, Cor
Oh, about your information to the press: you may want to add that it is unknown how many people out there work with false outcomes of calculations in Excel, because Excel treats text-numbers in two different ways, and it is rather unlikely that everyone realises that. Apart from this, I've noted progress in the discussion towards a solution, which also in my opinion is very important.
I think there is a fallacy in the thinking; it ignores the real-world workflow that you will often see. The number of people who get a wrong result in Excel is going to be minuscule, because just like regular software projects, people do test spreadsheets. In the business world, spreadsheets are designed in Excel. They are tested in Excel. Then they are distributed to the sales force, who fills them out - in Excel. The spreadsheet designers put strings into formulas - it's rarely the end user who accidentally adds quotes when entering values. Because all the testing happens in Excel, you won't see a problem. Then they are sent out to me. Who opens it in OpenOffice - and gets a wrong result. And "movement towards a resolution..." - well, when after 6.5 years there are STILL people arguing that Excel compatibility doesn't matter, I'm not as optimistic as you are.
And that - the resistance that some have to accepting this as an "Excel compatibility" issue - is why I thnk we need to re-focus on this as a consistency issue. =MID(A1;1;LEN(A1))+0 works, yet MID clearly returns a text value. Please, if anyone can give me a god reason why =A1+0 should behave any differently to =MID(A1;1;LEN(A1))+0?
"often" and "rarely" ... Still I can't disagree with you, of course. But it is not what I wrote.
Created attachment 57897 [details] Calc not performed when field defined as text
First of all: I do not use Excel. Second: This is clearly a bug, this is why... If you use a formula to calculate some numbers, and one or more are formatted as text, OOCalc 3.0 treats those numbers as ceros. In early versions (OOCalc 2.4) the formula throws an error, wich is correct because the user (me) are trying to calculate text, wich is not correct. OOCalc 3.0 should shows you an error, but it calculate the cell like it has a cero, wich produce wrong calculations. I don't care how excel do the calculation, this is not excel, but it is very important to avoid misscalculations. Please fix this issue... don't forget that 2.4 do the right thing. I use Windows and Linux to reproduce the error.
clemare: > In early versions (OOCalc 2.4) the formula throws an error, This issue has been around since forever - you may be experiencing some other condition that used to give an error but now calculates as though the cell were zero, but it's always calculated text cells as zero.
add CC
Clemare: You were probably using the Novell build of OO2.4 which did throw an error if text is included in a calculation. But be aware that we do not all agree with you. IMHO text should be treated as zero and not an error e.g. 1 + "not sure what number here" + 1 = 2 not error A side effect of this is 1 + "1" + 1 = 2 which could also be classed as correct since "1" is text and not a number, however I agree that this might confuse the user. One possible option would be to flag cells which contain text, which could be interpreted as a number. e.g red flag - comment, green flag = formula, blue flag = text number Andrew
It seems to me that there is a consensus developing here that it is more important to behave correctly than to be compatible with Excel. Maybe the real solution is to simply remove support for .xls functionality altogether, since it doesn't work anyway?
Actually I would agree with that. Why not just create a superb spreadsheet? Incidentally to whoever wrote that they were converting their clients to OpenOffice, be aware that OpenOffice does not correctly write Excel spreadsheet files. It does a creditable job of reading xls sheets, but complex sheets saved from OpenOffice may well become unusable in Excel afterwards. Andrew
> It seems to me that there is a consensus developing here > that it is more important to behave correctly than to be > compatible with Excel. YES, BUT: Treating 'text' as zero is definitely a BUG. Consider a company doing $1 + '1,000,000' = $1 and going bankrupt afterwards. You don't want that, do you?
I am afraid it is not quite as simple as that. One of the most common problems with a spreadsheet in the real world is the propensity for users to press the space bar to blank a cell. So 1 + "blank cell" = error in your scenario. What about the case of 1 + "1,000,000 people cannot be wrong" = 1,000,001? That is wrong as well. My suggestion was 1 + "1,000,000" = 1 [Watch out, the "1,000,000" is text in the preceding formula - Is that what you really meant?] Andrew
> Consider a company doing $1 + '1,000,000' = $1 and going bankrupt afterwards. That's is the problem. In an a Spreadsheet with undreds of records, one text value algined to the right (impossible to see with naked eye) will produce an error, and the user will not know until he loose his job. I like fdservices suggestion a lot... it's simple and informative. <<1 + "1,000,000" = 1 [Watch out, the "1,000,000" is text in the preceding formula - Is that what you really meant?]>> clemare BTW How many votes we need to get fix this error? My company is considering seriously to buy MS Office licences because of this error.
"My company is considering seriously to buy MS Office licences because of this error." Free solution available: http://extensions.services.openoffice.org/project/CT2N Want some enhancement? Sponsoring welcome ;-)
No, that extension isn't sufficient. When one of my business partners sends me an Excel spreadsheet by email, I need to be able to simply double-click it and trust that when it opens, the numbers will be the same that he used. Having to run a program or the like just in case he used text instead of a number does not cut it. And as long as this bug isn't fixed, only Microsoft Office satisfies that need. That is why I proposed earlier to remove support for .XLS altogether - broken support like this is worse than not supporting Excel at all.
Created attachment 58219 [details] Example of Excel calculation bug
I tend to agree, this is really an excel bug and the question has to be Do we reproduce excel bugs in order to be compatible? If you look at the attached Excel example you can see some of the weird results that can be obtained. I have no idea what the perceived difference is between column C and F!?! Andrew
@ kkeane, Sat Nov 22 2008: OK, you want a program that does exactly the same as Excel, without doing any extra click... hmm then I have the prefect solution for you: Excel :-) And it makes no sense if I or someone else thinks you could consider another choice. There nothing more that I can do. Sorry.
Yes, that would be my conclusion. In summary: If you need Excel, use Excel As for OpenOffice - just create a GREAT spreadsheet Andrew
@ cornouws: OK, you want a program that does exactly the same as Excel, without doing any extra click... hmm then I have the prefect solution for you: Excel :-) Ummm... Have you seen a version of Excel for my Ubuntu distribution? I've come to the same conclusion you have; if I need Excel, I will have to switch back to Microsoft. Unlike you, I'm not on a quest for an ivory tower "perfect spreadsheet" but rather for a program that solves my and my customer's real-world business problems. I suspect that is why Microsoft still rules the desktop. One thing that I don't quite understand is: if OpenOffice really does not even want to be Excel-compatible, why is there an Excel import filter in the first place? And why does the ad they insert into the Java JRE installer emphasize what Microsoft Office compatibility that it can't deliver?
1) Excel runs very well under Wine on Linux. 2) Windows rules the desktop simply because the vast majority of people use it. If you need a spreadsheet for everyday use, then OpenOffice is fine. If you need a spreadsheet that is 100% compatible with Excel then Excel is the one for you. 3) OpenOffice makes a very good effort at importing Excel spreadsheets, but its is definitely not "compatible" at the higher levels, and as I say above, it is better not to import complex excel spreadsheets and then write them back in excel format. You will often lose something in the translation. Andrew
My opinion on the question whether Excel's bugs/misfeatures should be emulated by OOo is that 1) OOo should implement "proper" behaviour. "Proper" in my opinion means not silently producing unexpected (by the user) results. Simply treating text as 0 doesn't fit that description, because that will definitely cause results unexpected by the user. 2) When OOo imports a spreadsheet from Excel, it should strive to produce the exact same computational results. Are the 2 requirements contradictory? No, they aren't! I think the handling of differences between Excel's treatment of strings in computations and OOo's treatment of such strings should be done in the import (and export) filter. The import filter should scan formulas for constructs that will be interpreted differently in OOo and should translate them appropriately. This may require the introduction of new mathematical functions such as DO_WHAT_EXCEL_DOES_WITH_THIS_VALUE_WHEN_ADDING_IT_TO_SOMETHING_ELSE that can be wrapped around offending arguments, but this is trivial (and has the advantage that the export filter can easily kick them out again). And it will allow OOo to implement its own computational behaviour independent of Excel's.
Thanks cornouws, this extension is very helpful. We need more votes to get a change. This issue is supposed to be fixed by now, but it is not (http://wiki.services.openoffice.org/wiki/2008_Q2_Review_of_Spreadsheet_Project). clemare
For all needing 100% Excel compatability: make sure you have all the same version of Excel as well as the same CPU. Both are known sources for differences in more complex spreadsheets ...
discoleo->cornouws > For all needing 100% Excel compatability: make sure you have all > the same version of Excel as well as the same CPU. Both are known > sources for differences in more complex spreadsheets ... This is the kind of comment that doesn't bring anything useful to this issue. By the way, if you really want absolutely identical spreadsheet results, be also warned that different versions of OOo Calc, and gnumeric and probably any other major spreadsheet beyond MS Excel will yield different results, so you NEED to stick to exactly the same version of the same spreadsheet on the same processor. [Yes, a SPARC will handle floating point very different than most MIPS processors.] The problem however is very different than EXCEL compatibility: it is really the broken philosophy in Calc, that allows to compute WRONG results, without notifying the user. So, in the end, the user might well be unsuspecting that his calculations are utterly wrong, and might be so catastrophically wrong. To paraphrase a different user: > The question, also, is not “does OOo behaviour really cause more > real problems than incompatibility to Excel.” The incompatibility > already exists as Issue 5658 shows. Both spreadsheets do illegal > operations, *they just do them differently*. > > The real question is, “Do we want to have a spreadsheet that we know > is faulty, where the faults may be very serious and potentially > life threatening. So, the real issue is that Calc allows me to do: = 1 + '1,000,000 and get a "1" and have NO feedback that the result is probably utterly nonsense. The unsuspecting user will likely miss this wrong calculations. How often did you look into a 10,000 rows spreadsheet if some numbers are actual text? Be warned that Calc often imports csv as text. You end sometimes even with a number transformed to text. Format e.g. a column as text and enter now numbers. They end up as *strings*, and any mathematical operation is broken. You may well miss this on a foreign spreadsheet, and even inspecting the cells will devoid you of any useful information, because the apostrophe in front of the text is missing (for cells formatted as text - how often did you check that the cell was previously formatted as text?). And have you thought of a 50,000 rows spreadsheet. I already work with >100,000 rows (well, obviously not in Calc). What chance do I have to find the faulty string in this spreadsheet? [see e.g. http://www.openoffice.org/issues/show_bug.cgi?id=85328 - I was well unaware that various cells were text - and this was a happy example that I noticed is wrong. !4! different users beyond me did NOT recognize that the cell was actually text.] I would bet, NONE. And on a different note: THERE IS NO SUCH STRING AS 1!!!!!!!! If you mean the string 'one' than this is different from the number '1'. There is only one number '1' and NO string '1'. This is purely a programmers distinction with NO real backup in linguistics. '1' is a number. Basta. Not to confuse the issue with '1 is not a number'. This is indeed a composite string, but simple numerical symbols are always numbers. I already described more advanced ways to handle string-input in a previous post: http://www.openoffice.org/issues/show_bug.cgi?id=5658#desc123 Hope this clarifies some issues.
I think mux2005, discoleo et al. are really missing the basic point here. Excel is inconsistent in its treatment of text and numbers. I would classify that as being a bug, and we do not need to reproduce that in OO. Andrew
@discoleo: saying "excel computes WRONG result" is like saying "I am RIGHT about this" - you aren't winning any friends by asserting your position in caps. @fdservices: What do you think about the inconsistency that I pointed out earlier, that =MID(A1;1;LEN(A1))+0 does arithmetic on a string value? Is this ok because the formula intentionally does arithmetic on a string result, whereas doing it on a cell is not intentional?
I think that it is just another example of Excel muddle headed formula evaluation. I realise that there may be no "right" answer here, but it would be nice to have some consistency. Andrew
@fdservices: Excel is inconsistent in its treatment of text and numbers. I would classify that as being a bug, and we do not need to reproduce that in OO. Not quite. If you import Excel files, then Excel's behavior is the gold standard. It doesn't matter if Excel's behavior is logical, illogical, serpentine, obscure, or even outright bizarre. It doesn't matter if you think there's a bug in Excel - I might even agree with you on that, but it's plain irrelevant. If OO doesn't do the same thing, it is a bug in OO's Excel import filter. As this discussion shows, many people got burned by this bug already, so it's not just an academic discussion. I've been a software developer for 20 years, and I also run a business. One thing I have learned is that customers don't care about the perfect anything - they care that the job gets done. I've seen software developers get fired for implementing a technically elegant solution that didn't meet the customer's needs. Whether that matters really depends on your goal. From an academic standpoint, if you want to build the best possible spreadsheet as a research project, you may be absolutely right. If your goal is to provide an alternative to Microsoft Office - well, then arguing "I'm right and Microsoft is wrong" is not going to earn you much market share. It's more like a famous last word.
"If OO doesn't do the same thing, it is a bug in OO's Excel import filter" - can't disagree with that. As for whether one can produce a better spreadsheet than Excel, I think that there is a crying need for a spreadsheet that works properly, and does not produce the nonsensical discussions that we are having here. I cringe at some of the points raised above. I get frustrated by people using the space bar to erase figures and then getting #VALUE errors in formulas. I am note even sure myself how Excel will treat numbers and text in different circumstances. (see the example I uploaded before) Andrew
*** Issue 93204 has been marked as a duplicate of this issue. ***
I just run into this issue in a spreadsheet which was giving bad results. Nor me or anybody in the office was able to troubleshoot it. I initially thought we've hit a bug. Then, only with help of QA team, I found out it is actually a feature. Arithmetic operations of number and text should yeild either number (like Excel does) or an error, if you want to be puritan. Not 0, which could confuse people and lead to false assumptions, as it actually does now in real-life scenarios. Therefore I think that it is a great (un)usability issue here.
>ondrej_suchy says: >Arithmetic operations of number and text should yeild either number (like Excel >does) We have seen above that Excel does not always give either a number or an error, it depends on the formulas used. >or an error, if you want to be puritan. Not 0, which could confuse people and lead to false assumptions, as it actually does now in real-life scenarios. The option is to treat text as 0, not to give 0 as the result of a formula containing text cells. I would be interested to see a scenario where the inclusion of text in a series of calculation, gives an answer that leads to false assumptions, if the text is treated as 0 in the formula. The only time that this will occur is where a number has been specifically entered as text, which, I admit, is something which needs to be handled e.g. by flagging the cell as suggested above.
Any cell that contains a formula such as =MID(A1;2;999) to trim off the leading £ or $ symbol will behave like this, treated as zero in a calculation. If you're opening an Excel sheet that does this, you'll have to either change it to =VALUE(MID(A1;2;999)), or =MID(A1;2;999)+0. I have a spreadsheet that brings in values from another sheet that I import from a text file that is scraped from a web page, and sums them up with proper numeric cells, and the text values were not being added in. I had to add +0 to all the formulas to make them work in OOo. This goes back to my prior suggestion - if =MID(X;Y;J)+1 works, why doesn't it work when the +1 is in another cell?
>philhibbs says: >Any cell that contains a formula such as =MID(A1;2;999) to trim off the leading >£ or $ symbol will behave like this, treated as zero in a calculation. Since MID() is a string manipulation formula, it will result in a string, which you would need to convert to a value, if that is what you want as a result. >If you're opening an Excel sheet that does this, you'll have to either change >it to=VALUE(MID(A1;2;999)) Exactly right >I have a spreadsheet that brings in values from another sheet that I import >from a text file that is scraped from a web page, and sums them up with proper >numeric cells, I do this all the time, even with text extracted from a pdf file, and it works fine (can't do that with excel!). I find this subject very interesting, can you post an example of the text file that you are trying to import?
*** Issue 101306 has been marked as a duplicate of this issue. ***
There's a post on an MSDN blog explaining how this issue is responsible for Microsoft's well-publicized lack of interoperability with OOo. http://blogs.msdn.com/dmahugh/archive/2009/05/09/1-2-1.aspx
Thinking a little more about that MSDN blog entry, there could be a way out. When Excel saves its formulae in an ODF file, it puts them in the msoxl namespace, as the standard puts it, “a namespace prefix specifying the syntax and semantics used within the formula.” When OOo imports an Excel spreadsheet, it should also import the formulae into the msoxl namespace, and if OOo wants to have "Excel Compatibility" on its feature list, it should process msoxl namespace formulae with automatic string conversion. The alternative is tantamount to saying "The standard allows for different semantics, but OOo will only implement our own semantics, not anyone else's".
How about implementing the following things: 1. If the user enters a formula that contains arithmetic on text cells, then add explicit conversion functions to it. I.e. the user enters "A1+10" and A1 contains text, then when the user submits the formula, write "convertonumber(A1,...) + 10" into the cell, where converttonumber is a conversion function (I don't what the actual function in Calc is called, assuming it exists) that accepts a parameter that specifies the number format. The number format parameter would be filled with the format appropriate for the current locale. That way, we fix the user's mistake and create a portable formula that will work consistently even when the spreadsheet is ported to a different locale (because the conversion is completely explicit). It may be useful to introduce an informational popup message such as "The formula you have just entered tries to perform arithmetic computations on string cells. Appropriate string -> number conversion functions have been automatically added to the formula". I would advise against making this a "Do you want this?" choice dialog, because there is no sensible reason for the user to reject this transformation. 2. When opening/importing, fix all formulas with string arithmetic according to 1. Since we do not usually know the creating environment (creating program, locale,...) and as such cannot with 100% accuracy predict the proper number format to use for conversions, a dialog such as the following would be nice: "The document you are trying to open contains formulas that try to do arithmetic computations with strings. OpenOffice.org can attempt to guess the proper text -> number conversion parameters. However, this may lead to incorrect results in affected cells. If you choose 'No' below, then all affected cells will be flagged with an error and you will have to fix them manually. Do you want OOo to attempt to fix cells automaticall? Yes/No". If the user chooses "Yes", introduce conversion functions as in 1, making a best effort based on whatever information we have about the document to guess the conversion parameters. If the user chooses "No", treat the affected formulas as if they had syntax errors. -------- This approach will - prevent the user from creating documents with implicit string arithmetic (ambiguous or undefined behaviour) - allow the user to open/import documents based on implicit string conversions and have (most of the time) a working spreadsheet (that no longer contains implicit conversions) - allow the user to open/import documents based on implicit string conversions and go the safe route of flagging all problematic cells as error for manual inspection and fixing.
Interesting idea, but it won't fix =SUM(A1:A10)
actually... I'm not sure if Excel will SUM() text values, Google Docs doesn't. Maybe SUM() doesn't need fixing for interop - I still think it should, but I don't think I'm going to win that one.
The above won't work for you, since you can enter a fomula before entering the cell value to which it refers, As far as I know the Go-oo (Novell) version of OpenOffice does what you want, converting text into numbers wherever possibly. I find that this makes spreadsheets less usable, not more, but then the choice is yours. Just a word of warning, neither version is Excel compatible. Both will read and write Excel spreadsheets, but you need to be careful since they cannot handle all the Excel content (Active Components, attached scripts etc.) I still do not know why we need to encourage the development of an Excel clone instead of an Excel beater! Andrew
The marketplace is littered with "better" products that failed because they didn't do what people needed. Excel won because it *was* Lotus 1-2-3 compatible. Ever heard of Next computers? Linux on the Desktop? Even a multi-billion dollar marketing budget didn't save these products. The choice isn't between Excel Clone and Excel Beater, but between Excel Clone and Edsel Beater.
Wow, that is some tirade! Lotus was developed before Excel, and was the spreadsheet that Excel had to beat. Hence the Microsoft mantra "If it runs Lotus it is not MSDOS 6". If Excel was Lotus compatible, then why include a "Tools>Options>Transition>Transition Formula Evaluation" option? Visicalc, Supercalc, Lotus and Excel all have their strengths and weaknesses. Let's try and improve on them all. Andrew
As the originator of this bug, I've been watching this argument for a few days shy of seven years. It's been interesting. Why don't you ask some users what they think about "Excel compatibility" vs. "we think *this* is *better* for you". I'd put $50 on "Excel compatibility" being the overwhelming winner. But try it and find out. "We *think* this is *better* for you" is fine sometimes, but if you're driving acceptance of an office suite by typical users who just want to get their work done, then surprises are generally not going to win you customers. And if the surprise is a wrong mathematical result that's almost impossible to catch... well, that's never welcome. Good luck!
Andrew, here is an easy experiment that may settle this once and for all. It's called Market Research. Implement two versions of OpenOffice. One with all your improvements and only supporting its own file format, and one that can open XLS files, but correctly. See which one people will download.
Hi alcohenma, I have never, and would never, enforce a "I think this is better for you" policy. I am simply putting forward an alternative to Excel. Excel has already proved to be the winner, but, in my opinion, through top notch marketing and (allegedly) underhand practices. Again in my opinion, it was never the best. Your really cannot use the term "wrong mathematical results" when referring to performing a mathematical calculation on a cell which has been deliberately entered as a "text item". It is true that the result will be different, but by no stretch of the imagination can it be "wrong". Imagine your bank balance if you add in the Account Number to the total. Pay off your insurance premium, but don't add in the Policy Number before you pay it. As to "impossible to catch", check out the facilities in Openoffice for Value Highlighting and Mark Invalid Data. Hi kkeane, I think that we actually have all of this; OpenOffice with the "improvements" and reading tolerably well Excel files. Go-oo without the "improvements", also reading Excel files to a reasonable extent, and Excel. To restate my point: 1) I do not want formulae to fail because there is a text string in one of the dependant cells (OpenOffice 1 Excel 0) 2) I do not want my bank account number added into a column of figures, even if I have (correctly) formatted it as text. (OpenOffice 2 Excel 1) Yes, Excel does not work the way you want with the SUM() formula! so: 3) I do want the result of calculations to be consistent, whichever formula I choose to use. In an example spreadsheet OpenOffice got the answer the same in 8 cases, 2 resulted in ERR:502 and 2 were different. The same sheet in Excel produced the same result in 6 cases, 4 resulted in #VALUE and again 2 were different. The example was worked using simple SUM(), +A1+A2 and VLOOKUP() formulae on a range of cells, including the most common user input errors. On a more complex calculation including a VLOOKUP() formula which returns a blank cell if no entry is found, Excel refused to give an answer for both SUM(+A1,-A2...) and +A1-A2... whereas OpenOffice gave the "correct" answer in both cases. Moving on to a previous point, can you explain why Excel included a Transition Formula Evaluation option if it is never needed? Personally I thought the latter was a good option. Add an option which allows the user to treat text as numbers wherever possible. The suggestion was rubbished by the developers when I posted it. One final thought, how do users input numbers into a spreadsheet, format them as text, and then align them right without realising it? Ah well. Andrew
@fdservices: how do users input numbers into a spreadsheet, format them as text, and then align them right without realising it? One step at a time, possibly performed by different users each building on the work of previous users. Issue 101471 has been created to discuss the handling of the msoxl namespace, which is related to this issue.
Best give your users some training then :-) Andrew
On a recent engagement with a local authority in the UK to move from Microsoft to an open source software stack, this very issue was on the list of reasons not to move. They decided not to in the end. I don't know how big this one was in the decision. Training was not an option, since they were dealing with a large number of legacy documents, a few of which would change behaviour in OOo.
Andrew, apparently, you never worked in a large company. Unless a CxO is a major OpenOffice supporter or Microsoft hater, if you tell them to retrain users for such an issue, the answer will be "you are fired for putting OpenOffice on our computers in the first place. You should have bought MS Office all along!" Then he'd write a memo putting OpenOffice on the banned-software list. And he'd be right. A single incident of such a problem can easily cost far more than a copy of MS Office for everybody. In any case, training users is not an option for me. Me lone consultant am not going to tell a multimillion dollar company to retrain all their users AND to review every single Excel spreadsheet companywide (because more often than not, users pass spreadsheets around and copy them, rather than using templates) for an issue that 99% of the recipients never have because they use Microsoft Office. Me lone consultant WILL tell my customers that OpenOffice Calc is dangerous to their bottom line.
Well, I happen to think that training is an important activity in any company, and training in the correct use of software is equally important. I admit that this is often a neglected area which gives rise to many inaccurate, inefficient and ineffective spreadsheets, and, consequently, wrong results. Just for the record, I was that CFO in a (several) large company, and I am also now a consultant to several companies, so you may assume that I have some small amount of experience. I have never come across the problem defined here (numbers entered as strings by accident) but I have often come across the problem of having to enter hugely complex formulas to avoid #VALUE errors when Excel tries to include a string in a calculation. Fortunately, as a fairly competent Excel user, I can get round the problem, but I still do not understand why you are adamant that I should have to! For this and other reasons, I want OpenOffice to succeed, and to be a better spreadsheet! I disagree that OpenOffice calc is dangerous, except to the extent that ALL spreadsheets are dangerous if the user does not take reasonable precautions to ensure he is getting accurate results, i.e. the spreadsheet is a tool and the user is responsible for the results of using that tool. I do agree that it is not Excel, it does not have the same bugs/features that Excel has,it cannot correctly exchange files with Excel at present, and it cannot therefore be used as a direct replacement for Excel. But if we all work hard at it and encourage its continuing development, it may well become the standard in time, and we can all finally dump Excel with a big sigh of relief. Andrew
> I have never come across the problem defined here. I do have. One of our suppliers sent an excel sheet with proposal. I reviewed it (in OOO) and approoved seeing that the totals were the cheapest of all the offers. Already later when the agreement was signed it turned out that totals in the same spreadsheet differed for me and him. After some ponderig I discovered that some of the items in the spreadsheet were formed as text, apparently to make them look the right way or whatever. And as his excel calculated all the numbers correctly, he was not aware of problems that will cause for me. And as my OOO did not hint in any way that it is going to ignore some numbers-formed-as-text I did not catch the error. Those who suggest retraining users -- please return to planet earth. You can't retrain everybody even if you wanted to. Spreadsheets originate from so many sources that you simply do not control. In fact relying on the user to enter data correctly in the system is wrong approach anyway. Any system should be prepared to receive all kinds of sh*t thrown at it and handle it correctly. To those who preach purity. Fine! But then do not ignore numbers formed as text silently. Either pop ut a notice when opening the file or put appropriate error notice in the cell that attempts to use numbers-formed-as-text. Silent ignoring is the only wrong approach!
Karlis, I think in this bug report we have seven years worth of evidence that this problem is *extremely* common even if Andrew personally may not have come across it. Which makes me wonder why he even cares about it one way or the other? I actually found the bug exactly the same way you did: a vendor sent me a proposal that looked like the cheapest. In my case, it wasn't for my own business but for a customer - which means that I could have gotten sued over it, and it could have put out of business. I'm not even sure if an E&O insurance would cover such matters. Fortunately, I caught it before placing the order. This vendor was a large company, and I'm sure they are sending out dozens of these spreadsheets every day. You are absolutely right. Retraining users is an absurd suggestion. And it wouldn't even help. That horse has left the barn long ago. You know the old joke about how God could create the world in seven days? He didn't have to worry about an installed base. Most likely, they have a sales team of maybe 20 people. Each of them probably keeps a copy of each spreadsheet sent out, and when a new prospect calls, pulls up one of the old spreadsheets, changes the numbers, and resaves it. Maybe they are using SharePoint or something like it to manage the large number of spreadsheets - who knows. He certainly will NOT go back and spend hours double-checking every single cell of a 5-page spreadsheet with 3000 rows and 30 columns - even less so when he sees with his own eyes that the total is correct. And since he can use any of thousands of previous versions of this spreadsheet as a starting point, cleaning up all the places where this "error" (which isn't an error at all from Excel's perspective) lurks is just plain a ridiculous proposal. Add to that the problem of personnel turnover. In many companies, sales people last maybe six months. Do you really seriously propose that each of these people be trained in such all such subtle issues that doesn't even affect the software that they and 99% of the world is using? On the first day on the job? With that kind of turnover, training is going to be 2 hours of "here is the phone, here is the price list, and for the proposals just ask John over there to send you a copy of the spreadsheet he did. Now get to work, and I want you to make 20 sales in your first week!" And training for long-term employees? Maybe after a year on the job they'll be sent to a one-week Excel class taught by a MOUS (Microsoft Office User Specialist). Most likely somebody who has never heard of OO. Training people? Come on, get real.
I'm interested to hear any objections to working around this issue by implementing Issue 101471 in the way that I have suggested there - by implementing support for the msoxl namespace with Excel semnantics, and additionally bringing imported .xls spreadsheet formulae into the msoxl namespace.
I think I may actually have stumbled upon WHY this problem occurs in so many Excel-generated spreadsheets. Turns out that it no amount of training would solve it, because Excel actually makes the change automatically. In some cases, users actually have to go out of their way to NOT enter numbers as text. Actually, there is at least one, and probably two more ways this happens: users may intentionally do it for various reasons. And it probably also happens when you import data from CSV and other formats, although I have not confirmed that. Try this (I tried it in Excel 2007). It is important to do it in exactly this sequence. open a blank spreadsheet. Do not touch the format of cell A1 (it should be formatted as General) Type the number 7 into cell A1 Change the format for cell A2 to "Text" Type the number 6 into cell A2. Make sure you type JUST the digit - no quotes or anything else. Note that the number six will be left-aligned due to the text formatting. Note, but do not change, the formatting of cell A3. It should be "General" Type the formula "=A1+A2" (without quotes) into A3. You will see the number 13, left-aligned. Note the formatting of cell A3 again. You will see that it has changed to "Text" Save the spreadsheet as XLS file. Open the spreadsheet in OpenOffice. Voila. Remember: you haven't entered a single quote in Excel, only digits. Yet just based on the formatting of the cell, Excel treats the 6 that you typed into A2 as a string. Even if you later change the cell format to "General" Excel will keep the number as text! In the real world, this is very likely to happen to parts of a spreadsheet by accident. Imagine a spreadsheet with 10 columns. You format each column appropriately for the data it should hold. Later you insert a new column between two columns that are both formatted as text. How likely is it that you'd realize the new cells are also formatted as text? Maybe you'll realize it an hour later when the formatting is off. But it's too late: once you enter data into a text-formatted cell, the data stays a string even if you later change the format. The second way this probably happens: it may also be an intentional "trick of the trade" of experienced Excel users: "you can save yourself a lot of work formatting the cells by typing in the numbers in the correct format, if you start it with a quote". I came across http://excel.tips.net. Several of the tips listed there deal with this and a few related issues, including a question "If you have a range of numeric values in your worksheet, you may want to change them from numbers to text values. Here's how you can make the switch." (the answer: copy the cells to the clipboard, format the cells as text, paste the data back in - exactly as I outlined above).
Since this issue is messed up by disfunctional comments anyway, I might ad some lines: > working with OOo support/migrations for 5 years (full time business) > only had one real case complaint about this issue > the issues I see because people don't know what they do with their office suite / documents, are hunderd folds ... > yes, it is important, so I created the CT2N extension
I think I now understand the problem better. If, in the real world, you are trying to use OpenOffice as an Excel substitute, then I recommend that you do not. As I have said time and again, "If you need Excel, use Excel". OpenOffice is not compatible in many ways. Also Excel is inconsistent in its evaluation of formula as (I think) we all know, so absolutely: Use Excel to read Excel sheets. @karlis - "To those who preach purity. Fine! But then do not ignore numbers formed as text silently. Either pop up a notice when opening the file or put appropriate error notice in the cell that attempts to use numbers-formed-as-text. Silent ignoring is the only wrong approach!" - I agree. @kkeane - "Which makes me wonder why he even cares about it one way or the other?" - for the reasons specified e.g. assuming you can accept this as a proper formula "=IF(ISERROR(VLOOKUP($A6,'Path to Workbook\[Workbook.xls]Sheet1'!$A$1:$F$500,5,FALSE)),"",VLOOKUP($A6,'Path to Workbook\[Workbook.xls]Sheet1'!$A$1:$F$500,5,FALSE))" then this formula will be needed to test whether a result is Debit or Credit and then add and subtract figures across a sheet correctly "=IF(SUM(E6,G6,I6,K6,N6)-SUM(F6,H6,J6,L6,M6)>0,SUM(E6,G6,I6,K6,N6)-SUM(F6,H6,J6,L6,M6),"")" If you want a detailed explanation of why this should work and nothing else, I can give it to you, but I wish such complexities were not necessary. "Training people? Come on, get real." - Oh dear! @philhibbs - seems like a sensible suggestion. Treat Excel created formula the Excel way. To repeat myself, this is similar to the Transition Formula Evaluation option in Excel, but executed automatically. I think, however, that there should be a warning issued when such a sheet is opened to prompt the user that Excel rules apply. @kkeane - you seem to have discovered another Excel "feature"! Importing csv and text files works fine, you can even specify the format of each column imported, but it is not usually necessary since Excel makes a creditable guess in each case. "Even if you later change the cell format to "General" Excel will keep the number as text!" - you are right again, yet another "feature"! "you can save yourself a lot of work formatting the cells by typing in the numbers in the correct format, if you start it with a quote". - This, from an experienced user, beggars belief! There have been a number of extremely sensible suggestions made in this thread to achieve backward compatibility with Excel formula evaluation: 1) Automatically mark cells with text content which will evaluate to numbers. (this can be done in OpenOffice already, but it is not automatic) 2) When opening an ".xls" file, warn the user that this is an Excel type file and then use transition formula evaluation as per philhibbs suggestion. But, what is the correct method of then saving the ".xls" file in a different format, and what would happen if you chose to save an OpenOffice file in ".xls" format? may be the kiss principle is best... 3) have a Transition Formula Evaluation option which is the reverse of the Excel implementation. If you want Excel formula evaluation, tick the box, if you do not, leave it clear. Andrew
I totally agree with karlis. Whoever said that OOO is superior because bla bla bla, is wrong, not because OOO is wrong, but because the compatibility issues like this should be warned to the user. Besides there are so many users that can't understand how to use the excel correctly, all of them work outside my company, but sent us excel files with this error in them. I vote to put a warning message to the user indicating that there are cells with numbers involved in formulas that are formated as text, and then put a little mark in every cell (or painting then with other color or whatever) so the user can correct them. clemare
Andrew, ---- If, in the real world, you are trying to use OpenOffice as an Excel substitute, then I recommend that you do not. As I have said time and again, "If you need Excel, use Excel". ---- I know you have said so before, but I find this comment very strange because OpenOffice is primarily and explicitly *advertised* as a Microsoft Office replacement. You see the banner every time you install Java, for instance. If this statement really was true, then the solution actually is very simple: remove the Excel support altogether. According to your statement, it's not needed anyway, and it is broken anyway. But if you leave the Excel support in, then it absolutely *must* work. Reliably. Every time. Re. the second comment: more importantly, I answered philhibbs' question about "how users do it without realizing" I identified what to my knowledge nobody had done before: why this issue arises so frequently in the first place. You can call it a feature or a misfeature or whatever you like. What bothers me most about all this isn't actually the bug itself. What bothers me is the absolute willful unresponsiveness of the OpenOffice team to a very obvious user demand - with the seventh anniversary coming up in a few days - just based on some ivory-tower idea of "Excel is wrong". And then people wonder why Linux on the Desktop isn't taking off.
One more thought. It is *entirely* possible that Microsoft has done this intentionally in order to break OpenOffice compatibility. It would be very much in line with their strategies in many other cases. And right now, OpenOffice is actually Microsoft's marketing for them by letting this issue fester for so long.
Friends of the Italian Opera, would you PLEASE STOP discussing off-topic in this issue? Thank you. This issue will be addressed, but adding non-technical comments just made it grow beyond recognizability. No one will read 203 comments.
ER - given that this issue has its SEVENTH anniversary coming up on Monday without a resolution, I have a very hard time believing that it will be addressed. Even more so since quite a few team members seem to have dismissed this issue or even explicitly stated that it should *not* be addressed. What you are seeing is the resulting frustration from the user community.
@ kkeane: then how much tips, carrots, sticks, suggestions, ... did the frustrated part of the community give on the CT2N extension, even after being invited to do so ? Sorry, but this gives an impression, which may not be intended, but that for me definitely is not positive.
Good point. To be honest, I hadn't even heard about CT2N until you mentioned it yesterday; I all but gave up on OpenOffice and am just following this in case this ever changes. This is one of the two main issues that are keeping me from switching to Linux on the desktop (the other one is the lack of good accounting software). I'm also a bit reluctant to rely too much on extensions for bug-fixes. At some point, you run into software management and update-compatibility issues. Another concern I have is that a user wouldn't necessarily KNOW that he'd need to take the CT2N action. All he would notice is the total of the spreadsheet, and if he is lucky he'd notice that the numbers make no sense. But I'd still be very interested to learn more about CT2N as a stopgap. Is it supported on all platforms?
@ kkeane: Thanks for your reply. I think first time it was mentioned here was http://www.openoffice.org/issues/show_bug.cgi?id=5658#desc82 More info to find from there.
If you want a change: VOTE FOR THIS ISSUE!!!!
@cornouws: "... did the frustrated part of the community give on the CT2N extension, even after being invited to do so ?" An extension is ok for individual use, but not corporate. Large organizations are resistant to anything non-standard, and for good reasons if they are expecting to exchange documents with other orgs that might not use the same extension (and therefore will misinterpret their documents, possibly causing financial loss and lawsuits). Either this issue gets resolved in the core application, or... well, there is no "or", that's my point.
Issue 103318 created to propose handling of alternative semantics in different namespaces (e.g. msoxl).
*** Issue 85242 has been marked as a duplicate of this issue. ***
Grabbing issue.
Working on this.
In cws odff06: revision 275712 sc/source/core/inc/interpre.hxx sc/source/core/tool/interpr2.cxx sc/source/core/tool/interpr4.cxx Converts string content to numeric value, or sets #VALUE! error if no unambiguous conversion is possible. Converted are only integer numbers including exponent, and ISO 8601 dates and times in their extended formats with separators. Anything else, especially fractional numeric values with decimal separators or dates other than ISO 8601 would be locale dependent and is a no-no. Leading and trailing blanks are ignored. The following ISO 8601 formats are converted: CCYY-MM-DD CCYY-MM-DDThh:mm CCYY-MM-DDThh:mm:ss CCYY-MM-DDThh:mm:ss,s CCYY-MM-DDThh:mm:ss.s hh:mm hh:mm:ss hh:mm:ss,s hh:mm:ss.s The century CC may not be omitted and the two-digit year setting is not taken into account. Instead of the T date and time separator exactly one blank may be used. If a date is given, it must be a valid Gregorian calendar date. In this case the optional time must be in the range 00:00 to 23:59:59.99999... If only time is given, it may have any value for hours, taking elapsed time into account; minutes and seconds are limited to the value 59 as well. NOTES: The conversion is done for single arguments, either referenced cell content or inline string, cell range arguments are not affected, so SUM(A1:A2) now will differ from A1+A2 if at least one of the cells contain a convertible string, as in Excel ... Because the conversion is also done for formula inline strings, such as ="1999-11-22"+42, calculations involving inline localized dates now return an error, e.g. with "11/22/1999" or "22.11.1999".
Created attachment 64497 [details] testcase for the fix
Reassigning to QA for verification.
Missed the formula cell case, revision 275776 sc/source/core/tool/interpr4.cxx
Created attachment 64535 [details] testcase now with formula cell results as well
Are leading and trailing spaces stripped? " 17" + "4 " = 21 Can cells with spaces be calculated? " " + 21 = 21 " " - 21 = -21 " " * 21 = 0 " " / 21 = 0 Can I get a sum on values with spaces? A1=1; B1=" "; C1=SUM(A1:B1) Thank you, Tom
Also, Excel will interpret all kinds of numeric data, such as 100.000,00 or $75.99. I assume that they resolve ambiguities (is 10.000 10 or 10000?) based on the locale. Not sure if it's the locale of the OS, of Excel itself or whether the locale is stored with the spreadsheet.
@tomvigl: Just READ my comment #desc215 Wed Sep 2 14:14:06 +0000 2009 And yes, an empty or blanks only string currently evaluates to 0. @kkeane: Up to Excel2003 the result depends on the locale of the Regional Settings, in Excel2007 it seems always en-US separators are assumed, but I didn't investigate deeper yet.
verified in internal build cws_odff06
*** Issue 105427 has been marked as a duplicate of this issue. ***
*** Issue 106944 has been marked as a duplicate of this issue. ***
*** Issue 107375 has been marked as a duplicate of this issue. ***
I have never used Excel, but my OpenOffice 3.0 spreadsheets now show "#VALUE!" all over the place because some arrogant wingnut decided that his medical spreadsheets were important than all the existing OOo spreadsheets out there. I'm off to search for FOSS spreadsheet software for gnu/Linux that does not drastically and arbitrarily change functionality from one minor version to the next. Thanks for nothing OpenOffice folks. Good luck keeping your user base. You'll need luck since you certainly don't have any respect for us!
finally OOo work well with numbers inside a string. Good job.
@jtienhaara: you can better use version 3.2.0 or 3.2.1 That should work in your situation. And for true converting text numbers to real numbers, use the extension CT2N.
@jtienhaara: can you attach an example that this change breaks?
Created attachment 69985 [details] Example: notes in fields, +/- treated text as 0 in OOo 3.0 but not in 3.2
Created attachment 69986 [details] Screenshot: OOo 3.0 / notes_in_number_fields.ods
Created attachment 69987 [details] Screenshot: OOo 3.2 (Windoze) / notes_in_number_fields.ods
@philhibbs: Thanks for taking the time to look at this further. I've attached a small example, notes_in_number_fields.ods, as well as screenshots from OOo 3.0 (on CentOS) and OOo 3.2 (on Windoze). N.b.: I'm aware that SUM() and PRODUCT() can be used instead of +/-. But that is not really a viable option in my rather bloated, 3 year old family finance spreadsheet... As with most spreadsheets, it has taken on a life of its own.
jtienhaara: This is a bigger issue than someone deciding "his medical spreadsheets were important than all the existing OOo spreadsheets out there". This is an Excel compatibility issue, and I understand that some people don't care about the 25 year legacy of the most widely-used piece of business software ever written, but OOo can't afford to ignore it. There's a hard choice to make here - do you silently treat a badly-formed number (e.g. "l2345") as a zero, or do you highlight it as an error to the user? I'm currently on the fence. My instincts are to go with the patch for this issue, and highlight it to the user (that's what Excel also does). That it breaks your spreadsheet is highly unfortunate, but I'm unsure whether it's a serious enough backwards-compatibility bug to revert this aspect of the behaviour (and treat non-numeric strings as zero rather than raising #VALUE!). In any case, I think it needs to be raised as a new issue for consideration, as this one is closed.
It is nice to see that others are putting their minds to this issue. Excel is the standard, but that does not, per se, mean that it cannot be improved. I, for once, would like to persuade Oracle that it is worth developing an office product which is superior to the MS offering. The problem which would need to be addressed is how to make the super-office compatible with the MS office. A previous suggestion I made was to introduce options similar to the "Tools > Options > Transition" which are included in Excel 2003. (File > Options > Advanced in late editions)
All this talk of Excel irritates me to no end. A minor revision of OpenOffice (3.2) changed the behaviour fundamentally. That should not have happened. Now that it has happened, an "OpenOffice <= 3.0 text in numeric cells compatibility mode" option should at least be added into a future release. What on earth has this got to do with Excel? (^^^rhetorical question, I don't care about the whole religious debate. I use OOo and nothing else, so I don't care whether Excel and QuattroPro and Lotus Notes do things differently. I do care that 3.0 and 3.2 do things differently.) In any case a new bug WAS opened, and the opener was told to close it and vote on this issue instead: http://www.openoffice.org/issues/show_bug.cgi?id=109721
@jtienhaara: The opener of Issue 109721 was told to refer to Issue 109165, not this one.
I want to thank the OOo team for fixing this very long-standing nuisance. @jitenhaara - OOo advertises itself as a drop-in replacement for Microsoft Office. So Excel compatibility matters more than making a "better" spreadsheet program that is useless to 99% of the world. And, yes, I understand your compatibility concern. You have exactly the same problem in reverse that people in a mixed Excel/OOo environment had since 2002, when this bug was first reported. I am very glad and appreciative that the OOo development team has listened to users. For me, it came too late - I had to advise all my clients to move away from Open Office because of this problem. But it is very much appreciated anyway!