Apache OpenOffice (AOO) Bugzilla – Issue 68636
Checkbox "Save cell contents as shown" should be unchecked as standard
Last modified: 2013-08-07 15:12:27 UTC
I often work with csv files containing high-precision numeric data (10 decimals or so) generated by statistical software. When I open these files in calc it automatically rounds these numbers to 2 decimals for display purposes, while still maintaining the underlying high-precision numbers. However, when I save the file after editing (again in csv format) it only saves the rounded numbers. I believe this is caused by the calc feature "Save cell content as shown", which is by default enabled in the dialog window that appears when I select 'Save file as' -> text(csv). Unfortunately this same window only appears if I am currently working on a openoffice calc file, not with an opened csv file. I believe two things need to be changed: 1) The default behaviour of saving csv files must be to save the original data, not the 'data as shown' because this can lead to serious data loss. At the very least, this behaviour should be configurable in the calc settings. 2) The 'Export of text files' dialog window should always appear when selecting 'save file as' and selecting 'text(csv)' as file format. To reproduce: 1) open a CSV files that contains numeric data with more than 2 decimals 2) Save the same CSV file again and notice how all data is now rounded to 2 decimals.
Ok, I found that it is possible to get the 'Export of text files' dialog for opened csv files if I first select the 'edit filter settings' checkbox in the save-as dialog. Regardless of this, I still believe that the default behaviour of calc should always be to save the complete, unrounded data to prevent data loss, and make it optional to only save the data 'as displayed'.
I checked with "2.0.2 German version WIN XP: [680m5(Build9011)]" and was not able to reproduce any problem, because option "Save cell content as shown" does not exist yet in this version. I checked with "2.0.X German version WIN XP: [680m180(Build9056)]" and was able to reproduce the dataloss. Checkbox "Save cell contents as shown" is always checked, and has to be unchecked newly for each new file that is to be saved as ".csv". That's dangerous and might cause dataloss. I belieev checkbox "Save cell contents as shown" should be unchecked as standard and if wanted it should work so that it must be checked all times nely for a new "save as .csv" @hvbakel Pls. specify your OS and Platform!
@Rainer: Export as displayed is a new feature as of OOo2.0.3, see http://sc.openoffice.org/servlets/ReadMsg?list=features&msgNo=199 and http://development.openoffice.org/releases/2.0.3.html The default "enabled" was chosen on purpose because that is what most people expect, see specification http://specs.openoffice.org/calc/filters/csv/save-to-csv.odt and issue 4925. If precision of the underlying values is needed, a corresponding number format can be applied. There is a request for enhancement (RFE) to automatically display the precision according to the value, see issue 46511.
Closing.
@er: You are right, it's some more sophisticated than I thought. Saving with checked "Save cell contents as shown" can cause precision dataloss, saving without checked box can cause dataloss concerning formatting, what might be even more serious. I don't believe that there can be a really "simple" solution, so for first let it be how it is.
Ok, I can understand the reason why this option was added, but I think it is problematic to have it enabled by default. To me, the fact that OO decides to display only two decimals (by default) has more to do with visual formatting than with the actual underlying data. If you look at how the data is saved in opendocument format there is also a clear distinction between formatting and the actual underlying data. If an ods file would loose all decimals by default upon saving there would be a riot... If I consider the case where I open a csv file with a data matrix of 10x40,000 and I edit two cells and then save the data, I expect only the two cells that I edited to have changed, not to have lost all the precision in the rest of the data as well. This just doesn't make sense, especially since there is no warning whatsoever that this is going to happen. Two cells changed should mean only two changes in the file, not the added 'bonus' of having all other data modified. To be honest, I just don't see much advantage of saving the data 'as shown', because even if I save high-precision data and re-open it again in openoffice, it will again show the data formatted the same as before (there is consistency for the user when re-opening the same file, regardless of precision). So this feature only affects how the data looks when you open it in another application. In that respect, stripping all precision data by default just seems like a very bad idea. Based on the above, I'd really like to see this issue reopened
see comment above
one for the requirements team
*** Issue 68871 has been marked as a duplicate of this issue. ***
I think this bug should be re-opened. Here's why. A common use for OO (and CSV) has got to be accounting-type calculations, where everything is rounded to 2 decimal places. In this case, the default of "Save cell contents as shown" makes sense. Another (possibly less common, but nonetheless important) use is scientific, or otherwise high-precision calculations. In this case CSV export is also far more important, because that's often the easiest way to transfer data to/from custom applications. The current setup fails miserably in this regard: a) Excel, last I checked, preserves all decimal places by default. So, people who are coming from Excel are unaware of the problem. b) Data loss occurs, but it mightn't be immediately apparent. People waste a lot of time as a result. c) The checkbox is not easy to notice. An inattentive or unsophisticated user will conclude that OpenOffice CSV export is broken, and will likely go right back to Excel. d) If need full precision, the odds are that you need it all the time. Remembering to uncheck the box, and making those extra clicks, is a nuisance. So, to summarize, this checkbox is terrible for people who care about decimal places. In contrast, somebody doing accounting type math: a) (I surmise) does not need CSV export as often. b) Does not suffer a major setback when a CSV is saved with full decimal places. c) Can achieve the desired effect even without finding the checkbox: by using ROUND, by doing search-and-replace in a text editor, by hand. So, the current state of affairs royally screws a (smaller?) group of people, and gives a small benefit to (larger?) group of people. I propose a variation on hvbakel's suggestions. The save dialog should look like this when CSV file type is selected: File list [...] File name: File type: [x] Automatic file name extension [ ] Save with password /grayed out/ [ ] Save cell contents as shown (may lose decimal places) /after the initial install, this is *off*; the checkbox remembers its state between save dialogs/ |> More options This is an expanding pane, instead of the silly dialog. It is hidden in the initial install, but it remembers its state (hidden/shown) from one save dialog to the next. It contains: Character set [...] Field delimiter: Text delimiter: [ ] Fixed column width This modification addresses: a) Dataloss by default b) Discoverability of the option c) Annoyance at having to set the option to your preferred value every time I also snuck in "change dialog to a pane", but this is optional. It's certainly better for people with a reasonably large screen (1024x768). It should be usable at smaller resolutions too, provided the layout is done right.
Is this duplicate of 3687?
duplicate of issue 3687 i.e.?
Fixing 3687 would make this bug less of an annoyance. However, I think that there's an independent aspect of this bug that needs to be tackled -- data loss by default. I see two options (the first one is better in my opinion): 1) Change the default to preserve all digits (see my previous comment: Excel compatibility, harmless, easy to work around even without the box). 2) Make it READILY APPARENT that leaving this option checked will discard data. E.g.: Save cell contents as shown ([bold]precision will be lost[/bold]) I think it's necessary to warn the user that the default (currently) settings lead to data loss.
lesha, see issue 4925 - people wanted something contradictory to point (1). saving of options should be the best choice (as mentioned in issue 3687) - this would also solve point (1). 'save as shown' => precision will be lost :) but adding a warning should be easy - we need to probably ask the UI team. muthusuba
Okay, I see that there has been an explicit request to discard digits (abeelias in issue 4925). So, I don't think it's worth fighting over the default at this point, although once issue 3687 is resolved. However, once it's fixed, I think it might be better to switch to the "safer" default of not discarding digits. How is the following? This bug can be closed when issue 3687 is resolved, and when a UI warning is added.
It seems that issue 4925 was more about formatting as money and as date, which are separate data types by the way. It was a tragic decision that ordinary numbers got chopped as well. Calc basically lost with one move all the scientific community.
This issue has caused me many hours of work, especially since precision data loss is hardly discoverable. I frequently use CSV as exchange format for latitude longitude coordinates, so a truncation to 2 digits shifts all locations by a few 100 meters each. If one infrequently works with OO this always leads to data loss and requires lots of diagnostic and data recovery time. I do not see any use-case or application that would benefit from rounded CSV output data. OO Spreadsheet is a not-recommended tool for our customers until this is fixed, even though it excels in UTF-8 handling and CSV reading capabilities. Data loss without *HUGE* warning or alaert panel is not acceptable. Solutions: 1. Import the CSV with all digits shown by default. If rounding is desired the CSV should already be rounded. Make no assumptions about the imported data. 2. Export with a HUGE warning that precision loss may occur.
This also causes me much more additional work as what a normal save operation should be done in one Ctrl+S shortcut, has now become a 4 click operation with unacceptable delays in between. I'm considering rolling back to an old version that did not have issue.