Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing |
Summary: | In "standard" data type csv import should never convert items with one dot to dates | ||
---|---|---|---|
Product: | Calc | Reporter: | mfrasca <mfrasca> |
Component: | open-import | Assignee: | AOO issues mailing list <issues> |
Status: | CONFIRMED --- | QA Contact: | |
Severity: | Trivial | ||
Priority: | P3 | CC: | cno, discoleo, higuita, issues, jbf.faure, kyoshida, ooo, rb.henschel, tuharsky |
Version: | OOo 2.4.0 | Keywords: | oooqa |
Target Milestone: | --- | ||
Hardware: | All | ||
OS: | All | ||
Issue Type: | DEFECT | Latest Confirmation in: | --- |
Developer Difficulty: | --- |
Description
mfrasca
2008-04-07 10:47:33 UTC
When you are in the import dialog, click on the column head. This enables a drop-down list "column type" above. In that list choose "US English". Then the values will be imported as float in English notation. They are shown in your local later on in the spreadsheet. Using this list you can control the date format too, where the order of year, month and day differs, and you can force a content to be handled as text. It is really "worksforme". I understand that it is possible to select each of the columns containing numbers like 17.03 and 6.12, inform the importer that the column contains "US English" data, but the problem is: rule for date: dd-mm-yyyy rule for number: x,d (but on the same system M$Excel uses x.d) input 17.03 does not respect either. having this interpreted as a date is surprising... I tried with this option: Tools->Options->Language Settings->Languages->Decimal separator key even if I leave it unchecked, it still does not work for me. moreover, this behaviour is not only in the importer but also for typed data. again: really no European would expect 1.13 to mean January 13th... What Operating system do you use? What is the local setting of your OS? What GUI-language has your OOo? What is the local setting in OOo? Is in "Tools>...>Decimal seperator key" the option "Same as local setting" checked? From where do you call that csv-import dialog? File>Open, Insert>Sheet, clipboard or...? Please attach such a file, which imports wrong. I cannot reproduce your problem. All files import correct for me. Do you really chose the correct column type in the import dialog? From your example it should be "Date (DMY)", "Time", "Text", "US English", "US English", and "US English". The way the float values are shown in the spreadsheet does not depend on the import, but on the language you use for the cells in your spreadsheet. no, I do not instruct OpenOffice to import my data (see original post) as US English and the whole point is that I would like not needing that. the data is not US English data and my colleagues here see this behaviour in OOo as an extra non obvious step as compared to M$Excel. at home I have a linux-powerpc machine, with locale set to 'en_DK.UTF-8' and I don't experience this defect there. here at office they gave me a windows machine (XP), Dutch localized. I installed the English version of OOo 2.4 (same as at home). I don't know where to look, in windows, for the answer to "What is the local setting of your OS?" but I assume it's Dutch (context menus in explorer are in Dutch). the OOo locale is set to Default Is in "Tools>...>Decimal seperator key" the option "Same as local setting" checked? setting or unsetting this option does not alter the behaviour "type/import 1.13 and get 13-01-08" as I see it, my locale settings say that I should type "13-1" to mean 13th of January of the current year. the same setting also say that I should type 13,1 if I mean 13+1/10. so actually the string 13.1 should either be left uninterpreted as a string (does not respect any known format) or if the program wants to be kind and helpful, it could be interpreted as a floating point number, which I believe would be the case in most cases. interpreting it as a date is based on the German format for dates, where the Germans use the '.' to make an ordinal number out of the preceding cardinal number, but the fact is that: 1) both day and month should be followed by that dot in the German notation for dates 2) Dutch, Italian, French, Spanish, are all not German and we all don't understand why 13.1 should mean January 13th at all... I notice that the English (Eire) locale comes quite close to what I expect and I think I can live with this. but I insist that you reconsider the fallback to parsing as date (using a sloppy rule from a different locale) as a default behaviour for unparseable input (which would be parseable as float in the most widespread locale) also check this: http://www.cl.cam.ac.uk/~mgk25/iso-time.html where it is stated that the German format dd.mm.yy(yy) has been obsoleted in May 1995. I have changed the issue type to an enhancement and changed the summary to reflect the problem. I hope, that I got you right. A similar discussion took place in December on the OOo Sc-mailing list: http://sc.openoffice.org/servlets/ReadMsg?listName=dev&msgNo=2751 It was my experience that the automatic date-recognition mechanism will never reach an acceptable level of reliability in true life. It annoys me over and over again when OOo Calc tries to figure out what date it is, when there is really *NO* date. And "outsmarting" the user is a sure way to earn you a lot of anger. It was therefore my belief, that OOo needs a mechanism to set the date-interpretation algorithm (aka which numbers are dates): - e.g. 'dd.mm' should be interpreted as date - others might choose only 'dd/mm' - and still others only 'mm/dd' Indeed, this would give complete control back to the users. Especially users in multi-locale environments would benefit from this feature, but many others would be grateful, too. well, this would be a clear enhancement _for us_, but if a user is using DIN 5008 data prior to the 1995 correction, he might have reasons to complain... as for Leonard, it's the "never" which worries me a bit. I would say "never if the locale does not say date:==dd.mm.yyyy", i.e.: "only if the locale explicitly says so". maybe "in case on non match with locale, prefer converting to number above converting to date" I agree with Leonard when he says that it would be nice if we could specify our own date format(s) -overruling the locale- and that input data (imported or typed) is recognized as date if and only if it matches and absolutely not otherwise. if you offered this option, I would define these formats and be happy: dd/mm dd/mm/yyyy yyyy-mm-dd The automatic date-issue pops up over and over again, but these issues get either closed as WORKSFORME, or for many other reasons, so I will keep a list on this (still open) issue: http://www.openoffice.org/issues/show_bug.cgi?id=13509 http://www.openoffice.org/issues/show_bug.cgi?id=13929 http://www.openoffice.org/issues/show_bug.cgi?id=20804 http://www.openoffice.org/issues/show_bug.cgi?id=22725 http://www.openoffice.org/issues/show_bug.cgi?id=23024 http://www.openoffice.org/issues/show_bug.cgi?id=23328 http://www.openoffice.org/issues/show_bug.cgi?id=23945 http://www.openoffice.org/issues/show_bug.cgi?id=25201 [...] and many more will follow when I have more time [...] http://www.openoffice.org/issues/show_bug.cgi?id=88020 http://www.openoffice.org/issues/show_bug.cgi?id=1145 [closed only for Data-Pilot] http://www.openoffice.org/issues/show_bug.cgi?id=7905 [touches on locale - but same underlying problem] [Partial Matches] http://www.openoffice.org/issues/show_bug.cgi?id=71997 http://www.openoffice.org/issues/show_bug.cgi?id=17146 In the meantime, I have re-opened the discussion on the mailing-lists using the previous date-enhancement proposal: see e.g. http://ux.openoffice.org/servlets/ReadMsg?list=discuss&msgNo=1573 [for the UX-mailing list] As I side-note, the enhancement I propose is called in computer-jargon: a data *input mask*. ;-) As said, this issue pops up at least a dozen times on Bugzilla, but the real number is probably more like over 100 times. I'll investigate further when I have some spare time. Some new additions (as always closed as WORKSFORME): http://www.openoffice.org/issues/show_bug.cgi?id=88243 http://www.openoffice.org/issues/show_bug.cgi?id=88271 [...] definitely more to come if you are affected by this issue, please vote for it! One more member to the list of similar issues: http://www.openoffice.org/issues/show_bug.cgi?id=77973 [I forgot about this one. It contains also a detailed discussion about automatic date-conversions.] regina, would you agree marking this back as a 'defect' rather than 'enhancement'? 'defect', in the sense of removing/modifying a surprising behaviour, not 'enhancement' in the sense of adding a feature. friendly regards. Mario. Issue 33723 (http://www.openoffice.org/issues/show_bug.cgi?id=33723) covers the same problem: automatic number recognition (more specifically, the automatic date conversion is the culprit). That issue has gathered some 31 votes. + 2 I also had the problem: http://www.openoffice.org/issues/show_bug.cgi?id=101286 + 2 same problems for Belgium wher we hav to cope with 3 locals -:) Just to spite the bigots who think everyone should be forced to use their poorly thought out features, I've submitted an issue. Issue 102488 http://www.openoffice.org/issues/show_bug.cgi?id=102488 *** Issue 102488 has been marked as a duplicate of this issue. *** BTW, people, try not to swear ok? We are all humans. No one likes to be called names. no-one likes to be called names, that's clear and should be avoided... I assume on the other hand that most of us reporting bugs here are working as advocates of openoffice.org, each of us inside of our respective companies, and we feel the duty to defend something we sometimes don't like and (as in this case) don't understand. please note that issues like this one have been opened and quickly closed as a "works for me" since April 2003 (check http://www.openoffice.org/issues/show_bug.cgi?id=13929), this is over 6 years ago. you maybe can understand some impatience on users' side. any plan to schedule a solution to this for any specific future release? thanks a lot for your attention, Mario Frasca Sure. I understand the impatience of the users. On the other hand, there are currently thousands of issues like this with very many frustrating users, and there are only a few developers writing code. So, there are also equal amount (if not greater) of frustrations coming from the developers side as well. All we can do is fix the issues one at a time, but there are still 999 more issues to go. It's a very painfully slow process, and sometimes it feels endless. It is quite understandable that it can take quite a while before issues are fixed as there are so many. Personally I sometimes find it rewarding to look for some low hanging fruit, which brings down the number of issues down quickly, even if it is intellectually utterly boring. In this case: I really think someone who knows what he's doing can fix this within an hour: it's just deleting support for a totally obsolete German-only format. Oops, now I may be offending some senior Germans .... a possibly relevant link: http://specs.openoffice.org/calc/ease-of-use/enhance_number_recognition.sxw The behaviour is illogical more than that. The way the data are parsed is applied BY CELL! So that it ends up like a useless mess of (bogus) dates and numbers. CSV is usually used as simple data transfer format, not intended for any formatting or so. We can quite reliably assume, that the format of any column, excepting the first line, should be of same type. So, if the program dosen't know for sure, how to eat the values, it should at least keep up with the same resolution for the whole column (excepting the first line). The most secure way to do is simply interpret them as text and let the user decide later, how to deal with them (he can always select them and say "they are all numbers"). That seems for me quite reasonable approach: "Hmm, there are some strange values in the column C, that don't hold together with other values in that column.. I will better mark them as text and let the user decide". At least, there is no dataloss using this approach. This is much more favorable than generating dates from everything that remotely resembles a date, thus destroying original values. @tuharsky: perhaps this feature of mine http://wiki.services.openoffice.org/wiki/Calc/Features/Numbers_import_for_plain_text_files will allow you to do that once integrated. The implementation is done, but due to insufficient time to perform proper QA & other testing, chances of getting that into 3.2 is becoming very very slim. @kohei: thank you!!! and thank God!!! something is happening along these lines finally. Being a citizen of one of the concerned countries (Germany), I have been waiting for this from the ages of StaOffice 5.2. I hope I get this right: downloading & installing the file /~kohei/builds/cws/koheicsvimport/OOo_3.2.0_090807_Win32Intel_install.exe gives me an en-US installation pack to install in addition to my german one? But even if it kills that: I am going to put up with quite a lot of trouble for this feature, which, as other folks have stated here ..and there.., is the only logical way to do this in a world of wildly different conventions as far as special numbers are concerned. THANK YOU AGAIN!!! again, @kohei: install failed, error: 2etup -that's what it read!- not found *** Issue 104913 has been marked as a duplicate of this issue. *** *** Issue 105137 has been marked as a duplicate of this issue. *** *** Issue 110263 has been marked as a duplicate of this issue. *** It's time to resolve this issue (defect). Thats completely crazy if the column with row numbering or us-US style numbers with decimal point (imported to cal from hardware devices - any type of meters) are converted to data type automatically. Date from electronic meters usually are complex and huge data file, and manually correcting this is unacceptable. I think that giving the option to mark "do not convert to data" on the import form is the good way. The better is to automatically convert us-US style numbers with period (.) to comma (,) style. |