Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing |
Summary: | CSV import is wrong on CR (0x13) value | ||||||
---|---|---|---|---|---|---|---|
Product: | Calc | Reporter: | fcartegnie <fcartegnie> | ||||
Component: | open-import | Assignee: | AOO issues mailing list <issues> | ||||
Status: | CONFIRMED --- | QA Contact: | |||||
Severity: | Trivial | ||||||
Priority: | P3 | CC: | helenrussian, issues, kpalagin, kyoshida, ooo | ||||
Version: | OOo 2.2 RC4 | ||||||
Target Milestone: | --- | ||||||
Hardware: | All | ||||||
OS: | All | ||||||
Issue Type: | ENHANCEMENT | Latest Confirmation in: | --- | ||||
Developer Difficulty: | --- | ||||||
Attachments: |
|
Description
fcartegnie
2007-10-24 22:53:05 UTC
Please attach sample file Created attachment 49762 [details]
sample dataset
1) "0x13 alone never meant newline on any system." How about Commodore machines, Apple II family and Mac OS up to version 9? http://en.wikipedia.org/wiki/Line_feed 2) 0x10 or 0x13 are never used as newline. CR=13(dec)=0D(hex), LF=10(dec)=0A (hex). Win/dos newline style is CR LF pair, *nix style is single LF -- both are mentioned in RFC 4180 (http://tools.ietf.org/html/rfc4180) as valid line separators. 3) Single CR is not mentioned in RFC 4180 as valid line separator. I don't know how many "single CR separator" files come from anywhere. forgot the previous. I thought as it was just newlines in a text file. Anyway, the mentioned RFC tells that line is finished by a CRLF record, which is "CR LF" (ABNF sequence http://tools.ietf.org/html/rfc2234#section-3.1), not "CR / LF" (ABNF alternative). Section 6.1 of RFC 2234 also does. So it should break only on the 2 bytes value CRLF, no ? Yes, RFC 4180 tells about CR LF sequence, not CR/LF alternative. However, CSV file is a *text* file above all and community uses historical LF/CRLF/CR newlines in text files. So, usually we don't know system, received file come from. May be we need option (enhancement), that control import of single-cr -- use it as newline char (as now) or use it as line-break (as Ctrl+Enter in Calc and Shift+Enter in Writer). Confirmed. Plus, some duplicates exist. *** Issue 81470 has been marked as a duplicate of this issue. *** *** Issue 83768 has been marked as a duplicate of this issue. *** *** Issue 83768 has been marked as a duplicate of this issue. *** *** Issue 98274 has been marked as a duplicate of this issue. *** *** Issue 95958 has been marked as a duplicate of this issue. *** This is an enhancement, not a defect. New record delimiter LF/CRLF (not CR)needs either be selectable in CSV dialog for both import and export, as RFC 4180 seems to allow both. http://tools.ietf.org/html/rfc4180 Sorry, phrasing failed completely in previous comment, should be: This is an enhancement, not a defect. The 'New Record' delimiter LF/CRLF (not CR!) needs to be selectable in dialogs for both CSV import and export, as RFC 4180 seems to allow both. *** Issue 106740 has been marked as a duplicate of this issue. *** |