Issue 82952

Summary: CSV import is wrong on CR (0x13) value
Product: Calc Reporter: fcartegnie <fcartegnie>
Component: open-importAssignee: AOO issues mailing list <issues>
Status: CONFIRMED --- QA Contact:
Severity: Trivial    
Priority: P3 CC: helenrussian, issues, kpalagin, kyoshida, ooo
Version: OOo 2.2 RC4   
Target Milestone: ---   
Hardware: All   
OS: All   
Issue Type: ENHANCEMENT Latest Confirmation in: ---
Developer Difficulty: ---
Attachments:
Description Flags
sample dataset none

Description fcartegnie 2007-10-24 22:53:05 UTC
CSV import is wrong in importing lines.
If it found a single 0x13 value, it guesses it's the newline separator.

The right behaviour would be to trigger a newline only on 0x10, and on 0x10 0x13
sequence. 0x13 alone never meant newline on any system.
Comment 1 helenrussian 2007-11-18 16:53:51 UTC
Please attach sample file
Comment 2 fcartegnie 2007-11-20 07:29:00 UTC
Created attachment 49762 [details]
sample dataset
Comment 3 bormant 2007-11-21 12:52:32 UTC
1) "0x13 alone never meant newline on any system." How about Commodore 
machines, Apple II family and Mac OS up to version 9? 
http://en.wikipedia.org/wiki/Line_feed
2) 0x10 or 0x13 are never used as newline. CR=13(dec)=0D(hex), LF=10(dec)=0A
(hex). Win/dos newline style is CR LF pair, *nix style is single LF -- both are 
mentioned in RFC 4180 (http://tools.ietf.org/html/rfc4180) as valid line 
separators.
3) Single CR is not mentioned in RFC 4180 as valid line separator. I don't know 
how many "single CR separator" files come from anywhere.
Comment 4 fcartegnie 2007-11-21 13:15:41 UTC
forgot the previous. I thought as it was just newlines in a text file.

Anyway, the mentioned RFC tells that line is finished by a CRLF record, which is
"CR LF" (ABNF sequence http://tools.ietf.org/html/rfc2234#section-3.1), not "CR
/ LF" (ABNF alternative). Section 6.1 of RFC 2234 also does.

So it should break only on the 2 bytes value CRLF, no ?
Comment 5 bormant 2007-11-21 14:06:15 UTC
Yes, RFC 4180 tells about CR LF sequence, not CR/LF alternative.
However, CSV file is a *text* file above all and community uses historical 
LF/CRLF/CR newlines in text files. So, usually we don't know system, received 
file come from.
May be we need option (enhancement), that control import of single-cr -- use it 
as newline char (as now) or use it as line-break (as Ctrl+Enter in Calc and 
Shift+Enter in Writer).
Comment 6 peter.junge 2009-07-22 06:40:17 UTC
Confirmed. Plus, some duplicates exist.
Comment 7 peter.junge 2009-07-22 06:41:22 UTC
*** Issue 81470 has been marked as a duplicate of this issue. ***
Comment 8 peter.junge 2009-07-22 06:43:40 UTC
*** Issue 83768 has been marked as a duplicate of this issue. ***
Comment 9 peter.junge 2009-07-22 06:45:38 UTC
*** Issue 83768 has been marked as a duplicate of this issue. ***
Comment 10 peter.junge 2009-07-22 06:46:54 UTC
*** Issue 98274 has been marked as a duplicate of this issue. ***
Comment 11 peter.junge 2009-07-22 06:48:06 UTC
*** Issue 95958 has been marked as a duplicate of this issue. ***
Comment 12 peter.junge 2009-07-22 07:05:30 UTC
This is an enhancement, not a defect. New record delimiter LF/CRLF (not CR)needs
either be selectable in CSV dialog for both import and export, as RFC 4180 seems
to allow both.
http://tools.ietf.org/html/rfc4180
Comment 13 peter.junge 2009-07-22 07:37:54 UTC
Sorry, phrasing failed completely in previous comment, should be:

This is an enhancement, not a defect. The 'New Record' delimiter LF/CRLF (not
CR!) needs to be selectable in dialogs for both CSV import and export, as RFC
4180 seems to allow both.
Comment 14 Regina Henschel 2009-11-09 18:29:26 UTC
*** Issue 106740 has been marked as a duplicate of this issue. ***