Issue 127034

Summary: xlsx file: imported DateTime cells are empty (Excel 2010 compatible)
Product: Calc Reporter: Falcon <ss15>
Component: open-importAssignee: AOO issues mailing list <issues>
Status: RESOLVED FIXED QA Contact:
Severity: Normal    
Priority: P5 (lowest) CC: charlie.clark, damjan, mseidel
Version: 4.1.2Keywords: interop_OOXML
Target Milestone: 4.1.14   
Hardware: All   
OS: All   
Issue Type: ENHANCEMENT Latest Confirmation in: 4.2.0-dev
Developer Difficulty: Simple
Attachments:
Description Flags
xlsx with 2 datetime notation: 1 works fine and another is empty in Calc none

Description Falcon 2016-07-09 13:01:55 UTC
Created attachment 85603 [details]
xlsx with 2 datetime notation: 1 works fine and another is empty in Calc

Calc do not read DateTime in cells when date in ISO 8601 format (like "1961-04-12T07:55:00.123")!

By documentation (Ecma Office Open XML paragraph 18.18.11) ST_CellType cell type for DateTime are:
1. "n" and value for date is number (like "22383.32986")
example:
   <c r="C3" s="5" t="n">
    <v>22383.32986</v>
   </c>

2. "d" and value for date is datetime (like "1961-04-12T07:55:00.123")
example:
   <c r="C4" s="5" t="d">
    <v>1961-04-12T07:55:00.123</v>
   </c>

Calc ignores the second case.

In attachment xlsx in cells B3 and C3 datetime as number - and Calc ok, in cells B4 and D4 datetime in ISO 8601 format - and Calc ignores dates.

In Excel 2010 all cells shown fine.

Tested in OO Calc 4.1.2 in Windows 7.
Also it's reproducible for LibreOffice Calc 4.3.3.2 in Debian 8 and 5.1.3.2 in Windows 7: https://bugs.documentfoundation.org/show_bug.cgi?id=100822

(Document was created using ZEXMLSS 0.0.11)
Comment 1 orcmid 2016-07-09 16:37:51 UTC
(In reply to Falcon from comment #0)
> Created attachment 85603 [details]
> xlsx with 2 datetime notation: 1 works fine and another is empty in Calc
> 
> Calc do not read DateTime in cells when date in ISO 8601 format (like
> "1961-04-12T07:55:00.123")!
> 
> By documentation (Ecma Office Open XML paragraph 18.18.11) ST_CellType cell
> type for DateTime are:
> 1. "n" and value for date is number (like "22383.32986")
> example:
>    <c r="C3" s="5" t="n">
>     <v>22383.32986</v>
>    </c>
> 
> 2. "d" and value for date is datetime (like "1961-04-12T07:55:00.123")
> example:
>    <c r="C4" s="5" t="d">
>     <v>1961-04-12T07:55:00.123</v>
>    </c>
> 
> Calc ignores the second case.
> 

That is plausible.  Which ECMA Office Open XML version are you checking?

The AOO .xlsx import filter is probably older than the time when the clarification of date handling was improved in the ECMA/ISO specification and in the Microsoft Excel 2010 implementation.  This would have been part of satisfying various parties about "strict" handling of ISO dates while also having the more-interoperable "n" case (the default in Excel, I believe).  It is likely that the AOO filter is more "OOOXML transitional."

The actual approach in Apache OpenOffice and in ODF 1.2 OpenFormula is a counterpart of the Excel "n" case.  

We need to check whether the equivalent of the "d" case is producible in AOO Calc.  Then having the .xslx import filter work appropriately is an additional challenge (along with .ods interop cases).

More analysis is needed.  Thanks for digging into this, Falcon.
Comment 2 Falcon 2016-07-09 19:01:59 UTC
It's in "ECMA-376, 4th Edition Office Open XML File Formats — Fundamentals and Markup Language Reference" (2012) on page 2432.

http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-376,%20Fourth%20Edition,%20Part%201%20-%20Fundamentals%20And%20Markup%20Language%20Reference.zip

Hm, there are 5-th edition here: http://www.ecma-international.org/publications/standards/Ecma-376.htm
Comment 3 damjan 2023-01-07 16:42:17 UTC
Confirming on the latest Git.

It should be easy to add support for the ISO 8601 dates, it's just function SheetDataContext::onEndElement() in main/oox/source/xls/sheetdatacontext.cxx.
Comment 4 damjan 2023-01-08 08:32:49 UTC
Fixed by the following commit, resolving FIXED.


commit 9621e552cdf723df9a998b3af4218407d6c66e37
Author: Damjan Jovanovic <...>
Date:   Sun Jan 8 10:24:33 2023 +0200

Add support for the new XLSX date type in cells, denoted with attribute t="d",
used by Excel 2010.
    
Also refactor the code so the datetime attribute in pivot tables is also parsed
by the same function, and increase the parsing accuracy to the maximum (HundredthSeconds, instead of just Seconds).
Comment 5 damjan 2023-01-08 08:45:54 UTC
*** Issue 125745 has been marked as a duplicate of this issue. ***
Comment 6 Matthias Seidel 2023-01-08 12:46:52 UTC
Cherry-picked for AOO42X with:
https://github.com/apache/openoffice/commit/f42644c3cfb90ca4340e57144e49d9c75b42db6b
Comment 7 damjan 2023-02-07 17:46:07 UTC
Cherry-picked for AOO41X in commit 1f71f1d8c317442f96cd6d435bb275ff6e97eccb.