Issue 95900

Summary: UTF-8 is not in selectble character encoding list during importing/exporting DIF formats
Product: Calc Reporter: zhangweiwu <zhangweiwu>
Component: save-exportAssignee: AOO issues mailing list <issues>
Status: CONFIRMED --- QA Contact:
Severity: Trivial    
Priority: P3 CC: issues, lohmaier
Version: OOo 3.0Keywords: oooqa
Target Milestone: ---   
Hardware: Unknown   
OS: All   
Issue Type: DEFECT Latest Confirmation in: ---
Developer Difficulty: ---

Description zhangweiwu 2008-11-06 03:38:20 UTC
1) open an DIF format spreadsheet in UTF-8.
2) oocalc prompt to select a character-encoding, in the list of the encodings
UTF-8 is missing;

expected:
2) user can choose to import in UTF-8 encoding.

The same problem occurs when saving DIF format.

Having this flaw it is not possible to correctly open a DIF file saved from
gnumeric (by default save in user locale, which is often UTF-8 on Linux). If the
user wish to avoid Excel format, then DIF is the only format that allow user of
gnumeric and ooocalc to exchange spreadsheet with data type settings (digit or
text). Thus it make sense to correct this bug.
Comment 1 zhangweiwu 2008-11-06 03:39:51 UTC
> If the user wish to avoid Excel format, then DIF is the only format that allow 
> user of gnumeric and ooocalc to exchange spreadsheet with data type settings
> (digit or text).

Partly because ODS support in Gnumeric is still experimental. They should
enhance ODS support but it also make sense to let OOO be stronger in import/export.
Comment 2 peter.junge 2009-07-22 05:17:02 UTC
I would think this is an issue with Gnumeric. AFAIK, DIF uses ASCII for
encoding, hence OOo is correct by suppressing the UTF-8 option both for import
and for export.

Hi Oliver,
I'm trying to push some issue submitted by the Beijing (non-RF2000!) OOo
community. Would you be so kind to comment my assumption and set resolution
accordingly. (Maybe ask Eike?!)

Greetings from Beijing,
Peter
Comment 3 zhangweiwu 2009-07-22 05:50:00 UTC
Hi thanks for your comment.

The issue started from a practical (not "in theory xxx should") requirement:
because we ourselves are using Linux on all office stuff, and some people choose
to use gnumeric for gnome/lightness and some choose to use oocalc, then we find
we have to exchange spreadsheets by using xls format which we prefer to stay
away from. The requirement for most spreadsheet is not high, just row/column and
data type correct would be enough, thus I think of DIF, then again failed for
Chinese ideographs contained in.

nowadays it is difficult to tell of something is ASCII or not thanks to multiple
extension to ascii. The only difference exist is multi-byte or single-byte charset.

below quoted from wikipedia:

DIF stores everything in an ASCII text file to mitigate many cross-platform
issues back in the days of its creation. However modern spreadsheet software,
e.g. OpenOffice.org Calc and Gnumeric, offer more character encoding to
export/import.
Comment 4 lohmaier 2009-11-13 16:49:20 UTC
confirming.

This is an artificial limitation. That what makes UTF-8 so useful is that's a
8bit-clean encoding, just like ASCII. For the fileformat there's no difference
whether UTF-8 or ASCII is stored. (when only characters from ASCII range are
used, it even is identical to ASCII)

If it can handle windows-codepages, latin#, etc. then it can also handle UTF-8.
There's no technical reason for not supporting UTF-8
Comment 5 lohmaier 2009-11-13 16:52:34 UTC
Furthermore:
http://wiki.services.openoffice.org/wiki/Documentation/DevGuide/Spreadsheets/Filter_Options

contains UTF-8 in the following section:

"Filter Options for Lotus, dBase and DIF Filters

These filters accept a string containing the numerical index of the used
character set for single-byte characters, that is, 0 for the system character set.
[...]
Unicode (UTF-8) 	76 
[...]"

So apparently it is already possible to load/save DIF with UTF-8 via the API,
just not via the UI.
Comment 6 oc 2011-03-01 08:09:44 UTC
Hi Eike, please have a look
Comment 7 Marcus 2017-05-20 11:35:23 UTC
Reset assigne to the default "issues@openoffice.apache.org".