Issue 38494 - Copy/Paste numbers from Writer to Calc: decimal separator interpretation error (RTF format)
Summary: Copy/Paste numbers from Writer to Calc: decimal separator interpretation erro...
Status: ACCEPTED
Alias: None
Product: Calc
Classification: Application
Component: open-import (show other issues)
Version: OOo 1.1.2
Hardware: All All
: P3 Trivial with 22 votes (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-12-06 13:02 UTC by gbadoi
Modified: 2017-05-20 11:11 UTC (History)
5 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description gbadoi 2004-12-06 13:02:35 UTC
Conditions: "Locale Settings" to a language that uses comma as decimal separator 
(German) 
 
1. Enter a formatted number in Writer document: 100.100
2. Select and copy the number.
3. Switch to Calc spreadsheet
4. Paste the number into any cell.

The result is 100,1 instead of 100100

5. Try the same with 100.100.100. The result is 100100100, which is correct.


Tested with OOo 1.1.2/Linux, 1.1.3/Windows, 1.1.4/Windows.
Comment 1 frank 2004-12-08 15:09:11 UTC
Hi,

this is because the default paste format is RTF which assumes the format is
english in this case. To avoid this please use Paste Special (CTRL+SHIFT+V).

Sorry for no better reply.

Frank
Comment 2 frank 2004-12-08 15:09:48 UTC
forgot to mention the format you should use. It's unformatted text.

Frank
Comment 3 gbadoi 2004-12-08 16:20:53 UTC
Hi,

I don't agree. Using paste special is just workaround.

Default paste must NOT assume English format, but user-selected format.

More, all my computer's locale settings are German (not only OOo!), so why 
English?

Gigi
Comment 4 frank 2004-12-09 09:26:38 UTC
Hi,

the possible outcome compared to the investment in time and resources for such
an seldom used case would be on the negative site. Nevertheless, let the User
Experience team decide.

Corrected Prio and set target. This is in no way a Prio 2 task. Please have a
look the priority settings help file.

Frank
Comment 5 gbadoi 2004-12-09 17:00:44 UTC
Hi,

Sorry for P2, but I lost data copying text tables to spreadsheets with copy/
paste.

Gigi
Comment 6 jcdamgaard 2005-03-18 10:23:10 UTC
Hi
Gigi wrote:
> Default paste must NOT assume English format, but user-selected format.
>More, all my computer's locale settings are German (not only OOo!), so why 
English?

I agree.
Calc always interprets a single .(point) as a decimal separator, no matter what 
settings you have - and this is a serious bug.

The error is still not fixed in 2.0 beta, and it definitely should be a Prio 2 
task. It is causing data loss for all us not using english locale settings!
It not only affects paste from Writer, but also from HTML-pages, Word etc.

JC
Comment 7 ooo 2005-03-29 19:20:46 UTC
Please don't intermix file formats here, HTML behaves similar because currently
the same procedure as with HTML is applied to RTF import, however, pasting
Writer text from clipboard in this case is done in RTF, so lets stick with that.
Issue 39898 will be kept for HTML.

RTF does have a language/locale designator, \lang# where # is the MS-LangID,
which is written by Writer but is not evaluated by Calc. Evaluating that would
enable us to correctly parse strings to values.

Grabbing issue, adapting priority to normal P3, setting target to OOoLater.
Comment 8 jcdamgaard 2005-03-30 09:44:43 UTC
ER wrote:
>RTF does have a language/locale designator, \lang# where # is the MS-LangID,
>which is written by Writer but is not evaluated by Calc. Evaluating that would
>enable us to correctly parse strings to values.

If that was the case, Calc would allways presume English number formats when 
pasting RTF. But it is not.

Example:
Copy the number 12,000 from Writer:

Then paste into calc and you get these results:
Calc (English locale settings):12000
Calc (German locale settings): 12

Everything is fine - Calc uses the locale settings correctly when parsing the 
string.

But try the same with 12.000, and you will see that Calc stops caring about 
language settings.

What we wish for, is making Calc parse the strings consistently.
Comment 9 jcdamgaard 2005-03-30 09:45:01 UTC
ER wrote:
>RTF does have a language/locale designator, \lang# where # is the MS-LangID,
>which is written by Writer but is not evaluated by Calc. Evaluating that would
>enable us to correctly parse strings to values.

If that was the case, Calc would allways presume English number formats when 
pasting RTF. But it is not.

Example:
Copy the number 12,000 from Writer:

Then paste into calc and you get these results:
Calc (English locale settings):12000
Calc (German locale settings): 12

Everything is fine - Calc uses the locale settings correctly when parsing the 
string.

But try the same with 12.000, and you will see that Calc stops caring about 
language settings.

What we wish for, is making Calc parse the strings consistently.
Comment 10 jcdamgaard 2005-03-30 09:45:12 UTC
ER wrote:
>RTF does have a language/locale designator, \lang# where # is the MS-LangID,
>which is written by Writer but is not evaluated by Calc. Evaluating that would
>enable us to correctly parse strings to values.

If that was the case, Calc would allways presume English number formats when 
pasting RTF. But it is not.

Example:
Copy the number 12,000 from Writer:

Then paste into calc and you get these results:
Calc (English locale settings):12000
Calc (German locale settings): 12

Everything is fine - Calc uses the locale settings correctly when parsing the 
string.

But try the same with 12.000, and you will see that Calc stops caring about 
language settings.

What we wish for, is making Calc parse the strings consistently.
Comment 11 ooo 2005-03-30 10:24:24 UTC
JC,

> >RTF does have a language/locale designator, \lang# where # is the MS-LangID,
> >which is written by Writer but is not evaluated by Calc. Evaluating that would
> >enable us to correctly parse strings to values.

> If that was the case, Calc would allways presume English number formats when
> pasting RTF. But it is not.

No one said it did. And the above doesn't imply it would.

> But try the same with 12.000, and you will see that Calc stops caring about
> language settings.

Because the decimal separator of your locale doesn't occur in the string, and
the '.' may either be an en_US decimal separator, which is taken in this case,
or a group separator of your locale. That's a weakness of the current
algorithm, which is used for both, HTML and RTF, the implementation is in
sc/source/filter/rtf/eeimpars.cxx r1.12 lines 389-429.

> What we wish for, is making Calc parse the strings consistently.

Which evaluating the \lang# tags present in RTF and parsing values accordingly
would do, as long as the document language or language of a text portion is set
correctly.

  Eike
Comment 12 jcdamgaard 2005-04-01 11:12:39 UTC
Hi Eike

Thanks for your explanation.
A few comments:
>Because the decimal separator of your locale doesn't occur in the string, and
>the '.' may either be an en_US decimal separator, which is taken in this case,
>or a group separator of your locale.

If the locale decimal separator doesn't occur in the number/string, we do not 
want Calc adding a decimal separator!

>That's a weakness of the current algorithm, which is used for both, HTML and
>RTF, the implementation is in sc/source/filter/rtf/eeimpars.cxx r1.12 lines
>389-429.

You may call it a weakness, but i think most users would call it a bug 
(especially when they get their numbers messed up)
Is it possible to fix this?
And perhaps later make the enhancement of evaluating language tags?

JC
Comment 13 ooo 2005-04-01 13:04:31 UTC
JC,

> If the locale decimal separator doesn't occur in the number/string, we do not 
> want Calc adding a decimal separator!

It doesn't add one, it takes the 12.000 as an en_US decimal number.

IMHO, if not doing it the right way for RTF like I mentioned, when sticking to
both HTML and RTF using this mechansim, it is not possible to fix this reliable
without introducing some UI, see my comment in issue 39898, Tue Mar 8 13:08:23
-0800 2005. Fixing it without UI and prefering the current locale would break
functionality for those who wish to import en_US documents from the web.

The only possibility would be to leave it as is for HTML, and to change it only
for RTF to use the current locale only. But this again could be wrong too, as
the document's content may not be of the current locale.

Eike
Comment 14 jcdamgaard 2005-04-04 09:28:17 UTC
Eike,
>Fixing it without UI and prefering the current locale would break
>functionality for those who wish to import en_US documents from the web.

If i wish to copy-paste english numbers, i can simply change locale settings to 
UK/US - and everything works fine.
But if i wish to paste e.g. german numbers, there is no way i can do this.
(except from changing the original figures manually)

If we were talking about an import function, i agree there should be some 
analysing the content, and perhaps asking the user about number format. 
But this issue is simply about pasting, which should follow the same logic as 
typing the numbers using the keyboard:
If i TYPE the number 3.986 i expect Calc to keep the number 3986.
If i PASTE the number 3.986 Calc now changes  the number to 3,986 (showing 
3,99). I expect Calc to keep the number 3986.
All other spreadsheets do what the user expects! why not Calc?

Right now a lot of new users are trying out OpenOffice.org. But if they find 
bugs like this, they would have second thoughts about letting Calc handle their 
numbers.

JC
Comment 15 henrikjordt 2005-04-04 11:21:23 UTC
It is extremely important that Calc interpret the decimal separator correctly
using the locale setting. This means when I copy my budget using copy/paste,
then Calc shall not misinterpret the values. In Denmark 300.000 is the same as
300000. When I copy that number it shall not be reduced by a factor 1000! 

The correspondance between ER and JCDAMGAARd reflects that Calc makes an
interpretation of the decimal separator. There may be a technical reason for
that but from the user point of view it is an error which must be corrected as
soon as possible.
Comment 16 stp 2005-05-26 18:33:55 UTC
The patch that is now available in issue 39898 fixes this issue, too.
Comment 17 niklas.nebel 2006-07-13 08:16:34 UTC
This is solved with the fix for issue 50670 (available in m176).
Comment 18 stp 2006-07-13 10:11:58 UTC
I am not sure but neither an expert.

Eike wrote Tue Mar 29 12:20:46 -0700 2005:
"RTF does have a language/locale designator, \lang# where # is the MS-LangID,
which is written by Writer but is not evaluated by Calc. Evaluating that would
enable us to correctly parse strings to values."

I understand Eikes comments as e.g. if 1.000 is written in an English text in
Writer and copied to a spreadsheet Calc may be able to parse this as 1 with 3
zero decimal regardless of Calc's locale by using lang tag in RTF which is the
clipboard carrier format.

Either way not this issue is probably not a defect any longer - more likely an
enhancement.

Thanks.
Comment 19 ooo 2006-07-17 13:29:22 UTC
The RTF import shares code with the HTML import, so currently the HTML option
also applies to the RTF import. The RTF \lang# tag, if present, should be
evaluated instead. The issue type for this somewhat is on the borderline of
defect and enhancement..
Comment 20 frank.loehmann 2008-05-22 09:35:29 UTC
This issue is important and listed on the quarterly review for Calc:
http://wiki.services.openoffice.org/wiki/2008_Q2_Review_of_Spreadsheet_Project
Therefore adjusting target to 3.x.
Comment 21 Marcus 2017-05-20 11:11:08 UTC
Reset assigne to the default "issues@openoffice.apache.org".