This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 247042 - Encoding error after patch export
Summary: Encoding error after patch export
Status: REOPENED
Alias: None
Product: versioncontrol
Classification: Unclassified
Component: Mercurial (show other bugs)
Version: 8.0
Hardware: PC Linux
: P4 normal with 1 vote (vote)
Assignee: Ondrej Vrabec
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-11 16:52 UTC by ulfzibis
Modified: 2014-09-16 21:53 UTC (History)
0 users

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description ulfzibis 2014-09-11 16:52:26 UTC
[ BUILD # : 8.0.1 FCS ]
[ JDK VERSION : 1.7.0_67 ]

STEPS:
   * Java project is ISO-8859-1.
   * Use German Umlaut in java source file.
   * Mercurial->Patches->Export Diff to location outside the project.

ACTUAL:
   Message: The file xxx.patch can not be safely opened with encoding UTF-8
....

EXPECTED:
   Should work.

WORKAROUND:
   Save the patch inside the project.
Comment 1 Ondrej Vrabec 2014-09-11 19:41:41 UTC
When does it say so? When you try to open it in the IDE?
Comment 2 Ondrej Vrabec 2014-09-11 19:54:23 UTC
> * Use German Umlaut in java source file.
in source file? or in a source file's name?
Comment 3 ulfzibis 2014-09-11 21:17:04 UTC
(In reply to Ondrej Vrabec from comment #1)
> When does it say so? When you try to open it in the IDE?

No, before, just with Mercurial->Patches->Export Diff, NB automatically tries to open the patch.

(In reply to Ondrej Vrabec from comment #2)
> > * Use German Umlaut in java source file.
> in source file? or in a source file's name?

To my understanding the problem occurs from the Umlaut in the source file, but additionally there is a Umlaut in 2 of the file names, see project, attached to bug 247043.
Comment 4 Ondrej Vrabec 2014-09-12 09:57:31 UTC
just help NetBeans and Mercurial to recognize your required encoding and run NB with -J-Dmercurial.encoding=ISO-8859-1 . It seems to work and when i use this i can open the patch file just fine.
Comment 5 ulfzibis 2014-09-12 13:38:35 UTC
(In reply to Ondrej Vrabec from comment #4)
> just help NetBeans and Mercurial to recognize your required encoding and run
> NB with  . It seems to work and when i use
> this i can open the patch file just fine.

Hm, I disagree. Using the -J-Dmercurial.encoding=ISO-8859-1 would disturb another open project which is encoded in UTF-8.
IMO the mercurial encoding setting should depend on the project's setting. This should work fine until there are multiple projects with different encodings hosted in one repository.
Comment 6 Ondrej Vrabec 2014-09-12 13:51:19 UTC
versioning systems are project independent, they work on the file level, not on the project level. I simply cannot make it the way you want because i would break other use-cases. If you do not want to use the switch and want to keep mixing different encoding for files (utf-8 according to your environment settings) and projects (iso-8859-1) then i cannot help you at the moment.
Comment 7 Ondrej Vrabec 2014-09-12 15:38:48 UTC
on another attempts it occurs to me that mercurial.encoding property kind works only by lucky chance. Because no matter what i try to pass to hg export as parameter it always seems to produce only diffs in iso-8859-1 encoding, never utf. I will try on linux on monday. You can try playing with hg export command and try passing --encoding, HGENCODING as an env variable and let me know if it ever produces any other than ISO-8859-1.
Comment 8 ulfzibis 2014-09-14 21:07:08 UTC
Hi,
I'm not sure, if it would make sense that hg outputs a diff with different encoding, because then there must be a tag in the diff, indicating the encoding, so on later import the correct encoding could be determined.

In the diff here, we have 2 encodings mixed, the UTF-8 according to the environment settings for the file path, and ISO-8859-1 for the text content. I think it's correct and reasonable, that hg handles the file's encoding transparently.

I think, the problem here is, that after the diff creation NB automatically forces opening the diff file in the editor. Do we really need this? I think, at least the user should have the choice to open the diff file of not, then he would not be assailed by the encoding failure message.

Actually this bug seems not a big problem, if the user is aware of the logic behind the message. The bigger problem occurs with bug 247043.

I do not really understand, what the --encoding option is meant for, I didn't find an explanation in the Mercurial guide: http://hgbook.red-bean.com/read/ .
Comment 9 Ondrej Vrabec 2014-09-15 06:59:44 UTC
(In reply to ulfzibis from comment #8)
> In the diff here, we have 2 encodings mixed, the UTF-8 according to the
> environment settings for the file path, and ISO-8859-1 for the text content.
Are you sure? Can you find any final specification for this? I googled but dod not find anything. Are you aware of any spec for diff files or at least mercurial exported diff files?
> Actually this bug seems not a big problem, if the user is aware of the logic
> behind the message. The bigger problem occurs with bug 247043.
That's probably because of the UTF-8 encoded paths.
> I do not really understand, what the --encoding option is meant for, I
> didn't find an explanation in the Mercurial guide:
hg help -v => it's used for encoding commit messages, author names etc.
Comment 10 ulfzibis 2014-09-16 11:27:50 UTC
(In reply to Ondrej Vrabec from comment #9)
> Are you sure? Can you find any final specification for this? I googled but
> dod not find anything. Are you aware of any spec for diff files or at least
> mercurial exported diff files?
No, I don't know any official document on this topic, I was just guessing from the the results I've seen. Additional such behaviour seems pretty reasonable to me. IMO the diff just divides the files into chunks upon line breaks and then compares the bytes without interpreting their meaning, e.g. encoding. The file paths are encoded in UTF-8 for exchangeability between arbitrary systems.

> > The bigger problem occurs with bug 247043.
> That's probably because of the UTF-8 encoded paths.
Yes, it looks like. For some reason NB doesn't accept this if the change history is in different encoding.

> hg help -v => it's used for encoding commit messages, author names etc.
I only get:
    --encoding ENCODE   Setzt die Zeichenkodierung (Voreinstellung: UTF-8)
    --encodingmode MODE Setzt den Modus der Zeichenkodierung (Voreinstellung:
                        strict)
Translated to English:
    --encoding ENCODE   Sets the character encoding (Default: UTF-8)
    --encodingmode MODE Sets the mode of the character encoding (Default:
                        strict)
Nothing about commit messages, author names etc.

Do you remember the command on Linux to force English output? I tried: LANG=C hg help -v
Comment 11 ulfzibis 2014-09-16 11:32:48 UTC
(In reply to ulfzibis from comment #8)
> I think, the problem here is, that after the diff creation NB automatically
> forces opening the diff file in the editor. Do we really need this? I think,
> at least the user should have the choice to open the diff file of not, then
> he would not be assailed by the encoding failure message.

Have you considered to solve this bug like above suggestion? For me this would make sense to change the summary then like:
Do not automatically force the result of Export Patch to display in editor.
Comment 12 Ondrej Vrabec 2014-09-16 12:39:41 UTC
(In reply to ulfzibis from comment #11)
> Have you considered to solve this bug like above suggestion? For me this
> would make sense to change the summary then like:
> Do not automatically force the result of Export Patch to display in editor.
No i haven't. For 99% of cases it works. If the file cannot be opened then simply dismiss the warning and the file will not be opened.
Comment 13 ulfzibis 2014-09-16 21:53:21 UTC
I'm still wondering about this default behaviour of Export Patch. Is this a real expectation of users? I can't remember from other software that Export something automatically forces the result to be opened.