This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 248599 - Editor opens UTF-8 files in wrong encoding (causing data loss)
Summary: Editor opens UTF-8 files in wrong encoding (causing data loss)
Status: RESOLVED FIXED
Alias: None
Product: projects
Classification: Unclassified
Component: Generic Infrastructure (show other bugs)
Version: 8.0.1
Hardware: PC Windows 7
: P1 normal (vote)
Assignee: Svata Dedic
URL:
Keywords:
Depends on: 246953
Blocks:
  Show dependency tree
 
Reported: 2014-11-12 17:18 UTC by _ gtzabari
Modified: 2015-04-03 03:07 UTC (History)
6 users (show)

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments
Testcase (3.40 KB, application/octet-stream)
2014-11-12 17:20 UTC, _ gtzabari
Details
Log from 8.0.1 that shows the race between threads on LazyLookupProviders (286.99 KB, application/zip)
2015-03-24 10:47 UTC, Svata Dedic
Details
Proposed patch (2.24 KB, patch)
2015-03-24 10:54 UTC, Svata Dedic
Details | Diff
binary patch for NetBeans 8.0.2 (5.53 KB, application/octet-stream)
2015-04-02 11:33 UTC, Marian Petras
Details

Note You need to log in before you can comment on or make changes to this bug.
Description _ gtzabari 2014-11-12 17:18:40 UTC
Product Version: NetBeans IDE 8.0.1 (Build 201408251540)
Updates: NetBeans IDE is updated to version NetBeans 8.0.1 Patch 1.1
Java: 1.8.0_25; Java HotSpot(TM) 64-Bit Server VM 25.25-b02
Runtime: Java(TM) SE Runtime Environment 1.8.0_25-b18
System: Windows 7 version 6.1 running on amd64; Cp1252; en_CA (nb)
User directory: C:\Users\Gili\AppData\Roaming\NetBeans\8.0.1
Cache directory: C:\Users\Gili\AppData\Local\NetBeans\Cache\8.0.1

Repro steps:
1. Open Netbeans
2. Open attached testcase
3. Notice that Main.java contains String "Montréal".
4. Close Netbeans. Restart Netbeans.
5. File Main.java will reopen automatically.
6. Notice that Editor now renders "Montréal" as "Montréal".

So far no damage has been done. The editor has mistakenly rendered the UTF-8 file as ASCII encoding. The underlying file is still correct.

7. To prove that the underlying file is correct, run the testcase. It should output "Montréal" correctly.

7. Make any modification to the file (e.g. add a comment) and save.
8. Netbeans overwrites the UTF-8 file using ASCII encoding, and in so doing causes permanent data loss.
9. Run the testcase. Notice that the output is now incorrect: "Montréal"

Expected behavior: Editor should open file in UTF-8 mode in step 6.
Comment 1 _ gtzabari 2014-11-12 17:20:34 UTC
Created attachment 150456 [details]
Testcase
Comment 2 Jiri Kovalsky 2014-11-13 10:50:42 UTC
Product Version: NetBeans IDE 8.0.2 (Build 201411102027)
Java: 1.8.0_25; Java HotSpot(TM) Client VM 25.25-b02
Runtime: Java(TM) SE Runtime Environment 1.8.0_25-b18
System: Windows 7 version 6.1 running on x86; Cp1252; en_US (nb)
User directory: C:\Users\Jiri Kovalsky\AppData\Roaming\NetBeans\8.0.2
Cache directory: C:\Users\Jiri Kovalsky\AppData\Local\NetBeans\Cache\8.0.2

I am sorry Gili, but I cannot reproduce it. I always see correct "Montréal". I followed exactly your steps. Closing as worksforme. The only difference I see is version of JDK: while I am on 32b system, you use 64b. 8.0.1 and 8.0.2 should not be important. I have also tried this on Linux Mint 17 and it worked there fine too.
Comment 3 _ gtzabari 2014-11-13 15:26:17 UTC
Jiri,

Did you try this under Netbeans 8.0.1 under Windows 7? If not, can you please have someone test that configuration? This might be OS-specific.
Comment 4 _ gtzabari 2014-11-13 15:40:44 UTC
Let me know if you can reproduce this under Windows. If not, leave it open and I will get back to you with an update within 2-3 days. If I can't reproduce it on a clean installation I will resolve this issue myself.
Comment 5 Jiri Prox 2014-11-13 17:14:49 UTC
I cannot reproduce it as well

btw. If I can see correctly Jirka tested it on both OS: Win7 and Linux


Product Version: NetBeans IDE 8.0.1 (Build 201408251540)
Updates: NetBeans IDE is updated to version NetBeans 8.0.1 Patch 1.1
Java: 1.8.0_25; Java HotSpot(TM) 64-Bit Server VM 25.25-b02
Runtime: Java(TM) SE Runtime Environment 1.8.0_25-b18
System: Windows 7 version 6.1 running on amd64; Cp1250; en_US (nb)
User directory: C:\Users\jprox\AppData\Roaming\NetBeans\8.0.1
Cache directory: C:\Users\jprox\AppData\Local\NetBeans\Cache\8.0.1
Comment 6 _ gtzabari 2014-11-13 17:17:01 UTC
Thanks for the clarification Jiri. I will test against a fresh 8.0.1 installation and figure out what combination of settings are triggering this bug.
Comment 7 _ gtzabari 2014-11-13 18:00:27 UTC
Okay. There is some sort of race condition at play.

1. I uninstalled Netbeans 8.0.1, reinstalled it.
2. Opened the testcase. Montréal displayed correctly.
3. Restarted Netbeans. Montréal displayed (bug).
4. Started a recorder session using http://www.sketchman-studio.com/rylstim-screen-recorder/
5. Repeated steps 2-3.
6. Shockingly, Montréal displays properly on step 3 even if I repeat the steps multiple times.
7. Stopped the screen recording.
8. Repeated step 2-3. Montréal displayed (bug).

So... I'm guessing my computer is running slightly faster than yours.

If I download a dev build, could you add extra logging that would help you figure out what is going on on my side (what order of events trigger the problem)?
Comment 8 Vladimir Voskresensky 2014-11-13 19:03:08 UTC
Is your testcase file belongs to project which specify UTF-8 encoding?
By default for you system Cp1252 is used. 
System: Windows 7 version 6.1 running on amd64; Cp1250; en_US (nb)
Right?

Do you have switched off Welcome Page, so on restarting file is displayed immediately and not Welcome Page?
(then the cause could be that your project which is provider of encoding is not yet loaded, but file is already opened => default system encoding was used)

If you close and reopen file. Does it have correct encoding then?
Comment 9 _ gtzabari 2014-11-13 19:46:40 UTC
Vladimir,

I think you nailed it.

(In reply to Vladimir Voskresensky from comment #8)
> Is your testcase file belongs to project which specify UTF-8 encoding?
> By default for you system Cp1252 is used. 
> System: Windows 7 version 6.1 running on amd64; Cp1250; en_US (nb)
> Right?

Yes. So even though my system default is Cp1252 I expect UTF-8 to get used.

> Do you have switched off Welcome Page, so on restarting file is displayed
> immediately and not Welcome Page?

Yes, that is correct. I disabled the Welcome Page so on restarting the file is displayed immediately.

> (then the cause could be that your project which is provider of encoding is
> not yet loaded, but file is already opened => default system encoding was
> used)

Yes. That sounds like the problem.

> If you close and reopen file. Does it have correct encoding then?

Yes. It does.

So ... now that we understand what is going on, how do we go about fixing it? Perhaps:

1. When a project opens it should scan the editor files.
2. If an opened file belongs to the editor, and the file was opened using a different encoding than the project default then reopen that file with the right encoding (making sure to maintain the caret position).

Is that possible?
Comment 10 Miloslav Metelka 2014-11-28 10:18:15 UTC
I'm also unable to reproduce even when simulating CP1252 environment for NB run.
But the projects open asynchronously so there may be a race condition.
The file encoding is determined in DataEditorSupport by querying FileEncodingQuery.
I'll try to make some more research.
Comment 11 Vladimir Voskresensky 2014-11-28 10:49:34 UTC
Mila, Try with file which is external to project, but was included into project (so Simple FEQ will not be able to get project by simple FileOwnerQuery which checks for nbproject folder)
Comment 12 Marian Petras 2014-12-05 08:38:29 UTC
I confirm this happens but not consistently. On my environment, it happens in approximately 30-40% of cases.

I have a Maven project of type "EJB module". The project has specified encoding UTF-8. If I leave some Java files open before IDE shutdown, the files are sometimes re-opened with the system's default encoding (Windows-1250 in my case) after the next startup. If this happens, simple reopening the file does not fix the issue. The most reliable workaround I found is to close the file, restart the IDE and open the file.

--
NetBeans 8.0.2
JDK: 1.8.0_25 64-bit, Server VM
OS: Windows 7 Professional, 64-bit
OS charset: Windows-1250
Comment 13 Marian Petras 2014-12-05 08:39:49 UTC
Bug #247578 ("When netbeans restores opened files they are opened using environment encoding") seems to be related to this one.
Comment 14 okki.nb 2015-01-22 13:56:02 UTC
Same problem here.

After a crash a single file openes in latin-1, but project setting is utf-8. The file is included in some other projects (maybe 4), also some other projects include this file. But all projects are utf-8.

I tried to reproduce this issue by manually killing the process with the windows task manager, but the the error only occured once.

Product Version: NetBeans IDE 8.0.2 (Build 201411181905)
Java: 1.8.0_25; Java HotSpot(TM) 64-Bit Server VM 25.25-b02
Runtime: Java(TM) SE Runtime Environment 1.8.0_25-b18
System: Windows 7 version 6.1 running on amd64; Cp1252; de_DE (nb)
Comment 15 NwDx 2015-03-11 10:38:01 UTC
Have the same issue too!

On start it shows wrong characters, but if I close the files and reopen the correct encoding is done.

Product Version: NetBeans IDE 8.0.2 (Build 201411181905)
Updates: NetBeans IDE is updated to version NetBeans 8.0.2 Patch 1
Java: 1.8.0_40; Java HotSpot(TM) Client VM 25.40-b25
Runtime: Java(TM) SE Runtime Environment 1.8.0_40-b25
System: Windows 8 version 6.2 running on x86; Cp1252; de_DE (nb)
Comment 16 jaale 2015-03-16 09:04:26 UTC
I can confirm this behavior, also using Win7 x64 and projects with source encoding set to UTF-8 on project level.

If I remember it correctly, this behavior started with NetBeans 8.0.1, version 8 was getting/guessing the right encoding.

Product Version: NetBeans IDE 8.0.2 (Build 201411181905)
Updates: NetBeans IDE is updated to version NetBeans 8.0.2 Patch 1
Java: 1.8.0; Java HotSpot(TM) 64-Bit Server VM 25.0-b70
Runtime: Java(TM) SE Runtime Environment 1.8.0-b132
System: Windows 7 version 6.1 running on amd64; Cp1252; de_DE (nb)
Comment 17 Vladimir Voskresensky 2015-03-23 13:55:23 UTC
possible improvement could be to store last seen encoding associated with file. So, when files are reopened on IDE restart we don't use default encoding, but saved one.
Comment 18 Svata Dedic 2015-03-24 10:46:24 UTC
I found a potential race condition in SimpleFileOwnerQueryImpl - attaching a proposed patch. The rest of behaviour reported by _gtzabari et all, relates to 8.0.1 and 8.0.2 and is MOST LIKELY caused by another race condition fixed by tstupka as issue #246953. Inspecting the dev version code, the race condition is gone.

I'll attach a zipped log that exhibits multiple entries to beforeLookup in attempt to locate per-project FileEncodingQueryImpl, the first terminated incorrectly bcs of the race condition.
Comment 19 Svata Dedic 2015-03-24 10:47:36 UTC
Created attachment 152810 [details]
Log from 8.0.1 that shows the race between threads on LazyLookupProviders
Comment 20 Svata Dedic 2015-03-24 10:54:10 UTC
Created attachment 152811 [details]
Proposed patch

Tomas (and Tomas), please review.
Comment 21 Tomas Stupka 2015-03-25 10:39:49 UTC
(In reply to Svata Dedic from comment #20)
> Created attachment 152811 [details]
> Proposed patch
> 
> Tomas (and Tomas), please review.
thanks for the patch.

no comments from my side
Comment 22 Svata Dedic 2015-03-26 12:31:55 UTC
Will be pushed as jet-main#baa7c0df1259
Comment 23 Marian Petras 2015-04-02 11:33:09 UTC
Created attachment 152998 [details]
binary patch for NetBeans 8.0.2

I tried the fix by creating and applying a binary patch of this bug to my NetBeans 8.0.2 installation.

I downloaded sources of NetBeans 8.0.2, applied the patch prepared by Sváťa Dědic, built the sources (full IDE) and then packed the bytecode of the modified class, thus creating a binary patch. I then applied the binary patch according to the guide (http://wiki.netbeans.org/DevFaqModulePatching). According to the entry in the IDE log ("INFO [org.netbeans.core.startup.NbEvents]: Module patch or custom extension: C:\Program Files\NetBeans 8.0\ide\modules\patches\org-netbeans-modules-projectapi\bug248599-patch.jar"), the binary patch file was recognized and used by the IDE. 

But still, the IDE opened the file with an incorrect encoding.
I attached the binary patch so you can try yourself.
Comment 24 Quality Engineering 2015-04-03 03:07:41 UTC
Integrated into 'main-silver', will be available in build *201504030001* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)

Changeset: http://hg.netbeans.org/main-silver/rev/baa7c0df1259
User: Svata Dedic <sdedic@netbeans.org>
Log: #248599: wait for the cache to deserialize to provide correct answer