121004 – I18N - FileEncodingQuery not working for Ant build scripts

This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 121004 - I18N - FileEncodingQuery not working for Ant build scripts

Summary: I18N - FileEncodingQuery not working for Ant build scripts

Status:	RESOLVED FIXED

Alias:	None

Product:	xml
Classification:	Unclassified
Component:	Text-Edit (show other bugs)
Version:	6.x
Hardware:	All All

Importance:	P2 blocker (vote)
Assignee:	Tomas Zezula

URL:
Keywords:	I18N

Depends on:	121135
Blocks:
	Show dependency tree

Reported:	2007-11-02 17:31 UTC by Ken Frank
Modified:	2007-11-05 13:53 UTC (History)
CC List:	6 users (show)

See Also:
Issue Type:	DEFECT
Exception Reporter:

Attachments
Patch (2.71 KB, text/plain) 2007-11-03 10:30 UTC, Tomas Zezula	Details
Patch including unit test (10.94 KB, text/plain) 2007-11-03 11:22 UTC, Tomas Zezula	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Ken Frank 2007-11-02 17:31:18 UTC

from community users


to summarize

1. project is in windows31j; user is running in windows31j locale

2. build.xml has utf-8 encoding tag
(its not seeded with encoding of project as is new xml files
and this can be ok)

3. user adds chararcters to build.xml comments and saves
(goes to files window to open it)

4. run project fails with encoding errors in xml file since build is using
project encoding
to build/process which is windows3j (which is correct)  but seems like the build.xml saved
the file
using windows31j and not utf8 -- but shouldnt it save using the utf8 since
that takes precedence over project encoding ?

errror msg is:

>> Error reading project file C:\Documents and Settings\My Documents\NetBeansProjects\JavaApplication6\build.xml:
Invalid byte 1 of 1-byte UTF-8 sequence.
>>
>> build.xml has "<?xml version="1.0" encoding="UTF-8"?>" in top, so
>> it should be opened and stored as UTF-8 but the characters
>> are stored in ShiftJIS(Windows-31j).


===> is this an issue ? Could it be that build.xml is not
covered/handled by feq rules?

Comment 1 Tomas Zezula 2007-11-02 17:40:19 UTC

Probably the XmlFEQImpl doesn't answer for Ant files. I will try.

Comment 2 Ken Frank 2007-11-02 17:47:28 UTC

could Tomas and others in this area also look at other possible cases
situations where the special kind of xml files that are not covered
by the feq as to seeding with encoding - like those in web and ejb projects,
that still use utf-8 -- if those files can be changed directly or indirectly
by users who are in some other project encoding than utf8 -- does the code
make sure to use feq to handle the processing/reading/writing of those special
files ?

I realize it might require separate issues but since now is last date for p2 issues
to be fixed, will be good to have some info so more can be filed, or to use this/change
this one to some umbrella issue that would allow others to be fixed/filed even after Monday.

ken.frank@sun.com

Comment 3 Tomas Zezula 2007-11-02 18:20:52 UTC

Sorry Ken, I cannot help with xml specialization much, the only xml files I know about is plain XML file and build
script, I have no experience with the j2ee cluster, I don't know which specializations of the XMLDataObject exists in
this cluster.

Comment 4 Masaki Katakai 2007-11-03 00:08:19 UTC

Please don't forget to put jf4jbug in Cc list.

Thanks.

Comment 5 Ayub Khan 2007-11-03 01:12:10 UTC

IMO this issue do not belong to xml component, because the Editor Kit load and save of xml document is working fine. You
can verify this (even in standard locale like cp1521 on windows by entering wrong encoding in the build.xml file as

<?xml version="1.0" encoding="XYZ"?>

The editor kit will throw exceptions. This means the editor kit is trying to use the encoding specified in the build.xml.

I suspect the bug is in the code (the project action handlers for "Build", "Run" etc) that tries to read the build.xml
from the file and not honoring the encoding in the build.xml.

Btw, I was able to reproduce this bug on a friends machine.

Comment 6 Ayub Khan 2007-11-03 02:14:17 UTC

Assigning to ant component, since its related to ant build processor.

Comment 7 Tomas Zezula 2007-11-03 10:29:44 UTC

The problem is that AntDataObject doesn't provide FileEncodingQueryImplementation.
The XMLFileEncodingQuery follows Jarda's proxy pattern and is quite complicated, the easies way to fix this issue is to
add dependency on the XML Core and use it's FileEncodingQuery.

I'am attaching patch of the suggested fix. Jesse can you review it? I will add test testing presence of FEQImpl later today.

Comment 8 Tomas Zezula 2007-11-03 10:30:59 UTC

Created attachment 52438 [details]
Patch

Comment 9 Tomas Zezula 2007-11-03 11:22:51 UTC

Created attachment 52441 [details]
Patch including unit test

Comment 10 Jesse Glick 2007-11-03 16:46:25 UTC

Patch would work as an emergency fix for this case. But I think we need to rethink the design here a little bit. It does
not make sense for the FEQI in this case to be taken from a DataObject. It should be picked up according to file MIME
type, since for any text/*+xml the same FEQI should be used. It is silly to have dozens of different XML-based
DataObject subtypes all needing to separately use this same API. And note that apisupport templates for new file types
do not create code like this. We can fix individual cases but there will always be other cases we do not know about
which do not work.

Better fix IMO is as follows: place XmlFileEncodingQueryImpl in global lookup and set it to work on any file whose MIME
type is text/xml or matches the pattern */*+xml. XFEQI would probably not need to be an API in this case. It can also be
removed from the lookup of an XMLDataObject. Probably it should be placed after DataObjectFEQI so that data loaders with
unusual needs can still override this.

Comment 11 Tomas Zezula 2007-11-03 17:09:40 UTC

So for NB 6.0 we can either put XML into global lookup or apply the attached patch, I'm not sure if all the
XML specialized DataObjects provide correct mime type - the patch is safer.
For NB 6.1 we can add MimeLookup based FileEncodingQuery used after the DataObjectFileEncodingQuery as I suggested on
API review.

Comment 12 Jesse Glick 2007-11-03 17:36:06 UTC

"I'm not sure if all the XML specialized DataObjects provide correct mime type - the patch is safer" - I don't think I
agree. We can easily look to see who is using XFEQI right now, and check that it is used from DOs with the right MIME
type. (If any are not using an XML MIME type, we can either fix that, or temporarily give them a DO-based FEQI.) It
would be much harder to find all data loaders using XML-based MIME types and ensure that they have the right FEQI -
especially since anyone with an external module who has made an XML-based file type by wizard will almost certainly not
have specially added a FEQI.

When you have an XML file type, you expect syntax coloring to work without special effort, you expect code folding and
indentation to work without special effort, you even expect DTD and Schema code completion to work without special
effort. So I think it is clearly wrong if a special effort is needed to just read and write international characters
correctly according to the XML specification.

A MimeLookup-based FEQI would indeed be nice and I think we should try this for 6.1. But the very important special case
of XML subtypes could be handled easily and without an API change in 6.0, I think.

Comment 13 Tomas Zezula 2007-11-03 19:17:04 UTC

I agree that no API change is needed if we don't want to remove the DO based XML file encoding query (it's a part of
API). But I prefer to move it outside the API package, it's strange anyway to have SPI implementation in the API but
this is a API change which I a bit afraid. Should be probably decided by xml core owner.

Comment 14 Nam Nguyen 2007-11-03 21:22:20 UTC

I don't understand why we need to remove the XFEQI from XMLDataObject and doing the cleanup on all of its client DO's ?
Wouldn't putting it on global lookup and setting it to cover text/xml and */*+xml be enough?

Jesse could you please help by posting a patch?

Comment 15 Jesse Glick 2007-11-03 22:02:34 UTC

It would not be necessary to remove usage of XFEQI from existing DataObject types; this would still work, though it
would become gratuitous and we would at some point probably want to deprecate explicit usage of XFEQI. To add support
for a global impl for XML content types, with lower priority than DataObject.lookup, the simplest code would probably be
something like this (not yet tested):

---%<--- src/META-INF/services/org.netbeans.spi.queries.FileEncodingQueryImplementation
org.netbeans.modules.xml.core.DefaultXmlFileEncodingQueryImpl
#position=105
---%<--- src/org/netbeans/modules/xml/core/DefaultXmlFileEncodingQueryImpl.java
public class DefaultXmlFileEncodingQueryImpl extends FileEncodingQueryImplementation {
  public Charset getEncoding(FileObject f) {
    if (f.getMIMEType().endsWith("xml")) {
      return XmlFileEncodingQueryImpl.singleton().getEncoding(f);
    } else {
      return null;
    }
  }
}
---%<---

Comment 16 Nam Nguyen 2007-11-03 22:08:45 UTC

Thanks Jesse, will try.

Comment 17 Nam Nguyen 2007-11-03 22:39:36 UTC

Patch checked in.  I only have somewhat testing for a few XML types: wsdl, build.xml, bpel, xsd... on Solaris charsets.
Ken, Sonali:  could you please verify the exact test case of this bug.

/cvs/xml/core/src/META-INF/services/org.netbeans.spi.queries.FileEncodingQueryImplementation,v  <-- 
org.netbeans.spi.queries.FileEncodingQueryImplementation
initial revision: 1.1
/cvs/xml/core/src/org/netbeans/modules/xml/core/DefaultXmlFileEncodingQueryImpl.java,v  <-- 
DefaultXmlFileEncodingQueryImpl.java
initial revision: 1.1

Comment 18 Nam Nguyen 2007-11-03 22:42:50 UTC

Open issue 121074 to complete this work.
Close this one as fixed.

Comment 19 Sonali Kochar 2007-11-04 02:22:21 UTC

Nam,

The fix looks good on winXP. thx

Comment 20 Tomas Zezula 2007-11-04 10:02:06 UTC

The patch is fine, it's placed behind DOFileEncodingQueryImpl.
But anyway even if you leave the XmlFileEncodingQueryImpl in the API you should probably find its usages and remove it
from the DataObject's which are covered by the DefaultXmlFileEncodingQueryImpl due to performance reasons (when the XML
file has no encoded encoding the XMLXmlFileEncodingQueryImpl is used twice before the other queries impl are used,
firstly by XmlFileEncodingQueryImpl and then by DefaultXmlFileEncodingQueryImpl).

Comment 21 Jesse Glick 2007-11-04 11:46:45 UTC

It seems this may be causing commit validation failures?

org.netbeans.modules.masterfs.filebasedfs.utils.FSException: Cannot get shared access to
/hudson/workdir/jobs/trunk/workspace/visualweb/test/work/projects/SanityProject/SanityProject/web/WEB-INF/faces-config.xml
(probably opened for writing).
	at org.netbeans.modules.masterfs.filebasedfs.utils.FSException.io(FSException.java:125)
	at org.netbeans.modules.masterfs.filebasedfs.fileobjects.MutualExclusionSupport.addResource(MutualExclusionSupport.java:97)
	at org.netbeans.modules.masterfs.filebasedfs.fileobjects.FileObj.getInputStream(FileObj.java:129)
Caused: java.io.FileNotFoundException
	at org.netbeans.modules.masterfs.filebasedfs.fileobjects.FileObj.getInputStream(FileObj.java:154)
	at org.openide.filesystems.MIMESupport$CachedFileObject.getInputStream(MIMESupport.java:284)
	at org.netbeans.core.filesystems.MIMEResolverImpl$Type.accept(MIMEResolverImpl.java:583)
	at org.netbeans.core.filesystems.MIMEResolverImpl$Type.access$1400(MIMEResolverImpl.java:458)
[catch] at org.netbeans.core.filesystems.MIMEResolverImpl$FileElement.resolve(MIMEResolverImpl.java:425)
	at org.netbeans.core.filesystems.MIMEResolverImpl$FileElement.access$100(MIMEResolverImpl.java:410)
	at org.netbeans.core.filesystems.MIMEResolverImpl$Impl.findMIMEType(MIMEResolverImpl.java:139)
	at org.openide.filesystems.MIMESupport$CachedFileObject.resolveMIME(MIMESupport.java:253)
	at org.openide.filesystems.MIMESupport$CachedFileObject.getMIMEType(MIMESupport.java:241)
	at org.openide.filesystems.MIMESupport.findMIMEType(MIMESupport.java:115)
	at org.openide.filesystems.FileUtil.getMIMETypeOrDefault(FileUtil.java:1004)
	at org.openide.filesystems.FileObject.getMIMEType(FileObject.java:489)
	at org.netbeans.modules.masterfs.MasterFileObject.getMIMEType(MasterFileObject.java:219)
	at org.netbeans.modules.xml.core.DefaultXmlFileEncodingQueryImpl.getEncoding(DefaultXmlFileEncodingQueryImpl.java:42)
	at org.netbeans.api.queries.FileEncodingQuery.getEncoding(FileEncodingQuery.java:91)
	at org.openide.text.DataEditorSupport.saveFromKitToStream(DataEditorSupport.java:368)
	at org.openide.text.CloneableEditorSupport$1SaveAsReader.run(CloneableEditorSupport.java:847)
	at org.netbeans.editor.BaseDocument.render(BaseDocument.java:1206)
	at org.openide.text.CloneableEditorSupport.saveDocument(CloneableEditorSupport.java:894)
	at org.openide.text.DataEditorSupport.saveDocument(DataEditorSupport.java:390)
	at org.netbeans.modules.web.jsf.JSFConfigEditorSupport.saveDocument(JSFConfigEditorSupport.java:265)
	at org.netbeans.modules.web.jsf.JSFConfigEditorSupport$1.save(JSFConfigEditorSupport.java:102)
	at org.netbeans.modules.visualweb.insync.models.FacesConfigModel.save(FacesConfigModel.java:209)
	at org.netbeans.modules.visualweb.insync.models.FacesConfigModel.addManagedBean(FacesConfigModel.java:180)
	at org.netbeans.modules.visualweb.insync.models.FacesConfigModel.ensureManagedBean(FacesConfigModel.java:148)
	at org.netbeans.modules.visualweb.insync.models.FacesModel.ensureManagedBeansEntry(FacesModel.java:939)
	at org.netbeans.modules.visualweb.insync.models.FacesModel.syncImpl(FacesModel.java:1081)
	at org.netbeans.modules.visualweb.insync.Model.sync(Model.java:219)
	at org.netbeans.modules.visualweb.insync.ModelSet.syncAll(ModelSet.java:646)
	at org.netbeans.modules.visualweb.insync.models.FacesModelSet.syncAll(FacesModelSet.java:1384)
	at org.netbeans.modules.visualweb.insync.models.FacesModelSet.doModeling(FacesModelSet.java:439)
	at org.netbeans.modules.visualweb.insync.models.FacesModelSet.<init>(FacesModelSet.java:406)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
	at org.netbeans.modules.visualweb.insync.ModelSet.createInstance(ModelSet.java:257)
	at org.netbeans.modules.visualweb.insync.ModelSet.getInstance(ModelSet.java:242)
	at org.netbeans.modules.visualweb.insync.ModelSet$1.run(ModelSet.java:209)
	at java.lang.Thread.run(Thread.java:595)

Comment 22 Nam Nguyen 2007-11-04 18:01:12 UTC

The file should be already locked by same thread as the saveDocument. I am investigating.

Comment 23 Tomas Zezula 2007-11-04 18:35:59 UTC

Don't forget that you are called form DataEditorSupport.saveStreamToKit, the DataEditorSupport already holds the write
lock on the file you want to save in, the filesystem lock like UNIX flock is not reentrant, you cannot open input stream
on the FileObject.

You can use this simple test to demonstrate it:

public void testFS () throws Exception {
        File f = getWorkDir();
        FileObject fo = FileUtil.toFileObject(f);
        FileObject t = FileUtil.createData(fo, "foo");
        FileLock l = t.lock();
        try {
            OutputStream out = t.getOutputStream(l);
            try {
                t.getInputStream();
            } finally {
                out.close();
            }
        } finally {
            l.releaseLock();
        }
        
    }

Comment 24 Nam Nguyen 2007-11-04 18:58:07 UTC

Thanks for the info.
Strangely.  I tried the test case manually and it works fine.
Anyway, I am looking into work-around in DataEditorSupport.saveStreamToKit, to obtain the encoding before open outputStream.

Comment 25 Nam Nguyen 2007-11-04 19:13:35 UTC

Actually i would only catch IOException in DefaultXmlFileEncodingQueryImpl.getEncoding and return null.

Comment 26 Tomas Zezula 2007-11-04 19:54:05 UTC

I doubt it will work, the query will not answer to the legal xml file. It will cause that the Ant build file and any
other file which doesn't provide XmlFileEncodingQuery in its' DataObject lookup will be written in wrong encoding
(probably in project encoding) in case the MIMEResolver will not already know the MIME type. In case the
XmlFileEncodingQuery is in DataObject lookup it will return the correct encoding => no regression, catching exception
and returning null can be used as a hot fix.
I am afraid that the correct fix is more complicated. Thinking more about the store document to stream case it's
suspicious that the already "non valid" disk data, which are being rewritten, are used to find out the MIME type. What
happens when user replaced the part of document which determined the MIME type? The correct MIME type recognition should
be done on the data stored into the file in similar way the XmlFileEncodingQueryImpl finds the encoding.

Comment 27 Nam Nguyen 2007-11-04 20:21:51 UTC

So it seems to me that:
(1) a partial fix for now is for DataEditorSupport obtain the encoding before locking the file.
(2) the long term complete fix would be for client of saveDocument to specify encoding.
Who would be owner to for (1)?

Comment 28 Tomas Zezula 2007-11-04 20:26:15 UTC

(1) Is not needed since you get the content of document, so you can find the MIME type later, this will also fix the (2).

Comment 29 Tomas Zezula 2007-11-04 20:34:42 UTC

If it helps the query can be called before lock is obtained  but it will not solve (2). You can fill a issue to
openide/loaders and add me to cc please. The correct solution of (2) is to rewrite the XmlFileEncodingQuery to get MIME
type from incoming data, but I am not sure if it should be done to NB 6.0.

Comment 30 Nam Nguyen 2007-11-04 21:45:41 UTC

I disable DefaultXmlFileEncodingQueryImpl for now because it does not help.

Comment 31 Nam Nguyen 2007-11-04 21:49:09 UTC

It seems the best limited (safest) 6.0 fix for this particular issue now is Tomas patch?

Comment 32 Tomas Zezula 2007-11-04 22:00:57 UTC

I will ask Jarda (openide/loaders) tomorrow about possibility to call the query before opening the stream and let you
know. For now I believe the original patch is the safest but non generic solution for 6.0.

Comment 33 Jesse Glick 2007-11-05 09:21:53 UTC

Another fix which would be safer than the current DefaultXmlFileEncodingQueryImpl.java but more general than Tomáš'
original patch would be to just check

if (file.getExt().equals("xml")) ....

which would not try to open any InputStream on the file, but would catch most of the cases including nearly all Ant
scripts (since most XML subtypes are still named *.xml). Would not provide correct encoding for *.jnlp, for example.

Comment 34 Tomas Zezula 2007-11-05 10:52:12 UTC

Jarda has rewritten the DataEditorSupport to call FEQ before opening the OutputStream, the (1) should be fixed, you can
integrate Jesse's patch again. The (2) is not very important the use case when someone changes xml file content to
something else than xml is obscure anyway.

Comment 35 Tomas Zezula 2007-11-05 12:56:00 UTC

Now it's save to use Jesse's patch, I will integrate it to get it into CVS before NB 6.0 branch.

Comment 36 Tomas Zezula 2007-11-05 13:53:03 UTC

Checking in src/META-INF/services/org.netbeans.spi.queries.FileEncodingQueryImplementation;
/cvs/xml/core/src/META-INF/services/org.netbeans.spi.queries.FileEncodingQueryImplementation,v  <-- 
org.netbeans.spi.queries.FileEncodingQueryImplementation
new revision: 1.3; previous revision: 1.2
done
Checking in src/org/netbeans/modules/xml/core/XMLDataObject.java;
/cvs/xml/core/src/org/netbeans/modules/xml/core/XMLDataObject.java,v  <--  XMLDataObject.java
new revision: 1.41; previous revision: 1.40
done