The MS word documents are drag-and drop in thru webdav and POI reads title from the word document. We have the CMS server on Sun solaris and webdav URL is configured for each user thru the Windows explorer. So the POI is not reading special characters like è,® from the title field of the word document if they drap-and-drop in the file thru webdav. It work fine if the server is on Windows and webdav is also on windows does not work if the CMS server is on Sun and webdav URL thru windows Explorer.
I don't understand what you are doing and especially I don't know what is means to "read from the MS Word Document thru POI and webdav". You should give more details so we can help better. However, I suppose that this is not a POI problem since - as you say - reading the POI file under Windows works. Did you set the LANG environment under Solaris to a sensible value? If you don't the JVM reads ASCII characters only and transforms anything else to '?' characters.
I did change the lang property to UTF8/ISO8859-1, but i still have the problem. What i am trying to do here is 1. Webdav folders are like Windows Explorer which follows HTTP protcol are accessible from My networkplaces. 2. Webdav - Dropping the MS word doc in the webdav folder thru my networkplaces ( This should automatically check in the doc to the CMS Server) 3. When i drop in the file, i am applying POI library to read the title from the MS-word before checking into the Content Server(JUST FYI content server allows some check in filters and the code is enclosed here). public int doFilter(Workspace ws, DataBinder binder, ExecutionContext cxt) throws DataException, ServiceException { if(isWordDoc(binder)) { String fileName = binder.getLocal("primaryFile:path"); try{ POIFSReader r = new POIFSReader(); MyPOIFSReaderListener listner = new MyPOIFSReaderListener(); //r.registerListener(listner,""); r.registerListener (listner,"\005SummaryInformation"); r.read(new FileInputStream(fileName)); String title = listner.getTitle(); System.out.println(" My Title: \"" + title + "\""); if(title != null) binder.putLocal("dDocTitle",title); }catch(java.io.FileNotFoundException e) { System.out.println("FileNotFoundException : " + fileName); }catch(java.io.IOException e) { System.out.println("IOException : " + fileName); } } // filter executed correctly. Return CONTINUE return CONTINUE; }
Two questions: - What does your MyPOIFSReaderListener look like? - Can you provide a sample document together with the output of your CMS filter code?
Created attachment 12537 [details] Checkinfilter.java
Created attachment 12538 [details] Sample Document
Attached the CheckinFilter.java and Sample Document that i am reading from. And the output of the getTitle is ������ Network Appliance - Press Release - 02/17/2004�议��
The sample document contains those funny characters in the title, and POI extracts them correctly. The rest of the sample document looks fine. How the special characters got into the title property and whether that's correct or not is outside the scope of POI resp. HPSF.