Bug 47875

Summary: reading word written in Chinese, paragraph nums is not correct.
Product: POI Reporter: inthendsun
Component: HWPFAssignee: POI Developers List <dev>
Status: RESOLVED WORKSFORME    
Severity: normal    
Priority: P2    
Version: 3.2-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   

Description inthendsun 2009-09-18 23:46:15 UTC
FileInputStream fileIn = new FileInputStream("D:\\111.doc"); 

WordExtractor extractor = new WordExtractor(fileIn); 

String[] paras =extractor.getParagraphText(); 
System.out.println(paras.length); 


why the paragraph nums is not correct? Reading in English looks like no problem. But my word is written in Chinese.

thanks!
Comment 1 Yegor Kozlov 2011-06-25 12:38:04 UTC
Please attach the problematic file, without it we can't do much to help you.

Yegor
Comment 2 Dominik Stadler 2016-02-14 18:40:34 UTC
No update for a long time, therefore I am closing this, please reopen with more information if this is still a problem for you.