Bug 54880 - Problem with Chinese characters
Summary: Problem with Chinese characters
Status: RESOLVED DUPLICATE of bug 55902
Alias: None
Product: POI
Classification: Unclassified
Component: XSLF (show other bugs)
Version: 3.9-FINAL
Hardware: Other All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-24 05:45 UTC by Abhinav Mathur
Modified: 2014-12-20 11:06 UTC (History)
1 user (show)



Attachments
PPT to reproduce the bug (101.50 KB, application/vnd.ms-powerpoint)
2013-04-24 05:45 UTC, Abhinav Mathur
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Abhinav Mathur 2013-04-24 05:45:19 UTC
Created attachment 30224 [details]
PPT to reproduce the bug

Chinese characters get converted into squares when converting a PPT containing Chinese characters into images. The same problem is discussed in [0]


[0]http://stackoverflow.com/questions/2687522/problem-with-using-apache-poi-to-convert-ppt-to-image


This can be reproduced by using attached PPT file
Comment 1 Nick Burch 2013-05-31 21:20:05 UTC
In r1488403 I have added unit tests based on the sample file supplied, which shows that POI can correct fetch all of the chinese characters without issue

Could you please suggest a unit test that highlights the problem?
Comment 2 saiyedzahid 2013-08-29 02:17:16 UTC
Please refer this stackflow quuestion. 

This is my code. But when I convert PPTX to PNG - it does not convert unicode character properly. It convert unicode characters to a rectangle box. I am attaching the pptx file I am using for my below program. 

package foo;

import java.awt.Color;
import java.awt.Dimension;
import java.awt.Graphics2D;
import java.awt.geom.AffineTransform;
import java.awt.geom.Rectangle2D;
import java.awt.image.BufferedImage;
import java.io.FileInputStream;
import java.io.FileOutputStream;

import org.apache.poi.xslf.usermodel.XMLSlideShow;
import org.apache.poi.xslf.usermodel.XSLFSlide;

public class PptToPng {
    public static void main(String[] args) throws Exception {

    	FileInputStream is = new FileInputStream("C:/Temp/aspose/word/font_test.pptx");
        XMLSlideShow ppt = new XMLSlideShow(is);
        is.close();
        double zoom = 2; // magnify it by 2
        AffineTransform at = new AffineTransform();
        at.setToScale(zoom, zoom);
        Dimension pgsize = ppt.getPageSize();
        System.out.println("DONE1");
        XSLFSlide[] slide = ppt.getSlides();
        System.out.println("DONE2");
        //for (int i = 0; i < slide.length; i++) {
            BufferedImage img = new BufferedImage((int)Math.ceil(pgsize.width*zoom), (int)Math.ceil(pgsize.height*zoom), BufferedImage.TYPE_INT_RGB);
            Graphics2D graphics = img.createGraphics();
            graphics.setTransform(at);
            System.out.println("DONE3");
            graphics.setPaint(Color.white);
            graphics.fill(new Rectangle2D.Float(0, 0, pgsize.width, pgsize.height));
            slide[0].draw(graphics);
            System.out.println("DONE4");
            FileOutputStream out = new FileOutputStream("C:/Temp/aspose/word/slide-" + (10 + 1) + ".png");        
            System.out.println("DONE5");   
            javax.imageio.ImageIO.write(img, "png", out);
            out.close();
            System.out.println("DONE");
       // }
    }
}
Comment 3 saiyedzahid 2013-08-29 02:18:19 UTC
Characters in my pptx are - /ˌinəˈvāSHən/

Stackflow question is  - http://stackoverflow.com/questions/18498228/unicode-characters-not-converting-from-pptx-to-png
Comment 4 saiyedzahid 2013-09-01 04:24:46 UTC
Anyone here ?
Comment 5 Nick Burch 2013-09-01 13:24:44 UTC
Are you sure that the font being used has the glyphs for those characters? My hunch is that you don't have the right fonts available when running the program, and the font being used instead doesn't handle chinese characters

(Based on the unit test I added in r1488403 we know POI can read the chinese characters in your file just fine, so the problem is something output related)
Comment 6 saiyedzahid 2013-09-02 04:02:51 UTC
I am registering font file but of no use. There should be a method to specify a directory where all these fonts files stored instead of just specifying one font file at a time. - My PPTX contains a word -  /ˌinəˈvāSHən/ but when it converts that pptx to png, I see some square box in the png inside -  /ˌinəˈvāSHən/

package foo;

import java.awt.Dimension; 
import java.awt.Font;
import java.awt.Graphics2D;
import java.awt.geom.AffineTransform;
import java.awt.geom.Rectangle2D;
import java.awt.image.BufferedImage;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;

import org.apache.poi.xslf.usermodel.XMLSlideShow;
import org.apache.poi.xslf.usermodel.XSLFSlide;

public class PPTXToPNG {

public static void main(String[] args) throws Exception {

    FileInputStream is = new FileInputStream("C:/Temp/PPTXToImage/unicode_test.pptx");      

    XMLSlideShow ppt = new XMLSlideShow(is);
    is.close();
    double zoom = 2;
    AffineTransform at = new AffineTransform();
    at.setToScale(zoom, zoom);
    Dimension pgsize = ppt.getPageSize();
    XSLFSlide[] slide = ppt.getSlides();

    BufferedImage img = new BufferedImage((int)Math.ceil(pgsize.width*zoom),
            (int)Math.ceil(pgsize.height*zoom), BufferedImage.TYPE_INT_RGB);
    Graphics2D graphics = img.createGraphics();

    graphics.setTransform(at);
    graphics.fill(new Rectangle2D.Float(0, 0, pgsize.width, pgsize.height));

    InputStream iss = new FileInputStream("C:/Temp/font/GEInspRg.ttf");
    Font font = Font.createFont(Font.TRUETYPE_FONT, iss);
    iss.close();
    graphics.setFont(font);
    
    // Draw first page in the PPTX. First page starts at 0 position
    slide[0].draw(graphics);

    FileOutputStream out = new FileOutputStream("C:/Temp/PPTXToImage/ConvertedSlide.png");  
    javax.imageio.ImageIO.write(img, "png", out);
    out.close();
    System.out.println("DONE");

   }
}
Comment 7 saiyedzahid 2013-09-02 04:20:50 UTC
You see above - my program is very simple. Read PPTX file from disk and get the slide out of that and convert just first slide to PNG. In production requirement i that these program will accept input of PPTX files which could contains letters and numbers from any international language. So I don't expect to specify and set font for each language. I have a directory where I have all the ttf files. Hope you can resolve this fast. I need to get this to production this week only and get manager approval by Monday Sep 2.
Comment 8 Nick Burch 2013-09-02 11:43:48 UTC
(In reply to saiyedzahid from comment #6)
> I am registering font file but of no use. There should be a method to
> specify a directory where all these fonts files stored instead of just
> specifying one font file at a time.

Font loading in Java is a JVM issue, it's not anything POI has control over. There are well documented ways to provide a directory or two of fonts to the JVM, I would suggest you read the JVM related documentation on all of that. It's nothing POI specific!
Comment 9 saiyedzahid 2013-09-03 16:15:59 UTC
Yes looks like a font issue. Let me check. 

Thank you.
Comment 10 saiyedzahid 2013-09-04 15:44:18 UTC
Can close this bug. Was a font issue.
Comment 11 Wu Huajie 2014-12-20 08:17:52 UTC
Know how to do. 

Spending several days on this issue. Lack of the hint on search engine, I spend most of time figuring out this problem on POI side, rather, finally the point is in jvm fonts setting. 

Time wasting !!
Better to let dev know how to handle this kind of problem.
Comment 12 Andreas Beeker 2014-12-20 11:06:45 UTC

*** This bug has been marked as a duplicate of bug 55902 ***