Bug 52164

Summary: Problems with accents and master page font
Product: POI Reporter: fastlock <nbodin78>
Component: HSLFAssignee: POI Developers List <dev>
Status: RESOLVED WORKSFORME    
Severity: major CC: nbodin78
Priority: P2    
Version: 3.8-dev   
Target Milestone: ---   
Hardware: All   
OS: All   
Attachments: Example
Without accents :

Description fastlock 2011-11-09 17:02:52 UTC
When we create a slide with some text inside, if the font of the TextShape is different from the "default" font (of the Master Page) then :
- all characters use the font specified in the TextShape (This is CORRECT)
BUT
- accents such as é à è ù î use the default font and it is not possible to change it.

a) I tried to use unicode encoding like : \u00e9 instead of é but the problem remains

b) I tried to set the value as HTML then encode &eacute; but this just displays the &eacute; and is not working either.

c) I tried to FORCE the font of the RichText to the default font, this works and now the strings with accent is fine on screen (with the other font) BUT this is not usable, since the size of the TextShape changes and I have text overlapping or gaps between text 

d) I tried to change the default font of the master page with the API but it does not seem to be possible to this.

e) Finally the only workaround which works is to test the default font and the TextShape font and if they differ to call : 
public static String removeAccents(String s)
	{
	    if (s != null && s.length() > 0)
	    {
	        s = s.replaceAll("[èéêë]","e");
	        s = s.replaceAll("[ûù]","u");
	        s = s.replaceAll("[ïî]","i");
	        s = s.replaceAll("[àâ]","a");
	        s = s.replaceAll("Ô","o");

	        s = s.replaceAll("[ÈÉÊË]","E");
	        s = s.replaceAll("[ÛÙ]","U");
	        s = s.replaceAll("[ÏÎ]","I");
	        s = s.replaceAll("[ÀÂ]","A");
	        s = s.replaceAll("Ô","O");
	    }
	    return s;
	}  
and all accents are gone ... The text is correctly displayed but the end users will complain for missing accents... Until a fix for this issue.

***
Code example found over the internet :assuming test.ppt is using Arial as default font and we have some text box using times new roman.
 
HSLFSlideShow hslfsh = new HSLFSlideShow("template/test.ppt"); SlideShow ppt = new SlideShow(hslfsh);

Slide slide[] = ppt.getSlides();

for(int i = 0 ; i < slide.length ; i++) { Slide curSlide = slide[i]; Shape sh[] = curSlide.getShapes();

for(int j = 0; j < sh.length ; j++ ) { Shape curSh = sh[j];

TextBox tb = new TextBox();

tb.setAnchor(curSh.getAnchor());

tb.setText("Helloworld é è ë ê à ö ï î ô ü û");

TextBox shape = (TextBox)curSh;

RichTextRun rt = shape.getTextRun().getRichTextRuns()[0]; RichTextRun newRt = tb.getTextRun().getRichTextRuns()[0];

// style copy newRt.setAlignment(rt.getAlignment()); newRt.setBold(rt.isBold()); newRt.setFontColor(Color.black); newRt.setFontName(rt.getFontName()); newRt.setFontSize(rt.getFontSize()); newRt.setItalic(rt.isItalic()); newRt.setUnderlined(rt.isUnderlined());

// remove old shape curSlide.removeShape(shape);

curSlide.addShape(tb); } }
Comment 1 fastlock 2011-11-09 17:20:06 UTC
Created attachment 27915 [details]
Example
Comment 2 fastlock 2011-11-09 17:34:29 UTC
Created attachment 27916 [details]
Without accents :

Same result after removing accents
Comment 3 Yegor Kozlov 2012-02-22 12:14:10 UTC
I can't reproduce it. Here is my test code and all accents are OK in the output: 



        SlideShow ppt = new SlideShow();
        Slide slide = ppt.createSlide();

        TextBox shape = new TextBox();
        RichTextRun rt = shape.getTextRun().getRichTextRuns()[0];
        shape.setHorizontalAlignment(TextBox.AlignLeft);
        rt.setFontName("Times New Roman");
        shape.setText("Helloworld é è ë ê à ö ï î ô ü û");
        rt.setFontSize(16);
        shape.setAnchor(new java.awt.Rectangle(495, 375, 210, 115));
        slide.addShape(shape);

        FileOutputStream out = new FileOutputStream("52164.ppt");
        ppt.write(out);
        out.close();

Can you post Java code that generates problematic output with missing accents? if it requires a template, then attach it as well.

Yegor
Comment 4 Andreas Beeker 2015-11-29 01:37:48 UTC
I also can't reproduce the error and there wasn't any response, so I'm closing 
this as "works for me"
This was my test code:

    public void bug52164() throws IOException {
        HSLFSlideShow ppt = new HSLFSlideShow();
        HSLFSlide sl = ppt.createSlide();
        HSLFTextBox tb = sl.createTextBox();
        
tb.getTextParagraphs().get(0).getTextRuns().get(0).setFontFamily("Cambria");
        tb.setText("Helloworld é è ë ê à ö ï î ô ü û");
        tb.moveTo(100, 100);
        FileOutputStream fos = new FileOutputStream("bla.ppt");
        ppt.write(fos);
        fos.close();
        ppt.close();
    }