Bug 45590

Summary: Header/footer extraction must work for .ppt files saved from PPT 2007
Product: POI Reporter: Dmitry Goldenberg <dgoldenberg>
Component: HSLFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: critical CC: dgoldenberg
Priority: P1    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Attachments: The presentation as a ppt with no header/footer data.
The presentation as a pptx, no header/footer data.
The presentation as ppt, with header and footer data.
The presentation as pptx, with header and footer data.

Description Dmitry Goldenberg 2008-08-07 09:11:06 UTC
Created attachment 22402 [details]
The presentation as a ppt with no header/footer data.

I have attempted to use the code outlined here for the extraction of headers and footers from .ppt files:

http://poi.apache.org/hslf/how-to-shapes.html#HeadersFooters

However, all the values I got were null.

The test file I used was a .ppt file created in PowerPoint 2007.  I.e. it was saved in the compatibility mode.  This is a very important use-case, since I have no access to any other version of Office and our customers are very likely to have a lot of files like this.

I am attaching the following:

1. a simple presentation, both in pptx and ppt formats (ppt saved in compatibility mode from '07)
2. the same presentation with set headers / footers, both in ppt and pptx formats.

Note that for the files with headers/footers, the following header/footer data is defined:

1. Slide header/footer data:
- Fixed date: August 06, 2008
- Slide number is checked
- Slide footer is set to "THE FOOTER TEXT" for all slides but the second, where it is actually set to "THE FOOTER TEXT FOR SLIDE 2"

2. Notes and Handouts
- Date and time is selected to 'update automatically'
- Header is set to "THE NOTES HEADER TEXT"
- Page number is selected
- Footer is set to "THE NOTES FOOTER TEXT"

This feature is critical for us.
Comment 1 Dmitry Goldenberg 2008-08-07 09:12:29 UTC
Created attachment 22403 [details]
The presentation as a pptx, no header/footer data.
Comment 2 Dmitry Goldenberg 2008-08-07 09:13:11 UTC
Created attachment 22404 [details]
The presentation as ppt, with header and footer data.
Comment 3 Dmitry Goldenberg 2008-08-07 09:13:41 UTC
Created attachment 22405 [details]
The presentation as pptx, with header and footer data.
Comment 4 Yegor Kozlov 2008-08-11 23:57:44 UTC
Fixed in r685054. 
I hope I figured out how PPT 2007 stores headers / footers - quite differently from PPT 2003. If you have a big set of such ppt 2007 files, please exercise the code against it. 

Yegor
Comment 5 Dmitry Goldenberg 2008-08-12 10:41:45 UTC
I just got the latest POI sources and my testing code still returns all nulls for all the headers and footers. I used the document you can find attached to this issue, marked as "The presentation as ppt, with header and footer data."

Thanks. Below is my tester code:


package com.attivio.test;

import java.io.FileInputStream;

import org.apache.poi.hslf.model.HeadersFooters;
import org.apache.poi.hslf.model.Slide;
import org.apache.poi.hslf.usermodel.SlideShow;

public class HslfHeaderFooterExtractor {
  public static void main(String[] args) throws Exception {
    FileInputStream fis = new FileInputStream(args[0]);
    SlideShow ppt = new SlideShow();
    fis.close();
    Slide[] slides = ppt.getSlides();

    // presentation-scope headers / footers
    HeadersFooters hdd = ppt.getSlideHeadersFooters();
    String headerText = hdd.getHeaderText();
    String footerText = hdd.getFooterText();

    System.out.println(">> Global header: " + headerText);
    System.out.println(">> Global footer: " + footerText);
    
    HeadersFooters notesHdd = ppt.getNotesHeadersFooters();
    headerText = notesHdd.getHeaderText();
    footerText = notesHdd.getFooterText();
    String dateTimeText = notesHdd.getDateTimeText();
    
    System.out.println(">> Notes header: " + headerText);
    System.out.println(">> Notes footer: " + footerText);
    System.out.println(">> Notes date time text: " + dateTimeText);

    // per-slide headers / footers
    for (int i = 0; i < slides.length; i++) {
      
      System.out.println(">> SLIDE #" + (i + 1));
      
      HeadersFooters hdd2 = slides[i].getHeadersFooters();
      headerText = hdd2.getHeaderText();
      footerText = hdd2.getFooterText();
      dateTimeText = hdd2.getDateTimeText();
      int slideNum = slides[i].getSlideNumber();
      
      System.out.println(">> HEADER: " + headerText);
      System.out.println(">> FOOTER: " + footerText);
      System.out.println(">> DATE TIME: " + dateTimeText);
      System.out.println(">> SLIDE NUM: " + slideNum);
      
    }

  }
}
Comment 6 Yegor Kozlov 2008-08-13 06:20:30 UTC
Dmitry,

You don't pass FileInputStream to SlideShow:

      FileInputStream fis = new FileInputStream(args[0]);
      SlideShow ppt = new SlideShow();  //ERROR
      fis.close();
      Slide[] slides = ppt.getSlides();

it should be 

      SlideShow ppt = new SlideShow(fis);  //OK

Below is the output:

>> Global header: null
>> Global footer: THE FOOTER TEXT
>> Notes header: THE NOTES HEADER TEXT
>> Notes footer: THE NOTES FOOTER TEXT
>> Notes date time text: null
>> SLIDE #1
>> HEADER: null
>> FOOTER: THE FOOTER TEXT
>> DATE TIME: Wednesday, August 06, 2008
>> SLIDE NUM: 1
>> SLIDE #2
>> HEADER: null
>> FOOTER: THE FOOTER TEXT FOR SLIDE 2
>> DATE TIME: August 06, 2008
>> SLIDE NUM: 2
>> SLIDE #3
>> HEADER: null
>> FOOTER: THE FOOTER TEXT
>> DATE TIME: Wednesday, August 06, 2008
>> SLIDE NUM: 3


Regards,
Yegor

(In reply to comment #5)
> I just got the latest POI sources and my testing code still returns all nulls
> for all the headers and footers. I used the document you can find attached to
> this issue, marked as "The presentation as ppt, with header and footer data."
> 
> Thanks. Below is my tester code:
> 
> 
> package com.attivio.test;
> 
> import java.io.FileInputStream;
> 
> import org.apache.poi.hslf.model.HeadersFooters;
> import org.apache.poi.hslf.model.Slide;
> import org.apache.poi.hslf.usermodel.SlideShow;
> 
> public class HslfHeaderFooterExtractor {
>   public static void main(String[] args) throws Exception {
>     FileInputStream fis = new FileInputStream(args[0]);
>     SlideShow ppt = new SlideShow();
>     fis.close();
>     Slide[] slides = ppt.getSlides();
> 
>     // presentation-scope headers / footers
>     HeadersFooters hdd = ppt.getSlideHeadersFooters();
>     String headerText = hdd.getHeaderText();
>     String footerText = hdd.getFooterText();
> 
>     System.out.println(">> Global header: " + headerText);
>     System.out.println(">> Global footer: " + footerText);
> 
>     HeadersFooters notesHdd = ppt.getNotesHeadersFooters();
>     headerText = notesHdd.getHeaderText();
>     footerText = notesHdd.getFooterText();
>     String dateTimeText = notesHdd.getDateTimeText();
> 
>     System.out.println(">> Notes header: " + headerText);
>     System.out.println(">> Notes footer: " + footerText);
>     System.out.println(">> Notes date time text: " + dateTimeText);
> 
>     // per-slide headers / footers
>     for (int i = 0; i < slides.length; i++) {
> 
>       System.out.println(">> SLIDE #" + (i + 1));
> 
>       HeadersFooters hdd2 = slides[i].getHeadersFooters();
>       headerText = hdd2.getHeaderText();
>       footerText = hdd2.getFooterText();
>       dateTimeText = hdd2.getDateTimeText();
>       int slideNum = slides[i].getSlideNumber();
> 
>       System.out.println(">> HEADER: " + headerText);
>       System.out.println(">> FOOTER: " + footerText);
>       System.out.println(">> DATE TIME: " + dateTimeText);
>       System.out.println(">> SLIDE NUM: " + slideNum);
> 
>     }
> 
>   }
> }
> 

Comment 7 Dmitry Goldenberg 2008-08-14 11:33:34 UTC
OMG I had a dumb bug in my tester, sorry! :)