Bug 33557 - Initial PowerPoint Support
Summary: Initial PowerPoint Support
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: POI Overall (show other bugs)
Version: 3.0-dev
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-02-14 12:07 UTC by Nick Burch
Modified: 2005-05-28 12:26 UTC (History)
0 users



Attachments
Code for initial powerpoint support (5.90 KB, application/octet-stream)
2005-02-14 12:09 UTC, Nick Burch
Details
Updated code to support powerpoint (8.71 KB, application/x-tar)
2005-02-21 12:28 UTC, Nick Burch
Details
Next version of PPT support (10.89 KB, application/x-tar)
2005-04-12 13:19 UTC, Nick Burch
Details
Next version of PPT supprt (14.09 KB, application/x-tar)
2005-04-13 13:50 UTC, Nick Burch
Details
Next version of PPT supprt (15.44 KB, application/x-tar)
2005-04-19 18:16 UTC, Nick Burch
Details
Next version of PPT supprt (19.64 KB, application/x-gzip)
2005-04-24 20:00 UTC, Nick Burch
Details
Next version of PPT supprt (24.75 KB, application/x-tar)
2005-05-20 17:24 UTC, Nick Burch
Details
Unit tests for powerpoint code (8.39 KB, application/x-tar)
2005-05-20 17:25 UTC, Nick Burch
Details
Quick guide to using the PowerPoint code (1.45 KB, text/plain)
2005-05-20 17:27 UTC, Nick Burch
Details
New Unit tests for powerpoint code (8.38 KB, application/x-tar)
2005-05-27 19:53 UTC, Nick Burch
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Burch 2005-02-14 12:07:28 UTC
Initial support for powerpoint, as described in post to poi-user. Supports
finding slides and meta slides (by a process of scanning the file for known byte
sequences), and getting text out from slides and meta slides.
Comment 1 Nick Burch 2005-02-14 12:09:05 UTC
Created attachment 14278 [details]
Code for initial powerpoint support
Comment 2 Nick Burch 2005-02-21 12:28:50 UTC
Created attachment 14332 [details]
Updated code to support powerpoint

New version, with help from Shaheed Haque. Instead of blindly scanning the file
looking for interesting byte sequences, now mostly understands how records
work, and tries to find interesting ones that way.
Comment 3 Nick Burch 2005-04-12 13:19:38 UTC
Created attachment 14689 [details]
Next version of PPT support

Next release of powerpoint support. Now able to write files back out to disk,
and have PowerPoint still be able to open them. Also includes some bug fixes to
text runs.

Still doesn't let you edit files though! Can only load, extract text, and save
again.
Comment 4 Nick Burch 2005-04-13 13:50:36 UTC
Created attachment 14701 [details]
Next version of PPT supprt

Functionally quite similar to the last version. A few more records are
implemented, but not enough to edit text yet. Also has a few bug fixes.
Comment 5 Nick Burch 2005-04-19 18:16:20 UTC
Created attachment 14760 [details]
Next version of PPT supprt

Functionally very similar to last version.

Better structure for how Containers write themselves out, which should speed
development. Bug fix for some NotesAtom entries being longer than 6 bytes,
which broke save.
Comment 6 Nick Burch 2005-04-24 20:00:57 UTC
Created attachment 14825 [details]
Next version of PPT supprt

Slide text is now properly record based, as is the corresponding model code.

It is now possible to edit the text of bits of powerpoint slides (not notes),
*BUT* only if you don't change the length of the text!
Comment 7 Avik Sengupta 2005-05-12 20:59:16 UTC
Nick, thanks for all the work!

A few lines of docs would be useful.. I looked at the code, but couldnt figure
out where to start, i.e., couldnt figure out very easily how to use the api. I'm
sure its obvious once you know where to look.. but..

A couple of nits.. 
1. Do the methods of TextMunger do anything that the methods of
o.a.p.util.StringUtil do not do? We're quite paranoid about duplication.. its a
pain to maintain. 

2. HSFLSlidehsow should probably be in an usermodel package?

..Later, reading the javadocs makes things clearer, but a quickstart guide would
probably still be useful. 

Also, here are some simple tests you can take inspiration from. Look at
o.a.p.hssf.records.TestStringRecord for a very simple low level test. Look at
o.a.p.hssf.model.TestSheet for a higher level test on the model. Look at
o.a.p.hssf.usermodel.TestHSSFCell for a high level test.  
Comment 8 Nick Burch 2005-05-20 17:24:03 UTC
Created attachment 15088 [details]
Next version of PPT supprt

Much better usermodel and model code. Several bug fixes, and improved text
extraction
Comment 9 Nick Burch 2005-05-20 17:25:01 UTC
Created attachment 15089 [details]
Unit tests for powerpoint code

First version of unit tests for powerpoint code. Tests writing out, text
extraction, and some parts of the record layer (but not yet all)
Comment 10 Nick Burch 2005-05-20 17:27:00 UTC
Created attachment 15090 [details]
Quick guide to using the PowerPoint code

Basic introduction to using the PowerPoint code. Describes how to extract text,
how to change text, warnings about changing test, how to get Slides, key
classes etc.
Comment 11 Avik Sengupta 2005-05-20 20:04:57 UTC
TestReWrite fails. Any ideas?

single-scratchpad-test:
    [junit] Running org.apache.poi.hslf.TestReWrite
    [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 0.161 sec

    [junit] Testsuite: org.apache.poi.hslf.TestReWrite
    [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 0.161 sec

    [junit] Testcase: warning took 0.006 sec
    [junit]     FAILED
    [junit] Exception in constructor: testWritesOutTheSame
(java.lang.NegativeArraySizeException
    [junit]     at
org.apache.poi.ddf.EscherClientAnchorRecord.fillFields(EscherClientAnchorRecord.java:74)
    [junit]     at
org.apache.poi.ddf.EscherContainerRecord.fillFields(EscherContainerRecord.java:55)
    [junit]     at
org.apache.poi.ddf.EscherContainerRecord.fillFields(EscherContainerRecord.java:55)
    [junit]     at
org.apache.poi.ddf.EscherContainerRecord.fillFields(EscherContainerRecord.java:55)
    [junit]     at
org.apache.poi.hslf.record.PPDrawing.findEscherChildren(PPDrawing.java:108)
    [junit]     at org.apache.poi.hslf.record.PPDrawing.<init>(PPDrawing.java:85)
    [junit]     at
org.apache.poi.hslf.record.Record.createRecordForType(Record.java:159)
    [junit]     at
org.apache.poi.hslf.record.Record.findChildRecords(Record.java:102)
    [junit]     at
org.apache.poi.hslf.record.DummyRecordWithChildren.<init>(DummyRecordWithChildren.java:50)
    [junit]     at
org.apache.poi.hslf.record.Record.createRecordForType(Record.java:155)
    [junit]     at
org.apache.poi.hslf.record.Record.findChildRecords(Record.java:102)
    [junit]     at org.apache.poi.hslf.HSLFSlideShow.readFIB(HSLFSlideShow.java:173)
    [junit]     at org.apache.poi.hslf.HSLFSlideShow.<init>(HSLFSlideShow.java:102)
    [junit]     at org.apache.poi.hslf.TestReWrite.<init>(TestReWrite.java:44)
Comment 12 Nick Burch 2005-05-23 10:53:32 UTC
The failure is because no-one has applied my patch to
ddf.EscherClientAnchorRecord, in bug #34787

Without that patch, ddf.EscherClientAnchorRecord will do nasty things, because
the current version assumes a different size of record to what's really there.

Hopefully, if you try again having applied the ddf patch, the test will work.
Comment 13 Avik Sengupta 2005-05-25 17:21:14 UTC
OK thanks, that makes sense.  Its been a busy couple of days for me, and I'll
get to it in a day or two.. sorry. 
Comment 14 Avik Sengupta 2005-05-25 20:24:28 UTC
Ok, so I now get a failure. Any ideas? thanks! Are the ppt files attached to the
testcases correct?

single-scratchpad-test:
    [junit] Testsuite: org.apache.poi.hslf.TestRecordCounts
    [junit] Tests run: 3, Failures: 1, Errors: 0, Time elapsed: 0.17 sec

    [junit] Testcase: testSheetsCount took 0.008 sec
    [junit]     FAILED
    [junit] expected:<2> but was:<0>
    [junit] junit.framework.AssertionFailedError: expected:<2> but was:<0>
    [junit]     at junit.framework.Assert.fail(Assert.java:47)
    [junit]     at junit.framework.Assert.failNotEquals(Assert.java:282)
    [junit]     at junit.framework.Assert.assertEquals(Assert.java:64)
    [junit]     at junit.framework.Assert.assertEquals(Assert.java:201)
    [junit]     at junit.framework.Assert.assertEquals(Assert.java:207)
    [junit]     at
org.apache.poi.hslf.TestRecordCounts.testSheetsCount(TestRecordCounts.java:54)
Comment 15 Nick Burch 2005-05-27 19:53:19 UTC
Created attachment 15187 [details]
New Unit tests for powerpoint code

Update the unit tests so the all actually work

(I was sure I'd done that last time, but it seems I was a muppet and missed one
test)
Comment 16 Avik Sengupta 2005-05-28 20:26:05 UTC
Comitted. Thanks Nick! Please verify. 

Please provide new stuff as diffs against CVS, attached to a new bug.