Bug 65639 - org.apache.poi.util.RecordFormatException: Tried to allocate an array of length 4276190, but 1000000 is the maximum for this record type
Summary: org.apache.poi.util.RecordFormatException: Tried to allocate an array of leng...
Status: NEEDINFO
Alias: None
Product: POI
Classification: Unclassified
Component: HSLF (show other bugs)
Version: 4.1.2-FINAL
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-10-19 14:03 UTC by redmanmale
Modified: 2021-10-22 16:07 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description redmanmale 2021-10-19 14:03:19 UTC
I try to parse ppt document and get this error:

<business logic>

Caused by: org.apache.poi.hslf.exceptions.HSLFException: Couldn't instantiate the class for type with id 1000 on class org.apache.poi.hslf.record.RecordTypes$$Lambda$1302/2069825217@6c8f6293 : org.apache.poi.hslf.exceptions.HSLFException: Couldn't instantiate the class for type with id 1010 on class org.apache.poi.hslf.record.RecordTypes$$Lambda$1308/936605483@4abd8f87 : org.apache.poi.hslf.exceptions.HSLFException: Couldn't instantiate the class for type with id 2005 on class org.apache.poi.hslf.record.RecordTypes$$Lambda$1327/708554163@431de604 : org.apache.poi.hslf.exceptions.HSLFException: Couldn't instantiate the class for type with id 4024 on class org.apache.poi.hslf.record.RecordTypes$$Lambda$1345/56643443@5fd52ce1 : org.apache.poi.util.RecordFormatException: Tried to allocate an array of length 4276190, but 1000000 is the maximum for this record type.
If the file is not corrupt, please open an issue on bugzilla to request
increasing the maximum allowable size for this record type.
As a temporary workaround, consider setting a higher override value with IOUtils.setByteArrayMaxOverride()
at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:190)
at org.apache.poi.hslf.record.Record.buildRecordAtOffset(Record.java:118)
at org.apache.poi.hslf.usermodel.HSLFSlideShowImpl.read(HSLFSlideShowImpl.java:270)
at org.apache.poi.hslf.usermodel.HSLFSlideShowImpl.buildRecords(HSLFSlideShowImpl.java:251)
at org.apache.poi.hslf.usermodel.HSLFSlideShowImpl.<init>(HSLFSlideShowImpl.java:150)
at org.apache.poi.hslf.usermodel.HSLFSlideShow.<init>(HSLFSlideShow.java:163)
at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:83)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:178)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:131)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
... 11 more
Caused by: org.apache.poi.hslf.exceptions.HSLFException: Couldn't instantiate the class for type with id 1010 on class org.apache.poi.hslf.record.RecordTypes$$Lambda$1308/936605483@4abd8f87 : org.apache.poi.hslf.exceptions.HSLFException: Couldn't instantiate the class for type with id 2005 on class org.apache.poi.hslf.record.RecordTypes$$Lambda$1327/708554163@431de604 : org.apache.poi.hslf.exceptions.HSLFException: Couldn't instantiate the class for type with id 4024 on class org.apache.poi.hslf.record.RecordTypes$$Lambda$1345/56643443@5fd52ce1 : org.apache.poi.util.RecordFormatException: Tried to allocate an array of length 4276190, but 1000000 is the maximum for this record type.
If the file is not corrupt, please open an issue on bugzilla to request
increasing the maximum allowable size for this record type.
As a temporary workaround, consider setting a higher override value with IOUtils.setByteArrayMaxOverride()
at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:190)
at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:143)
at org.apache.poi.hslf.record.Document.<init>(Document.java:133)
at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
... 20 more
Caused by: org.apache.poi.hslf.exceptions.HSLFException: Couldn't instantiate the class for type with id 2005 on class org.apache.poi.hslf.record.RecordTypes$$Lambda$1327/708554163@431de604 : org.apache.poi.hslf.exceptions.HSLFException: Couldn't instantiate the class for type with id 4024 on class org.apache.poi.hslf.record.RecordTypes$$Lambda$1345/56643443@5fd52ce1 : org.apache.poi.util.RecordFormatException: Tried to allocate an array of length 4276190, but 1000000 is the maximum for this record type.
If the file is not corrupt, please open an issue on bugzilla to request
increasing the maximum allowable size for this record type.
As a temporary workaround, consider setting a higher override value with IOUtils.setByteArrayMaxOverride()
at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:190)
at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:143)
at org.apache.poi.hslf.record.Environment.<init>(Environment.java:54)
at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
... 23 more
Caused by: org.apache.poi.hslf.exceptions.HSLFException: Couldn't instantiate the class for type with id 4024 on class org.apache.poi.hslf.record.RecordTypes$$Lambda$1345/56643443@5fd52ce1 : org.apache.poi.util.RecordFormatException: Tried to allocate an array of length 4276190, but 1000000 is the maximum for this record type.
If the file is not corrupt, please open an issue on bugzilla to request
increasing the maximum allowable size for this record type.
As a temporary workaround, consider setting a higher override value with IOUtils.setByteArrayMaxOverride()
at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:190)
at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:143)
at org.apache.poi.hslf.record.FontCollection.<init>(FontCollection.java:53)
at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
... 26 more
Caused by: org.apache.poi.util.RecordFormatException: Tried to allocate an array of length 4276190, but 1000000 is the maximum for this record type.
If the file is not corrupt, please open an issue on bugzilla to request
increasing the maximum allowable size for this record type.
As a temporary workaround, consider setting a higher override value with IOUtils.setByteArrayMaxOverride()
at org.apache.poi.util.IOUtils.throwRFE(IOUtils.java:630)
at org.apache.poi.util.IOUtils.checkLength(IOUtils.java:208)
at org.apache.poi.util.IOUtils.safelyAllocateCheck(IOUtils.java:610)
at org.apache.poi.util.IOUtils.safelyAllocate(IOUtils.java:596)
at org.apache.poi.hslf.record.FontEmbeddedData.<init>(FontEmbeddedData.java:70)
at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
... 29 more

Maybe we should bump the default max size for this record type.

I could've attach a file but it's more than 1 Mb (around 5 Mb). If you need it I could upload and paste a link.
Comment 1 PJ Fanning 2021-10-19 14:52:10 UTC
Is there a reason you can't use IOUtils.setByteArrayMaxOverride() ?

The max is there to protect users from malicious files. Ww would be reluctant to change the default and you will be stuck waiting for a release anyway.
Comment 2 redmanmale 2021-10-20 09:03:43 UTC
I've already use setByteArrayMaxOverride and it fixed problem for me.

But there's a comment in this method to open an issue if you're using it.
>>and please open up issues on POI's bugzilla to bump values for specific records.

That's it.
Comment 3 PJ Fanning 2021-10-20 09:08:32 UTC
It would seem odd to have a font config that is so big. It's my opinion that it is not worth changing the POI default in this case. Someone else might have a different opinion.
Comment 4 Nick Burch 2021-10-20 09:52:00 UTC
4mb of font metadata feels very high and likely broken, but if it is actually holding the full font with lots of hinting/design then that might be reasonable. If it is holding multiple fonts, 4mb seems quite likely.

Anyone have 15 minutes to find the relevant link to the MS docs on this kind of font record, to check if it is just metadata or can contain full embedded fonts?
Comment 5 redmanmale 2021-10-21 09:53:28 UTC
I've processed 30k documents and found ~60 docx with the huge record length (~37000000).
Comment 6 PJ Fanning 2021-10-21 10:09:48 UTC
I've added r1894438

PS this issue does affect docx files, it affects ppt files.
Comment 7 PJ Fanning 2021-10-21 10:27:58 UTC
I meant 'this issue does not affect docx files, it affects ppt files.'
Comment 8 PJ Fanning 2021-10-21 10:55:09 UTC
@redmanmale could you attach a sample ppt file that exhibits this issue - so we can add a regression test for it?
Comment 9 redmanmale 2021-10-22 16:07:39 UTC
>I meant 'this issue does not affect docx files, it affects ppt files.'
Sorry, I'll open another issue for this thing.

>could you attach a sample ppt file that exhibits this issue
I'll take a look if there's any that I could upload (without sensitive or private data).