Bug 59858

Summary: NullPointerException thrown by VBAMacroReader
Product: POI Reporter: brooke
Component: POIFSAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: brooke
Priority: P2    
Version: 3.15-dev   
Target Milestone: ---   
Hardware: All   
OS: All   
Attachments: Example xls that causes readMacros() to throw a NullPointerException
The results of running org.apache.poi.poifs.dev.POIFSDump.main on the problem document.

Description brooke 2016-07-14 15:23:56 UTC
I am getting a NullPointerException when trying to extract the macro VBA from a particular Excel file.

I am using org.apache.poi.poifs.macros.VBAMacroReader

The following code consistently reproduces the NullPointerException:

File file = new File("npe_example.xls");
VBAMacroReader reader = new VBAMacroReader(file);
Map<String, String> macros = reader.readMacros();

I have attached the file which causes the error.
Comment 1 brooke 2016-07-14 15:27:40 UTC
Created attachment 34038 [details]
Example xls that causes readMacros() to throw a NullPointerException
Comment 2 brooke 2016-07-14 15:28:21 UTC
Created attachment 34039 [details]
The results of running org.apache.poi.poifs.dev.POIFSDump.main on the problem document.
Comment 3 Javen O'Neal 2016-07-14 15:29:37 UTC
Could you provide a stack trace?
Comment 4 brooke 2016-07-14 15:31:49 UTC
Here is the stack trace:

Exception in thread "main" java.lang.NullPointerException
	at org.apache.poi.poifs.macros.VBAMacroReader.readMacros(VBAMacroReader.java:258)
	at org.apache.poi.poifs.macros.VBAMacroReader.findMacros(VBAMacroReader.java:148)
	at org.apache.poi.poifs.macros.VBAMacroReader.findMacros(VBAMacroReader.java:153)
	at org.apache.poi.poifs.macros.VBAMacroReader.findMacros(VBAMacroReader.java:153)
	at org.apache.poi.poifs.macros.VBAMacroReader.findMacros(VBAMacroReader.java:153)
	at org.apache.poi.poifs.macros.VBAMacroReader.readMacros(VBAMacroReader.java:115)
	at poitester.POITester.main(POITester.java:39)
Comment 5 Javen O'Neal 2016-07-14 15:46:11 UTC
A module offset was not set before trying to read the stream.
https://svn.apache.org/viewvc/poi/trunk/src/java/org/apache/poi/poifs/macros/VBAMacroReader.java?revision=1738674&view=markup#l258
Comment 6 Javen O'Neal 2016-07-15 05:29:13 UTC
Added unit test that reproduces the problem in r1752776.
Comment 7 Javen O'Neal 2016-07-15 06:13:36 UTC
Replaced NullPointerException with IOException with an error message of the name of the module that the VBAMacroReader failed to read in r1752778.
Comment 8 Tim Allison 2016-10-18 16:31:33 UTC
This file has two _VBA_PROJECT_CUR directories; both have Sheet1, Sheet2 and Sheet3 and thisWorkbook.  The _VBA_PROJECT_CUR under MDB00082648 has only empty (zero-byte) Sheet1, etc.; whereas the _VBA_PROJECT_CUR under root has meaningful content.

We are keying only off the name of the stream (e.g. "Sheet2") in our module map.  This means that we're overwriting (or skipping) the other "Sheet2".

For now, I propose checking if the module.buf is null.  If it is, then we expect an offset and we can read go about reading it.

Longer term, we might consider a way to prevent overwriting/skipping of streams with the same module name?

Perhaps this is what is meant by "TODO Refactor this to fetch dir then do the rest"?
Comment 9 Tim Allison 2016-10-18 16:44:30 UTC
r1765479

For now, I've added a check to see if we've already read the module with that name.