Bug 68335 - java.lang.IllegalArgumentException in org.apache.poi.hssf
Summary: java.lang.IllegalArgumentException in org.apache.poi.hssf
Status: RESOLVED WORKSFORME
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 5.2.3-FINAL
Hardware: PC Linux
: P2 major (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-13 10:58 UTC by Xiaohan Zhang
Modified: 2024-02-25 12:43 UTC (History)
0 users



Attachments
Crash samples (4.38 KB, application/zip)
2023-12-13 10:58 UTC, Xiaohan Zhang
Details
POC xls file (3.01 KB, application/vnd.ms-excel)
2023-12-13 11:20 UTC, Xiaohan Zhang
Details
POC xls file2 (16.99 KB, application/vnd.ms-excel)
2023-12-13 11:21 UTC, Xiaohan Zhang
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Xiaohan Zhang 2023-12-13 10:58:24 UTC
Created attachment 39457 [details]
Crash samples

Recently we discovered a bug in poi (5.2.3).
Due to the lack of contextual knowledge in the poi library, we cannot thoroughly fix some bugs hence we look forward to any proposed plan from the developers in fixing these bugs.

# Crash Stack
```
('org.apache.poi.hssf.record.aggregates.FormulaRecordAggregate.<init>', 'FormulaRecordAggregate.java:73'),
Exception in thread "main" java.lang.IllegalArgumentException: Unexpected base token id (-64)
        at org.apache.poi.ss.formula.ptg.Ptg.createBasePtg(Ptg.java:170)
        at org.apache.poi.ss.formula.ptg.Ptg.createPtg(Ptg.java:92)
        at org.apache.poi.ss.formula.ptg.Ptg.readTokens(Ptg.java:66)
        at org.apache.poi.ss.formula.Formula.getTokens(Formula.java:89)
        at org.apache.poi.hssf.record.FormulaRecord.getParsedExpression(FormulaRecord.java:213)
        at org.apache.poi.hssf.record.aggregates.FormulaRecordAggregate.handleMissingSharedFormulaRecord(FormulaRecordAggregate.java:94)
        at org.apache.poi.hssf.record.aggregates.FormulaRecordAggregate.<init>(FormulaRecordAggregate.java:73)
        at org.apache.poi.hssf.record.aggregates.ValueRecordsAggregate.construct(ValueRecordsAggregate.java:179)
        at org.apache.poi.hssf.record.aggregates.RowRecordsAggregate.<init>(RowRecordsAggregate.java:113)
        at org.apache.poi.hssf.model.InternalSheet.<init>(InternalSheet.java:189)
        at org.apache.poi.hssf.model.InternalSheet.createSheet(InternalSheet.java:128)
        at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:382)
        at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:431)
        at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:411)
        at com.test.Entry.main(Entry.java:34)
```

```
('org.apache.poi.hssf.record.common.UnicodeString.<init>', 'UnicodeString.java:96'),
Exception in thread "main" java.lang.IllegalArgumentException: Cannot create a ChainLoopDetector with negative size, but had: -2147483648
        at org.apache.poi.poifs.filesystem.BlockStore$ChainLoopDetector.<init>(BlockStore.java:89)
        at org.apache.poi.poifs.filesystem.POIFSMiniStore.getChainLoopDetector(POIFSMiniStore.java:237)
        at org.apache.poi.poifs.filesystem.POIFSStream$StreamBlockByteBufferIterator.<init>(POIFSStream.java:195)
        at org.apache.poi.poifs.filesystem.POIFSStream.getBlockIterator(POIFSStream.java:96)
        at org.apache.poi.poifs.filesystem.POIFSStream.iterator(POIFSStream.java:87)
        at org.apache.poi.poifs.filesystem.POIFSDocument.getBlockIterator(POIFSDocument.java:177)
        at org.apache.poi.poifs.filesystem.DocumentInputStream.<init>(DocumentInputStream.java:92)
        at org.apache.poi.poifs.filesystem.DirectoryNode.createDocumentInputStream(DirectoryNode.java:160)
        at org.apache.poi.poifs.filesystem.DirectoryNode.createDocumentInputStream(DirectoryNode.java:137)
        at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:369)
        at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:431)
        at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:411)
        at com.test.Entry.main(Entry.java:34)
```

# Test Program

```
package com.test;
import java.io.File;
import java.io.InputStream;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.CellType;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class Entry {
        public static void main (String args[]) throws IOException {
                assert args.length == 1;
                System.out.println("Testing Harness with args[0]: " + args[0]);
                try {
                        FileInputStream fis = new FileInputStream(args[0]);
                        Workbook workbook = null;
                        workbook = new HSSFWorkbook(fis);
                        int numberOfSheets = workbook.getNumberOfSheets();
                        for(int i=0; i < numberOfSheets; i++){
                                Sheet sheet = workbook.getSheetAt(i);
                                Iterator<Row> rowIterator = sheet.iterator();
                                while (rowIterator.hasNext())
                        {
                                        String name = "";
                                        String shortCode = "";
                                        Row row = rowIterator.next();
                                        Iterator<Cell> cellIterator = row.cellIterator();

                            while (cellIterator.hasNext())
                            {
                                Cell cell = cellIterator.next();
                                if (cell.getCellType() == CellType.STRING){
                                name = cell.getStringCellValue().trim();
                                System.out.println("Random data::"+ name);
                                } else if (cell.getCellType() == CellType.NUMERIC){
                                System.out.println("Random data::"+cell.getNumericCellValue());
                                }
                            }
                        } 
                        fis.close();
                        }
                } catch (IOException e) {
                        e.printStackTrace();
        }
}
```
Comment 1 PJ Fanning 2023-12-13 11:14:57 UTC
* POI is a volunteer project and the community is no longer very active
* The test case and stacktrace you provided are not very useful. Please provide an xls file that reproduces the issue.
* The IllegalArgumentException could be because there is a number overflow somewhere - but this could be a sign that you have a very big file
* I am not going to open your zip file. I have no idea what is in it but it doesn't appear to be an xls file. 

In the end of the day, users will need to get used to the idea that they will need to roll up their own sleeves and do a lot of the investigation themselves. The POI code is plain Java. It's fairly complicated but a motivated developer should be able to make some reasonable progress with working out how it works.
Comment 2 PJ Fanning 2023-12-13 11:17:58 UTC
At this stage, there isn't much interest in maintaining of enhancing the HSSF code. xls format is prehistoric. xlsx is much better supported. I, for one, will occasionally look at XSSF issues (xlsx files) but have very little interest in HSSF issues.
Comment 3 Xiaohan Zhang 2023-12-13 11:20:34 UTC
Created attachment 39459 [details]
POC xls file

Sorry for the inconvenient. I attach the xls file that can crash the test program.
Comment 4 Xiaohan Zhang 2023-12-13 11:21:16 UTC
Created attachment 39460 [details]
POC xls file2
Comment 5 PJ Fanning 2023-12-13 11:27:03 UTC
I tried 'POC xls file'  and it won't open in Excel (outlook.com).

I also tried 'POC xls file2' and it was 'repaired' by Excel but nothing was left after it was repaired.

Those files are corrupt.
Comment 6 PJ Fanning 2023-12-13 11:43:24 UTC
See https://bz.apache.org/bugzilla/show_bug.cgi?id=68336 -- this user is just fuzzing files and expecting someone else to deal with them.
Comment 7 Dominik Stadler 2024-02-25 12:43:21 UTC
Apache POI does not try to handle broken documents without throwing exceptions.

It tries to not allocate endless amounts of memory and not run into endless loops/stackoverflow-exceptions. 

Therefore in this case it seems fine to get this type of exception when the input data is actually a document produced by a fuzzer. 

See https://github.com/google/oss-fuzz/tree/master/projects/apache-poi/src/main/java/org/apache/poi for some fuzz-targets and which exceptions they currently handle "gracefully".