Bug 62948 - Row ranges are transformed to cell ranges during parsing
Summary: Row ranges are transformed to cell ranges during parsing
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: SS Common (show other bugs)
Version: 4.0.x-dev
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2018-11-25 16:29 UTC by gallon.fizik@gmail.com
Modified: 2018-11-25 19:18 UTC (History)
0 users

file for the testcase (136.04 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2018-11-25 19:17 UTC, gallon.fizik@gmail.com

Note You need to log in before you can comment on or make changes to this bug.
Description gallon.fizik@gmail.com 2018-11-25 16:29:37 UTC
Cell cell = new XSSFWorkbook().createSheet().createRow(0).createCell(0);

EvaluationWorkbook evaluationWorkbook = XSSFEvaluationWorkbook.create((XSSFWorkbook) cell.getSheet().getWorkbook());

Ptg[] ptgs = FormulaParser.parse(cell.getCellFormula(), (FormulaParsingWorkbook) evaluationWorkbook, FormulaType.CELL, 0, 0);

String reconstructed = FormulaRenderer.toFormulaString((FormulaRenderingWorkbook) evaluationWorkbook, ptgs);

// reconstructed == "SUM(sheet!$A85:$XFD85)" // wuut

I'm on it and will hopefully publish a fix soon
Comment 1 gallon.fizik@gmail.com 2018-11-25 19:14:44 UTC
1. HSSF converts row range to cell range right away at setCellFormula. Demo:
        Cell cell = new HSSFWorkbook().createSheet().createRow(0).createCell(0);
        System.out.println(cell.getCellFormula()); // SUM($A85:$IV85)
2. XSSF does parse the formula string during a Cell.setCellFormula() but the original string is stored. So for XSSF the case may seem insignificant but...

3. However I found a valid showcase: an expression like SUM(5:5) may be a master formula for a shared formula. To produce formula strings for secondary cells, the master formula is parsed, shifted as Ptg[] and rendered back to string. And here's the case that the original form is overwritten.

    public void demo() throws IOException {
        Row row = new XSSFWorkbook("res/test.xlsx").getSheet("sheet").getRow(0);
        System.out.println(row.getCell(0).getCellFormula()); // SUM(5:5)
        System.out.println(row.getCell(1).getCellFormula()); // SUM($A5:$XFD5)

Although seemingly insignificat for any evaluations, it's bad that formulas get to look different.

4. And another thing. A case of a reference with column range (F:F) is handled in AreaPtgBase.formatReferenceAsString() but with row range is not. It's pretty easy to add. However,

*** formatReferenceAsString (and the whole FormulaRenderer) uses hard-coded SpreadSheetVersion == EXCEL97 ***

so the formatter has now way to detect correctly if a reference covers a whole row/range of rows.

Furthermore, FormulaParser *is* aware of the version... therefore the formatter won't detect a reference as a row reference because it ises excel97, and the sizes won't match. 

My proposition:
* add FormulaRenderingWorkbook.getSpreadSheetVersion()
* pass the version to any Ptg.formatReferenceString()
* use EXCEL97 as the default version
* version-sensitive ptgs (well... a ptg doesn't care much about the version but at least it formats itself) will use the version to produce a correct string.

It may also be a good thing gor the future in case the next spreadsheet format has some differences in syntax/formatting.
Comment 2 gallon.fizik@gmail.com 2018-11-25 19:17:03 UTC
Created attachment 36277 [details]
file for the testcase