Bug 60260

Summary: Sheet shiftRows exception at rowShifter.updateNamedRanges
Product: POI Reporter: Zero <tuyenivt>
Component: XSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: 3.15-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: excel file used in test code

Description Zero 2016-10-17 03:07:10 UTC
Created attachment 34376 [details]
excel file used in test code

I meet excetion when shift rows in sheet name Sheet・1 and named_range test_named_range include sheet name

I add file in attactment and code used (tested with 3.15 and 3.16-beta1)

    @Test
    public void testShiftRows() throws IOException {
        String in = "data/book2.xlsx";
        try (XSSFWorkbook wb = new XSSFWorkbook(new FileInputStream(in))) {
            XSSFSheet sheet = wb.getSheetAt(0);
            sheet.shiftRows(1, 2, 3);
        }
    }

Exception message:

org.apache.poi.ss.formula.FormulaParseException: Specified named range 'Sheet' does not exist in the current workbook.
	at org.apache.poi.ss.formula.FormulaParser.parseNonRange(FormulaParser.java:898)
	at org.apache.poi.ss.formula.FormulaParser.parseRangeable(FormulaParser.java:490)
	at org.apache.poi.ss.formula.FormulaParser.parseRangeExpression(FormulaParser.java:311)
	at org.apache.poi.ss.formula.FormulaParser.parseSimpleFactor(FormulaParser.java:1509)
	at org.apache.poi.ss.formula.FormulaParser.percentFactor(FormulaParser.java:1467)
	at org.apache.poi.ss.formula.FormulaParser.powerFactor(FormulaParser.java:1454)
	at org.apache.poi.ss.formula.FormulaParser.Term(FormulaParser.java:1827)
	at org.apache.poi.ss.formula.FormulaParser.additiveExpression(FormulaParser.java:1955)
	at org.apache.poi.ss.formula.FormulaParser.concatExpression(FormulaParser.java:1939)
	at org.apache.poi.ss.formula.FormulaParser.comparisonExpression(FormulaParser.java:1896)
	at org.apache.poi.ss.formula.FormulaParser.intersectionExpression(FormulaParser.java:1869)
	at org.apache.poi.ss.formula.FormulaParser.unionExpression(FormulaParser.java:1849)
	at org.apache.poi.ss.formula.FormulaParser.parse(FormulaParser.java:1997)
	at org.apache.poi.ss.formula.FormulaParser.parse(FormulaParser.java:170)
	at org.apache.poi.xssf.usermodel.helpers.XSSFRowShifter.updateNamedRanges(XSSFRowShifter.java:91)
	at org.apache.poi.xssf.usermodel.XSSFSheet.shiftRows(XSSFSheet.java:3068)
	at org.apache.poi.xssf.usermodel.XSSFSheet.shiftRows(XSSFSheet.java:2925)
	at test.poi.TestExcel.testShiftRows(TestExcel.java:69)

Thank you very much for reading and support
Comment 1 Zero 2017-01-12 02:11:01 UTC
The exception also occurred while rename sheet
Comment 2 Javen O'Neal 2017-01-12 08:16:03 UTC
NamedRange(
    name = "test_named_range",
    refersTo = "'Sheet・1'!A1:A9"
)

This is the first time I have seen the sheet name be non-ASCII, which probably breaks some of our assumptions and regular expressions.

How do you think this should be fixed?
The rename sheet [1] case will be easier to start with as it doesn't require understanding the parse tree.

[1] https://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/xssf/usermodel/XSSFWorkbook.java?revision=1777855&view=markup#l1656
Comment 3 Javen O'Neal 2017-01-12 08:20:29 UTC
I stand corrected. Renaming a sheet requires updating named ranges (and formulas) that refer to that sheet, which requires parsing the formula. We'll need to dig through the parse tree code either way.

https://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/xssf/usermodel/helpers/XSSFFormulaUtils.java?revision=1754521&view=markup
Comment 4 Javen O'Neal 2017-01-12 08:32:39 UTC
The formula parse tree assumes the formula contains only ASCII. It reads one **char** from the formula string at a time via GetChar() [1]. For multi-byte symbols in the formula, a char is returned for each byte in the symbol [2]. Perhaps we should be using String.codePointAt(int index) instead [3].

> private char look;
> ...
> private void GetChar() {
> ...
>     look=_formulaString.charAt(_pointer);

[1] https://svn.apache.org/viewvc/poi/trunk/src/java/org/apache/poi/ss/formula/FormulaParser.java?revision=1776796&view=markup#l1134

[2] http://docs.oracle.com/javase/6/docs/api/java/lang/Character.html#unicode
[3] http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#codePointAt(int)
Comment 5 Zero 2017-01-12 10:13:55 UTC
In my test, maybe multi-byte char is invalid when isUnquotedSheetNameChar check sheet name chars [1], char '・' not match any case in 
> private static boolean isUnquotedSheetNameChar(char ch) {
>   if (Character.isLetterOrDigit(ch)) {
>     return true;
>   }
>   switch(ch) {
>     case '.': // dot is OK
>     case '_': // underscore is OK
>         return true;
>   }
> ...
[1] https://svn.apache.org/viewvc/poi/trunk/src/java/org/apache/poi/ss/formula/FormulaParser.java?revision=1776796&view=markup#l1178
Comment 6 Javen O'Neal 2017-01-12 10:57:53 UTC
Added support for unicode (one and multi-char symbols) in r1778418 and docs in 1778419.
Note that with these changes, look is a codepoint value, not a char, in the formula parser.

Thanks for the simple problem description, stack trace, unit test and test file.

These changes will be included in the next release, 3.16 beta 2.
Comment 7 Javen O'Neal 2017-01-12 11:15:01 UTC
http://graphemica.com/%E3%83%BB
This is katakana middle dot (U+30FB)
Comment 8 Javen O'Neal 2017-01-12 17:38:58 UTC
(In reply to Zero from comment #1)
> The exception also occurred while rename sheet

Do you have a unit test for this as well? Oddly, the two other unit tests that I wrote in r1778418 that started with a blank workbook passed without modifying Formula parser. It'd be good to have another meaningful unit test if you have it.
Comment 9 Zero 2017-01-13 02:16:08 UTC
(In reply to Javen O'Neal from comment #8)
> (In reply to Zero from comment #1)
> > The exception also occurred while rename sheet
> 
> Do you have a unit test for this as well? Oddly, the two other unit tests
> that I wrote in r1778418 that started with a blank workbook passed without
> modifying Formula parser. It'd be good to have another meaningful unit test
> if you have it.

Your changes work with me, I try to test based on your unit test (renameSheetWithNamedRangeReferringToUnicodeSheetName), and all my case passed

> @Test
> public void renameSheetWithNamedRangeReferringToUnicodeSheetName() {
>     List<String> specChars = Arrays.asList(
>             "\u2018", "\u300C", "\u300D", "\uFFE5", "\uFF5B",
>             "\uFF5D", "\uFF5C", "\uFF1B", "\u2019", "\uFF1A", "\u201D",
>             "\u30FC", "\uFF5E", "\uFF01", "\uFF20", "\uFF03", "\uFF04",
>             "\uFF05", "\uFF3E", "\uFF06", "\uFF0A", "\uFF08", "\uFF09",
>             "\uFF3F", "\uFF1C", "\uFF1E", "\uFF1F", "\u3001", "\u3002",
>             "\u30FB", "\u2605", "\u2606", "\uFF0B", "\u25CF", "\u203B",
>             "\uFF1D", "\u2460", "\u2461", "\u2462", "\u2463", "\u2464",
>             "\u2465", "\u2466", "\u2467", "\u2468", "\u2469", "\u246A",
>             "\u246B", "\u246C", "\u246D", "\u246E", "\u246F", "\u2470",
>             "\u2471", "\u2472", "\u2473", "\u24EB", "\u24EC", "\u24ED",
>             "\u24EE", "\u24EF", "\u24F0", "\u24F1", "\u24F2", "\u24F3",
>             "\u24F4", "\u3007", "\uFF0E", "\u30FB", "\u3002", "\u25CB",
>             "\u25CE", "\u21D4", "\u2192", "\u2190", "\u2191", "\u2193",
>             "\u21D2", "\u3010", "\u3011", "\u3014", "\u3015", "\u300A",
>             "\u300B", "\u3008", "\u3009", "\u300E", "\u300F", "\uFF1C",
>             "\uFF1E", "\u226A", "\u226B", "\u301D", "\u301F");
>     XSSFWorkbook wb = new XSSFWorkbook();
>     for (String specChar : specChars) {
>         wb.createSheet("Sheet" + specChar + "1");
> 
>         Name name = wb.createName();
>         name.setNameName("test_named_range");
>         name.setRefersToFormula("'Sheet" + specChar + "1'!A1:A6");
> 
>         wb.setSheetName(0, "Sheet 1");
>         
>         wb.removeName(name);
>         wb.removeSheetAt(0);
>     }
>     IOUtils.closeQuietly(wb);
> }

Thank you so much, Javen O'Neal