Bug 60811 - Possibility to keep structured references string
Summary: Possibility to keep structured references string
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: 3.15-FINAL
Hardware: PC All
: P2 enhancement (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-03-03 11:42 UTC by Phili0815
Modified: 2017-03-03 21:37 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Phili0815 2017-03-03 11:42:23 UTC
When you parse a formula with structured references like '[columnname]', they will be replaced with an area reference! But this results in two problems when rendering the formula back to a string via FormulaRenderer:

- You will definetely loose the orginal reference string e.g. '[columnname]' because it will be replaced with something like 'sheetname!A2:A10'. However it may be desired to keep the original structured reference string because it is more readable. Especially in large tables with many columns where you don't know immediately what data is in column e.g. BF.

- Another problem is that when you copy the formula to different cells in the table and then add some new rows to the table, all the copied formulas will be invalid because the area reference does not cover the complete column anymore. If you open such a file in Excel 2016 it will give you warnings like 'This cell is inconsistent with the column formula'. If the structured references would be kept, this would never be a problem because such a reference remain valid independent of the number of rows in the table.


So for example a use case from me:

I have a template sheet that contains a table with a single row. Some of the cells in the row contain formulas. Now I load some data from a database and insert the data row by row into the table. Therefore the formulas have to be copied. The formulas are mixed with structured references and normal cell references like E2, A4,... which depend on the cell's row. Hence I have to parse the orginal formula, update the normal cell references depending on the current row and set the updated formula. But exactly this results now in the described problem because the structured reference is now an area reference that gets not updated with the increasing table. After inserting all the data I update the table area. I can't do that before because I don't know how many data there will be. So adding an option to keep the structural reference as a structural reference would be realy nice.
Comment 1 Javen O'Neal 2017-03-03 15:28:19 UTC
This seems doable. I'm assuming a structures reference is its own Ptg that can be eval'd to an AreaReference.
Comment 2 Greg Woolsey 2017-03-03 15:54:02 UTC
When I implemented structured reference formula parsing last year, I was focused on evaluation, not reversibility, so I didn't go to the work of creating a new Ptg type, I just converted the reference to an area Ptg as part of parsing the formula.  So I see the work you would like to have here.

However, why do you need to parse the formula into a Ptg[] when adding new rows?  Why can't you convert all the normal cell references to structured row-based column references?  Then you don't have to manipulate the formula string at all, just copy it to new rows.
Comment 3 Phili0815 2017-03-03 21:01:57 UTC
Yeah, of course there are several workarounds for me. Currently I escape the structured references such that they are handled as a string and unescape them after rendering the formula back. The problem is that I have multiple templates, not all of them use structured references, not all of them have even set a table and some use both etc. So my programm have to deal with all these cases. Thats why it would be nice to have.

Or may be you know a better way. I noticed that Excel stores column formulas for every column. These formulas can be found in CTTableColumn.getCalculatedColumnFormula() and they also causes the error message in Excel: 'This cell is inconsistent with the column formula' if you set the manipulated formulas with the wrong translated structural reference. Does there exists a possibility in POI to use this formula by default for a table column? Copying in every cell is not working because then again normal cell references are wrong.
Comment 4 Greg Woolsey 2017-03-03 21:37:49 UTC
I have a rather large class that does pretty much what you are wanting, that I use for my day job.  It digs into the XSSFTable for the related CTConnection, uses that to query the database, and updates the table definition with the new results, wiping out previous rows.

I combine the 

CTQueryTableFields.getQueryTableFieldList() 
and the 
CTTable.getTableColumns().getTableColumnList()

by index into a composite object that tracks the column definitions.  For calculated columns, that don't come from the query results, I just set the cell formula to:

CTTableColumn.getCalculatedColumnFormula().getStringValue()

which works for me just fine in all my tests so far.  I've not had to do any manual adjusting of formulas, but I don't know that I've tried any with references that weren't already in structured query format.

If the table isn't an actual table, but just a formatted range of cells, then you will have to write your own handling.  But this works for me for QueryTable updates, which is a key part of my company's product.

I've thought about making this part of POI, but haven't had the time or inclination to figure out what a reasonable API would be.  Currently I use ResultSet.next() and ResultSet.getObject(CTQueryTableField.getName()) to match things up.  I suppose I could write an Interface that encapsulated them, with some checked exceptions for missing/invalid column names in the input vs. the table definition.

Doesn't do anything for reversing formula evaluation, but does handle QueryTable updates.

That should be enough details to point you in the right direction if you want to take a crack at it.