Attempts to load in my 'spreadsheet from hell' have advanced slightly since the previous reported bug was closed, I downloaded the 20020610 nightly build, and re-ran my test script, now I get a new error: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: Unknown PTG = 3d (61) at org.apache.poi.hssf.record.formula.Ptg.createPtg(Ptg.java:241) at org.apache.poi.hssf.record.NameRecord.getParsedExpressionTokens(NameRecord.java:604) at org.apache.poi.hssf.record.NameRecord.fillFields(NameRecord.java:576) at org.apache.poi.hssf.record.Record.fillFields(Record.java:142) at org.apache.poi.hssf.record.Record.<init>(Record.java:104) at org.apache.poi.hssf.record.NameRecord.<init>(NameRecord.java:116) at java.lang.reflect.Constructor.newInstance(Native Method) at org.apache.poi.hssf.record.RecordFactory.createRecord(RecordFactory.java:247) at org.apache.poi.hssf.eventmodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:201) at org.apache.poi.hssf.eventmodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:139) at PoiTest.main(PoiTest.java:268) Exception in thread "main" org.apache.poi.hssf.record.RecordFormatException: Unable to construct record instance, the following exception occured: null at org.apache.poi.hssf.record.RecordFactory.createRecord(RecordFactory.java:260) at org.apache.poi.hssf.eventmodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:201) at org.apache.poi.hssf.eventmodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:139) Unfortunately I dont know what a PTG is, so I can't offer a tremendous amount of insight yet. I can provide additional information if necessary.
Sounds like a formula thing (which is still being worked on in the trunk). Checkout the 1.5 branch. (Branch tag: REL_1_5_BRANCH). Attn: Formula guys! Is this fixable?
This error seems to be a reoccurring problem. The way I am using HSSF, I read in existing Excel workbooks which have a wide variety of "features". I seem to get this error ( but with different PTG = ?? values ) quite often. The crux is, I am using the EventModel, and this cripples me with an exception. Is this unknown PTG problem going to be continuous, and if so, can we throw this exception somewhere else in the API, thus shielding the EventModel from failure? My hope for the EventModel is that it allows me to "only care about what I need". In this case, I couldn't care about parsing errors on formulas. I want the data. just my 1.5 cents... Jason Hoffman
you shouldn't get this with 1.5 if you have EXPERIMENTAL_FORMULA_SUPPORT compiled out (default). If this is in the 2.0-dev branch well...thats why its the dev branch. If not then we should target this for 1.5.1 and fix it.
Aha.. works fine on the 1.5 branch -- excellent.. Thank you.
BTW, 0x3D is ptgAreaErr3dR .. whatever that means :). Given that we are not likely to implement the semantics of *that* ptg very soon, maybe we should remove the remove the exception and use an UnknownPtg pattern so that we can write back everything we read even if we dont understand it yet. Seems to go in with the philosophy of rest of poi. No?
Hi Avik, If you think about it you'll realize why its not possible. While we *could* have the formula record able to do this without exception, PTGs have an indeterminant size. Records always explicitly declare their size. So, while I'd love to do it that way if I have 10 ptgs, and the 5th one can't be parsed, then neither can the one's following (Because the size of the 5th can't be determined). What we *should* do is make the *formula* record lazy parse these. Only if someone asks it for something related to the PTGS should it parse them. If it can't...Then throw the exception and let the user handle it. For the moment this should be a *checked* exception. (which will make people hate us, but its the *right* thing to do, because its the same sitation as say reading a file...even if you *know* its there....one day it won't be). The formula record should keep some flag on whether to just write the raw stored bytes or un-parse the array of interperated PTG objects. Make sense? (reopening against 2.0-dev)
Learning for the day. Thanks a lot Andy. I dont know about a checked exception, since the exception is being thrown due to an issue with our code, rather than an issue with the environment. What will happen when we finally manage to handle all ptg's. Or do u forsee that the file format will keep evolving and we will never be able to understand *ALL* ptgs? Once i am thru with my current work in a day or two, i'll look at implementing all ptgs (read and write, but not generate). I think i'll be able to do them pretty fast (unlike the current struggle against undocumented formula semantics). Then maybe the problem will go away?
I like the idea of "lazy" parsing. Also, my co-worker and I were discussing the idea of stepping back and registering the records you care about as the ultimate first step. This would essentially mean "lazily" pasring EVERYTHING, unless you have registered that you would like a thorough parsing of certain record types. This could possibly reduce the memory usage of "Reading" existing spreadsheets. I'm implying taking this "lazy" approach as far back as is possible, thus reducing the unnecessary creation of Java objects, which will never be used. ( my additional .5 cents, added to earlier comment = 2 cents.. ( I'll shutup now ))
Registering interest .. well, thats what the Event API is meant for. I have fixed this problem for the moment as follows: If its a PTG we cannot recognise, then the FormulaRecord falls back to dumb "store the whole byte thingy" behaviour. Trying to get the formula from usermodel will result in a #NAME , which is what Excel itself uses when it cant understand a formula. Therefore, people who dont need to read formulas will not be bother by unknown ptg exceptions, and it will be written back properly. As a result, one method on FormulaRecord CAN return a null. It is therefore noted in its javadoc comments, callers should check. However the method is meant to be called primarily from within poi. so we should be okay. I've tried to check all its callers in poi. I think this fixes jason's 1.5 cents, the other .5 cents i am not sure we need ... just my own 2c :)