|Summary:||[Patch] A Simple Extractor and Workbook are proposed|
|Component:||XSSF||Assignee:||POI Developers List <dev>|
svn diff output
Description ssmeets 2010-10-11 19:54:15 UTC
Created attachment 26160 [details] svn diff output Proposed is a SimpleExtractor and XSSFSimpleWorkbook in order to use a more efficient way of parsing an XSL spreadsheets in Tika (SAX based parsing). This is related to Tika-521 (https://issues.apache.org/jira/browse/TIKA-521). Testcases will follow when the proposed approach is approved.
Comment 2 Nick Burch 2010-11-19 13:18:28 UTC
I've done some refactoring of XSSFEventBasedExcelExtractor in r1036968, which should help with the Tika side when it comes to outputting the values as XHTML Next I'll need to expand on your XSSFSimpleWorkbook to cover all the different file parts we might need to replicate the functionality in XSSFExcelExtractorDecorator (may need some more POI refactoring as well as new code) Finally, we'd then need to go to the Tika side and update XSSFExcelExtractorDecorator to use the new simple workbook + implement a SheetContentsHandler which generates the xhtml events