Summary: | Bring over missing constants from Tika | ||
---|---|---|---|
Product: | POI | Reporter: | Nick Burch <apache> |
Component: | POI Overall | Assignee: | POI Developers List <dev> |
Status: | NEW --- | ||
Severity: | enhancement | ||
Priority: | P2 | ||
Version: | 3.17-dev | ||
Target Milestone: | --- | ||
Hardware: | All | ||
OS: | All | ||
Attachments: |
a quick comparison of Tika and POI constants
a quick comparison of Tika and POI constants |
Description
Nick Burch
2017-07-13 20:31:45 UTC
Created attachment 35138 [details] a quick comparison of Tika and POI constants https://github.com/apache/tika/tree/master/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ git clone https://github.com/apache/tika.git apache-tika pushd apache-tika cd tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ grep -r -P "(static final|final static|http://schemas|vnd|urn)" . Most notably, * ./ooxml/AbstractOOXMLExtractor.java has 8 relationship schema URLS and 1 ooxml mime type * ./ooxml/OOXMLWordAndPowerPointTextHandler.java has 6 schema urls and 2 urns * ./POIFSContainerDetector.java has several mime types And a few others See attachment for a list of current constants that could be copied over. Created attachment 35139 [details]
a quick comparison of Tika and POI constants
Yup. Sorry. I've been meaning to do this. Thank you, Nick and Javen! Speaking of which...is there any interest in moving over the SAX-based docx/pptx code from Tika into POI? Yes, absolutely! |