Created attachment 28574 [details] Patch
Created attachment 28575 [details] Source xls file to show the structure of drawing records
Created attachment 28576 [details] Target file which represents structure of workbook.xls in xml format
Hi, I have created this utility as part of my GSoC proposal. http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/evgeniy44/1 I look forward to your feedback.
Hi Evgeniy, A very good start! I see that for now only a few records override toXML(): EscherSplitMenuColorsRecord, EscherBSERecord and EscherDggRecord. Do you plan to implement toXML for the rest of DDF ? I think the answer is 'yes', just want to confirm it. For EscherOptRecord it would be great to show properties as child nodes, something like this: <EscherOptRecord> <Property id="127" value="32440321" name="protection.lockagainstgrouping"/> <Property id="128" value="809474328" name="text.textid"/> <Property id="191" value="262148" name="text.sizetexttofitshape"/> <Property id="447" value="1114113" name="fill.nofillhittest"/> <Property id="511" value="1638417" name="linestyle.nolinedrawdash"/> <Property id="831" value="524288" name="shape.backgroundshape"/> <Property id="959" value="131072" name="groupshape.print"/> </EscherOptRecord> Some records show recordId as decimals, others as hex. It should be unified, I think hex everywhere would be OK. Records thar are not supported (UnknownEscherRecord) should print their data in hex form: <UnknownEscherRecord> 50 4B 03 04 14 00 06 00 08 00 00 00 21 00 98 28 26 </UnknownEscherRecord> Other than that, very cool! Looking forward to working with you on GSoC! Regards, Yegor
Created attachment 28597 [details] Updated patch
Created attachment 28598 [details] Updated target file which represents structure of workbook.xls in xml format
Hi, I have updated patch according to your advices. Also I have updated target xml file. I look forward to your feedback. With best regards, Evgeniy Berlog
Very cool. It is nearly complete in my opinion, I only have two things to say: 1. Complete the use case : modify the source workbook with POI. Call sheet.getDrawingPatriarch() for each sheet, save the result and run BiffDrawingToXml. We expect that the output will be identical to the source but I guess it will be not. 2. RecordId, Version and Instance are record attributes (as opposed to the record body) and they ought to be XML attributes instead of child nodes: <EscherClientAnchorRecord recordId="0xF010" version"0x00" instance="0x00"> ... </EscherClientAnchorRecord> This way we clearly separate record attributes (i.e record header) from the record body. Other than that, very cool. Regards, Yegor (In reply to comment #7) > Hi, I have updated patch according to your advices. > Also I have updated target xml file. > > I look forward to your feedback. > With best regards, Evgeniy Berlog
Created attachment 28609 [details] Updated patch
Created attachment 28610 [details] The copy of workbook.xls made by POI Code for copying workbook.xls into target.xls: public static void main(String[] args) { String input = "workbook.xls"; String output = "target.xls"; try{ POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream(input)); HSSFWorkbook workbook = new HSSFWorkbook(fs); HSSFSheet sheet = workbook.getSheetAt(0); HSSFPatriarch patriarch = sheet.createDrawingPatriarch(); FileOutputStream fileOut = new FileOutputStream(output); workbook.write(fileOut); fileOut.close(); } catch(IOException e){ System.out.println("Can't open file "+input+" or "+output); System.exit(1); } }
Created attachment 28611 [details] Xml file which represents structure of target.xls
Created attachment 28612 [details] Xml file which represents structure of workbook.xls
Hi Yegor, Thanks much for your advices! I have added 'recordId', 'version' and 'instance' fields as record attributes. Also I copied workbook.xls into target.xls using the code above to compare their structures. You were right. The stuctures were different, what you can see while comparing target.xml and workbook.xml. Please, check if any more changes are required or not. My GSoC proposal includes resolving of this problem. With best regards, Evgeniy Berlog
Created attachment 28622 [details] Xml file which represents structure of target.xls
Evgeniy, this is nearly complete. The final touch will be a command-line interface. Something like this: BiffDrawingToXml -sheet <sheetId> <inputFileName> or BiffDrawingToXml -sheet "sheetName" <inputFileName> The name of the output xml file should be derived from the input file name, i.e. xmlFileName = inputFileName.replace(".xls", ".xml") Please complete these changes and I will check this utility in SVN. Yegor
Created attachment 28698 [details] Updated patch Hi, I have added command line interface to this utility. Here is utility description: Synopsis: BiffDrawingToXml [Options] <Source file> Options: -exclude-workbook (excludes records which belong to the workbook) -sheet-name "sheetName" (addes to xml file records of the sheet with specified name) -sheet-indexes <indexesList> (addes to xml file records of the sheets with specified indexes. Indexes must be separated by comma) If no sheets is specified, utility converts all of them into xml file Thanks for your advices. With best regards, Evgeniy Berlog