Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing |
Summary: | excel to html conversion | ||
---|---|---|---|
Product: | App Dev | Reporter: | pmakde <pdmakde> |
Component: | api | Assignee: | stephan.wunderlich |
Status: | CLOSED NOT_AN_OOO_ISSUE | QA Contact: | issues@api <issues> |
Severity: | Trivial | ||
Priority: | P4 | CC: | issues, svante.schubert |
Version: | 3.3.0 or older (OOo) | ||
Target Milestone: | --- | ||
Hardware: | All | ||
OS: | Linux, all | ||
Issue Type: | DEFECT | Latest Confirmation in: | --- |
Developer Difficulty: | --- | ||
Attachments: |
Description
pmakde
2004-08-24 06:15:43 UTC
Created attachment 17303 [details]
MS Excel spreadsheet with single sheet
Created attachment 17304 [details]
The html output for excel spreadsheet(appenda.xls)
SW->pmakde: first of all this is no API issue since the same happens when you use the saveAs entry in the file-menu. Then it is supposed to be that way, the different headers are thought to navigate through the safed sheets ... you wouldn't create a table of contents when you only have one chapter would you ? ... and last but not least this would even if it would be an issue for sure no P1 ;-) ... I set it to invalid because I don't think that it is an issue at all I think you didnt get the point. A sheet name in a spreadsheet ( even if it has only single sheet) carries some information. It gives some information about contents. I think we just cant ignore that part. i think lossing any data from the source document should be a big issue. I checked the same with Microsoft Excel ( save as web page ) and the html ouput had the sheet name information. SW->pmakde: mmm I just opened your document in excel and saved it as webpage ... no sheet name visible when I open it in a browser :-( ... I attached the saved file. Created attachment 17313 [details]
html file as it is produced by excel
I can see sheet name in your html output file. If you open it with notepad or something, you can see the html source has the sheet name. The sheet name data is there in the html souce and its not lost by converting xls to html file. I mean we can generate Sheet Name as part of the output ( as some <H1> tag like that ). In our OOo XML document we have the spreadsheet name saved as an attribute of the table: /office:document/office:body/table:table/@table:name How do suppose to map it to XHTML tables? XHTML is descibed in 'xhtml1-strict.dtd' as: <!ELEMENT table (caption?, (col* | colgroup*), thead?, tfoot?, (tbody+ | tr+))> <!ATTLIST table %attrs; summary %Text; #IMPLIED width %Length; #IMPLIED border %Pixels; #IMPLIED frame %TFrame; #IMPLIED rules %TRules; #IMPLIED cellspacing %Length; #IMPLIED cellpadding %Length; #IMPLIED > The name is NOT the summary, which is used by further informations. Remember the OpenOffice.org XML format in general contains more informations than (X)HTML. Every transformation from OOo XML to (X)HTML is therefor a transformation loss transformation (filter). We might give it out as an comment in the XHTML export filter, which is useable since StarOffice7. Remember that the first rule of a filter is to keep the exported document as close to the original as possible to make a round-trip (reloading) possible. Inserting more elements as Headings e.g. <h1> would break this rule. If we have sheet name data in some xml file.. we can as well generate it in the html file. We can always put it in such way that reverting back should not be an problem. Anyway we are doing that for a document with multiple sheets. In my application I need the sheet name ( its important for me) as part of the html output. thanks again Created attachment 17318 [details]
XHTML export filter (state before SO8 EA)
> If we have sheet name data in some xml file.. we can as well generate it in the html file. We could, but there is no adequate HTML node for putting it into. A spreadsheet is a simple table and the name of a table is not shown in HTML. > We can always put it in such way that reverting back should not be an problem. It is, as there is no element/attribute (in general a node) for it in HTML. This is what I tried to explain you earlier. > Anyway we are doing that for a document with multiple sheets. Yes, and I personally dislike it very much. As it alters the orginal document by adding information, which were before not existing. > In my application I need the sheet name ( its important for me) as part of the html output. I can offer you only a compromise. Install yourself a new version of OpenOffice (e.g. 1.1.2). Add under Tools -> XML Settings->NEW a new XSLT transformation, which you find attached as ZIP. - Name might be "SO8 EA XHTML export" - Choose Application "Calc" - Name of Filetype "XHTML1.0" - File Extension "xhtml" Under Label just put into XSLT for export the path of the unzipped file of ooo2xhtml.xsl. Via File->export you now can export as XHTML, where the table:name is written in a comment. But you going to have problems if you try to import it again. Don't forget it is an export not a save option. Last note: 1) You need to install a JRE/JDK 1.4, as it's XSLT engine use it's optional feature. 2) If you install the new OOo 1.1.2, don't forget to enable the optional "XSLT Sample Filter" during setup. Just put myself on CC *** Issue 33282 has been marked as a duplicate of this issue. *** thanks sus. I could get xml file based on your filter settings. But I need to generate HTML output ( with sheet name ). I am using Java OO SDK API calls to do this. is there any solution to this? thank again for your help. So far so good, you receive XML. But you should receive XHTML with the stylesheet name as comment included (I added it for you in the attached stylesheets). Anyway, we shouldn't talk about this in a RESOLVED INVALID issue - in an issue anyway - and switch to a newsgroup as openoffice.xml.dev, when you have problems using the XSLT transformation via OOo 1.1.2 or an api related group if you do not know how to access this filter via API. By this others might take advantage of your questions as well. I will close this isse so far as there won't be any futher fix due to the reasons mentioned above. I dont see any comment included in xhtml file. Am i missing something here. thanks yes, you missed something somehow. I installed the XSLT stylesheets from the zip on another computer with StarOffice7 pp3 doing the procedure I written and it works fine, again. Creating valid strict XHTML1.0 containing the following: <table border="0" cellspacing="0" cellpadding="0" class="ta1"> <!--@table:name=SIGNATUR_SHEET--> <colgroup> sorry i didnt put my question properly. I wanted to know, how can i get HTML file from the XHTML file ( generated using new filter) The question is still ambigious, do you want to know how to reimport the XHTML into OOo? This would be done with a XHTML import filter, which is currently not provided. But im my opinion it would be the wrong approach anyway, as you better should work top-down to the XHTML. It means you would only edit the Office document and export it, when it is ready to XHTML. Otherwise you might loose information in HTML. If you still have questions, please post them on the appropriate mailing list: http://www.openoffice.org/mail_list.html |