Bug 39277

Summary: [PATCH] [WebDAV] Create pipelines, stylesheets for outputting .wordml files to edit with Word 2003
Product: Lenya Reporter: Jonathan Addison <jonathan.addison>
Component: MiscellaneousAssignee: Lenya Developers <dev>
Status: NEW ---    
Severity: normal    
Priority: P2    
Version: Trunk   
Target Milestone: 2.0.1   
Hardware: Other   
OS: other   
Attachments: add ability to edit wordml files with webdav
added change to odt module to previous patch

Description Jonathan Addison 2006-04-11 22:25:01 UTC
This is a follow-up to bug#39019 (removing read-only restriction in Word 2003).
 The goal is to output a .wordml file using a server-side xhtml2wordml
transformation, edit with Word 2003, save, and transform back using a
server-side wordml2xhtml transformation.
Comment 1 Jonathan Addison 2006-04-11 22:46:45 UTC
Created attachment 18075 [details]
add ability to edit wordml files with webdav

This patch creates a second output file for each document with a .wordml
extension, so the tutorial page shows as both tutorial_en.html and
tutorial_en.wordml in the webdav folder.  I used wordml as the extension to
make it explicit but it could be changed to xml or even doc.  The default
webdav GET matcher was also changed from davget.xml to davget.html to match its
output format and simplify the pipelines.

Two stylesheets were added for transforming to and from wordml on the server:
xhtml2wordml.xsl and wordml2xhtml.xsl.	Originally these transformations were
being done within Word, but Word has a bug when fetching stylesheets from a
webdav or http address on saving so we switched to doing the transformations
server-side.  A benefit is that the Professional version of Word 2003 is no
longer needed because the advanced xml features aren't needed when the file is
wordml.  There is also a free plugin available that lets you use the wordml
format with older versions of Word, but I have not tested it yet.

The stylesheets only implement a basic subset of xhtml at the moment (p, h1-h6,
em, strong, a).  They also restrict what styles are available while editing in
Word, which is useful for enforcing markup guidelines.	The wordml2xhtml.xsl
also has support for tables and lists, a big thanks to Josias for adding this.
Comment 2 Jonathan Addison 2006-04-18 22:05:58 UTC
Created attachment 18131 [details]
added change to odt module to previous patch
Comment 3 Andreas Hartmann 2007-04-30 09:05:55 UTC
Could someone please review + apply this? Or should we schedule it for 1.4.1?
Comment 4 J 2007-04-30 09:21:03 UTC
i don't use word.
the patch looks very clean, but i don't like how everything gets stuffed into
the xhtml module. using the format mechanism for wordml certainly makes sense,
but i'd be more comfortable if we had a separate xhtml2wordml module... 
but i wonder: how much demand is there for such a feature? the word-to-web stuff
i've seen in the past has been the antithesis of standards-compliance and
semantic structure - does it really make sense to encourage people to publish
from word?
Comment 5 Jonathan Addison 2007-04-30 17:49:38 UTC
Considering that this is over a year old now, I'm sure the patch would need to
be resubmitted with corrections to the sitemaps.

We could also create a word module similar to the opendocument module to hold
the stylesheets.

And Joern, if you are curious about the semantic structure you may want to read
the description above and have a look at the stylesheets.  Most Word-to-web
stuff doesn't use WordML directly but rather has other transformations performed
before saving, which is where a lot of the non-compliance comes in.  These
stylesheets work directly on the wordml and keep only the elements you want. 
Right now there is only a minimal set of xhtml elements supported so it is
standards compliant.

I would save this for 1.4.1.