Bug 46706 - Hexadecimal and decimal represenations of characters are converted to the utf-8 character by the oneform editor.
Summary: Hexadecimal and decimal represenations of characters are converted to the utf...
Status: NEW
Alias: None
Product: Lenya
Classification: Unclassified
Component: Form Editor (show other bugs)
Version: 2.0.2
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Lenya Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-02-12 07:04 UTC by Lambert Utz
Modified: 2009-02-16 13:12 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Lambert Utz 2009-02-12 07:04:05 UTC
Hexadecimal and decimal represenations of characters are converted to the utf-8 character by the oneform editor.
A   is transformed to a blank. If there are serveral   they are converted to blanks and they are shown as one blank. So it's impossible to have serveral leading blanks as in lenya 1.2.x
Comment 1 Andreas Hartmann 2009-02-12 07:46:16 UTC
The change happens when the document is written. If the document already contains a   it is displayed correctly in the editor.
Comment 2 Andreas Hartmann 2009-02-12 07:49:53 UTC
The content parameter contains the   entity when the usecase handler is called.
Comment 3 Andreas Hartmann 2009-02-12 07:51:44 UTC
I guess the problem is the DOM conversion. Unfortunately it is necessary for link rewriting.
Comment 4 Andreas Hartmann 2009-02-12 08:04:21 UTC
Using SAX instead would be virtually impossible because we use XPath to specify the link attributes, which doesn't sound like a good idea now :(

Maybe we should allow a different, more SAX-compatible way to specifiy the link attributes, similar to the configuration of the LinkRewritingTransformer classes:

<link-attribute namespace="http://www.w3.org/1999/xhtml" element="a" attribute="href"/>
Comment 5 Andreas Hartmann 2009-02-12 08:07:37 UTC
(In reply to comment #4)

> Maybe we should allow a different, more SAX-compatible way to specifiy the link
> attributes, similar to the configuration of the LinkRewritingTransformer
> classes:

BTW, in Lenya 2.2 I introduced two classes for this purpose:

* LinkRewriteAttributes
* LinkRewriteAttribute

A LinkRewriteAttributes object (singleton) is basically the declaration of the link attributes of a document type. I think thas can be generalized.
Comment 6 J 2009-02-16 13:12:10 UTC
what part of the rewriting behaviour is the problem? imho, an entity should be equivalent to the corresponding utf-8 character, and any transformation between those two is a valid identity operation in the semantic space (if not the lexical one).
i wonder what lambert is trying to accomplish here?

would it help to optionally convert everything that is not 7bit-ascii into numerical entities?