Bug 9888 - Parser corrupts document end tag
Summary: Parser corrupts document end tag
Status: NEW
Alias: None
Product: Crimson
Classification: Unclassified
Component: other (show other bugs)
Version: 1.1
Hardware: Other All
: P3 major (vote)
Target Milestone: ---
Assignee: Edwin Goei
URL:
Keywords:
Depends on: 34387
Blocks:
  Show dependency tree
 
Reported: 2002-06-15 00:32 UTC by loney
Modified: 2004-11-16 19:05 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description loney 2002-06-15 00:32:32 UTC
Running Crimson 1.1.1 on the input XML document shown below corrupts the input 
stream. The error is highly sensitive to the exact sequence of input 
characters. E.g., a tab initiates the line:

		   stopX="xxxxx"/>

If a space is substituted for the initial tab, then the input is successfully 
parsed. Similarly bizarre, if the initial comment is removed, then the input 
parses successfully. 

An attempt has been made to mask out letters in the input whose value appears 
not to affect the parse. These letters have value 'x'; they must be present for 
the parse to fail, but it appears that any letter can be substituted for 'x'.

This is not true of other letters. E.g., substituting all occurrences of the 
string 'xxxxxxxs' in the attribute values with the string 'xxxxxxxx' results in 
a successful parse.

Examining the stream content indicates that the parse corrupts the characters 
in the end tag. Execution results in the exception:

org.xml.sax.SAXParseException: Expected "</application>" to terminate element 
starting on line 5.
	at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3108)
	at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3102)
	at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1500)
	at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:500)
	at org.apache.crimson.parser.Parser2.parse(Parser2.java:305)
	at org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:433)
	at org.apache.crimson.jaxp.DocumentBuilderImpl.parse
(DocumentBuilderImpl.java:185)

This error does not occur with Xerces 2. Needless to say, the problem takes a 
long time to isolate. There is no work-around other than making random changes 
to valid input until something parses, hardly a satisfying solution.

Input content
-------------
<?xml version="1.0" encoding="UTF-8"?>

<!-- Cf. "Administrator's Guide" for information about this file -->

<application>
  <xxxxxxx xxxxxxxy="xxxxxx" xxxx="log4j"
           xxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.services.Logger"
           xxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.xxxx.Log4jLogger"/>
  <xxxxxxx xxxxxxxy="formatter" xxxx="standard"
           
xxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.xxxxxxxs.MessageFormatter"
           
xxxxxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.xxxx.MessageFormatterImplMBean"
           provider="xxx.xxxxxxxxxx.xxxxxxxxx.xxxx.MessageFormatterImpl"/>
  <xxxxxxx xxxxxxxy="xxxxxxx.administrator" xxxx="jmx"
           
xxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.xxxxxxxs.ServiceAdministrator"
           
xxxxxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.xxxxxxxs.jmx.JmxHtmlServiceAdmini
stratorMBean"
           
provider="xxx.xxxxxxxxxx.xxxxxxxxx.xxxxxxxs.jmx.JmxHtmlServiceAdministrator"/>
  <xxxxxxx xxxxxxxy="id.generator" xxxx="IETF"
           xxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.xxxxxxxx.IdGenerator"
           provider="xxx.xxxxxxxxxx.xxxxxxxxx.xxxx.IETFIdGenerator"/>
  <xxxxxxx xxxxxxxy="cache" xxxx="simple"
           xxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.xxxxxxxs.Cache"
           
xxxxxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.persistence.ObjectCacheMBean"
           provider="xxx.xxxxxxxxxx.xxxxxxxxx.xxxxxxxxxxx.ObjectCache"
	       stopX="xxxxx"/>
  <xxxxxxx xxxxxxxy="encryptor" xxxx="base64"
           xxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.xxxxxxxs.Encryptor"
           provider="xxx.xxxxxxxxxx.xxxxxxxxx.xxxx.Base64Encryptor"/>
  <xxxxxxx xxxxxxxy="schema" xxxx="xx"
           xxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.xxxxxxxs.Schema"
           xxxxxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.domain.SchemaMBean"
           provider="xxx.xxxxxxxxxx.xx.domain.SxSchema"/>
  <xxxxxxx xxxxxxxy="xxxxxxxxxxx" xxxx="transient"
           
xxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.xxxxxxxs.PersistentStoreFactory"
           
xxxxxxxxxxxxxxxxxxxx="test.xxxxxxxxx.xxxxxxxxxxx.TransientStoreFactoryMBean"
           provider="test.xxxxxxxxx.xxxxxxxxxxx.TransientStoreFactory"/>
  <xxxxxxx xxxxxxxy="xxxxxxxxxxx" xxxx="xml"
           
xxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.xxxxxxxs.PersistentStoreFactory"
           
xxxxxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.xxxxxxxxxxx.PersistentStoreFactor
yMBean"
           provider="xxx.xxxxxxxxxx.xx.xxxxxxxxxxx.xxx.SxXmlStoreFactory"/>
  <xxxxxxx xxxxxxxy="xxxxxxxxxxx" xxxx="ejb"
           
xxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.xxxxxxxs.PersistentStoreFactory"
           
xxxxxxxxxxxxxxxxxxxx="xxx.xxxxxxxxxx.xxxxxxxxx.xxxxxxxxxxx.PersistentStoreFactor
yMBean"
           
provider="xxx.xxxxxxxxxx.xx.enterprise.xxxxxxxxxxx.SxEjbStoreFactory"/>
</application>
Comment 1 loney 2002-06-15 00:37:41 UTC
Unfortunately, Bugzilla mangles the formatting of the file included in the bug 
report. Each attribute should be on its own line. Send me a note if you'd like 
a true copy of the input test case.