Issue 46016 - Unopkg cannot parse an XML manifest with leading whitespace
Summary: Unopkg cannot parse an XML manifest with leading whitespace
Status: CONFIRMED
Alias: None
Product: utilities
Classification: Unclassified
Component: code (show other issues)
Version: OOo 2.0 Beta
Hardware: PC Windows XP
: P4 Trivial (vote)
Target Milestone: OOo Later
Assignee: AOO issues mailing list
QA Contact: issues@tools
URL:
Keywords: oooqa
Depends on:
Blocks:
 
Reported: 2005-03-25 17:09 UTC by raymondb
Modified: 2017-05-20 11:33 UTC (History)
3 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description raymondb 2005-03-25 17:09:29 UTC
Discovered that a manifest.xml file included in a component won't be parsed by
unopkg if it has a leading line of whitespace before the XML declaration. Unopkg
will install the bundle but will not deploy anything listed in the manifest.xml
file.

The work around is to remove the whitespace but it will catch people off guard
who think there component must have done somethng wrong.
Comment 1 hennes.rohling 2005-05-06 12:50:49 UTC
.
Comment 2 jsc 2005-05-09 08:24:42 UTC
jsc -> dbo: it's yours
Comment 3 Daniel Boelzle [:dbo] 2005-05-09 09:37:17 UTC
As I am using the com.sun.star.packages.manifest.ManifestReader, I think it is a
problem in that area.
@MAV: you are the best one I know for this, please take over. Martin Gallwey
does not contribute/fix anything anymore, right?
Comment 4 mikhail.voytenko 2005-05-09 11:51:58 UTC
MAV->DVO: ManifestReader implementation uses Sax parser to parse the stream. So
it looks like the parser is confused by the line of whitespaces. Could you
please take a look.
Comment 5 joerg.barfurth 2005-05-10 08:28:02 UTC
Hi. I just noticed this issue accidentally.

If an XML document contains an XML declaration, that declaration MUST be at the
very beginning of the document. In particular, whitespace preceding the xml
declaration is not allowed in a well-formed XML document. The only exception is,
that a single unicode byte order mark is permitted.

I suggest closing this issue as INVALID.
Comment 6 raymondb 2005-05-10 21:54:54 UTC
I looked through the specification (current, third edition) and it discusses the difference validating vs. 
nonvalidating processors need to apply when parsing documents and it discusses how to treat whitespace 
but I don't see anywhere in the specification where it calls out that this is a problem. I might have missed 
it but my sense is that it should be handled in some way. I know other parsers I've used don't like it and 
will throw an error but I don't think it should silently fail.
Comment 7 openoffice 2005-05-11 10:53:28 UTC
It's in chapter 2.1 in combination with chapter 2.8 of the XML specification
(http://www.w3.org/TR/REC-xml/). 

Chapter 2 defines the physical layout of document. You will find the following
productions in the respective chapters:
  [1]  document ::= prolog element Misc*
  [22] prolog ::= XMLDecl? Misc* (doctypedecl Misc*)?
  [23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'

Note that allowed whitespace is explicit in those productions (the S rule), and
there is none allowed in front of the '<?xml' text. Hence, the parser is correct.

dvo->raymondb: I'm tempted to close as INVALID as well. Or, if the problem is
actually about the silent failure, send it back the chain to let higher-level
components deal with that (if possible).
Comment 8 raymondb 2005-05-14 05:06:26 UTC
raymondb -> dvo: Given that I have seen other processors deal with it without failing I assumed that was a 
case that should be handled. Your comments and further checking have shown that more strict processors 
will fail on this however they will thorw an error. I think for usability that is important because I imagine 
most who stumble over this will take some time before they realize the problem.
Comment 9 andreas.martens 2005-06-28 16:29:05 UTC
Redistributing dvo's issues.
Comment 10 lohmaier 2005-08-16 23:15:58 UTC
the issue is confirmed (OOo doesn't handle the file beginning with whitespace),
so this issue should no longer stay unconfirmed. Everything else (whether the
issue is closed as invalid/wontfix or OOo will issue a warning or something) is
another story.
Comment 11 michael.brauer 2005-08-23 14:09:19 UTC
MIB->DBO,MAV If the manifest.xml is inavlid, then an error should be displayed
to the user.
Comment 12 Daniel Boelzle [:dbo] 2005-08-23 14:49:13 UTC
@MAV: IMO the manifest reader ought to reflect that error somehow (e.g. throwing
an exception upon readManifestSequence()). As DVO figured out, the parser is ok
not triggering anything. But because of the fact that even a document started
event is missing, the manifest reader should signal that exception. The
packagemanager cannot distinguish whether the file is corrupt or just does not
contain any entries.

Comment 13 mikhail.voytenko 2005-08-29 11:51:32 UTC
Setting the target to the issue.
Comment 14 Marcus 2017-05-20 11:33:46 UTC
Reset assigne to the default "issues@openoffice.apache.org".