Issue 26733 - uncompressed dictionary of words appearing in document
Summary: uncompressed dictionary of words appearing in document
Status: CONFIRMED
Alias: None
Product: xml
Classification: Code
Component: code (show other issues)
Version: OOo 1.0.0
Hardware: All All
: P4 Trivial (vote)
Target Milestone: AOO Later
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-03-19 13:06 UTC by papeye
Modified: 2013-02-07 22:41 UTC (History)
1 user (show)

See Also:
Issue Type: FEATURE
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description papeye 2004-03-19 13:06:58 UTC
"It would be nice" if documents, spreadsheets, etcetera were saved with an
uncompressed list of words appearing in the document.  Since the list of words
is uncompressed it would be easier to find all documents with specific words
appearing in them.

Alternatively, the entire user entered text part of a document could be saved
uncompressed.  This would have the advantage of giving the ability to search for
phrases, but would expand the size of the document.

(Check box options for one, the other, or neither?)
Comment 1 jogi 2004-03-19 14:07:51 UTC
I don't think that it makes sense because it is very easy to look into a
compressed archive (.jar) but may be the UserExperience has a different view.
Comment 2 papeye 2004-03-19 14:41:28 UTC
In windows 2000 the xml is some kind of zip that Winzip can open.  It could 
probably be searched with zgrep.

If the xml is stored in .jar on another platform this is another good reason to 
have a portion of the xml containing words or phrases in uncompressed format.  
It will then be searchable without "inside" knowledge of the compression scheme 
used on the xml format (using grep on every platform).
Comment 3 papeye 2007-05-24 00:16:52 UTC
Hi, this wish has been more or less solved by the current generation of file
indexers (such as strigi) because they unpack archive files to index their
internal files.

Thanks!
Comment 4 bettina.haberer 2010-05-21 14:59:24 UTC
To grep the issues easier via "requirements" I put the issues currently lying on
my owner to the owner "requirements".