Issue 106032 - linguistic: make human-readable user-dicts the default format ?
Summary: linguistic: make human-readable user-dicts the default format ?
Alias: None
Product: Writer
Classification: Application
Component: code (show other issues)
Version: DEV300m61
Hardware: All Linux, all
: P3 Trivial (vote)
Target Milestone: ---
Assignee: stefan.baltzer
QA Contact: issues@sw
Depends on: 60698
  Show dependency tree
Reported: 2009-10-19 12:30 UTC by caolanm
Modified: 2013-08-07 14:44 UTC (History)
5 users (show)

See Also:
Issue Type: PATCH
Latest Confirmation in: ---
Developer Difficulty: ---

trivial patch (626 bytes, patch)
2009-10-19 12:31 UTC, caolanm
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this issue.
Description caolanm 2009-10-19 12:30:56 UTC
The default binary format is editable within OOo but not human readable. This
makes it hard/impossible to work with the format outside of OOo. For instance it
would be very nice to be able to extend a minority language spelling dictionary
by simply getting users to attach and send personal wordlists for easy
processing in order to extract words to add them to the upstream projects
Comment 1 caolanm 2009-10-19 12:31:15 UTC
Created attachment 65440 [details]
trivial patch
Comment 2 caolanm 2009-10-19 12:32:49 UTC
The new format (issue 60698) was brought in at 2.0.3, how about for 3.2 making
it the default ? By this stage the problem of loading a 3.2 dict in an old
version would only arise in by now very ancient versions < 2.0.3.
Comment 3 thomas.lange 2009-10-19 12:55:11 UTC
tl->cmc: I'm a bit worried about people actually changing the content of the
human readable version, because the code that was provided as patch is quite
sensitive to white space and line ends. And changing them by accident will not
work with the provided code.
It was ok as long this was only used by the patch provider or some automated
tooling. But if we are to allow for user editable dictionary content the code
should be more stable in respect to changing white space and line ends.

The other reason is, that in the longer run it would still be much nicer if the
dictionary format would be the same as in hunspell dictionaries.
Problems here: 
- hunpsell does not allow to provide hyphenation points in spell check 
dictionaries, thus we would need an extra file for those. 
- and hunspell has no support for exception dictionaries

For all of the above we are still stuck with the binary format.

tl->nemeth: any comments from your side.
Comment 4 thomas.lange 2009-10-19 13:32:11 UTC
SBA said this is not something to push into 3.2 at this late stage. He prefers
to have something like this changed among the first 3.3 changes, and preferably
with a somewhat more flexible code when reading that format.
Thus changing target to OOo 3.3.
Comment 5 thomas.lange 2010-04-23 07:27:53 UTC
Fixed in cws tl80. No improvement regarding white space handling etc. implement
since in the end the argument that such dictionaries are already in use, and
thus we don't want to introduce new variability to it, took over. Thus the user
has to be careful not to modify those headers lines by accident.
Comment 6 thomas.lange 2010-05-14 06:41:19 UTC
tl->sba: please verify. Thanks!
Comment 7 stefan.baltzer 2010-06-11 14:14:39 UTC
Verified in CWS tl80.
Comment 8 caolanm 2010-06-17 20:46:44 UTC
closing, integrated m83