Issue 23541 - Index from concordance file to ignore optional hyphens in doc
Summary: Index from concordance file to ignore optional hyphens in doc
Status: CONFIRMED
Alias: None
Product: Writer
Classification: Application
Component: code (show other issues)
Version: OOo 1.1 RC5
Hardware: Other Windows XP
: P3 Trivial (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
Depends on:
Blocks: 128492
  Show dependency tree
 
Reported: 2003-12-14 07:58 UTC by othr
Modified: 2021-11-01 21:16 UTC (History)
1 user (show)

See Also:
Issue Type: ENHANCEMENT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Test Case with Generated Index (27.06 KB, application/vnd.oasis.opendocument.text)
2008-05-16 07:35 UTC, pesala
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description othr 2003-12-14 07:58:10 UTC
Action: generating alphabetical indexes from a concordance file.

Enhancement: Would be nice if optional hyphens in words were ignored when 
matching words with the concordance file.
Comment 1 h.ilter 2003-12-15 11:12:30 UTC
Reassigned to BH
Comment 2 pesala 2006-11-04 13:14:17 UTC
If one uses optional hyphens [-] in index entries they are also treated 
differently. For example the following would generate four entries in the index:

concordance
con[-]cordance
concor[-]dance
con[-]cor[-]dance

It would be better if all optional hyphens [-]could be ignored when comparing index 
entries. 

Perhaps ordinary hyphens should be ignored too, though this is debatable. 
Perhaps the following should be treated as a single entry? (anti-semitism)

anti-semitism
antisemitism
anti[-]semitism
antisemit[-]ism
anti-semit[-]ism
Comment 3 pesala 2008-05-16 07:35:05 UTC
Created attachment 53695 [details]
Test Case with Generated Index
Comment 4 pesala 2008-05-16 07:41:40 UTC
This issue still affects release 2.4 

Optional hyphens should be ignored for indexing purposes, though it would be useful to 
include them in the index so that long words still break in the desired place in the index.

Non-breaking hyphens and regular hyphens should be treated as the same.

Optionally, hyphenated and unhyphenated terms that are otherwise identical could be 
combined under a single index entry, i.e. for indexing purposes anti-semitism = 
antisemitism = Anti-semitism. Whichever spelling was used first would take precedence 
as the index entry. 
Comment 5 bettina.haberer 2010-05-21 14:50:27 UTC
To grep the issues easier via "requirements" I put the issues currently lying on
my owner to the owner "requirements".