Issue 91750 - Japanese word cannot be searched in Search tab in Help window
Summary: Japanese word cannot be searched in Search tab in Help window
Status: ACCEPTED
Alias: None
Product: Internationalization
Classification: Code
Component: ui (show other issues)
Version: DEV300m23
Hardware: All All
: P2 Trivial (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords: CJK
Depends on:
Blocks:
 
Reported: 2008-07-17 09:40 UTC by yuko
Modified: 2017-05-20 11:13 UTC (History)
6 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
search result in localisation29 build (64.19 KB, image/gif)
2008-07-17 09:42 UTC, yuko
no flags Details
search result in SS Beta2 build (67.54 KB, image/gif)
2008-07-17 09:43 UTC, yuko
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description yuko 2008-07-17 09:40:52 UTC
Tested build: DEV300m23 (Build:9326) [CWS:localisation29]
Tested platform: Solaris SPARC, WinXP

In the Search tab in the Help window,
when entering Japanese word, help topics are not found.
ASCII word can be searched, but any Japanese word cannot be searched.
Please see the attached CWSl10n29_Help_ja.gif. 

Also, I tried to do the same steps in StarSuite Beta 2 build.
Japanese word can be searched as the attached SS-Beta2_Help_ja.gif.
Japanese word is translation for English word "help" in this image.
Comment 1 yuko 2008-07-17 09:42:16 UTC
Created attachment 55169 [details]
search result in localisation29 build
Comment 2 yuko 2008-07-17 09:43:03 UTC
Created attachment 55170 [details]
search result in SS Beta2 build
Comment 3 ivo.hinkelmann 2008-07-17 11:27:47 UTC
ab, is this related to the new indexer?!?
Comment 4 kai.sommerfeld 2008-07-17 14:47:57 UTC
jsc: Can you please take over as ab is on vacation. Thanks.
Comment 5 jsc 2008-07-17 15:31:32 UTC
accepted
Comment 6 jsc 2008-07-18 09:10:48 UTC
the error must be in the lucene part of the index search. I can debug our own
code and everything looks fine so far (the same as for the English version). But
the lucene search returns 0, nothing find. I am currently still not able to
debug the lucene code, i am investigating ...
Maybe the index is wrong but a simple rebuild of the index changed nothing.  
Comment 7 jsc 2008-07-23 10:31:06 UTC
fixed on cws helpsearch

I integrated lucene-analyzers-2.3.jar and use the CJK analyzer for Japanese. At
the moment for Japanese only because Korean and Chinese worked with the standard
analyzer as well. Maybe it would be interesting to check if it make sense to use
language specific analyzers in the future to get better search results. Have to
be evaluated.

Changes in
xmlhelp: introduce a new lang parameter in HelpSearch.java, HelpIndexer,
resultsetforquery.cxx and use the CJK analyzer for ja.

helpcontent2: use new lang parameter for HelpIndexer

lucene: simplify makefile and build now the lucene-analyzers-XY.jar as well. 

scp2: add lucene-analyzers-2.3.jar

config_office: add lucene-analyzers jar
Comment 8 rene 2008-07-23 17:07:39 UTC
jsc: sorry, reopening.

xmlhelp2 missed the $(PATH_SEPERATOR)$(LUCENE_ANALYZERS_JAR) in the
SYSTEM_LUCENE case

config_office is not committed to the cws at all.
Comment 9 rene 2008-07-23 17:15:02 UTC
sorry, taking config_office back...

xmlhelp just fixed by me
Comment 10 kla 2008-07-25 11:24:37 UTC
I take it.
Comment 11 kla 2008-07-25 11:25:51 UTC
Seen ok in cws helpsearch. -> verified
Comment 12 kla 2008-08-29 08:58:37 UTC
Seen ok in current master -> closed
Comment 13 yuko 2008-09-12 11:00:18 UTC
Search behavior for Japanese word does not work as expected in OOo 3.0 RC1,
so I reopen this and change the target milestone to OOo 3.0.1.

Only first two characters are used as a search word.

When entering Japanese word グループ for English word 'group', 'No topics
found.' appears.
グルー is not found. However, グル shows some topics.

Ex.
    English     Japanese (not found) -> (found)

    Wizard        ウィザード            -> ウィ
    Japanese    日本語               -> 日本
    Spellcheck  スペルチェック       -> スペ


グル is not a correct word in Japanese, so topics for グループ need to be found.

Also, Japanese word 言語設定 for English 'Language Settings' shows some topics,
but 言語 highlights in the chosen page, so the first two characters are used as
a search word.

Comment 14 rafaella.braconi 2008-09-12 11:53:25 UTC
adding ihi and ufi in cc:
Comment 15 ivo.hinkelmann 2008-09-12 12:38:22 UTC
ab, please have a look . Seems the japanese hc2 search function still has
problems ....
Comment 16 ab 2008-09-12 12:57:13 UTC
STARTED for now

Probably I will create a new issue for this as reusing an old issue 
usually leads to some confusion concerning cws assignment etc.
Comment 17 Uwe Fischer 2008-09-12 13:00:27 UTC
pls see also issue 38553 for problems with the English search. Some more info
also in issue 61820
Comment 18 yuko 2008-09-16 02:55:48 UTC
I think that this issue is different from issue 38554/61820.
Issue 38554/61820 is search issue between a word "custom" and a phrase "custom
shape".
However, this ja issue is search word issue that a word "custom" ("カスタム") is
devided
("カス" and "タム") first two ja characters ("カス") are used as search word.
Comment 19 ab 2008-11-10 10:07:11 UTC
Evaluation showed that this is the default behavior of Lucene's
CJKAnalyzer. This cannot be changed for 3.0.1. -> 3.1 for now.
Comment 20 ab 2009-01-09 08:18:24 UTC
Not enough time left to check / use another analyzer -> 3.2
Comment 21 ab 2009-09-07 16:23:28 UTC
-> OOo 3.3
Comment 22 amy2008 2010-04-22 08:29:33 UTC
The same problem in OOo_zh 
Li Meiying
Comment 23 ab 2010-06-04 12:13:08 UTC
Not enough time left for 3.3 -> 3.4
Comment 24 ab 2010-12-06 09:27:26 UTC
-> OOo 3.x
Comment 25 Marcus 2017-05-20 11:13:40 UTC
Reset assigne to the default "issues@openoffice.apache.org".