This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 160276 - JavaHelp Index Is Built Improperly
Summary: JavaHelp Index Is Built Improperly
Status: VERIFIED FIXED
Alias: None
Product: apisupport
Classification: Unclassified
Component: Harness (show other bugs)
Version: 6.x
Hardware: All All
: P3 blocker (vote)
Assignee: rmichalsky
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-03-13 18:22 UTC by tomwheeler
Modified: 2009-03-26 02:03 UTC (History)
0 users

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments
Simple suite to illustrate the described problems (14.61 KB, application/x-compressed)
2009-03-13 18:23 UTC, tomwheeler
Details

Note You need to log in before you can comment on or make changes to this bug.
Description tomwheeler 2009-03-13 18:22:02 UTC
We've discovered two problems with the JavaHelp index in a platform application.  I've created a minimal application
that reproduces them; I will upload it momentarily.  I believe both problems stem from the same root cause, so I am
filing them as one issue.

The first problem is that all documents with a .html extension seem to get indexed, regardless of whether or not they
are referenced in the map file.  They do not show up in the table of contents or index, but do show up in searches.

The second problem is that it seems indexing is not performed for any file which has an extension other than .html, and
in particular, not for files with a .htm extension (commonly produced by several Windows HTML editors).  This is true
even when the file is properly referenced in the map, toc and idx help system files.  Although the page will be shown in
the table of contents and/or index as expected, they will not show up in searches.
Comment 1 tomwheeler 2009-03-13 18:23:01 UTC
Created attachment 78161 [details]
Simple suite to illustrate the described problems
Comment 2 tomwheeler 2009-03-13 18:31:46 UTC
First unpack the suite, open it in the NetBeans IDE and then run it.

To reproduce the first problem:

1. Open the help contents (Help -> Help Contents)
2. Note that the unreferenced HTML file does not appear on the contents tab (as expected).
3. Click the Search tab and do a search for a word in the unreferenced HTML file (such as 'metallic' or 'elephant').
4. Note that the word appears, even though the file is unreferenced.

To reproduce the second problem:

1. Open the help contents (Help -> Help Contents)
2. Click the Contents tab
3. Note that the "Three Letter Extension" page is displayed (as expected).  
4. Click the Search tab and do a search for a word in the .htm file (such as 'workaround' or 'presumably').
5. Note that the search does not locate this page
Comment 3 Victor Vasilyev 2009-03-23 01:30:29 UTC
You are right both problems have the same root cause. 
Code responsible for it is located in the source form in <NetBeansInstallDir>/harness/build.xml
See target with the name="javahelp".

About the first problem "Indexing HTML documents that are not listed in the map file"...
I guess, it is not an issue, because:
1) The map file defines only identifiers of the documents, but not actual content of the help set. It is possible that
not all documents of the help set need to be identified in a project.
2) You can easily exclude desired files from indexing by specifying the javahelp.excludes property in the
project.properties file that is associated with the project containing a help set.
E.g. in case of your example you can define
javahelp.excludes=unreferenced.html
in the javahelpsuite/helpme/nbproject/project.properties file.
Note, you can use ant-like patterns as a value of the javahelp.excludes property, e.g. "**/*.htm". 

About the second problem "Indexing of the HTML documents with .htm extension is not provided"...
I agree it is a bug.
Now, only "**/*.html" is defined as a file set that will be included for indexing.
Moreover, it is hardcoded in the call of the jhindexer task. 
I think, to fix this issue we need establish additional property "javahelp.includes" and pass its value to the call of
the jhindexer task. So, a user will have full control on the indexing process from inside IDE.

WORKAROUND:
-----------
To change a set of files that should be involved in the indexing process you can modify the
<NetBeansInstallDir>/harness/build.xml file in your installation directory of the NetBeans IDE like this [See Line 177
(NB 6.5)]:

        <jhindexer basedir="${build.javahelp.dir}/${javahelp.base}"
                   db="${build.javahelp.dir}/${javahelp.base}/${javahelp.search}">
            <classpath>
                <pathelement location="${jhall.jar}"/>
            </classpath>
<!--        <include name="**/*.html"/> THIS ELEMENT IS COMMENTED OUT TO FIX THE ISSUE WITH INDEXING OF .htm FILES -->
            <exclude name="${javahelp.search}/"/>
            <exclude name="${javahelp.excludes}"/>
        </jhindexer>

Note, by default all files with both .html and .htm extensions will be included by the jhindexer task. 
Comment 4 Victor Vasilyev 2009-03-23 18:09:14 UTC
Proposal for fixing this issue:
file apisupport.harness\release\build.xml line 240:
-------------------------------------------------------------------------------------
        <jhindexer basedir="${build.javahelp.dir}/${javahelp.base}"
                   db="${build.javahelp.dir}/${javahelp.base}/${javahelp.search}">
            <classpath>
                <pathelement location="${jhall.jar}"/>
                <pathelement location="${harness.dir}/tasks.jar"/>
            </classpath>
            <include name="**/*.html"/>
            <include name="**/*.htm"/> <!-- Fix for Issue #160276 -->
            <exclude name="${javahelp.search}/"/>
            <exclude name="${javahelp.excludes}"/>
        </jhindexer>
-------------------------------------------------------------------------------------
Comment 5 Jesse Glick 2009-03-23 19:02:10 UTC
Including **/*.htm sounds like the correct fix to me. (Including all files is not acceptable - you would be indexing
images etc.)

All HTML files should be indexed whether or not they are directly listed in the map. After all, they might be reachable
via hyperlinks from some other pages which are listed.
Comment 6 rmichalsky 2009-03-24 10:24:19 UTC
Ok, commited as changeset core-main #c3b3cf1e4173, thanks for the fix.
Comment 7 Quality Engineering 2009-03-25 21:22:05 UTC
Integrated into 'main-golden', will be available in build *200903251400* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
Changeset: http://hg.netbeans.org/main-golden/rev/c3b3cf1e4173
User: Richard Michalsky <rmichalsky@netbeans.org>
Log: #160276: .htm files not included in help index
Comment 8 tomwheeler 2009-03-26 02:03:52 UTC
Verified in a JavaSE build I grabbed off deadlock.netbeans.org today.

Thanks, guys!!