Bug 65586 - JarContents#mightContainResource doesn't return true when finding directory in jar file by using bloom filter
Summary: JarContents#mightContainResource doesn't return true when finding directory i...
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 9
Classification: Unclassified
Component: Catalina (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: -----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-09-22 02:17 UTC by DigitalCat
Modified: 2021-09-27 20:57 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description DigitalCat 2021-09-22 02:17:12 UTC
Dear all

When using a bloom filter to speed up archive lookups (useBloomFilterForArchives = "true" in context.xml) in Tomcat9, tomcat will fail to get resources from jar file in some special condition.

For example , when we want to find a directory resource in xxx.jar by using a bloom filter,

if we use " cl.getResouce("/org/apache/coyote", "/WEB-INF/classes") " to get resources , noting will be returned,  
(By the way,
we find the same way to be used to get resources in the xmlbeans-4.0.0.jar org.apache.xmlbeans.impl.schema.SchemaTypeLoaderImpl#isPath30)

but if we use cl.getResouce("/org/apache/coyote/", "/WEB-INF/classes") , it will return the resources we want successfully.

if we do not use bloomFilter , both ways will return resources successfully. 

It is cause by org.apache.catalina.webresources.JarContents#JarContents who create hashCode of JarEntry.getName(), if JarEntry is directory, its name contain

"/" at the last of string. 

So when you use param didn't contain "/" at last, org.apache.catalina.webresources.JarContents#mightContainResource will return false.

For example:

        JarFile jarFile = new JarFile("D:\\tomcat-coyote.jar");

        JarContents jarContents  = new JarContents(jarFile);

        // false
     System.out.println(jarContents.mightContainResource("/org/apache/catalina", "/WEB-INF/classes"));

        // true
    System.out.println(jarContents.mightContainResource("/org/apache/catalina/", "/WEB-INF/classes"));
 
So I suggest changing JarContents#hashcode like this to ignore end slash of path

    private int hashcode(String content, int startPos, int hashPrime) {
        int h = hashPrime/2;
        int contentLength = content.length();
		
        if (contentLength > 1 && content.charAt(contentLength - 1) == '/') {
            // ignore end slash
            contentLength--;
        }
		
        for (int i = startPos; i < contentLength; i++) {
            h = hashPrime * h + content.charAt(i);
        }

        if (h < 0) {
            h = h * -1;
        }
        return h;
    }

	
	
sorry ,I am not native speaker , hope that I made it clear!
Comment 1 Mark Thomas 2021-09-27 17:57:56 UTC
That is a very clear description. Thank you. I am working on a fix now.
Comment 2 Mark Thomas 2021-09-27 20:57:28 UTC
Fixed in:
- 10.1.x for 10.1.0-M6 onwards
- 10.0.x for 10.0.12 onwards
- 9.0.x for 9.0.54 onwards