Bug 62287

Summary: ValueExpressionImpl#equals is wrong
Product: Tomcat 9 Reporter: Mark Struberg <struberg>
Component: ELAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: 9.0.6   
Target Milestone: -----   
Hardware: PC   
OS: Mac OS X 10.1   
Attachments: patch for ValueExpressionImpl

Description Mark Struberg 2018-04-12 09:42:42 UTC
Created attachment 35859 [details]
patch for ValueExpressionImpl

ValueExpressionImpl#equals is likely not enough.

The current code just compares the hashCode():

>return (obj instanceof ValueExpressionImpl && obj.hashCode() == this.hashCode());

It's possible that there is the same hashCode for different objects. For some Java Classes we alwasy get 0 even.


For better tracking: this came up during a bug report against Apache OpenWebBeans as reported by Holger Sunke.
https://issues.apache.org/jira/browse/OWB-1233
Comment 1 Mark Thomas 2018-04-23 13:21:06 UTC
Given the current implementation, the odds of a hash collision look to be very small. The probably explains why the code has been like this since 6.0.x without any bugs reported. That said, the Javadoc for #hashCode() and equals() is clear that non-equal objects may have equal hash codes so checking equals in the event of equal hash codes looks like the safe thing to do.

I'm assuming the #hashCode() was used for performance reasons.
Comment 2 Mark Thomas 2018-04-23 13:36:09 UTC
Thanks for the patch.

Fixed in:
- trunk for 9.0.8 onwards
- 8.5.x for 8.5.31 onwards
- 8.0.x for 8.0.52 onwards
- 7.0.x for 7.0.87 onwards
Comment 3 Christopher Schultz 2018-04-23 16:47:56 UTC
It's a little late, but I have a few comments about the patch.

1. There is no reason to check the hashcode if you are going to call .equals. Presumably, Node.equals() knows the best way to compare most efficiently.

2. I don't know enough about ValueExperssionImpl (etc.) to know whether this is true, but often using "instanceof" in an equals(Object) method is inappropriate. Is there any reason to use a concrete class here rather than an interface? Is there any reason two objects might have different runtime classes, yet still be considered "equals"?

Under normal circumstances, a proper equals(Object) method would look like this:

{
  return null != o
      && o.getClass().equals(this.getClass()
      && ((HighestPossibleInterface)o).getNode().equals(this.getNode())
  ;
}
Comment 4 Mark Thomas 2018-04-23 17:36:21 UTC
(In reply to Christopher Schultz from comment #3)
> It's a little late, but I have a few comments about the patch.
> 
> 1. There is no reason to check the hashcode if you are going to call
> .equals. Presumably, Node.equals() knows the best way to compare most
> efficiently.

My assumption was that hashCode() was used originally because it was faster. Therefore the patch uses hashCode() and only falls back to equals if the hash codes are the same. Whether that makes sense depends on the relative speed of hashCode() vs equals() and how often the tested objects are equal or not. I opted not to dig in to that and go with the patch as written. I've no objection to dropping the hashCode() if that is what folks prefer.


> 2. I don't know enough about ValueExperssionImpl (etc.) to know whether this
> is true, but often using "instanceof" in an equals(Object) method is
> inappropriate. Is there any reason to use a concrete class here rather than
> an interface? Is there any reason two objects might have different runtime
> classes, yet still be considered "equals"?

Not that I can think of. There would only ever be a single EL implementation in use.
Comment 5 Christopher Schultz 2018-04-23 17:57:00 UTC
I would actually prefer to drop the hashCode because:

1. It complicates code unnecessarily (!)
2. It makes a guess about the implementation of Node.equals() that a hash-code check is faster than any other check
2a. Some hashCode calls are expensive (eg SimpleNode.hashCode where "children" is long)
2b. Checking node's runtime type is probably the quickest possible rule-out
2c. Node.equals(Object) already checks object types
Comment 6 Mark Thomas 2018-04-23 18:40:37 UTC
No objection here.
Comment 7 Mark Struberg 2018-04-25 20:48:48 UTC
Thanks for the feedback and applying. Have been sick over the weekend, so sorry for the delay.

> I would actually prefer to drop the hashCode because:

Yes, you are both right. I first intended to ship a minimal impact patch. 
But of course hashCode() will likely again be called in Node#equals(). That would make it being called twice - which makes no sense.
Comment 8 Christopher Schultz 2018-04-26 15:30:14 UTC
(In reply to Mark Struberg from comment #7)
> But of course hashCode() will likely again be called in Node#equals(). That
> would make it being called twice - which makes no sense.

I've rarely seen an equals() method use hashCode for anything.

It's not used in SimpleNode#equals, for instance.
Comment 9 Mark Struberg 2018-04-26 17:48:45 UTC
> It's not used in SimpleNode#equals, for instance.

Oki, in that case please leave the hashCode() check!

Reason: the hashCode is internally cached as far as I've seen. So if you can quickly find NON matching nodes you are much faster. Expecially since the non-equals case is obviously much more often the case than equals.