183162 – provide efficient char sequence implementation

This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 183162 - provide efficient char sequence implementation

Summary: provide efficient char sequence implementation

Status:	RESOLVED FIXED

Alias:	None

Product:	platform
Classification:	Unclassified
Component:	-- Other -- (show other bugs)
Version:	6.x
Hardware:	All All

Importance:	P2 normal (vote)
Assignee:	Vladimir Voskresensky

URL:
Keywords:	API, API_REVIEW_FAST

Depends on:
Blocks:	181684
	Show dependency tree

Reported:	2010-03-30 09:34 UTC by Vladimir Voskresensky
Modified:	2010-04-07 04:43 UTC (History)
CC List:	6 users (show)

See Also:
Issue Type:	ENHANCEMENT
Exception Reporter:

Attachments
proposed patch-1 (38.35 KB, patch) 2010-03-30 10:53 UTC, Vladimir Voskresensky	Details \| Diff
final patch (42.83 KB, patch) 2010-03-31 12:28 UTC, Vladimir Voskresensky	Details \| Diff
next patch (46.02 KB, patch) 2010-04-01 14:58 UTC, Vladimir Voskresensky	Details \| Diff
Show Obsolete (2) Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Vladimir Voskresensky 2010-03-30 09:34:24 UTC

To reduce memory footprint NB Platform should use efficient char sequences implementations in place of strings.
It is advised to create such support in org.openide.util.CharSequences by moving CND's CharSequenceKey implementation into platform (see issue #181684)

Comment 1 Vladimir Voskresensky 2010-03-30 10:53:22 UTC

Created attachment 96315 [details]
proposed patch-1

patch with initial impl + tests

Comment 2 Vladimir Voskresensky 2010-03-30 10:54:47 UTC

Please, review proposed patch

Comment 3 Vitezslav Stejskal 2010-03-30 19:58:13 UTC

VS1: typo in javadoc "@author Vladiir Voskresensky"
VS2: CharSequences class should be declared final
VS3: missing apichanges.xml entry

Comment 4 Vladimir Voskresensky 2010-03-30 20:46:22 UTC

(In reply to comment #3)
> VS1: typo in javadoc "@author Vladiir Voskresensky"
Thanks
> VS2: CharSequences class should be declared final
class has private constructor
> VS3: missing apichanges.xml entry
OK

Comment 5 Vitezslav Stejskal 2010-03-30 21:16:54 UTC

(In reply to comment #4)
> (In reply to comment #3)
> > VS2: CharSequences class should be declared final
> class has private constructor

Yes, but declaring it final does not hurt anything and it's immediately clear to everybody that it really can't be subclassed.

Comment 6 Vladimir Voskresensky 2010-03-30 21:25:50 UTC

(In reply to comment #5)
> (In reply to comment #4)
> > (In reply to comment #3)
> > > VS2: CharSequences class should be declared final
> > class has private constructor
> 
> Yes, but declaring it final does not hurt anything and it's immediately clear
> to everybody that it really can't be subclassed.
I don't have real objections. 
Some others can have (Jesse?)... Utility classes with only static methods not often marked with "final", but must have private constructor

Comment 7 Jaroslav Tulach 2010-03-31 07:22:08 UTC

Y01 The javadoc is not really finished, is it? Sentences start with lower case, are short, not really describing what is going on, etc.

Y02 Modify apichanges.xml

Y03 Use assertSize. As the primary motivation for this is to save some memory, it is wise to enhanced the tests with assertSize guards.


I guess I'd make the class final as VS02 suggests.

Comment 8 Vladimir Voskresensky 2010-03-31 09:18:30 UTC

(In reply to comment #7)
> Y01 The javadoc is not really finished, is it? 
semantics of create methods is the same as constructors of String class and has appropriate links.
>Sentences start with lower case,are short, not really describing what is going on, etc.
Will fix
> 
> Y02 Modify apichanges.xml
(VS3) OK
> 
> Y03 Use assertSize. As the primary motivation for this is to save some memory,
> it is wise to enhanced the tests with assertSize guards.
Great. Will use.
> 
> 
> I guess I'd make the class final as VS02 suggests.
OK

Comment 9 Vladimir Voskresensky 2010-03-31 12:28:33 UTC

Created attachment 96423 [details]
final patch

Please, check patch with addressed comments. If not objections I will integrate tomorrow

Comment 10 Jesse Glick 2010-03-31 17:10:09 UTC

Generally I suggest to not mark utility classes final since final and abstract modifiers draw your attention toward subclasses, whereas for utility classes, subclasses are not so much prohibited as meaningless - so a final modifier is just a distraction. Since the ctor is private there is no risk of subclassing anyway. But I do not have a strong opinion.


[JG01] Get rid of ComparatorIgnoreCase. No one is likely to use it, and anyway it is not doing a locale-sensitive comparison like you would expect.


[JG02] CharSequenceKey might be clearer if it were split into two classes, one holding a byte[], one a char[].


[JG03] CharSequencesTest ctor should take a String name and pass it to super. That is how test cases get their names.


[JG04] The test does not appear to ever check non-ASCII character sequences.

Comment 11 Alexander Simon 2010-03-31 18:02:59 UTC

(In reply to comment #10)
> Generally I suggest to not mark utility classes final since final and abstract
> modifiers draw your attention toward subclasses, whereas for utility classes,
> subclasses are not so much prohibited as meaningless - so a final modifier is
> just a distraction. Since the ctor is private there is no risk of subclassing
> anyway. But I do not have a strong opinion.
> 
> 
> [JG01] Get rid of ComparatorIgnoreCase. No one is likely to use it, and anyway
> it is not doing a locale-sensitive comparison like you would expect.
Not agree, CND use it.
> 
> 
> [JG02] CharSequenceKey might be clearer if it were split into two classes, one
> holding a byte[], one a char[].
> 
Thanks, good idea. Improve performance.
> 
> [JG03] CharSequencesTest ctor should take a String name and pass it to super.
> That is how test cases get their names.
> 
> 
> [JG04] The test does not appear to ever check non-ASCII character sequences.
Absolutely agree.

Comment 12 Vladimir Voskresensky 2010-04-01 14:37:21 UTC

(In reply to comment #10)
> Generally I suggest to not mark utility classes final since final and abstract
> modifiers draw your attention toward subclasses, whereas for utility classes,
> subclasses are not so much prohibited as meaningless - so a final modifier is
> just a distraction. Since the ctor is private there is no risk of subclassing
> anyway. But I do not have a strong opinion.
> 
> 
> [JG01] Get rid of ComparatorIgnoreCase. No one is likely to use it, and anyway
> it is not doing a locale-sensitive comparison like you would expect.
We use it (Fortran is case insensitive language and we use comparator to match with known keywords). How to improve it?

> 
> 
> [JG02] CharSequenceKey might be clearer if it were split into two classes, one
> holding a byte[], one a char[].
Will update.

> 
> [JG03] CharSequencesTest ctor should take a String name and pass it to super.
> That is how test cases get their names.
OK

> 
> 
> [JG04] The test does not appear to ever check non-ASCII character sequences.
will add

Thanks.

Comment 13 Vladimir Voskresensky 2010-04-01 14:58:32 UTC

Created attachment 96545 [details]
next patch

patch with dedicated class for byte-based and char-based impls + updated comparator + tests for Unicode and contentEquals

Comment 14 Jesse Glick 2010-04-02 15:50:15 UTC

(In reply to comment #12)
>> [JG01] Get rid of ComparatorIgnoreCase
>
> We use it (Fortran is case insensitive language and we use comparator to match
> with known keywords). How to improve it?

Can this be implemented in the Fortran support code instead? Case-insensitive keyword matching is a relatively uncommon task, since nearly all modern languages are case-sensitive. (Ant is a notable and unfortunate exception, which has led to problems for Turkish users - the infamous I/i/İ/ı problem.)

If it must remain for reasons of encapsulation - i.e. there is no efficient way to implement comparable functionality externally - then make it clear in the Javadoc that this is not a locale-sensitive comparison, thus similar to String.equalsIgnoreCase or .compareToIgnoreCase rather than Collator.

Comment 15 Vladimir Voskresensky 2010-04-02 16:32:54 UTC

(In reply to comment #14)
> (In reply to comment #12)
> >> [JG01] Get rid of ComparatorIgnoreCase
> >
> > We use it (Fortran is case insensitive language and we use comparator to match
> > with known keywords). How to improve it?
> 
> Can this be implemented in the Fortran support code instead? 
I think, I can move it into Fortran. This comparator currently doesn't use knowledge about optimized classes internal structures (opposite to case sensitive)

> Case-insensitive
> keyword matching is a relatively uncommon task, since nearly all modern
> languages are case-sensitive. 
Do not agree, but off-topic.

Comment 16 Vladimir Voskresensky 2010-04-02 16:35:59 UTC

re JG01: I would make comparator() method without arguments. Is it ok?

Comment 17 Jesse Glick 2010-04-02 17:49:55 UTC

(In reply to comment #16)
> re JG01: I would make comparator() method without arguments.

Yes, that is what I meant.

Comment 18 Vladimir Voskresensky 2010-04-06 14:22:11 UTC

Thanks for review.
I will integrate.

Comment 19 Vladimir Voskresensky 2010-04-06 14:35:34 UTC

integrated:
http://hg.netbeans.org/cnd-main/rev/2a8a057f9721

Comment 20 Vladimir Voskresensky 2010-04-06 16:24:08 UTC

fixed subSequence in changed byte and char long seqs
http://hg.netbeans.org/cnd-main/rev/1d4ace847b93

Comment 21 Quality Engineering 2010-04-07 04:43:19 UTC

Integrated into 'main-golden', will be available in build *201004070201* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
Changeset: http://hg.netbeans.org/main/rev/2a8a057f9721
User: Vladimir Voskresensky <vv159170@netbeans.org>
Log: fixed #183162 -  provide efficient char sequence implementation