This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 88966 - Lexer creates (too) many instances of my Lexer
Summary: Lexer creates (too) many instances of my Lexer
Status: RESOLVED DUPLICATE of bug 89000
Alias: None
Product: editor
Classification: Unclassified
Component: Lexer (show other bugs)
Version: 5.x
Hardware: All All
: P3 blocker (vote)
Assignee: issues@editor
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-11-08 19:17 UTC by _ briansmith
Modified: 2006-12-14 15:00 UTC (History)
0 users

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description _ briansmith 2006-11-08 19:17:42 UTC
I was asked to file an issue here after posting on NBDEV "Lexer module creates
instances of my lexer very often":
http://www.netbeans.org/servlets/ReadMsg?list=nbdev&msgNo=35943

The main issue is that the Lexer API's design requires many instances of the
user's Lexer implementation to be instantiated, and that this happens at least
once for every character the user types. As a result, if construction of a Lexer
instance is expensive, then the whole editor becomes inefficient.

In my message, I suggest that the Lexer API should change, so that an
already-existing Lexer can be restarted. That is, replace
LanguageHierarchy#createLexer(LexerRestartInfo) with
LanguageHierarchy#createLexer() and then add Lexer#restart(LexerRestartInfo).

Once this API change is made, then the Lexer module's implementation could
change to allow it to cache and reuse existing lexer instances.

Alternatively, please provide a method Lexer#finished(), which the lexer module
would call to indicate that the Lexer is no longer going to be used. This would
allow the user to implement his own pool of lexer instances in his
LanguageHierarchy implementation.
Comment 1 Miloslav Metelka 2006-11-28 16:16:56 UTC
In response to discussion on nbdev:
I do not think that it is necessary to do caching-per-<document | thread>. When
a lexer for the particular language will be needed by the lexer framework the
free lexer will either come from a cache or it will be created from scratch if
all the lexers for the particular language are in use. I would leave the caching
to be performed in the Lexer's implementations. The handcoded lexers will not
need it at all and the lexer impls that will use the caching must ensure that
all the lexer's instance variables that might cause any leaks get cleared in
release().

Personally I would cache two lexer instances at maximum: one for user
modifications to a document (all typing requests are hanled in AWT) and the
other one for a background parser (typically there is a single parsing queue for
the given language that processes a single parsing request at the time).
Especially if the lexers consume a lot of memory it would not be good to cache
many of them.
 BTW couldn't the jflex be tweaked to only use the LexerInput and not an extra
buffer? That way the per-lexer memory could be decreased.

So if we can agree on returning Lexer.release() I will go on and implement issue
89000. 
Comment 2 _ briansmith 2006-12-06 13:18:09 UTC
I am happy with either way. But, I think that the way I suggested is better 
because it provides you (the author of the Lexer module) with greater 
flexibility in the future regarding threading, and it is easier for the user 
to code, because they don't have to deal with any pooling.

It is possible to tweak JFlex-generated lexers so that they have a very small 
buffer, but that doesn't decrease the number of objects that need to be 
instantiated per lexer. To get rid of the buffer requirement completely, JFlex 
itself needs to be patched.
Comment 3 Miloslav Metelka 2006-12-14 15:00:28 UTC
I have committed issue 89000. Making this issue a dup of it.

*** This issue has been marked as a duplicate of 89000 ***