This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.
I was asked to file an issue here after posting on NBDEV "Lexer module creates instances of my lexer very often": http://www.netbeans.org/servlets/ReadMsg?list=nbdev&msgNo=35943 The main issue is that the Lexer API's design requires many instances of the user's Lexer implementation to be instantiated, and that this happens at least once for every character the user types. As a result, if construction of a Lexer instance is expensive, then the whole editor becomes inefficient. In my message, I suggest that the Lexer API should change, so that an already-existing Lexer can be restarted. That is, replace LanguageHierarchy#createLexer(LexerRestartInfo) with LanguageHierarchy#createLexer() and then add Lexer#restart(LexerRestartInfo). Once this API change is made, then the Lexer module's implementation could change to allow it to cache and reuse existing lexer instances. Alternatively, please provide a method Lexer#finished(), which the lexer module would call to indicate that the Lexer is no longer going to be used. This would allow the user to implement his own pool of lexer instances in his LanguageHierarchy implementation.
In response to discussion on nbdev: I do not think that it is necessary to do caching-per-<document | thread>. When a lexer for the particular language will be needed by the lexer framework the free lexer will either come from a cache or it will be created from scratch if all the lexers for the particular language are in use. I would leave the caching to be performed in the Lexer's implementations. The handcoded lexers will not need it at all and the lexer impls that will use the caching must ensure that all the lexer's instance variables that might cause any leaks get cleared in release(). Personally I would cache two lexer instances at maximum: one for user modifications to a document (all typing requests are hanled in AWT) and the other one for a background parser (typically there is a single parsing queue for the given language that processes a single parsing request at the time). Especially if the lexers consume a lot of memory it would not be good to cache many of them. BTW couldn't the jflex be tweaked to only use the LexerInput and not an extra buffer? That way the per-lexer memory could be decreased. So if we can agree on returning Lexer.release() I will go on and implement issue 89000.
I am happy with either way. But, I think that the way I suggested is better because it provides you (the author of the Lexer module) with greater flexibility in the future regarding threading, and it is easier for the user to code, because they don't have to deal with any pooling. It is possible to tweak JFlex-generated lexers so that they have a very small buffer, but that doesn't decrease the number of objects that need to be instantiated per lexer. To get rid of the buffer requirement completely, JFlex itself needs to be patched.
I have committed issue 89000. Making this issue a dup of it. *** This issue has been marked as a duplicate of 89000 ***