This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.
There are usecases that require the dynamic language embedding creation for the API clients: 1) A java string literal may contain a e.g. a SQL statement text. As there is a default embedding that recognizes escaped characters e.g. "\n" then this allows for more than one embedding presence for a single token (the default embedding should be available as well as there may be clients relying on its presence). 2) A <script> html tag does not need to specify the type of the scripting language and the default language may be overriden by <META http-equiv="Content-Script-Type" content="type"> Although the lexer could in theory recognize such declaration and store the content-type in the state object of each token that follows the declaration it is non-practical as recognizing the declaration above is a task that should be reserved for a html parser. 3) A script that follows the <script> tag may be written as comment surrounded by <!-- // --> and there may be no default embedding for the comment so the parser must request explicit embedding creation for the comment token. Requirements: 1) API method must be added to TokenSequence for custom embedding creation. 2) Notification model must be extended so that clients (e.g. syntax coloring) may notice creation of new embedding. 3) Clients must also listen for the case when a new token eligible for custom embedding gets created. Also if the token with custom embedding becomes damaged by user's typing then the custom embedding will be lost so the clients must recreate it. 4) If more than one embedding exist for a single token the one of the embeddings must be used for syntax coloring purposes. As there is not yet a usecase where there would be more than one custom embeddings the solution can be that the syntax coloring will use custom embedding if one exists otherwise it will use default embedding.
The following attached diff contains implementation of this request. There are the following changes: 1. Extracted TokenHierarchyEvent.Type inner enum into TokenHierarchyEventType top-level enum for better readability. 2. Adding TokenSequence.createEmbedding() method was added for creation of a custom embedding. New TokenHierarchyEventType.EMBEDDING value fired after the custom embedding creation. 3. Affected offset area information affectedStartOffset() and affectedEndOffset() extracted from TokenChange to TokenHierarchyEvent because it's more useful and clear for the clients of these methods - e.g. the syntax coloring will just query these offsets without digging into the (possibly embedded) token change(s). 4. Removed tokenComplete parameter from LanguageHierarchy.embedding() because it's currently unused and the token incompletness will be handled in a different way in the future (see also issue 87014). 5. Swapped order of <code>token</code> and languagePath parameters in LanguageProvider to be in sync with LanguageHierarchy.embedding(). 6. LanguageEmbedding is now a final class (instead of abstract class) with private constructor and static create() method. That allows better control over the evolution of the class and it also allows to cache the created embeddings to save memory. 7. LanguageEmbedding is now generified with the LanguageEmbedding<T extends TokenId> which is a generification of the language which it contains. 8. TokenHierarchy.languagePaths() set contains all language paths used in the particular token hierarchy. TokenHierarchyEventType.LANGUAGE_PATHS is fired after change of the language paths set.
Created attachment 36320 [details] Diff of the change
Marking for fasttrack review.
BTW "diff -u" is generally more readable than "diff -c", especially in an enormous patch like this one. Easiest to append "diff -u" to your ~/.cvsrc file.
Created attachment 36454 [details] List of committed files
Committed into trunk.
Uh, did you mean M6?
Sorry, I've meant M6. Thanks, Jesse.