Friday, February 24, 2012

Better JavaScript content assist in Eclipse Orion

Even in the month since I started using and working on Eclipse Orion, there have been significant improvements to it. The UI is getting cleaned up, navigation is getting easier, git integration is progressing, and search is improving. It's great to be contributing to such a fast moving project, even if the rate of change can be dizzying sometimes. Also, I'm new to JavaScript and I have never worked on such a large code-base written in a dynamic language. One of the things that I have been missing the most is good tool support that really knows about the code you are working on. Traditional IDEs set the bar high in this area.

I'm happy to introduce a small step in the direction of making the JavaScript editor smarter. I've just released an Orion plugin that provides semantically aware content assist in the JavaScript editor. The plugin uses the Esprima JavaScript parser with some extra error recovery added by Andy Clement. I have found the Esprima parser to be fast, clean, and easy to use and using it as the core of the content assist plugin has been the right thing to do, even though this project, too, is fast moving and hard to keep up with.

How it works


Like content assist in Java editors in Eclipse, the Esprima content assist plugin is based off of a semantically rich abstract syntax tree (AST) and so content assist proposals are more likely to be relevant than if we were using a lexical approach to content assist. Here is what happens:

  1. On a content assist invocation, the contents of the buffer are parsed by Esprima.
  2. The resulting AST is walked by the content assist plugin.
  3. While walking the AST, the target type of any AST node is recorded as well as assignments and declarations. This information helps us keep track of what properties are available on each known type at any given point in the AST.
  4. After walking a sufficient amount of the AST (we don't need to walk the entire tree since parts of it are not going to be relevant for a given content assist invocation), all available proposals are calculated based on the target type of the invocation offset and the prefix.

The best way to understand how this works is through examples.

What it can do


Recognizing function scopes


As you can see in this screenshot, scoping is respected and identifiers that are not accessible in the current scope (vInnerInner, v3, v4,…) are not shown in content assist.


Object literals


The key/values of object literals are appropriately proposed:

Even nested object literals are recognized:


Simple control flow


Simple control flow is recorded by the plugin, so that assignments are remembered:


Pre-defined types


Some (but not all) predefined types are available in content assist.


Currently, the plugin recognizes JSON, MATH, Number, String, Boolean, and Date, but I will probably add more as it makes sense.

Constructors


Functions that start with capital letters are considered constructors


Parser recovery


Finally, Andy Clement has been doing some work on making the esprima parser recoverable from errors. Actually, some error recovery is already in esprima, but we need to tweak it a bit for content assist. Hopefully, this work can be contributed back to esprima after we have a good solution. Currently, the recovery work is focussed on errant dots. A common case is that you will type a variable name and then a '.' and expect content assist to provide all reasonable answers. Most JavaScript parsers will fail after the first error, which makes them quite useless when editing code.


As you can see in the screenshot, despite all of the funky dots, the plugin is able to realize that myVar is of type Number and is providing appropriate proposals.

What it can't do (yet)


This is still early for the content assist plugin and there is quite a bit of work to do. For example:

  1. There is no pre-defined window object, which probably should be there, along with possibly other predefined objects, like dojo, dijit, and $ (jquery).
  2. There is no analysis of function return types
  3. No inter-file type inferencing, which will be crucial for getting anything really smart working
  4. The plugin should recognize /*global */ comments
  5. The Esprima-based proposals are currently intermingled with proposals from the default JS content assist plugin and so duplicates appear. (Esprima proposals are always prefixed with a handle (Esprima) so you know where they come from, but they are always on the bottom).

I hope to deal with each of these issues eventually, but I also need to make sure that performance remains reasonable, which it currently seems to be, but is something I need to watch.

How to get it


Mark Macdonald has already added the pluign to the Orion plugin page, so after you log into Orion, click to the "Get Plugins" link and select the Install link for the Esprima content assist plugin:


The github page is located here: https://github.com/aeisenberg/esprimaContentAssist so try it out, have a look at the code and let me know what you think!