Friday, February 24, 2012

Better JavaScript content assist in Eclipse Orion

Even in the month since I started using and working on Eclipse Orion, there have been significant improvements to it. The UI is getting cleaned up, navigation is getting easier, git integration is progressing, and search is improving. It's great to be contributing to such a fast moving project, even if the rate of change can be dizzying sometimes. Also, I'm new to JavaScript and I have never worked on such a large code-base written in a dynamic language. One of the things that I have been missing the most is good tool support that really knows about the code you are working on. Traditional IDEs set the bar high in this area.

I'm happy to introduce a small step in the direction of making the JavaScript editor smarter. I've just released an Orion plugin that provides semantically aware content assist in the JavaScript editor. The plugin uses the Esprima JavaScript parser with some extra error recovery added by Andy Clement. I have found the Esprima parser to be fast, clean, and easy to use and using it as the core of the content assist plugin has been the right thing to do, even though this project, too, is fast moving and hard to keep up with.

How it works


Like content assist in Java editors in Eclipse, the Esprima content assist plugin is based off of a semantically rich abstract syntax tree (AST) and so content assist proposals are more likely to be relevant than if we were using a lexical approach to content assist. Here is what happens:

  1. On a content assist invocation, the contents of the buffer are parsed by Esprima.
  2. The resulting AST is walked by the content assist plugin.
  3. While walking the AST, the target type of any AST node is recorded as well as assignments and declarations. This information helps us keep track of what properties are available on each known type at any given point in the AST.
  4. After walking a sufficient amount of the AST (we don't need to walk the entire tree since parts of it are not going to be relevant for a given content assist invocation), all available proposals are calculated based on the target type of the invocation offset and the prefix.

The best way to understand how this works is through examples.

What it can do


Recognizing function scopes


As you can see in this screenshot, scoping is respected and identifiers that are not accessible in the current scope (vInnerInner, v3, v4,…) are not shown in content assist.


Object literals


The key/values of object literals are appropriately proposed:

Even nested object literals are recognized:


Simple control flow


Simple control flow is recorded by the plugin, so that assignments are remembered:


Pre-defined types


Some (but not all) predefined types are available in content assist.


Currently, the plugin recognizes JSON, MATH, Number, String, Boolean, and Date, but I will probably add more as it makes sense.

Constructors


Functions that start with capital letters are considered constructors


Parser recovery


Finally, Andy Clement has been doing some work on making the esprima parser recoverable from errors. Actually, some error recovery is already in esprima, but we need to tweak it a bit for content assist. Hopefully, this work can be contributed back to esprima after we have a good solution. Currently, the recovery work is focussed on errant dots. A common case is that you will type a variable name and then a '.' and expect content assist to provide all reasonable answers. Most JavaScript parsers will fail after the first error, which makes them quite useless when editing code.


As you can see in the screenshot, despite all of the funky dots, the plugin is able to realize that myVar is of type Number and is providing appropriate proposals.

What it can't do (yet)


This is still early for the content assist plugin and there is quite a bit of work to do. For example:

  1. There is no pre-defined window object, which probably should be there, along with possibly other predefined objects, like dojo, dijit, and $ (jquery).
  2. There is no analysis of function return types
  3. No inter-file type inferencing, which will be crucial for getting anything really smart working
  4. The plugin should recognize /*global */ comments
  5. The Esprima-based proposals are currently intermingled with proposals from the default JS content assist plugin and so duplicates appear. (Esprima proposals are always prefixed with a handle (Esprima) so you know where they come from, but they are always on the bottom).

I hope to deal with each of these issues eventually, but I also need to make sure that performance remains reasonable, which it currently seems to be, but is something I need to watch.

How to get it


Mark Macdonald has already added the pluign to the Orion plugin page, so after you log into Orion, click to the "Get Plugins" link and select the Install link for the Esprima content assist plugin:


The github page is located here: https://github.com/aeisenberg/esprimaContentAssist so try it out, have a look at the code and let me know what you think!

9 comments:

  1. Very cool Andrew. I have your plugin installed and it's working great so far. It's a big step up from the basic content assist we had in Orion so far, and a much better base to build on.

    ReplyDelete
  2. Great job! This plugin makes a lot of javascripter happy.
    I will try to port your eclipse plugin to vim plugin.

    ReplyDelete
  3. Hey,
    I'm trying to understand deeper into the parser- where exactly is the scope calculated?
    How does the plugin know what scope the cursor is on, so that it will fetch only relavent proposals?
    Regards,
    Mattan.

    ReplyDelete
    Replies
    1. @Mattan, the piece that you are interested in is an AST visitor implemented in the esprimaVisitor.js file in the scripted/orion code base. This component knows how to walk a parse tree of a js file. And in the esprimaJsContentAssistPlugin.js file, there is logic that generates proper scoping information. Not sure if this answers your question, so be more specific if you need more help.

      Delete
    2. Andrew,
      Thanks for the quick answer! I'll give a more specific example:
      Let's say this is the code I have:

      var test = 1;
      function gcd(a, b) {
      function abc(){
      var test = 1;
      }
      return (b === 0) ? a : gcd(b, a % b);
      // And in the next line I ask for intelisence for
      // anything starting with ab
      ab (*)
      // + ctrl + space
      }

      how does the parser know that line (*) is in the scope of the function gcd, leading it to propose function abc?
      Thanks again!

      Delete
    3. I see what you're asking. To oversimplify, the inferencer uses a visitor pattern to visit the AST of the JS file. We use a stack to keep track of the current scope. Every time we visit a new scope, we push the scope onto the stack. When we leave the scope, it gets popped from the stack. And to perform a lookup to find the type of a variable, we peak at the top of the scope stack to see if it contains the variable. If not, we recursively look through all items in the stack until we find the stack or we look at all elements.

      Delete
  4. ok, and what is the "range" object used for?
    Thanks again :)
    Mattan.

    ReplyDelete
  5. The range is the source locations of the ast node. It is a two element array corresponding to the start and end locations of the node.

    ReplyDelete
  6. when sometimes we press the . operator or use ctrl+space key combination in JSP/Javascript pages, Eclipse won't suggest anything but at times the java EE version does show the options. I just learnt about some plugins from eclipse content assist

    ReplyDelete