Introducing Sunlight

Contents hide

This article was originally published in my blog (affectionately referred to as blargh) on . The original blog no longer exists as I've migrated everything to this wiki.

The original URL of this post was at https://tmont.com/blargh/2011/4/introducing-sunlight. Hopefully that link redirects back to this page.

So, I wrote a syntax highlighter. The answer is because it was fun. And also because I was displeased with the inadequacy of other ones, such as SyntaxHighlighter and Prettify.

Specifically, those highlighters rely on hideously complicated regular expressions, which works in a general kind of case, but most languages are a bit more sophisticated than just having keywords. For example, in C#, there are contextual keywords (like get, set and value) that are keywords only when used in a certain context. Obviously a regular expression isn't going to be able to detect that kind of idiosyncrasy.

So what I did is write a syntax highlighting library that could. And I called it Sunlight.

Note how awesome this is:

Sunlight syntax highlighting example for C#

More complete demo (and for other languages) is here.

Technical Details

Sunlight behaves much more like an actual language parser. It actually iterates over each character of the string it's highlighting, invoking rules that convert the text into a stream of tokens. Then it analyzes the tokens, which generally just converts it to HTML. I know that sounds like it's slow, but it's actually a little bit faster than Prettify (in the benchmarks I did, which was just comparing the times it took to highlight the C# demo code; Sunlight's C# language definition is by far the most complicated).

Each language is defined by an object that specifies a few things like keywords, operators, scopes (like strings and comments) and so forth. It also gives the option for full customization by defining your own parse rules. The C# language definition does this so that it can, for example, detect contextual keywords like get, set and value and color them appropriately.

Sunlight also has a fairly easy way to detect what I called "name idents" (e.g. class names) and color them appropriately. You can also write rules to indicate when an identifier should be "named" (like, say, when it comes after the new keyword).

Usage

Usage is simple, and pretty similar to the other syntax highlighters. Include the javascript file(s), a CSS file for styling, and surround your block of code with an element with a special class name, like so:

html
<pre class="sunlight-highlight-csharp">public class MyClass {
  private int get;
  private int set;
  private int value;

  public int Property {
    get { return get; }
    set {
      value = set ?? value;
      return set;
    }
  }
}</pre>

And then call Sunlight.highlightAll(); and the rest is magic.

Sunlight is capable of many other magical things, which are spelled out in some tediously detailed documentation.