Syntax Coloring for Blog Posts
If you were reading Objective-C 2.0 Tutorial: Part II yesterday, you might have noticed the first signs of syntax coloring starting to break through the surface. It looked a bit weird for a few hours, but I think the final result was worth it. It turns out this wasn't so easy to do.I had punted on syntax coloring for a while, but it really felt like something was missing when the Objective-C 2.0 posts came up. With so much code, it really helps to have visual cues. So I sat down to take a look at this. I knew of three possible strategies:
1. JavaScript-based parsing/coloring engine
2. Server-side parsing/coloring engine
3. TextMate HTML generation
The JavaScript solution does exist as a Google Code project called google-code-prettify. It looks nice and simple, but I couldn't get anything to happen with Objective-C source. There's also CodeHighlighter, which I found on Joe Maller's post on the subject, but there's no Objective-C support here, either.
Server-side parsing. Do I want to write an Objective-C parser? Nope. Moving on.
The TextMate solution, it turns out, produces stunningly beautiful results. The TextMate bundle (yes, there's a bundle for TextMate itself) has a few interesting commands in this area:
Create CSS from Theme
Create HTML from Document / Selection
Create HTML from Document / Selection with Lines
The "Create CSS from Theme" command converts your current TextMate color theme into a CSS file. And it works well. Really well. I actually thought the first output test was the original TextMate window.
The "HTML from Document" command converts TextMate's document structure into HTML, adds span tags with appropriate CSS class names for styling, then slaps the matching CSS at the top, producing a free-standing web page with the properly-styled document contents.
This all works because TextMate the document contents to bundles in the form of scopes. Scopes are cascading selectors (much like CSS, in fact), which is why you can write a command that applies to all types of source, or just to, say, Objective-C. It's also why the "objacc" tab trigger generates different results in an @interface block than an @implementation block.
You can see what the scope is for any given block of text by selecting the text and choosing "Show Scope" from the "Bundle Development" bundle (Control-Shift-P, by default):
An NSString type in the context of a property declaration, for example, has these scopes:
source.objc
meta.interface-or-protocol.objc
meta.scope.interface.objc
The syntax coloring themes uses scopes for styling text. So the "Create HTML from Document" command really does three things:
1. Converts scope names in the theme to CSS class selectors
2. Converts the theme attributes to CSS properties
3. Converts the document structure to HTML with scope names converted to CSS class names
This works great for one-off documents, but the catch is that the HTML generator applies all scopes for all elements in the document in the form of span tags — even those which are not styled by the current theme. So the result is a lot of extra span tags which are not used in many cases.
There is immense scripting potential with all of that metadata inline, but for my purposes, I just wanted something a bit more streamlined. The great thing about TextMate bundles is that you can go in and muck around with them yourself. So I did.
The HTML generator was created by Brad Choate. The script is written in Ruby and resides in the TextMate app package:
/Applications/TextMate.app/Contents/SharedSupport/
Bundles/TextMate.tmbundle/Support/lib/doctohtml.rb
The script is in the Support folder, so you can't edit it directly from the Bundle Editor. You can still open it directly, of course. Make a backup of the original before doing so.
I'm not a Ruby expert, but I knew just enough to hack something together. I changed the script to build up a list of scopes that the current theme has styling information for, and only output CSS class names for those selectors. In an extreme case, the result is going from this:
<span class="support support_class support_class_cocoa">NSString</span>* director;director = <span class="meta meta_bracketed meta_bracketed_objc"><span class="punctuation punctuation_section punctuation_section_scope punctuation_section_scope_objc">[</span><span class="meta meta_bracketed meta_bracketed_objc"><span class="punctuation punctuation_section punctuation_section_scope punctuation_section_scope_objc">[</span><span class="meta meta_bracketed meta_bracketed_objc"><span class="punctuation punctuation_section punctuation_section_scope punctuation_section_scope_objc">[</span>movie <span class="meta meta_function-call meta_function-call_objc"><span class="support support_function support_function_any-method support_function_any-method_objc">director</span></span><span class="punctuation punctuation_section punctuation_section_scope punctuation_section_scope_objc">]</span></span> <span class="meta meta_function-call meta_function-call_objc"><span class="support support_function support_function_any-method support_function_any-method_objc">fullName</span></span><span class="punctuation punctuation_section punctuation_section_scope punctuation_section_scope_objc">] (snipped for length)
To this:
<span><span class="support_class">NSString</span>* director;director = <span><span>[</span><span><span>[</span><span><span>[</span>movie <span><span class="support_function">director</span></span><span>]</span></span> <span><span class="support_function">fullName</span></span><span>]</span></span> <span><span class="support_function">capitalizedString</span></span><span>]</span></span>;<span class="support_type">NSUInteger</span> movieTitleLength;movieTitleLength = <span><span>[</span><span><span>[</span>movie <span><span class="support_function">title</span></span><span>]</span></span> <span><span class="support_function">length</span></span><span>]</span></span>;</span>
And the final result looks identical.
I couldn't eliminate all of the unnecessary span tags because the converter doesn't build up a structure — it just runs straight through and outputs HTML as it encounters each element. If I left out the opening tags, there'd be a lot of closing tags with no counterpart. This really isn't a criticism, though. I understand the command was not designed to be the ultimate HTML generator. I sent Brad an email about this, so we'll see what happens.
The great thing about this generator, though, is that it generates output specific to the currently-selected theme. So you can generate multiple CSS files, load them into the same document, and have multiple blocks of code displayed using different themes for contrast. For example, client-side code can be displayed with a lighter-colored theme, and the server-side code is in a darker theme.
You can edit themes using the built-in theme editor:
You can also add coloring for scopes which are not already styled in the current theme. I've added custom colors for project-specific symbols in the past by editing the language bundle and adding coloring for them in the theme editor. I consider this a huge productivity gain.
I'm happy to release my modifications, but they're really hacky and I'm not sure what the license requirements are. I'll post an update if I figure it out with Brad.
Syntax Coloring for Blog Posts
Posted Nov 5, 2007 — 15 comments below
Posted Nov 5, 2007 — 15 comments below
Matthew Flanagan — Nov 05, 07 4998
I use Pygments to do syntax highlighting in my blog. It is a python module and script that works on any platform that supports python. It has support for an impressive list of languages and other markup including Objective-C (not sure about 2.0).
Scott Stevenson — Nov 06, 07 5000
Pretty slick. I could probably use this instead. Everything else here is PHP, but I guess it would be easy enough to call out into Python.
Michael Sheets — Nov 06, 07 5001
Ciarn Walsh — Nov 06, 07 5002
If you modified the file in the application package directly don’t forget to move the bundle to /Library/Application Support/TextMate/Bundles/ so that your changes aren’t overwritten when TextMate is updated.
stubblechin — Nov 06, 07 5004
Magnus Nordlander — Nov 06, 07 5006
Dominik Wagner — Nov 06, 07 5008
Scott Stevenson — Nov 06, 07 5009
I haven't messed around with the color much, but one nice thing it does is make Cocoa class names clickable with links to the official documentation. Wow. Very nice touch.
pavel — Dec 11, 07 5190
i have no ruby skills :(
Alex — Jun 09, 08 6057
Donavan — Jun 30, 08 6126
Glenn — Sep 07, 08 6351
Not to put too fine a point on it, but...I don't think so.
The first example formats:
NSString* director;director = [[[movie director] fullName]...
while the second formats:
NSString* director;director = movie.director.fullName...
I don't know if all of those punctuation... classes would still be there if the first example used dot notation.
Scott Stevenson — Sep 07, 08 6356
Yeah, it looks like you're right. I pasted the wrong text into the second example. The default output is still far more verbose, though, so the main point still applies.
Pete K — Feb 04, 09 6606
I use Snipt (http://snipt.net) now for hosted code coloring, and it's pretty damn spiffy.
http://pkarl.com/articles/introduction-git-version-control-you-and-me-os-x-1/
I just had to paste my code into a box & pick a syntax. They give you an embeddable script tag.
Oli — Jun 29, 09 6820
Thanks in advance!