I’ve written about Markdown before. Today I had another opportunity to be reminded of how truly great and useful it is.
Markdown is a simple page formatting syntax and a Perl script that converts the Markdown markup to HTML. The script runs in either BBEdit or TextWrangler. It is also offered as a plugin for a number of blogs and CMS’s.
Much of the markup that Markdown uses is implicit. Two carriage returns automatically insert paragraph tags. Adding a couple of spaces at the end of a line creates a br tag.
The rest is simple. Add a # or a * in front of a line and Markdown surrounds that block of text with h1 tags. Two ##’s make h2 tags, three ###’s make h3 tags and so on. A > creates a blockquote and a numbered list is automatically turned into an HTML numbered list. You get the idea. A complete syntax style guide is included at the author John Gruber’s site. Though fewer than 10 tags handle common formatting chores. This is simple, folks.
With Markdown it is fast and simple to use a text editor to write semantically correct HTM as quickly or more quickly than it would be to use menus in Word or TextEdit to format the text for printing. It is almost too cool for words.
Don’t get me started with the idea of composing in TextWrangler with Markdown, then copying the resulting page out of Safari into TextEdit. That can actually be faster than trying to create a structured document in TextEdit using styles and the mouse, especially if you want to include hyperlinks in the document.
The great thing about using Markdown while writing is that it is drop dead simple to format a page for HTML while writing. It makes Dreamweaver seem glacial by comparison. But that’s pretty much the case whenever a function can be accessed via the keyboard. WYSIWYG editing is wonderful for its short learning curve but, once learned, a markup protocol like Markdown is so much faster.
Today, I got a Word document that I needed to turn into a website. It is a novel about 450 pages long, that will end up as about 20 web pages, one per chapter.
I broke the file down, one new document per chapter. The initial plan was to use Word’s HTML export function then clean up the pages in Dreamweaver, using its clean Word HTML feature. What a pain.
Section one was 69 pages in Word. That exported to a 1.3 MB file, a bit big for quick loading. Cleaning up the junk that Word threw into the page helped quite a bit. The new file was only 800 KB or so. That’s a lot smaller but not even in the ballpark for a reasonable load time. Add to that the fact that Dreamweaver was molassas slow with a file that size and the thought of trying to continue this way was not appealing.
Next I tried the Word HTML cleaner function of Tidy. It never even got through the file, just died on me. That was both with the stand alone program and with the BBEdit plugin BBTidy. Not looking good.
Just for reference, Word had decided that the page needed 29 separate classes and thousands of font and span tags. No wonder the file was so large. Back to square one.
Looking at the document there didn’t need to be more than three levels of styled headings, two paragraph styles, a blockquote and styled pre tags for some embedded poetry. That’s really a very simple document. It turned out that the easiest solution was to use Markdown in Word, then copy and paste the file into BBEdit for some search and replace.
It had to start in Word, or another rich text editor. The big problem was that the author used a lot of italics, that needed to be addressed somehow. The italics would have been lost if the contents had been moved into BBEdit first. So the initial step went through Word. Fortunately, most of the italicized words were foreign, either Serbian or German and there was a glossary included. So an hour of global search and replace put about half the italics (em) Markdown syntax in place. Another three hours of hand editing got the rest. But the worst part was over.
There was some more search and replace in BBEdit to get the rest of the Markdown syntax in place, mostly making sure that there were the proper number of line breaks. This was also an opportunity to place a “go to page top” link between each section.
With the page marked up, running the Markdown script only took a few seconds for each page. There were now twenty long pages of text, 99% web ready, with all the tags and back to page top in place, ready to paste into the template pages.
Since the page templates were quite simple, it only took another hour and a half to get the content pages, home page and TOC ready for the book. The it was just copy and paste 20 time and the site was up. None of the finished pages exceeded 125 KB in size, which is less than a 10th of what Word was able to produce.
That was one work day to convert 180,000 + words into a complete web site with at total of 22 pages, 20 pages of content a cover page and a contact page. That included over 400 internal links. Score one for Markdown. (and BBEdit’s regular expression search and replace engine)