David Eisinger


Journal > Keep Markdown Links in Order With mdrenum

Posted 2023-11-15

I write all these posts in Markdown, and I tend to include a lot of links. I use numbered reference-style links and I like the numbers to be in sequential order. (Here’s the source of this post to see what I mean.) I wrote a Ruby script to automate the process of renumbering links when I add a new one, and as mentioned in last month’s dispatch, I spent some time iterating on it to work with some new posts containing code blocks that I’d imported into my Elsewhere section.

As I was working on the script, it was pretty easy to think of cases in which it would fail – it can handle fenced code blocks, for example, but not ones set off by spaces. I thought it’d be cool to build something in Go that uses a proper Markdown parser instead of regular expressions. This might strike you as an esoteric undertaking, but as Dr. Drang put it when he embarked on a similar journey:

But there is an attraction to putting everything in apple pie order, even when no one but me will ever see it.

First Attempts with Go

My very first attempt involved the gomarkdown package. It was super straightforward to turn a Markdown document into an AST, but after an hour or so of investigation, it was pretty clear that I wasn’t going to be able to get the original text and position of the links. I switched over to goldmark, which is what this website uses to turn Markdown into HTML. This seemed a lot more promising – it has functions for retrieving the content of nodes, as well as start and stop attributes that indicate position in the original text. I thought I had it nailed, but as I started writing tests, I realized there were certain cases where I couldn’t perfectly locate the links – two links smashed right up against one another, as an example. I spent a long time trying to come up with something that covered all the weird edge cases, but eventually gave up in frustration.

Both of these libraries are built to take Markdown, parse it, and turn it into HTML. That’s fine, that’s what Markdown is for, but for my use case, they came up short. I briefly considered forking goldmark to add the functionality I needed, but instead decided to look elsewhere.

A Promising JavaScript Library

I searched for generic Markdown/AST libraries just to see what else was out there, and a helpful Stackoverflow comment led me to mdast-util-from-markdown, a JavaScript library for working with Markdown without a specific output format. I pulled it down and ran the example code, and it was immediately obvious that it would provide the data I needed.

But now I had a new problem: I like JavaScript (and especially TypeScript) just fine, but I find the ecosystem around it bewildering, and furthermore, most of it is tailored for delivering complex functionality to browsers, not distributing simple command-line programs. I even went so far as to investigate using AI to convert the JS code to Go; the solution I found has some pretty severe character limitations, but I wonder if seamlessly converting code written in one language to another will be a thing in five years.

New JS Runtimes to the Rescue

On a whim, I decided to check out Deno, a newer alternative to Node.js for server-side JS. Turns out it has the ability to compile JS into standalone executables. I downloaded it and ran it against the example code, and it worked! I got a (rather large) executable with the same output as running my script with Node. A coworker recommended I check out Bun, which has a similar compilation feature – it worked just as well, and the resulting executable was about a third the size as Deno’s, so I opted to go with that.

Once I had a working proof-of-concept and a toolchain I was happy with, the rest was all fun; writing recursive functions that work with tree structures to do useful work is extremely my shit (here’s an old post I wrote about The Little Schemer along these same lines). I added Jest and pulled in all my Go tests, as well as Prettier to stand in for gofmt. I wrapped things up earlier this week and published the result, which I’ve imaginatively called mdrenum, to GitHub.

Bun (compiler) + TypeScript (type checking) + Prettier (code formatting) gets me most of what I like about working with Go, and I’m excited to use this tech in the future. The resulting executable is big (~45MB, as compared with ~2MB for my Go solution), but, hey, disk space is cheap and this actually works.

Integrating with Helix

I’ve been a happy Helix user for the last several months, and I thought it’d be cool to configure it to automatically renumber links every time I save a Markdown file. The docs do a good job explaining how to add a language-specific formatter:

The formatter for the language, it will take precedence over the lsp when defined. The formatter must be able to take the original file as input from stdin and write the formatted file to stdout

This was simple to add to the program, and then I added the following to ~/.config/helix/languages.toml:

[[language]]
name = "markdown"
auto-format = true
formatter = { command = "mdrenum" , args = ["--stdin"] }

This totally works, and I’ll say that it’s uniquely satisfying to save a document and see the link numbers get instantly reordered properly. I’ve done it probably 100 times in the course of writing this post.


Thanks for coming on this journey with me, and if this seems like a tool that might be useful to you, grab it from GitHub and open an issue if you have any questions.


References


Backlinks