1 July 2024

Using string similarity to allow loose link hrefs in a blog

One of the main properties I aim at when architecting my blog is ease of authoring, and this includes being able to drop links into pages without looking up the exact slug of the page I’m looking for.

There are a few components to this. One is having multiple ways to create a link, depending on whether it’s external and how accurate the href needs to be. For this purpose I have:

  • a Link component that accepts either a to prop (slug) or an exact page (all pages have access to the entire site map as a Svelte context object, so this could be e.g. <Link page={site.bySlug["some-page"]}/>)

  • remark-wiki-link with:

    • Slugifying the href, so I can write [[learning rust 1]] without the dashes

    • A list of aliases, e.g. lfb-mit for Lisa Feldman Barrett’s talk at MIT.

    • Slugs resolve to /r/[slug] (see Redirecting below)

  • MDSveX set up to automatically hyperlink bare URLs

Redirecting

I got this idea from the page linked here: https://news.ycombinator.com/item?id=40641598. There’s some interesting discussion on whether this is a good idea or not, but I like it for now. I’m less bothered about SEO than I am about people being able to find my posts when they’ve been linked under different routing schemes in older versions of my blog.

I thought it would be nice to have a specific route for this, so I used /r/[approximate-slug] to say “redirect me to the page with the most similar slug to [approximate-slug]”. Wiki-style links that don’t point to an alias are transformed to /r/${slugify(href)}.

This means that if I know a post is the first in a series on learning Rust, I can probably just type [[rust 1:Learning Rust #1]] to get to it.

No 404s

I never know quite how my site is going to be treated in Vercel’s deployments, or even whether it will have a backend or be statically generated. From experience messing around with different deployment settings and having things break, I’ve learnt that it’s good to take a belt-and-braces approach to 404s.

There are two elements to this:

  • The main /p/[slug] route does a server-side redirect to the most similar slug if it wasn’t found.

  • The +error.svelte does a client-side redirect to /r/[slug] for 404s.

This is all a bit sloppy, of course, but that’s one of the things I like about it. My blog isn’t a critical web service; it’s a place where I want to be able to quickly put things on the internet for people to read and hopefully enjoy or get some value out of. I want full control over the content, features, and deployment using text files, which necessitates something other than a hosted CMS, but apart from that, the less time I spend programming it, the better.