A selfie of me hiking on snowy Mt. Rainier. I am smiling while wearing my hat and sunglasses. Behind my head, in the distance, you can see the looming peak of Mt. Adams.
0 min read
Published · Updated

Use Hyphenopoly for better line breaks in your Markdown content

Let’s get wacky and wild on the web and justify our text without guilt.

Adding on to my pre­vi­ous post, Re­move runts from your Astro Mark­down files with ~15 lines of code, I wanted to share an­other quick tip for im­prov­ing the lay­out of your Mark­down prose.

We’re going to write a cus­tom Re­mark plu­gin to sug­gest line breaks in our Mark­down con­tent. Doing this al­lows us to use the hyphens: manual CSS prop­erty while avoid­ing awk­ward breaks in places they shouldn’t be.

Wait, can’t CSS fix this?

CSS does have a prop­erty for break­ing words and in­sert­ing hy­phens (hyphens: auto), but it’s not super gran­u­lar about the words it breaks. For ex­am­ple, words with only 5 let­ters will get bro­ken just as eas­ily as words with 20 let­ters. Yuck!

There is an­other CSS prop­erty to con­trol the length of words that get bro­ken (hyphenate-limit-chars), but it’s not yet widely sup­ported.

This cus­tom plu­gin I wrote sprin­kles line break op­por­tu­ni­ties into your markup by in­sert­ing the ­ en­tity. The ­ en­tity is a soft hy­phen, which means it’s a sug­ges­tion to the browser to in­sert hy­phens there when the hyphens: manual CSS prop­erty is used.

Hy­phe­nop­oly to the res­cue

Under the hood, I’m using the pack­age hyphenopoly to in­sert these soft hy­phens every­where. Per the pack­age’s doc­u­men­ta­tion, hy­phe­nop­oly uses Franklin M. Liang’s al­go­rithm de­vel­oped for TeX. If you’re cu­ri­ous how that al­go­rithm works, I sug­gest you check out the project’s README.

Get­ting started

Note: If you need help set­ting up your Astro blog for Mark­down (.md or .mdx), I rec­om­mend fol­low­ing Astro’s “Build a Blog” tu­to­r­ial.

This time, we’ll need two npm pack­ages:

npm install unist-util-visit hyphenopoly

Writ­ing our plu­gin

Now, we’ll cre­ate the JS file that will ac­tu­ally do the trans­for­ma­tion. I like to store my cus­tom plu­g­ins in a folder called /remark-plugins at the root of my project, but you’re wel­come to put them wher­ever.

remark-plugins/remark-hyphenate.mjs
import hyphenopoly from "hyphenopoly"
import { readFileSync } from "node:fs"
import { dirname } from "path"
import { visit } from "unist-util-visit"
import { fileURLToPath } from "url"

function loaderSync(file) {
	const cwd = dirname(fileURLToPath(import.meta.url))
	return readFileSync(`${cwd}/../node_modules/hyphenopoly/patterns/${file}`)
}

const hyphenator = hyphenopoly.config({
	exceptions: {
		"en-us": "Houston",
	},
	loaderSync,
	// dontHyphenateClass: 'donthyphenate',
	// minWordLength: 6,
	require: ["en-us"],
	defaultLanguage: "en-us",
	sync: true,
})

export function remarkHyphenate() {
	function transformer(tree) {
		visit(tree, "text", function (node) {
			const hyphenated = hyphenator(node.value)

			node.value = hyphenated
		})
	}

	return transformer
}

This Re­mark plu­gin has a bit more going on than my last one. That’s be­cause hy­phe­nop­oly must have a pat­tern spec­i­fied to apply the soft hy­phens. The pack­age ships with a ton of built-​in lan­guage sup­port (71 by my count!), so you have to tell it which .wasm file to load up.

In my case, I’m using the en-us pat­tern. You can find the full list of pat­terns here.

Warn­ing: Bor­ing code ex­pla­na­tion time!

  1. The loaderSync func­tion is a bit of a hack — it loads the en-us.wasm file from the hyphenopoly pack­age di­rec­tory. I’m not sure if there’s a bet­ter way to do this, but it works for now.
  2. hyphenator con­fig­ures hy­phe­nop­oly:
    • We pass in our cus­tom loader with the .wasm pat­tern, and spec­ify which lan­guage(s) we want to use.
    • You can spec­ify ex­cep­tions to the hy­phen­ation rules, which I’ve done for the word “Hous­ton.”
    • You can spec­ify a min­i­mum word length for hy­phen­ation, but I’ve left that com­mented out for now. Six is the de­fault, and that was good enough for me.
    • I’ve also com­mented out the op­tional dontHyphenateClass set­ting, which al­lows you to spec­ify a CSS class that will pre­vent hy­phen­ation from hap­pen­ing.
  3. We’re using unist-util-visit to tra­verse the Mark­down AST and re­place the text con­tent of each (tex­tual) DOM node (node.value) with our hy­phen­ated text.

En­abling the plu­gin in Astro

The last thing we have to do is tell Astro to use our new plu­gin. We can do this by adding a remarkPlugins array to our astro.config.mjs file:

astro.config.mjs
import { defineConfig } from "astro/config"

import mdx from "@astrojs/mdx"

import { remarkHyphenate } from "./remark-plugins/remark-hyphenate.mjs"

export default defineConfig({
	integrations: [mdx()],
	markdown: {
		remarkPlugins: [remarkHyphenate],
	},
})

En­able word breaks in CSS

To reap the full ben­e­fit of our cus­tom plu­gin, we need to add at least one CSS prop­erty to our prose con­tent:

hyphens: manual;
text-align: justify; /* Optionally, you can now use `justify` */

I added the jus­ti­fi­ca­tion be­cause I like the way it looks, but it’s not nec­es­sary. You can safely use it be­cause we’ve told the browser where it can break up words in our prose.

There’s not much to it other than that! Feel free to tin­ker with the hy­phe­nop­oly set­tings to your heart’s con­tent.

Fur­ther im­prove­ments

I’m still not 100% happy with the way this plu­gin works. I’d like a bet­ter method to load in the .wasm pat­tern file. My loader func­tion seems rather brit­tle.

Out of the box, hy­phe­nop­oly pro­vides or­phan con­trol (they ac­tu­ally mean “runts”), which I’d like to test. If it does as good a job as my last plu­gin, then that’s re­dun­dant code I can hap­pily rip out.

Work­ing ex­am­ple

Feel free to drop me a line or tweet at me. I am al­ways open to im­prov­ing this tu­to­r­ial or my code ex­am­ples.


Fur­ther read­ing

Interested in working together?

Reach out to learn more about me and/or my work.