← Back to Toolbox
Ripgrep is one of my favorite command-line tools. Primarily a better
grep, its search & replace features make it useful for much more. On macOS you can install it with Homebrew:
brew install ripgrep
Ripgrep is available at the command line as
rg. There's a Guide available, including some of the command-line options. A complete list of command-line options is printed by
Move localized Markdown sources from one file structure to another
With the Hugo static site generator, your localized posts might be organized with language suffixes:
On the other hand, Eleventy recommends organizing languages in separate folders:
To migrate from the Hugo file structure to the Eleventy file structure, we can use
rg to produce the necessary arguments for two standard Unix command-line tools that will get the job done:
mkdir -p, that creates the appropriate folder structure; and
mv, that moves/renames the source files.
For the first step we need a list of folder structures to create:
find src/**/index.*.md | rg "src/(.+)/index.(en|es|it|de).md" --replace 'src/$2/$1'
We can pipe the output of this command to
mkdir -p, which recursively creates all these folders, using
find src/**/index.*.md | rg "src/(.+)/index.(en|es|it|de).md" --replace 'src/$2/$1' | xargs mkdir -p
For the second step we'll be using
mv, which needs two arguments: the source and the destination path. We can adapt the first command to produce the original match (via
$0), along with the replacement, on separate lines:
find src/**/index.*.md | rg "src/(.+)/index.(en|es|it|de).md" --replace $'$0\nsrc/$2/$1/index.md'
Let's unpack the
$'…'is needed so we can insert the newline character
$0is the original string, printed on the first line;
- for the second line,
src/$2/$1/index.mdshuffles around the groups captured in the matched pattern into the desired configuration.
We print the source and destination paths on separate lines because that allows us to pipe the output to
xargs -n2, that is to take the input two lines at a time and use these lines as the two arguments to
find src/**/index.*.md | rg "src/(.+)/index.(en|es|it|de).md" --replace $'$0\nsrc/$2/$1/index.md' | xargs -n2 mv
hyperglot to find how many glyphs a font is missing for supporting each language
This recipe is very specific but I'm adding it because beyond
hyperglot's output, it's mainly about wrangling text with a combination of CLI tools.
a database and tools for detecting language support in fonts. Its
--verbose output contains information about which characters are missing from the font for it to support a certain language, with lines in the form of:
hyperglot font.otf --verbose 2>&1
Missing from base language ron: ș (537) Ș (536) ț (539) ă (259) Ț (538) Ă (258)
Can we match the language code and count the missing characters from lines such as the one above?
hyperglot font.otf --verbose 2>&1 |\
rg 'Missing from base language (.+):|\((\d+)\)' -or '$1$2' |\
rg --passthrough '\d+' -r '????' |\
uniq -c |\
paste - - |\
rg '1 ([a-z]+)[^\d]+(\d+)' -or '$2 $1' |\
The command does the following:
- Extracts the important bits — the language code on the one hand, and the set of Unicode character numbers on the other — to separate lines, by way of two capturing groups. In the command,
-oris short for
- Replaces all lines matching a number with the same sequence of characters, so we can count the lines. The sequence itself doesn't really matter; here I'm using
????. Lines that don't match remain unchanged via the
- Counts the occurrences of consecutive matching lines. This produces all the data we need, but it's in an awkward order.
- To coax the data into a more useful shape,
pastes each pair of consecutive lines side by side on a single line, then extracts the important bits — the language code, and the character count — and prints them in reverse order, via
- Sorts by the first number on each line with
And there we have it. Here's an example output:
1 aht 1 cab 1 cak 1 chj (...) 4 dga 4 ebu 4 fat 4 fuv (...) 242 tir 300 vai 756 kor (...) 11172 kor