Glyphs in popular monospaced fonts

Linus Torvalds famously uses an ancient uemacs editor that he updated to support UTF-8 some years ago. This post is not about Torvalds or his editor but about the file named UTF-8-demo.txt that caught my attention while I was glancing through the above repo.

Chrome displays it almost correctly--except for the "formulas" block. Judging from the shape of the glyphs, the default monospace font I use in Fedora

$ fc-match monospace
LiberationMono-Regular.ttf: "Liberation Mono" "Regular"

doesn't contain all the characters from UTF-8-demo.txt; hence, Chrome does font fallback. In its devtools it reports:

Liberation Mono     — Local file (5,438 glyphs)
DejaVu Sans         — Local file (1,179 glyphs)
Droid Sans Thai     — Local file (415 glyphs)
Droid Sans Ethiopic — Local file (320 glyphs)
Segoe UI Historic   — Local file (45 glyphs)
Noto Sans Math      — Local file (7 glyphs)
Droid Sans Fallback — Local file (5 glyphs)
Segoe UI Symbol     — Local file (1 glyph)
Noto Color Emoji    — Local file (1 glyph)
Times New Roman     — Local file (1 glyph)

(Your list would be, of course, completely different.)

Interestingly, xterm, gvim and gedit

UTF-8-demo.txt in gedit

... render the file better than Chrome (they don't mangle the "formulas" block) and Emacs 29.1 not only fails the "formulas" test but incorrectly aligns pseudo-graphics at the end of the file.

Anyhow, I became curious about how many installed monospace fonts I have can render that file without font substitution (spoiler: none).

Is there a way to disable font fallback? We can provide a custom fontconfig config via

$ FONTCONFIG_FILE=~/tmp/lol.conf gedit

or convert UTF-8-demo.txt to the pdf format using a tool that, by default, doesn't know about font substitution:

# input output font
txt2pdf() {
  awk '{print "    " $0}' < "${1:-/dev/null}" | \
    pandoc --pdf-engine=xelatex \
    -V "monofont:${3:-Roboto Mono}" -V "mainfont:${3:-Roboto Mono}" \
    -V geometry:"top=1cm,left=1cm,bottom=1.5cm,right=1cm" \
    -t pdf -o "${2:-${1%.*}.pdf}"

(Yes, we trick pandoc into thinking the input is markdown; IRL you'd probably want to add -V papersize=a4 and -V fontsize=12pt options to it.)

Then we can type:

$ txt2pdf UTF-8-demo.txt lol.pdf 'Ubuntu Mono'

and examine lol.pdf to see that a typeface, created "to complement the Ubuntu tone of voice", not only has "a contemporary style" but also "conveys a precise, reliable and free attitude":

UTF-8-demo.txt in gedit

How to do the same for all installed fixed-width fonts? First, we obtain the list of such fonts:

$ type fc.mono
fc.mono is aliased to `fc-list :mono family | awk -F, "{print \$1}" | sort -u'
$ fc.mono
Bitstream Vera Sans Mono
Courier 10 Pitch
Courier New
DejaVu Sans Mono
Droid Sans Mono
Liberation Mono
Material Icons
Material Icons Outlined
Material Icons Round
Material Icons Sharp
Material Icons Two Tone
Nimbus Mono PS
Noto Color Emoji
Source Code Pro
Ubuntu Mono

(I have Roboto Mono v2 installed as well, but fontconfig doesn't recognise it as a monospace font.)


$ (IFS=$'\n'; for fn in `fc.mono`; do txt2pdf UTF-8-demo.txt "$fn".pdf "$fn"; done)
$ ls *pdf -1
'Bitstream Vera Sans Mono.pdf'
'Courier New.pdf'
'DejaVu Sans Mono.pdf'
'Droid Sans Mono.pdf'
'Liberation Mono.pdf'
'Nimbus Mono PS.pdf'
'Source Code Pro.pdf'
'Ubuntu Mono.pdf'

In my case, not every .txt→.pdf conversion was successful, but among those that succeeded, 'DejaVu Sans Mono.pdf' produced the best result, glyph-wise.

Share a PulseAudio sink with an Android phone

While attempting to find a pair of semi-decent USB-only speakers, I became so frustrated that contemplated creating a mock simulator with "greater dynamic range", "improved bass", & even "lower distortion" to reliably replicate the advertised "quality" of these fake USB speakers. (Why fake? Because they all (?) use USB-A for power & require a 3.5mm jack input for sound.) Most speakers in the $0-$100 range often sound as if someone placed a phone in a metal bucket & started playing Quake 2 theme music using the phone's mighty speaker.

Here's a high precision simulator:

  1. Load module-simple-protocol-tcp module into a PulseAudio server.
  2. The module can use any sink you prefer, typically the default one.
  3. Initiate a regular playback using the selected sink (with mpv, vlc, a web browser, whatever).
  4. Connect to the server from an Android phone using Simple Protocol Player.

For an additional milieu, put the phone in a bucket. $100 saved, "greater dynamic range" achieved.

The scheme is compatible with PipeWire too.

Coincidentally this can be used as a rustic spying technique: you can listen to everything a machine with a running module-simple-protocol-tcp module plays.

A script (requires dialog(1) and jq(1)) that draws a menu with available sinks and starts listening on a socket:

#!/usr/bin/make -f

# the standard pulseaudio tcp port
port := 4713
answer := $(shell mktemp -u)

status:; pactl list | grep tcp -B1
stop:; -pactl unload-module module-simple-protocol-tcp

    pactl -f json list sinks \
     | jq -r '.[] | @sh "\(.index) \(.description)"' \
     | xargs dialog --keep-tite --menu "Local sink" 0 0 0 2>$(answer)

start: $(answer) stop
    pactl load-module module-simple-protocol-tcp rate=44100 format=s16le channels=2 source=`cat $<` record=true port=$(port)
    rm $<


Run it as

$ android-pa-share start

Beware that module-simple-protocol-tcp module doesn't support authentication, hence protect the port (4713 in the script above) from the WAN.

Forgotten finite automata techniques

"I built the earliest demos [of Google Code Search in 2006] using Ken Thompson's Plan 9 grep, because I happened to have it lying around in library form. The plan had been to switch to a "real" regexp library, namely PCRE, probably behind a newly written, code reviewed parser, since PCRE's parser was a well-known source of security bugs.

"The only problem was my then-recent discovery that none of the popular regexp implementations - not Perl, not Python, not PCRE - used real automata. This was a surprise to me, and even to Rob Pike, the author of the Plan 9 regular expression library. (Ken was not yet at Google to be consulted.) I had learned about regular expressions and automata from the Dragon Book, from theory classes in college, and from reading Rob's and Ken's code. The idea that you wouldn't use the guaranteed linear time algorithm had never occurred to me. But it turned out that Rob's code in particular used an algorithm only a few people had ever known, and the others had forgotten about it years earlier."

(From Regular Expression Matching with a Trigram Index by Russ Cox.)

No salute at sea

'[1893] The time was when ships passing one another at sea backed their topsails and had a "gam," and on parting fired guns; but those good old days have gone. People have hardly time nowadays to speak even on the broad ocean, where news is news, and as for a salute of guns, they cannot afford the powder. There are no poetry-enshrined freighters on the sea now; it is a prosy life when we have no time to bid one another good morning.'

(From Sailing Alone Around The World, Ch. 5 by Joshua Slocum.)

