Alexander Gromnitsky's Blog

Offline Math: Converting LaTeX to SVG with MathJax

Air Date:
Latest update:

Pandoc can prepare LaTeX math for MathJax via its eponymous --mathjax option. It wraps formulas in <span class="math"> elements and injects a <script> tag that points to cdn.jsdelivr.net, which means rendering won't work offline or in case of the 3rd-party server failure. You can mitigate this by providing your own copy of the MathJax library, but the mechanism still fails when the target device doesn't support JavaScript (e.g., many epub readers).

At the same time, practically all browsers support MathML. Use it (pandoc's --mathml option), if you care only about the information superhighway: your formulas will look good on every modern device and scale delightfully. Otherwise, SVGs are the only truly portable option.

Now, how can we transform the html produced by

$ echo 'Ohm'\''s law: $I = \frac{V}{R}$.' |
  pandoc -s -f markdown --mathjax

into a fully standalone document where the formula gets converted into SVG nodes?

  1. Use an html parser like Nokogiri, and replace each <span class="math"> node with an image. There are multiple ways to convert a TeX-looking string to an SVG: using MathJax itself (which provides a corresponding CLI example), or by doing it in a 'classical' fashion with pdflatex. (You can read more about this method in A practical guide to EPUB, chapters 3.4 and 4.6.)
  1. Alternatively, load the page into a headless browser, inject MathJax scripts, and serialise the modified DOM back to html.

I tried the 2nd approach in 2016 with the now-defunct phantomjs. It worked, but debugging was far from enjoyable due to the strangest bugs in phantomjs. I can still run the old code, but it depends on an ancient version of the MathJax library that, for obvious reasons, isn't easily upgradable within the phantomjs pre-es6 environment.

Nowadays, Puppeteer would certainly do, but for this kind of task I prefer something more lightweight.

There's also jsdom. Back in 2016, I tried it as well, but it was much slower than running phantomjs. Recently, I gave jsdom another try and was pleasantly surprised. I'm not sure what exactly tipped the scales: computers, v8, or jsdom itself, but it no longer feels slow in combination with MathJax.

$ wc -l *js *conf.json
  24 loader.js
 105 mathjax-embed.js
  12 mathjax.conf.json
 141 total

Roughly 50% of the code is nodejs infrastructure junk (including CL parsing), the rest is a MathJax config and jsdom interactions:

let dom = new JSDOM(html, {
  url: `file://${base}/`,
  runScripts: /* very */ 'dangerously',
  resources: new MyResourceLoader(), // block ext. absolute urls
})

dom.window.my_exit = function() {
  cleanup(dom.window.document) // remove mathjax <script> tags
  console.log(dom.serialize())
}

dom.window.my_mathjax_conf = mathjax_conf // user-provided

let script = new Script(read(`${import.meta.dirname}/loader.js`))
let vmContext = dom.getInternalVMContext()
script.runInContext(vmContext)

The most annoying step here is setting url property that jsdom uses to resolve paths to relative resources. my_exit() function is called by MathJax when its job is supposedly finished. loader.js script is executed in the context of the loaded html:

window.MathJax = {
  output: { fontPath: '@mathjax/%%FONT%%-font' },
  startup: {
    ready() {
      MathJax.startup.defaultReady()
      MathJax.startup.promise.then(window.my_exit)
    }
  }
}

Object.assign(window.MathJax, window.my_mathjax_conf)

function main() {
  var script = document.createElement('script')
  script.src = 'mathjax/startup.js'
  document.head.appendChild(script)
}

document.addEventListener('DOMContentLoaded', main)

The full source is on Github.

Intended use is as follows:

$ echo 'Ohm'\''s law: $I = \frac{V}{R}$.' |
  pandoc -s -f markdown --mathjax |
  mathjax-embed > 1.html

The resulting html doesn't use JavaScript and doesn't fetch any external MathJax resources. mathjax-embed script itself always works offline.


Tags: ойті
Authors: ag

The Size of Adobe Reader Installers Through The Years

Air Date:
Latest update:

At the time of writing, the most recent Adobe Reader 25.x.y.z 64-bit installer for Windows 11 weights 687,230,424 bytes. After installation, the program includes 'AI' (of course), an auto-updater, sprinkled ads for Acrobat online services everywhere, and 2 GUIs: 'new' and 'old'.

For comparison, the size of SumatraPDF-3.5.2 installer is 8,246,744 bytes. It has no 'AI', no auto-updater (though it can check for new versions, which I find unnecessary, for anyone sane would install it via scoop anyway), and no ads for 'cloud storage'.

The following chart shows how the Adobe Reader installer has grown in size over the years. When possible, 64-bit versions of installers were used.

adobe reader vs sumatrapdf

Next Day Update:

Best comment on Hacker News: "Looks like a chart crime scene."

Alright, here's your linear graph, along with the source from which both graphs were generated. All point labels are version numbers.

adobe reader vs sumatrapdf (linear scale)
Tags: ойті
Authors: ag

Not a bug

Air Date:
Latest update:

Peter Weinberger (the "w" in awk), while working at Bell Labs, wrote an experimental implementation of a network file system. Included with Research Unix v8 (Feb 1985, licensed strictly for educational use), it allowed to share / (yes) with other machines running v8 by specifying a mapping between a local uid/gui and the desired view from the LAN.

Weinberger described peculiarities of his netfs as

"If A mounted B's file system somewhere, and B mounted A's, then the directory tree was infinite. That's mathematics, not a bug."

His /usr/src/netfs/TODO contained an existential question:

'why does it get out of synch?'

The connection of this netfs and Sun's NFS is murky.

Steve Johnson:

"I remember Bill Joy visiting Bell Labs and getting a very complete demo of RFS and being very impressed. Within a year, Sun announced NFS."

Unix System V SVR3, released by AT&T in 1987, included a different version of netfs, which they officially began calling RFS. Appearing 18 months after Sun announced NFS, it briefly attempted to compete, but failed on 2 fronts simultaneously: ⓐ big vendors (Dec, IBM, HP) disliked its licensing terms, and ⓑ the protocol's brittleness discouraged ports to non-Unix systems. NFS won, becoming widely used--even by NeXTSTEP.

Lyndon Nerenberg:

'We ran RFS on a "cluster" of four 3B2s [AT&T microcomputers], and while it worked, to varying degrees, the statefulness of the protocol inevitably led to the whole thing locking up, requiring a reboot of all four machines to recover.'


Tags: ойті
Authors: ag

Batocera 35 and Vontar X3

Air Date:
Latest update:

In the dining room there is an old (2013) 1080p telly with an old (2020) Android TV box connected to it. The TV box contains an ancient Amlogic S905X3 SoC inside. It has just enough power to play Youtube & 1080p movies from a network drive but not much else.

Some time ago I heard about repurposing this particular model (Vontar X3) as a retro-gaming console, but anticipating battles similar to those with openwrt-on-outers--where 2 devices with the exact same name have slightly different hardware revisions (& as a result, nothing works as expected)--I've been putting off the adventure.

The easiest Linux gaming distro to deploy is a French one called Batocera.1 Its wiki describes perfomance of a particular device in terms of console generation support:

Gén Consoles
3 NES
4 SNES, Sega Mega Drive
5 PlayStation (psx), PlayStation Portable (psp)

(I've skipped the irrelevant generations.)

In my tests, while the modest S905X3 runs most psx & psp games acceptably, some titles have such a perceived frame drop (that do not occur on a desktop PC running the same emulator as Batocera) that it makes them unplayable. The prominent unsuccessful examples are CTR: Crash Team Racing (psx) and MotorStorm: Arctic Edge (psp).

The last Batocera version for the Vontar X3 is 35 (the OS images for this device haven't been updated since '22). I dd'ed batocera-s905gen3-tvbox-gen3-35-20220910.img onto a 32GB SD card, inserted the card into the TV box, pressed its reset button with a toothpick, plugged in the power cable, & saw this:

The TV's info popups indicated that the resolution of this shaky image was 1080i (interlaced?) instead of the expected 1080p. I then tried 2 completely different (albeit much newer) TVs, as well as a capture card--none of them had any problems negotiating a proper resolution with Batocera.

After pointlessly suffering with various kernel parameters, I ended up with the following kludge to disable the interlaced mode:

  1. Connect the device to your LAN via the ethernet port. Batocera runs Avahi, hence we can just say

     $ ssh root@batocera.local
    

    (The password is 'linux'.)

  2. Run

     # batocera-resolution listModes | head -10
     max-1920x1080:maximum 1920x1080
     max-640x480:maximum 640x480
     0.0.1920x1080.60:HDMIA 1920x1080 60Hz (1920x1080i)
     0.1.1920x1080.60:HDMIA 1920x1080 60Hz (1920x1080)
     0.2.1920x1080.60:HDMIA 1920x1080 60Hz (1920x1080)
     0.3.1920x1080.60:HDMIA 1920x1080 60Hz (1920x1080i)
     0.4.1920x1080.50:HDMIA 1920x1080 50Hz (1920x1080)
     0.5.1920x1080.50:HDMIA 1920x1080 50Hz (1920x1080i)
     0.6.1920x1080.24:HDMIA 1920x1080 24Hz (1920x1080)
     0.7.1920x1080.24:HDMIA 1920x1080 24Hz (1920x1080)
    

    inside Batocera. Take a note of a mode you'd like to see.

  3. Turn the device off. Extract the SD card out of it & insert it into a PC. The card has 2 partition: the 1st one is fat32 that has batocera-boot.conf file. Add a line to it:

     es.resolution=0.1.1920x1080.60
    

The picture will still incessantly jerk from left to right, but only during the boot phase:

# batocera-info
Disk format: ext4
Temperature: 66°C
Architecture: tvbox-gen3
Model: Shenzhen Haochuangyi Technology Co., Ltd H96 Max
System: Linux 5.10.134
Available memory: 624/932 MB
Cpu model: ARMv8 Processor rev 0 (v8l)
Cpu number: 4
Cpu max frequency: 1908 MHz

There is little to add here. You copy your .nes/.sfc/.chd/.iso files to /userdata/roms/{nes,snes,psx,psp} either directly onto the 2nd partition of the SD card, or via ssh, or even smb, for Batocera runs Samba.


  1. I couldn't find any guidance on how to pronouce it: there is a type of beetles called /bə'tosərə/, but some youtubers say it as /bαto'sɛrα/.

Tags: untagged
Authors: ag