Alexander Gromnitsky's Blog

node.js 0.12, stdin & spawnSync

Latest update:

If you have in your code a quick hack like this:

stdin = fs.readFileSync('/dev/stdin').toString()

& it works fine & nothing really happens bad, so you may start wondering one day why is it considered by everyone as a temporal solution?

Node's readFileSync() uses stat(2) to get the size of a file it tries to read. By definition, you can't know ahead the size of stdin. As one dude put it on SO:

'Imagine stdin is like a water tap. What you are asking is the same as "How much water is there in a tap?".'

by using stat(2) readFileSync() will read up to what lenght value the kernel will lie/guess about /dev/stdin.

Another issues comes w/ testing. If you have a CL utility & want to write an acceptance test for it using 'new' node 0.12 child_process.spawnSync() API, expect funny errors.

Suppose we have a node version of cat that's written in a dumb 'synchronous' way. Call it cat-1.js:

#!/usr/bin/env node

var rfs = require('fs').readFileSync

if (process.argv.length == 2) {
    process.stdout.write(rfs('/dev/stdin'))
} else {
    process.argv.slice(2).forEach(function(file) {
        process.stdout.write(rfs(file))
    })
}

Now we write a simple test for it:

var assert = require('assert')
var spawnSync = require('child_process').spawnSync

var r = spawnSync('./cat-1.js', { input: 'hello' })
assert.equal('hello', r.stdout.toString())

& run:

$ node test-cat-1-1.js

assert.js:86
  throw new assert.AssertionError({
        ^
AssertionError: 'hello' == ''
    at Object.<anonymous> (/home/alex/lib/writing/gromnitsky.blogspot.com/posts/2015-02-07.1423330840/test-cat-1-1.js:5:8)

What just happened? (I've cut irrelevant trace lines.) Why the captured stdout is empty? Lets change the test to:

var assert = require('assert')
var spawnSync = require('child_process').spawnSync

var r = spawnSync('./cat-1.js', { input: 'hello' })
console.error(r.stderr.toString())

then run:

$ node test-cat-1-2.js
fs.js:502
  return binding.open(pathModule._makeLong(path), stringToFlags(flags),
mode);
                 ^
Error: ENXIO, no such device or address '/dev/stdin'
    at Error (native)
    at Object.fs.openSync (fs.js:502:18)
    at fs.readFileSync (fs.js:354:15)

At this point unless you want to dive into libuv internals, that quick hack of explicitly reading /dev/stdin should be changed to something else.

In the past node maintainers disdained the stdin sync read & called it an antipattern. The recommended way was to use streams API, where you employed process.stdin as a readable stream. Still, what if we really want a sync read?

The easiest way is to make a wrapper around readFileSync() that checks filename argument & invokes a real readFileSync() when it's not equal to /dev/stdin. For example, lets create a simple module readFileSync:

var fs = require('fs')

module.exports = function(file, opt) {
    if ( !(file && file.trim() === '/dev/stdin'))
        return fs.readFileSync(file, opt)

    var BUFSIZ = 65536
    var chunks = []
    while (1) {
        try {
            var buf = new Buffer(BUFSIZ)
            var nbytes = fs.readSync(process.stdin.fd, buf, 0, BUFSIZ, null)
        } catch (err) {
            if (err.code === 'EAGAIN') {
                // node is funny
                throw new Error("interactive mode isn't supported, use pipes")
            }
            if (err.code === 'EOF') break
            throw err
        }

        if (nbytes === 0) break
        chunks.push(buf.slice(0, nbytes))
    }

    return Buffer.concat(chunks)
}

It's far from ideal, but at least it doesn't use stat(2) for determining stdin size.

We modify out cat version to use this module:

#!/usr/bin/env node

var rfs = require('./readFileSync')

if (process.argv.length == 2) {
    process.stdout.write(rfs('/dev/stdin'))
} else {
    process.argv.slice(2).forEach(function(file) {
        process.stdout.write(rfs(file))
    })
}

& modify the original version of the acceptance test to use it too:

var assert = require('assert')
var spawnSync = require('child_process').spawnSync

var r = spawnSync('./cat-2.js', { input: 'hello' })
assert.equal('hello', r.stdout.toString())

& run:

$ node test-cat-2-1.js

Yay, it doesn't throw up an error & apparently works!

To be sure, generate a big file, like 128MB:

$ head -c $((128*1024*1024)) < /dev/urandom > 128M

then run:

$ cat 128M | ./cat-2.js > 1
$ cmp 128M 1
$ echo $?
0

Which should return 0 if everything was fine & no bytes were lost.


Tags: ойті
Authors: ag