Thursday, September 29, 2022

Flush node's process.stdin

I wrote an interactive program. It uses read to prompt the user.

But it suffered from type ahead buffering, particularly at a prompt for confirmation to proceed. An accidental extra <CR> entered before the prompt, while waiting for some slow processing to occur, would be processed after the prompt and select the default, which wasn't good whether the default was to confirm or reject.

I needed a way to ensure that only something entered after the prompt would be accepted as a response to the prompt. In other words: defeat the type ahead - flush / purge the stdin buffer before prompting for the confirmation.

For other prompts the type ahead was fine. A set of routine prompts that the user might become very familiar with and type ahead, knowing what they want, despite the slow processing. But not so good for the confirmation.

After much searching I couldn't find a simple way to flush the input buffer.

So, I did it the hard way:

function flushStdin () {
  return new Promise((resolve, reject) => {
    let n = 0;
    const interval = setInterval(() => {
      const chunk = process.stdin.read();
      if (chunk === null) {
        if (++n > 3) {
          clearInterval(interval);
          resolve();
        }
      } else {
        n = 0;
      }
    }, 0);
  });
}

It probably isn't the best way, but it is the best way I have found so far and it seems to work OK. It's human interaction, speed isn't a great concern.

 I have tried many variations. Most surprising was this:

function flushStdin () {
  return new Promise((resolve, reject) => {
    process.stdin.resume();
    setTimeout(() => {
      resolve();
    }, 10);
  });
}

This works somewhat, except that it takes time to flush all the input so with a timeout of 0 it doesn't work or with no timeout: calling resolve synchronously, in the same phase of node's event loop. But wait long enough and all the input will be flushed. The problem being that the time required to flush it all is indeterminate: it depends on how much input is buffered. And there is no way to check how much data is in the buffer.

While I don't find documentation of it and I haven't read the node source code to find out what it actually does, my guess, based on observations is:

The Linux terminal driver is buffering input. Absent a pending read, the terminal driver will buffer some amount of data. I don't know how much. There must be a limit. Eventually the input must be blocked.

It appears node gets data from the terminal driver in line mode: it reads one line of input and buffers it. So process.stdin.readableLength never sees more than the length of one line of input plus the terminating linefeed. Node doesn't fetch the next line from the terminal driver until the current line has been read.

So, even if there are several lines of input buffered by the terminal driver, it takes several iterations of reading a line and processing the line before the buffer is drained and I have found no interface in node for inspecting how much data is in the terminal driver buffer.

It seems odd to me that resuming the input without a data event listener or anything else to read the input actually flushes the input, given sufficient time, but doesn't interfere with subsequent reads (e.g. by readline). Yet it does.

The call to resume causes a resume event to be emitted, followed by a series of readable and data events: one pair of events for each line of input and, contrary to my understanding of what the documentation says, these events are emitted despite there being no listeners for them. Effectively, the buffer is cleared and the data discarded even though nothing read the data. But it takes time.

If you too are curious, you might try this:

const oldEmitter = process.stdin.emit;

process.stdin.emit = function () {
  const emitArgs = arguments;
  console.log(Date.now(), 'emit: ', arguments);
  oldEmitter.apply(process.stdin, arguments);
};

This lets you observe the events emitted from process.stdin. It must interfere with them somewhat. It takes time to write to the console. But my understanding is that it is synchronous: at least, the output is written to an output buffer synchronously, even if it doesn't immediately appear on the display (e.g. if the display is connected by a low speed tty).

But this is mostly speculation: deductions based on observations of various tests using various methods and properties of process.stdin and, via the read package, I think the readline interface.

Node documentation says two seemingly inconsistent things about process.stding when connected to a TTY:

In TTY it says:

When Node.js detects that it is being run with a text terminal ("TTY") attached, process.stdin will, by default, be initialized as an instance of tty.ReadStream and both process.stdout and process.stderr will, by default, be instances of tty.WriteStream. The preferred method of determining whether Node.js is being run within a TTY context is to check that the value of the process.stdout.isTTY property is true.

But in process.stdin it says:

The process.stdin property returns a stream connected to stdin (fd 0). It is a net.Socket (which is a Duplex stream) unless fd 0 refers to a file, in which case it is a Readable stream.

So, which is it? Is it an instance of tty.ReadStream or an instance of net.Socket? Or do they deem that the returned object is at the same time an instance of both? Is a tty.ReadStream an instance of net.Socket?

 

Labels