Thursday, September 29, 2022

Flush node's process.stdin

I wrote an interactive program. It uses read to prompt the user.

But it suffered from type ahead buffering, particularly at a prompt for confirmation to proceed. An accidental extra <CR> entered before the prompt, while waiting for some slow processing to occur, would be processed after the prompt and select the default, which wasn't good whether the default was to confirm or reject.

I needed a way to ensure that only something entered after the prompt would be accepted as a response to the prompt. In other words: defeat the type ahead - flush / purge the stdin buffer before prompting for the confirmation.

For other prompts the type ahead was fine. A set of routine prompts that the user might become very familiar with and type ahead, knowing what they want, despite the slow processing. But not so good for the confirmation.

After much searching I couldn't find a simple way to flush the input buffer.

So, I did it the hard way:

function flushStdin () {
  return new Promise((resolve, reject) => {
    let n = 0;
    const interval = setInterval(() => {
      const chunk = process.stdin.read();
      if (chunk === null) {
        if (++n > 3) {
          clearInterval(interval);
          resolve();
        }
      } else {
        n = 0;
      }
    }, 0);
  });
}

It probably isn't the best way, but it is the best way I have found so far and it seems to work OK. It's human interaction, speed isn't a great concern.

 I have tried many variations. Most surprising was this:

function flushStdin () {
  return new Promise((resolve, reject) => {
    process.stdin.resume();
    setTimeout(() => {
      resolve();
    }, 10);
  });
}

This works somewhat, except that it takes time to flush all the input so with a timeout of 0 it doesn't work or with no timeout: calling resolve synchronously, in the same phase of node's event loop. But wait long enough and all the input will be flushed. The problem being that the time required to flush it all is indeterminate: it depends on how much input is buffered. And there is no way to check how much data is in the buffer.

While I don't find documentation of it and I haven't read the node source code to find out what it actually does, my guess, based on observations is:

The Linux terminal driver is buffering input. Absent a pending read, the terminal driver will buffer some amount of data. I don't know how much. There must be a limit. Eventually the input must be blocked.

It appears node gets data from the terminal driver in line mode: it reads one line of input and buffers it. So process.stdin.readableLength never sees more than the length of one line of input plus the terminating linefeed. Node doesn't fetch the next line from the terminal driver until the current line has been read.

So, even if there are several lines of input buffered by the terminal driver, it takes several iterations of reading a line and processing the line before the buffer is drained and I have found no interface in node for inspecting how much data is in the terminal driver buffer.

It seems odd to me that resuming the input without a data event listener or anything else to read the input actually flushes the input, given sufficient time, but doesn't interfere with subsequent reads (e.g. by readline). Yet it does.

The call to resume causes a resume event to be emitted, followed by a series of readable and data events: one pair of events for each line of input and, contrary to my understanding of what the documentation says, these events are emitted despite there being no listeners for them. Effectively, the buffer is cleared and the data discarded even though nothing read the data. But it takes time.

If you too are curious, you might try this:

const oldEmitter = process.stdin.emit;

process.stdin.emit = function () {
  const emitArgs = arguments;
  console.log(Date.now(), 'emit: ', arguments);
  oldEmitter.apply(process.stdin, arguments);
};

This lets you observe the events emitted from process.stdin. It must interfere with them somewhat. It takes time to write to the console. But my understanding is that it is synchronous: at least, the output is written to an output buffer synchronously, even if it doesn't immediately appear on the display (e.g. if the display is connected by a low speed tty).

But this is mostly speculation: deductions based on observations of various tests using various methods and properties of process.stdin and, via the read package, I think the readline interface.

Node documentation says two seemingly inconsistent things about process.stding when connected to a TTY:

In TTY it says:

When Node.js detects that it is being run with a text terminal ("TTY") attached, process.stdin will, by default, be initialized as an instance of tty.ReadStream and both process.stdout and process.stderr will, by default, be instances of tty.WriteStream. The preferred method of determining whether Node.js is being run within a TTY context is to check that the value of the process.stdout.isTTY property is true.

But in process.stdin it says:

The process.stdin property returns a stream connected to stdin (fd 0). It is a net.Socket (which is a Duplex stream) unless fd 0 refers to a file, in which case it is a Readable stream.

So, which is it? Is it an instance of tty.ReadStream or an instance of net.Socket? Or do they deem that the returned object is at the same time an instance of both? Is a tty.ReadStream an instance of net.Socket?

 

Wednesday, June 15, 2022

Linux duplex scanning

I have been using XSane for scanning but it seems not to have an option for duplex batch scanning.

I found this gist. It didn't quite work but I adapted it to work with my scanner:

#!/bin/bash
scanimage -d 'brother4:net1;dev0' --batch --batch-double
read -p "Flip papers in the feeder, last page on top, press Enter..."
endpage=$(echo "$(ls *.pnm -1 | wc -l) * 2" | bc)
scanimage -d 'brother4:net1;dev0' --batch --batch-increment -2 --batch-start $endpage
for file in *.pnm; do convert $file $file.jpg; done
rm *.pnm
convert $(ls *.jpg -1v | paste) -compress jpeg -page A4 output.pdf
rm *.jpg
 

This works, but the scanner device is specific to my setup. To find the scanner I ran: scanimage -L

Otherwise, it would be better if it created a temporary directory, put all the files in it then copied only the output file back to current directory and deleted the temporary directory. But in the meantime, this works and is easier than struggling with the X(In-)Sane UI.

I had to change ImageMagick policy to allow writing the PDF using convert: in /etc/ImageMagick-6/policy.xml I change "none" to "write" in the line:   <policy domain="coder" rights="write" pattern="PDF" />

This allows 'convert' (and presumably other commands of the ImageMagic set) to write PDF files but not to read them. Reading them is a security concern but writing them should be safe enough.

Friday, June 3, 2022

Calibre-Web epub url churn

I am using calibre-web to view epub documents, with Firefox plugin zhongwen for lookup of Chinese characters.

The zhongwen plugin selects text matching dictionary entries as the cursor moves over the document and opens a popup to display the matching definitions.

The published zhongwen plugin doesn't work with epub documents displayed by calibre-web because it doesn't handle the iframe that the epub document appears in, so I am using a modified version of the plugin: ig3/zhongwen.

This modified version of the plugin works. I can view Chinese epub documents with lookup of the Chinese characters, which is very helpful for my Chinese study.

However, every time the zhongwen plugin changes the selected text, the epub reader updates the URL with a new epubcfi that specifies the selected text. This churn of the URL results in many browser history entries. When I look for something in my browser history, I have to scroll over thousands of these history entries to find what I want.

I don't was a browser history entry every time zhongwen selects text.

With a bit of poking about in the calibre-web source, I found that it uses the Futurepress reader and epub.js libraries to render epub documents.

The epub.js library emits events when selection changes but doesn't have code that would update browser history.

The reader does have code that updates browser history on selection change.

Looking at ~/.local/lib/python3.7/site-packages/calibreweb/cps/static/js/libs/reader.min.js I see "history:!0", which appears to be setting default value of the "history" option which determines if history is updated.

I changed "history:!0" to "history:!1", reloaded the epub document and now the URL is not changed when zhongwen plugin selects text.

But now there are no updates to the URL at all: not even when I move to a different page. So I have gone from too much history to too little.

So, I grabbed a copy of the non-minified reader and edited the function that updates the URL to:

PUBJS.Reader.prototype.selectedRange = function(cfiRange){
  var cfiFragment = "#"+cfiRange;

  // Update the History Location
  if(this.settings.history &&
      window.location.hash != cfiFragment &&
    cfiFragment.indexOf(',') === -1
  ) {
    // Add CFI fragment to the history
    history.pushState({}, '', cfiFragment);
    this.currentLocationCfi = cfiRange;
  }
};

The addition is "cfiFragment.indexOf(',') === -1", which prevents the update if the fragment includes a comma, which it does if it is a "RANGE" of text.

With this change, I get URL updates / history when I use the page navigation but no URL updates / history when text is selected. 

This seems a reasonable compromise and, thus far, everything works. The change hasn't broken anything that I have noticed.

 

Thursday, May 26, 2022

Janne Schra,

The world is full of hard working, talented people who don't get the attention they deserve. Meanwhile, there are Tik Tok and OnlyFans taking people's time and money for trash.

Just one example: I was listening to an old music selection today: Janne Schra: Different. I enjoyed it so I looked her up and listened to / watched several of her YouTube videos. I enjoyed them all. A lot of work went into them - a lot of talent. Yet the had only a few hundred views each.

She collaborated with M. Ward. I liked most of what he posted to YouTube also. They too had almost no views.

YouTube's algorithms work. Alphabet makes a lot of money. But they don't promote what is good. They merely monetize us and our reactions to triggers: titles and thumbnails.

Friday, May 20, 2022

Greg Ellis - The Respondent

I just watched this livestream discussion with Greg Ellis:

 
He speaks calmly but persuasively. 

See his website: The Respondent

And his YouTube channel: The Respondent


Tuesday, April 19, 2022

Anki Backlog Hell

 People have written about Anki Ease Hell:

But there is another kind of hell with Anki, which I call Backlog Hell.

Backlog Hell results from two features of Anki:

  1. Cards are presented in the order they are due
  2. New cards are presented regardless of backlog

When you have a backlog of due cards, you do not see them promptly when they are due. This isn't an issue with a small backlog but becomes an issue with a larger backlog.

Consider what happens if you are reviewing about 1,000 cards per day and you take a break from study for a week. You may take a break because you have exams, vacation, work or illness. It doesn't matter why. When you return, how many cards do you have due? After one week, you will have about 7,000 cards due for review. Maybe a bit less: some of those cards are short interval.

You normally are reviewing 1,000 cards per day.  Maybe this takes you 30 minutes per day (about 2 seconds per review if they are quick) or maybe 2 hours per day (about 8 seconds per review if you take a bit more time on each card).

To get through your backlog, you might be able to double your study time. If you do, you can review 2,000 cards per day instead of your usual 1,000 cards. You should clear your backlog quickly, right? In one week, you should have cleared your backlog, right?.

Wrong! it will take much more than 1 week to clear your backlog. It might never happen. Instead, your backlog might get larger every day.

If you know your cards well (i.e. they are almost always Good or Easy, even if you don't see them promptly) then you will clear your backlog.

If a card has an interval of 100 days and you see it after 107 days, it doesn't make much difference. If it was going to be Good at 100 days, it will probably be Good at 107 days. You will clear these from your backlog. Even if you don't get to it the first day it is due. Seeing them after 114 or 120 days isn't too much different from 100 days. Not optimum, but often you will still remember.

But what about the cards that you were just learning: the majority of your daily review? What happens to a card that you should have reviewed after 7 days? You will not see it again for at least 14 days, maybe longer. Will it still be Good after two or three weeks?

You end up clicking Again. You should see these cards again soon. But you still have a backlog, so you don't see it until you have seen all the other cards in your backlog. Maybe not for a few days or weeks.

You continue reviewing cards long after they were due. Can you recall them, long after they were due? Sure, sometimes you will. But if the scheduling algorithm is good, seeing them long after they were due will not be as effective as seeing them when they were due: the whole point of the scheduler is to present them at the optimum interval: shorter or longer intervals are sub-optimum.

So, just when you need to optimize your study: to clear the backlog as quickly as possible - Anki goes into a sub-optimal mode.

If your backlog is too big, it will be so sub-optimal that you can't catch up.

And Anki adds new cards every day. What happens to a new card added today? You should see it again in a few minutes, or maybe tomorrow. But when do you see it?

Because Anki presents cards according to due date and all the cards in your backlog were due before today, you will not see that new card again until you have reviewed all the cards in your backlog. If you have 7,000 cards in your backlog and you can review 2,000 per day, that's over 3 days until you can see that card again.

Can you learn your new cards if you only see them once every 3 days? How about once a week? Or once a month? If I wasn't doing anything else, maybe I could. But when I am reviewing 2,000 cards per day, I can't. It comes up again in a few days but I should have seen it again in 10 minutes. I can't remember it. So I click Again. It is due in 10 minutes but there are still 7,000 cards that I have to review first, before I will see it again. And I can't review 7,000 cards in 10 minutes, so I see it late again: days late.

And still Anki makes the problem worse by adding more new cards every day.

Seeing cards significantly after their due date, your learning will be sub-optimal.  With a small backlog: no problem. With a large backlog: you will never catch up. You will study more and more but see your backlog get a bit bigger each day. Remember: you were studying 1,000 cards per day. Some of these were short interval cards but some were long interval cards: there will be more of these added to your backlog each day. 

Eventually, you will stop learning almost completely: forever reviewing but never progressing: a Sisyphean torment: Backlog Hell.

Every time you see a card that you can't remember, because you haven't seen it in too long and you are overloaded, you click Again. Doing that a few times sends you into Ease Hell.

Ease Hell is bad enough, but combine it with Backlog Hell and you are doomed to failure. Your backlog will grow. You will never catch up. You will spend more and more time to learn less and less. When you don't see your cards on time: when you see your new cards days late, you will not learn effectively.

How to escape Backlog Hell?

First, manually reduce your new cards per day to 0. There is no point adding to the problem. You can start learning new cards again after you clear your backlog.

Then, study only a subset of cards each day. You can study select decks or create custom decks. Your objective should be to review cards on time. You need to optimize your learning to get through the backlog.

It is better to review some cards on time and learn them than it is to review all your cards (or as many of them as you can review in a day) days or weeks late and not learn any of them.

With no new cards being added, if you continue learning effectively you should eventually clear your backlog.

Alternatively, you can set all the cards in your backlog back to new. Then they will be re-introduced gradually, according to your new card limit.

But, do you really want to set 7,000 cards back to new? That's tremendous setback: another torment.

I ended up giving up on Anki. I didn't want to reset all my cards to new or other manual workarounds, and I didn't want to forever be in Ease Hell and Backlog Hell. I wanted to learn and I wanted to spend my time learning, not fussing with Anki. I couldn't fix the Anki scheduler so I wrote srf. I still take occasional breaks from study and have backlogs. It still takes time to clear a backlog, but I make reliable progress and I don't have to fuss with the scheduler: it just works. 

On the other hand, the srf scheduler is easy to change if I want to: there are parameters for minor adjustments and it is just some JavaScript and SQL queries, if I want to make more fundamental changes. I have changed it several times since first writing srf. It is quite different from Anki in its details, though similar in its principles and objectives. It is now settled to something I am quite comfortable with and a vast improvement over the Anki scheduler.

Wednesday, April 13, 2022

Testing Express based apps

All of the examples of automated testing Express based apps that I have found use supertest. But supertest is based on superagent and I don't like the API of superagent and, therefore, supertest.

But it is not necessary to use supertest. All it does is run an instance of the app and submit queries to it, then muddle assertions with the HTTP client. It is easy, more flexible and cleaner to run the app directly then use whatever means to submit requests and assert assertions: hopefully from independent packages.

For example:


'use strict';

const t = require('tape');
const app = require('../app');
const http = require('http');
const axios = require('axios');

const server = http.createServer(app);
server.listen(0, '127.0.0.1');
server.on('listening', () => {
  const addr = server.address();
  const address = addr.address;
  const port = addr.port;
  console.log('addr: ', addr);
  console.log('listening on ' + address + ':' + port);
  runTests(port)
  .then(() => {
    server.close();
  })
  .catch(err => {
    console.log('tests failed with: ', err);
  });
});

function runTests (port) {
  return new Promise((resolve, reject) => {
    const request = axios.create({
      baseURL: 'http://localhost:' + port + '/'
    });

    t.test('GET / returns 200', (t) => {
      request({
        url: '/',
      })
      .then(res => {
        t.equal(res.status, 200, 'Status is 200');
      })
      .catch(err => {
        t.fail(err);
      })
      .finally(() => {
        t.end();
      });
    });
 

    t.onFinish(() => {
      resolve();
    });

    t.onFailure(() => {
      reject(new Error('something went wrong'));
    });
  });
}


This is simple enough: almost as simple as using supertest, and it has advantages:

  • Use any http client code you like
  • Use any assertion library you like

 

Thursday, March 24, 2022

Recording audio that is playing

I sometimes want to record what is playing on my system. Not from the microphone, but what is playing to the headphones or speakers.

I am running Debian  Bullseye. It has a particular configuration of ALSA and PulseAudio that makes it easy.

The command 'pactl list' lists pulse audio devices, including all the sources.

On my system, one of those sources has the name 'alsa_output.pci-0000_00_1f.3-platform-skl_hda_dsp_generic.HiFi__hw_sofhdadsp__sink.monitor' with description 'Monitor of Tiger Lake-LP Smart Sound Technology Audio Controller Speaker + Headphones'. I see this description on the mixer Input Devices tab and it seems to correspond to what is playing on the headphones or speakers.

 arecord -f cd -t wav --max-file-time 3600 --device=pulse:alsa_output.pci-0000_00_1f.3-platform-skl_hda_dsp_generic.HiFi__hw_sofhdadsp__sink.monitor mon.wav

I suspect that the form of device name (`pulse:xxx`) depends on the configuration in /etc/alsa/conf.d/50-pulseaudio.conf:

# Add a specific named PulseAudio pcm and ctl (typically useful for testing)

pcm.pulse {
    @args [ DEVICE ]
    @args.DEVICE {
        type string
        default ""
    }
    type pulse
    device $DEVICE
    hint {
        show {
            @func refer
            name defaults.namehint.basic
        }
        description "PulseAudio Sound Server"
    }
}

ctl.pulse {
    @args [ DEVICE ]
    @args.DEVICE {
        type string
        default ""
    }
    type pulse
    device $DEVICE
}


I think this takes the xxx from 'pulse:xxx' as a parameter and allows access to the PulseAudio device by name.



Wednesday, March 16, 2022

Thunderbird keyboard shortcut customization

I am running Thunderbird 91.4.

I don't like the default Ctrl-V pasting formatting.

I installed the tbkeys-lite add-on and searched for two hours for any examples of configuration of the compose section. Didn't find a single example, and very few examples for the main section.

Issue 80 asked for examples and there was 1 reply.

http://kv5r.com/computers/thunderbird-paste-without-formatting/

Thunderbird Paste Without Formatting looks like it's relevant, but it is for an old version of Thunderbird - the files it references don't exist in my installation.

Finally, I tried the following and, fortunately, it worked:

{
  "ctrl+v": "cmd:cmd_pasteNoFormatting",
  "ctrl+shift+v": "cmd:cmd_paste"
}

With this, Ctrl-V pastes without formatting and Ctrl-Shift-V pastes with formatting.

Monday, February 28, 2022

npm install -g

As of about npm v5.0.0, the `npm install` command no longer installs from a directory. It is now an alias for `npm link`. This is fine in many cases, but `npm link` already existed and sometimes what one wants to do is install: not link.

For example, if a locally developed package is being installed globally so that all users can run a command it provides, it is generally best to install the package globally rather than link it, so that it is independent of the directory it was installed from.

Fortunately, it is not too difficult to overcome the elimination of this install option:

  • cd <package-directory>
  • npm pack
  • npm install -g <package>-<version>.tgz

This works because when installing from a tgz file, `npm install` still installs rather than linking.

Saturday, February 26, 2022

Worthless Guarantees from AliExpress and Netac Official Store

 I purchased an NVME SSD from Netac Official Store via AliExpress.

It is offered with "75-Day Buyer Protection Money Back Guarantee":

We promise your money back if the item you received is not as described, or if your item is not delivered within the Buyer Protection period. You can get a refund 15 days after the claim process finishes. This guarantee is in addition to and does not limit your statutory rights as a consumer, as granted by all mandatory laws and regulations applicable in your country of residence.

The package was delivered via postal service.

My local post office delivered the package to the wrong address and updated status to delivered.

I contacted my local post office and they confirmed that they delivered the package to the wrong address. They attempted to recover the package and deliver it to me but were unable to do so. They deemed the package to be lost.

My local post office cannot compensate me for the loss but they are willing to pay compensation for the loss if the sender makes a claim through the origin post office.

But AliExpress and the Netac Offical Store do not accept the emails I have forwarded from my post office confirming that they lost the package and did not deliver it to me.

They deem the communications from my local post office: "The evidence from NZPOST is not so clear to show the problems, please provide more clear evidence as required so that we can tell the problems obviously"

This is what NZ Post said:

Thank you for your patience. 

Sadly, your parcel has been delivered incorrectly, despite our attempts to recover this parcel from the incorrect address we have been unsuccessful. We have deemed this parcel to be missing. 

We recommend you contact the sender and request they initiate an enquiry/claim with the postal authority of the sending country. 

Please note claims for compensation for the loss or return of an international item can only be made by the sender from the country of origin. New Zealand Post has no provision for compensation for incoming international items. 

Our sincere apologies for the inconvenience caused by this incident.

How is this not clear? They made a mistake. They admit it. They are willing to pay compensation through the normal channels for international shipping.

International shipping is fraught with human error and acts of god that cause losses. Ships sink. Shipping containers go overboard. Warehouses go up in flames. All sorts of things. 

I don't mind that there was an error.

But obtuse refusal to accept the clear acceptance of fault and liability, to obtain the compensation offered and to send a replacement or refund is offensive.

AliExpress' and Netac Official Store's guarantees are worth nothing because they are in China and the guarantees are unenforceable from New Zealand, and neither AliExpress nor Netac Official Store are willing to honour them.

Update: NZ Post has changed their story. Now they say the package was scanned "near" my address and they don't know if it was delivered to my address and stolen or delivered to some other address. They maintain that in either case the package was "delivered" so they refuse to change the status in their tracking system. Still AliExpress is at fault for the delivery failure: they sent a valuable package with no signature required. Had a signature been required, the package would not have been left where it could be stolen. I would have gone in to pick it up and sign for it.

AliExpress has my money and I have nothing. Because they were negligent in how they sent it, not requiring a signature, allowing it to go missing rather than me receiving it. But from their perspective, all they have to say is: "The evidence from NZPOST is not so clear to show the problems,please provide more clear evidence as required so that we can tell the problems obviously"

The problem is: they have my money and I have nothing. But they don't care about that.

Saturday, February 19, 2022

Limitations of Calibre and Calibre-Web as an epub book viewer

When viewing an epub book using Calibre, one's position in the book is indicated as percent, in the bottom right corner of the reader window. This is OK for a small book but it does not show fractions of a percent and therefore is the same for many pages of a longer book. It does have the virtue that it is independent of the size of the reader window. Calibre maintains a constant position when the window is resized: at least some of the text is common between the larger and smaller window.

The Calibre-Web reading window is worse: it doesn't indicate position in the book at all. This is made worse by the fact that if one resizes the reader window the reader jumps to a different position in the book. If one alternates between a small and large window, just flipping back and forth, position gradually moves to the beginning of the current chapter. There is no practical way to get back to the original position except to go back to the start of the chapter then scan forward to the desired location. This is very poor UX.

In the case of Calibre-Web, the behaviour is a result of using epubjs to display the book. I have made a small test app that uses default configuration and it exhibits the same faults: no indication of position and position jumps when window is resized.

The epubjs package has many options. There may be options to display position and to maintain position when window is resized, but I haven't found them yet. I tried capturing and repositioning after resize, but it is a bit of a nightmare of rapid-fire events and deciding when to save and when to restore position. There does not appear to be any option to disable the automatic redisplay on window size change and there is essentially no documentation of the implementation.

As with so much software these days, the documentation of epubjs provides the syntax of the API but very little to nothing regarding the semantics.

There are some clues about getting page information in issue 744. None of this defines what is meant by "page" but most of the comments imply that what is meant is: what is presented in the reader window.

Calibre-Web has an inbuilt table-of contents for epub books but if it is opened the right side of the content goes off-screen with no scrollbar. It is impossible to read the text with the table of contents open.

Calibre-Web allows bookmarks to be saved but the presentation of them is as a cryptic string like "epubcfi(/6/8[id_4]!/4/2/1:0)". This is a standard epub cfi but it is not human friendly. Given a list of these, how would a normal person know which refers to what? There is no way to annotate them. In contrast, a bookmark in Firefox can be edited to change the title, icon, etc.

The bookmarks persist between browser sessions.  And they persist through clearing cookies and site data. This suggests that they are stored on the server. But presumably not in the book itself as each user could have different bookmarks.

It appears that Calibre-Web has its own database, in addition to the Calibre database it accesses. Default on Linux is ~/.calibre-web/app.db. This includes table 'bookmark' with fields user_id, book_id, format and bookmark_key, the latter two beeing epub and an epubcfi string. So, bookmarks are per user, per book and persistent across devices and sessions.

Both epub2 and epub3 have support for page lists but various posts suggest that almost no epub2 and few epub3 readers actually use them.

The navigation file provides some guidance on navigation for epub3.

A book might be published in hard copy, with a particular layout of pages. The same book in epub format will break the text into different sections, depending on font, window size, etc. One might be interested in what page of the original hard copy publication is being viewed, regardless of how many reader windows it takes to view the complete page, or one might be interested in pages as determined by the reader window: one reader window full is one page.

Is it possible to position the reader window at an arbitrary position in the text? If so, then what 'page' is the reader at when the displayed text starts at the second character of the book? Or the third? etc.

If the definition of "page" depends on the size of the reader window, font, screen resolution for displaying images, etc. then page number is only relevant in the current reader window context. If one reads the same book on a different system, with a different reader size, with a different font size, etc. then the page numbers will all be different. "Page 237" will contain different text depending on all these (and probably other) factors.

Counting characters might be more consistent: current display starts at character "1073648 of 27634287", for example. But this isn't very human friendly.

Pages are a familiar concept in paper books, but what do they mean in an e-book?

How does one return to the same position in the book, when one re-opens it?

How does one refer to a part of the book in computer friendly terms (where character offset might be fine) and human friendly terms?

Is it possible to make a fixed page list that is independent of the reader window size, screen resolution, font size, etc.? 

There is a good discussion of epub3 page lists at epubsecrets. This includes use cases where consistency across different media formats, independent of the individual reader details, is useful.

The epub3 spec includes page-list nav element.

epub3 has support for fixed layout documents, with the introduction pointing out that by default epub3 documents are intended to adapt to the reader with reflow, etc.

Firefox / Librewolf unsigned add-ons

In short:

Enable unsigned add-ons by setting xpinstall.signatures.required to false in about:config.

Make sure the add-on manifest.json has an id, as in the following:

   "browser_specific_settings": {
      "gecko": {
        "id": "zhongwen@example.org"
      }
    },

 Zip the add-on into an 'xpi' file:

    $ zip -r -FS ../zhongwen.xpi * --exclude '*.git*'

Then install the add-on from the about:addons page.

 

But it took me an unreasonably long time to learn to do this because Mozilla doesn't document it very well.

Mozilla has various guides to developing add-ons but they are all oriented towards having them signed by Mozilla. They say that it is possible to install unsigned add-ons to select versions of Firefox but give only hints about what is required.

Some add-ons that install successfully as a temporary add-on, via about:debugging, cannot be installed permanently as an unsigned add-on, with the unhelpful message:

Installation aborted because the add-on appears to be corrupt.

They could have omitted "because..." - it would have made the message no less informative.

Mozilla support gets reports of the message but offers no explanation. Multiple reports, with all sorts of complex details but no overview of how an add-on might be corrupt or how to diagnose the problem, just specific trial and error advice. Mozilla is getting to be as bad as Microsoft.

The browser console has more information:

1645324229834 addons.xpi WARN Invalid XPI: Error: Cannot find id for addon /home/ian/dev/zhongwen.xpi(resource://gre/modules/addons/XPIInstall.jsm:1531:19) JS Stack trace: loadManifest@XPIInstall.jsm:1531:19

Why isn't the error message to the user: Installation aborted because the add-on does not have an id? Or some such? None the less, if one jumps through enough esoteric hoops, the information is available.

So, how to provide an ID? 

This page gives some hints. 

The workshop gives some hints. Mostly it tells you that you don't have to set an ID explicitly and when you do have to set and ID explicitly. At the very end (did you read all the way through the irrelevant details to the last sentence?) it says:

See browser_specific_settings in manifest.json for the syntax of setting the extension ID.

Could they have made it any less obvious? I don't think so. To make it this obscure, one would have to be deliberately trying to make it difficult to succeed in any way other than having the add-on signed by Mozilla, and one would have to develop and refine the obscurity with diligence.

In any case, the documentation of browser_specific_settings describes the id, including the two supported formats.

So, I edited manifest.json of my plugin, adding:

   "browser_specific_settings": {
      "gecko": {
        "id": "zhongwen@example.org"
      }
    },

They don't say anything more about this format than: a string formatted like an email address, and the guidance that if it is a real email address it will attract spam. This suggests that it doesn't have to be a real email address: it just has to have the format of an email address. There is nothing about whether or how the address is used or validated, uniqueness constraints or anything else. Really, it just describes the syntax and that is all.

I then zipped the contents of the add-on directory, excluding the .git directory, to an xpi file (which is just a zip file with a unconventional extension):

    $ zip -r -FS ../zhongwen.xpi * --exclude '*.git*'

I also set 'xpinstall.signatures.required' to false via about:config.

I was then able to install the add-on.

Why do they need an ID when they don't need it to install temporarily? Obviously they don't need it, otherwise they would need it to install the add-on temporarily. It is an arbitrary restriction, making installation more difficult without adding any obvious value and possibly without adding any value at all. One more brick in the wall.

I have installed the add-on to Librewolf and a Nightly build of Firefox. It seems to work OK.

Zhongwen

Zhongwen is an add-on for Chrome and Firefox that looks up Chinese characters. I use it on Firefox.

I was using Calibre to view e-books but then I wanted to view them in a browser so I could use the Zhongwen add-on to look up the characters I don't know.

I installed Calibre-Web and began studying some of my Chinese books that were in pdf format. It worked great. I could read along and easily look up the characters I didn't know.

But then I tried reading a Chinese book in epub format and I couldn't look up the characters. The Zhongwen add-on wasn't working, except on the title line at the top of the page.

It turns out that the Zhongwen add-on doesn't work on content in an iframe and Calibre-Web presents the contents of epub books in an iframe.

I developed an enhancement of Zhongwen (see the issue85 branch) that works on content in iframes. I have only tried it with Calibre-Web but it works fine there. With this, I can view epub books via Calibre-Web and look up the Chinese characters I don't know with the modified Zhongwen add-on.

The only remaining problem is that Firefox no longer lets me install a plugin that isn't signed by Mozilla, unless I use an unbranded release. Security is good and it would be good if requiring signed add-ons was default, but removing all possibility of installing my own add-ons is going too far. Not every Firefox user is incompetent to create their own add-on and manage add-ons safely. If someone has access to my system to manipulate my add-ons, they can just replace the Firefox executable. Requiring signed add-ons is a limitation without benefit.

It is possible to install an unsigned add-on temporarily from about:debugging. But then the add-on must be re-installed every time Firefox is restarted.  

I installed an unbranded nightly build of Firefox but still there are hoops to jump in order to install the add-on permanently. It works fine installed temporarily but there is no easy to find and follow documentation about how to install a locally developed add-on permanently, only very lengthy procedures about how to register with Mozilla and publish an add-on, which is a very involved and time consuming process to learn and exercise.

So, at the moment I am stuck with loading the add-on temporarily. It seems that Mozilla, despite a good start, has decided that making the life of users difficult and less productive is the way to go.

I have submitted a pull request to Zhongwen but thus far there has been no response from the author. Who knows when or if an update of the add-on will allow it to work on content in an iframe. I will try to be patient. I have something that works well enough for my study in the meantime, though it is a nuisance having to use a different browser.

If the add-on isn't enhanced for too long, I will consider releasing my own version, but that requires getting involved with Mozilla's publishing process, which I would rather not do. It is really unfortunate that I can't put something up on GitHub or the like that people can download, review and install for themselves.

Maybe it is time to switch to a more permissive browser. Or maybe all the major browsers are now in the business of building walled gardens. It's a sad world. I wonder how long it will be before Firefox starts making it difficult to browser sites other than Mozilla's own sites and sites of those who fund it? They have already made it impossible to browse sites for which Mozilla deems the security to be inadequate. This was good when I could review the problem and manually override where appropriate, but more recent releases give no option to proceed to the site: it is simply and utterly blocked with an unhelpful message. Attempting to browse other sites results in an alert but, thus far, one can still proceed to the site: it is just a matter of dealing with the alerts and clicking through. Add-ons? There are no practical options: only Mozilla approved add-ons are allowed. The walls are going up and, no doubt, will become higher with time, then the excuse of lack of resources will be used to bring them in to reduce the scope of support. 

It's a sad world we live in. Mozilla used to develop good, free software. It still is, to the extent that I could fork Firefox and remove the restrictions, but I don't have time for that.

Friday, January 14, 2022

gmusicbrowser on Debian bullseye

gmusicbrowser is not longer available as a package to install to Debian Linux. It was removed in 2019, due to lack of maintenance and Debian bug 912882. The root cause was that gmusicbrowser depended on libgtk2-perl, which was being dropped.

gmusicbrowser issue 57 tracks progress to migrate to gtk3 since 2013. The 1.1.99.1 release is the first based on gtk3 but there has been a year of development since then, as yet unreleased.

Fortunately, it is easy to install the gtk3 based gmusicbrowser on Debian bullseye with xfce4 desktop. It is probably as easy to install with any other desktops, but I haven't tried any of them.

$ git clone https://github.com/squentin/gmusicbrowser.git
$ cd gmusicbrowser
$ sudo make install

I didn't have to install any packages beyond what were already installed.

I haven't tested it extensively, but I haven't had any problems. If you too like gmusicbrowser, you can install it to current Debian system easily this way, and probably many other systems.

Saturday, January 1, 2022

Preparing to configure Windows

 I bought a new laptop - a dynabook tecra. It came pre-installed with Windows 10. Completing the Windows 10 setup was easy but for the next few days, every time I reboot it installs more updates.

This time, it has been displaying 'Preparing to configure Windows' for over 20 minutes with no indication of progress or when it might finish. 

The Windows Club has a page on this. They recommend waiting for 2 hours before giving up and trying anything other than waiting.

TWO HOURS! For an update. Not even a full install.

I can complete an update of Linux in about 2 minutes, typically.

What is Microsoft thinking, to make such an obtuse update process? It's not like they are beginners, without experience. They have been doing this for years.

It is because of nonsense like this that I prefer Linux. I never have such problems with Linux.

I would just abort the Windows update but, being conservative, I want to backup the Windows partitions before I wipe them and install Linux and I don't want the backups corrupted by an incomplete update. I am having to reboot a few times to confirm I have all the Linux issues sorted (modules and firmware for all the essential hardware). It is extremely annoying having to deal with Windows again.

Eventually it had completed its update and now I am unable to access the BIOS setup or boot menu. It boots straight to Windows every time, restart or shutdown and boot. Fucking Windows update - it was working normally until the fucking update.


Labels