Saturday, February 19, 2022

Limitations of Calibre and Calibre-Web as an epub book viewer

When viewing an epub book using Calibre, one's position in the book is indicated as percent, in the bottom right corner of the reader window. This is OK for a small book but it does not show fractions of a percent and therefore is the same for many pages of a longer book. It does have the virtue that it is independent of the size of the reader window. Calibre maintains a constant position when the window is resized: at least some of the text is common between the larger and smaller window.

The Calibre-Web reading window is worse: it doesn't indicate position in the book at all. This is made worse by the fact that if one resizes the reader window the reader jumps to a different position in the book. If one alternates between a small and large window, just flipping back and forth, position gradually moves to the beginning of the current chapter. There is no practical way to get back to the original position except to go back to the start of the chapter then scan forward to the desired location. This is very poor UX.

In the case of Calibre-Web, the behaviour is a result of using epubjs to display the book. I have made a small test app that uses default configuration and it exhibits the same faults: no indication of position and position jumps when window is resized.

The epubjs package has many options. There may be options to display position and to maintain position when window is resized, but I haven't found them yet. I tried capturing and repositioning after resize, but it is a bit of a nightmare of rapid-fire events and deciding when to save and when to restore position. There does not appear to be any option to disable the automatic redisplay on window size change and there is essentially no documentation of the implementation.

As with so much software these days, the documentation of epubjs provides the syntax of the API but very little to nothing regarding the semantics.

There are some clues about getting page information in issue 744. None of this defines what is meant by "page" but most of the comments imply that what is meant is: what is presented in the reader window.

Calibre-Web has an inbuilt table-of contents for epub books but if it is opened the right side of the content goes off-screen with no scrollbar. It is impossible to read the text with the table of contents open.

Calibre-Web allows bookmarks to be saved but the presentation of them is as a cryptic string like "epubcfi(/6/8[id_4]!/4/2/1:0)". This is a standard epub cfi but it is not human friendly. Given a list of these, how would a normal person know which refers to what? There is no way to annotate them. In contrast, a bookmark in Firefox can be edited to change the title, icon, etc.

The bookmarks persist between browser sessions.  And they persist through clearing cookies and site data. This suggests that they are stored on the server. But presumably not in the book itself as each user could have different bookmarks.

It appears that Calibre-Web has its own database, in addition to the Calibre database it accesses. Default on Linux is ~/.calibre-web/app.db. This includes table 'bookmark' with fields user_id, book_id, format and bookmark_key, the latter two beeing epub and an epubcfi string. So, bookmarks are per user, per book and persistent across devices and sessions.

Both epub2 and epub3 have support for page lists but various posts suggest that almost no epub2 and few epub3 readers actually use them.

The navigation file provides some guidance on navigation for epub3.

A book might be published in hard copy, with a particular layout of pages. The same book in epub format will break the text into different sections, depending on font, window size, etc. One might be interested in what page of the original hard copy publication is being viewed, regardless of how many reader windows it takes to view the complete page, or one might be interested in pages as determined by the reader window: one reader window full is one page.

Is it possible to position the reader window at an arbitrary position in the text? If so, then what 'page' is the reader at when the displayed text starts at the second character of the book? Or the third? etc.

If the definition of "page" depends on the size of the reader window, font, screen resolution for displaying images, etc. then page number is only relevant in the current reader window context. If one reads the same book on a different system, with a different reader size, with a different font size, etc. then the page numbers will all be different. "Page 237" will contain different text depending on all these (and probably other) factors.

Counting characters might be more consistent: current display starts at character "1073648 of 27634287", for example. But this isn't very human friendly.

Pages are a familiar concept in paper books, but what do they mean in an e-book?

How does one return to the same position in the book, when one re-opens it?

How does one refer to a part of the book in computer friendly terms (where character offset might be fine) and human friendly terms?

Is it possible to make a fixed page list that is independent of the reader window size, screen resolution, font size, etc.? 

There is a good discussion of epub3 page lists at epubsecrets. This includes use cases where consistency across different media formats, independent of the individual reader details, is useful.

The epub3 spec includes page-list nav element.

epub3 has support for fixed layout documents, with the introduction pointing out that by default epub3 documents are intended to adapt to the reader with reflow, etc.

No comments:

Labels