Ian's World

Wednesday, June 2, 2021

Spaced Repetition Flashcards

After two years using Anki, I decided to write my own spaced repetition flashcards software.

I was using the Anki v2 scheduler but it had various faults and limitations, the most serious of which were a fixed number of new cards per day, lack of API to implement an alternate scheduler, frequently changing add-on API and overly complex build environment that requires familiarity with several languages and build tools.

I created add-ons that overcame some of the faults but it became too difficult to maintain them. And they didn't achieve all that I wanted. Because of how many of the scheduling features are implemented, many problematic aspects are difficult or practically impossible to alter or improve. It became evident that it would be less work and better outcome to write my own software than to continue struggling with Anki.

I use only a small subset of Anki features. Only an single instance of the desktop version (so no syncing) and only simple cards - no cloze cards. The Anki application was vastly more complicated than I needed.

So, I wrote my own. It is browser based with the server running on nodejs. It is simple: less than 1000 lines of JavaScript and a few templates. It's a bit crude yet, but I have been using it to study for over a week now. It presents all my cards as well as Anki did. It has, in my opinion, superior scheduling. The build tools are simple and well known - in other words: easy to install and use. I only need to know one programming language: JavaScript, or five if you include the SQL, Handlebars templates, HTML and CSS as languages. All but the Handlebars templates are ubiquitous and familiar, and Handlebars templates sufficient for this purpose, are simple.

About the same time, I updated my Anki add-on again, to support up to the then current 2.1.43 (and probably a few subsequent - until the internals change again). This is likely to be the last time I will update the add-on as I am not using it or Anki any more.

Scheduling is simpler and more consistent than Anki. New cards are automatically regulated to maintain a steady workload of about 1 to 2 hours of study per day. The number of cards to be reviewed each day is somewhat random, but if study time in a day (actual or projected) exceeds 1 hour then no new cards are presented. This allows me to focus on the cards I am already learning, until I learn them well enough that cards to be studied / study time fall below the threshold, after which new cards are presented again. Other than the limit on total study time, there is no limit on new cards.

The scheduler prioritizes cards with shorter intervals over cards with longer intervals. This doesn't make much difference when you are up to date but it makes a big difference with a backlog, as I had after working on on this new app for a few days instead of studying. Even with a large backlog, the challenging cards don't get deferred by the backlog. It seems, in this way, to be much better than Anki. A few days after I got back to studying, I had cleared my backlog and was seeing more new cards than I had in a long time with Anki.

I have no research that proves my scheduler is more effective than Anki's but at least it doesn't have the misfeatures that I experienced with Anki: fixed number of new cards per day; excessive complexity of new, learning, re-learning and review cards and queues; a tendency to gross overload and dysfunctional review ordering, resulting in overall failure to learn; adverse interactions of new card limits between nested decks; lack of control of new card order; and limitation of the scheduling algorithm to units of days.

It is based on an import and modification of my old Anki database. I haven't yet written anything to import an Anki (or other) shared deck, but I probably will. However, currently I have about 30,000 new cards, so no need for more any time soon. I wrote an utility to import the Anki database but probably won't use it again: it would be too much of a setback to revert to what I was doing in Anki weeks ago and I don't study with Anki any more. I suppose I might import a new database if I used Anki to download some other decks and wanted to keep multiple databases, but it would probably be better to write an import of shared decks directly than to do it through Anki.

There is a very crude interface for editing notes: just the fields. So, I can correct errors as I find them. I'll add features to add and remove notes at some point.

New card ordering isn't lost when a card is studied. As a result, it is possible to reset a deck to it's original state and start over with the original ordering of new cards. This isn't possible with Anki because the initial ordering is in the due field of the cards, which is overwritten when the card is viewed.

Some of the serialized data in the Anki database is no longer used but there is more to be done. There is nothing fundamentally wrong about storing serialized data structures, but using a non-standard format is bad. I might revise some of the serializations to JSON. Alternatively, I might deserialize them into separate fields in the database. But, for the moment, I have cracked the rust serialization algorithm, so there is no immediate need.

There is room for improvement in the review log, but I can produce basic statistics such as counts of cards studied and due per day, study time per day, number of new cards viewed per day, etc.

Wednesday, December 16, 2020

Learning a Language with Anki and other resources

I like what this reponse to this post says about words Vs sentences for learning a language, and the use of Anki Vs other resources. And this post expresses the same ideas about context Vs isolated words, though I am more inclined to somewhat longer, more complex sentences than they are. When learning a language, I am not concerned about remembering the idea expressed by a sentence. I am only concerned with learning to understand the sentence, to extract the meaning. The issue of complexity might be much more significant if one were trying to learn the expressed ideas (e.g. math, physics, economics or whatever) rather than learning to understand the expression.

Wednesday, December 9, 2020

Anki Scheduling

I have been using Anki on Linux to help my study of Mandarin. I am just a beginner, learning basic vocabulary.

Anki has been very helpful but after a few months of study, with default scheduling, I became overwhelmed. It was taking me several hours to complete all scheduled reviews and many days I didn't have enough time. I was seeing cards too infrequently and not learning effectively. I was failing. Something had to change.

I'm not the only one. Many others have faced this problem and much has been written. Search for Anki Scheduler and you can find much more, and many good ideas.

For me, the fundamental problem was that I had too many cards to learn and review, and new cards were being added every day - faster than I could learn them. Everyone has limits. My memory is poor so my limit is quite low. I had to reduce the number of new cards I was seeing.

I was studying three main decks: a large deck of characters, a large deck of phrases and a growing deck of my own - characters and phrases that I had seen in my other study (reading and watching movies and serials). I merged the latter two into a single deck and then put the remaining two decks under a single parent deck. This gave me one deck (the parent deck) to study each day.

I set new cards per day low but it wasn't enough. I was so overwhelmed, I wasn't making progress, even with a low number of new cards per day. I realized I had to stop the new cards and focus on the cards I was already learning, at least until I had recovered.

I could have set new cards limit to 0 until I recovered, then either increase the limit or add new cards through custom study, but I didn't want to have to make such decisions every day.

I could have set new cards to show only after I had completed all scheduled reviews. If I didn't complete scheduled reviews I would see no new cards. It would be self-regulating. But it was difficult to learn new cards after a long period of study and not completing all that was scheduled, including new cards, felt too much like failure.

I wanted new cards to be mixed with review cards but I didn't want too many new cards, and I didn't want to be adjusting the new card limit manually. I was using the new version 2 scheduler but there was no way to achieve this except by manually adjusting the new card limit up and down. I wanted something more automatic. Something that challenged me but didn't overwhelm me.

I wanted to study between one and two hours per day. If I wasn't able to complete all scheduled reviews in that time, I didn't want to add any new cards. On the other hand, if I was completing scheduled reviews on time I wanted to see new cards, to make progress and continue to be challenged. I wanted to be studying near my limits but not go beyond them.

The scheduler couldn't do this for me, but Anki is extensible, so I wrote an add-on to limit new cards. The add-on limits the number of new cards based on the number of reviews scheduled and completed. It allows me to mix new cards with reviews, to reduce the number of new cards when I am overwhelmed but see enough new cards to challenge me when I am not. It keeps me near my target of one to two hours of study per day. It is not all-or-nothing. As number of cards to study increases, number of new cards decreases gradually. Even when I am near my limit, I see a few new cards but if I am overwhelmed, there are no new cards until I recover.

The add-on is based on number of card views rather than time, but that's OK. Time per card is fairly consistent and it's only an approximation to my capacity in either case. I configured it to begin limiting new cards when scheduled reviews exceeds 150, with no new cards at all if scheduled reviews exceeds 250.

It worked as expected. Initially, because I had such a large backlog of unlearned cars, I saw no new cards at all. Gradually I learned the cards I was already studying. After a few weeks, I was consistently completing all scheduled reviews. My daily study time came down, so I wasn't exhausted. I began to learn more effectively again. As my number of daily reviews came down, I began to see new cards again. It was success.

Now that I am through the worst of the overload, my daily study time is usually between 1.5 and 2 hours. On a bad day (if I have had no sleep) sometimes more. On a really good day, only a little over an hour. This fits my overall schedule. If I miss a day or two, it takes me a few days to catch up. While I am catching up, I see no new cards, but when all is going well I see a few new cards each day - more or less, depending on my workload, keeping me near my configured limit of about 200 reviews per day.

It is still early days. I expect it will take me many years to learn enough Mandarin to have conversations, watch movies without subtitles and read news and novels. But now, I learn a little every day, and my time with Anki is effective. I see enough new cards to make progress and keep me challenged, but not too many. I am no longer overwhelmed.

Everyone is different, learning differently, with different capacity and limits. What works for me (I have poor memory, getting worse as I get older) may seem trivially easy for others. But Anki is configurable and each user will have to find the configuration that suits their ability and interest. The default configuration is a good starting point, but I have made adjustments.

To understand the configuration, you have to understand the Anki scheduling algorithm. Read the manual for a basic introduction to the options and the FAQ on the Anki Algoritm. I also found this video helpful.

Consolidating decks helped. I am only studying Mandarin. If I were studying different subjects I would probably keep them in separate decks, but for Mandarin, I prefer characters and phrases to be mixed, rather than separate.

As much time with other study (watching movies and reading) as I spend with Anki also helps. My ears need practice and the repetition and seeing things in different contexts really helps.

Currently, I have new cards per day limited to 5. My memory really is poor. I have little capacity. I could increase this, as the add-on limits new cards. I haven't seen 5 new cards in a day for a long time now, but I am still working through the backlog I accumulated before writing the add-on. It will probably take me a few more months to reach a stead state of learning. Maybe then I will increase the limit on new cards. If not, I can live with 5 new cards per day. Slow and steady is the best I can do these days.

For new cards, I have Graduating interval at 1 days, Easy interval at 4 days and starting ease 175%. Initial steps are 1, 2, 5, 10 and 20 minutes. I need lots of repetition to learn but after the initial repetitions, I prefer to let the Anki scheduler adjust the intervals, rather than a fixed schedule. Some cards are easier and require fewer, less frequent repetitions. Others, not obviously different, I find difficult to learn - they require a lot of frequent repetition. The scheduler takes care of this for me.

For reviews, I have a maximum of 1000 per day. I want to complete all reviews on schedule. I don't want them delayed by this limit. Easy bonus is the default 150%. Interval modifier is default 100%. But I decreased the Hard interval to 80%. If I find a card hard, I want to see it sooner. If it continues hard, I want the interval to decrease until I am seeing it often enough that I start to remember. I really do need a lot of repetition.

For lapses, I have steps 3, 5, 10 and 60 minutes - a little burst of frequent repetitions. Then, New interval at 30%. I want to see the card again soon. I'm not concerned about leeches, so I set leech threshold to 50 lapses and Leech action to Tag Only.

With these settings, I am making progress. But I have noticed a few faults / features of the V2 scheduler that I don't like. So, I wrote some more add-ons to fix/improve these.

With Hard interval set above 100% and low ease, the minimum increment in interval is 2 days. Cards progress through 1 day, 3 days, 5 days etc. This rapid progression is sometimes too much for me - it is effectively a minimum ease of 300%, only decreasing to the card's actual ease when the interval has reached a week or more. I am good for one or two reviews then lapse again. Cards end up with minimal ease (Hard and Again both decrease ease) but still progress too quickly.

Worse, with Hard interval set below 100% and low ease (anything below 150% - and minimal ease is only 130% and Hard and Again both reduce ease) they get stuck at an interval of 2 days - not progressing at all. Easy will lift them out of this, but Good leaves them stuck at 2 days indefinitely. And I want Hard interval at 80% or 90% - if I find a card hard, I want to see it more frequently, not less.

So I wrote this add-on to fix the calculation of new intervals for review cards. It fixes both problems. Regardless of Hard interval, minimum increment in interval is 1 day on Good. Not 2 days and never 0 days. If a card has an interval of 1 day, hard leaves it at 1 day (1 day is the minimum interval for a review card) and Good progresses it to 2 days. That's still an effective ease of 200%, but the minimum increment in interval and better than 300%. If a card has an interval of 2 days, Hard reduces it to 1 day (I have Hard interval at 80%) and Good increases it to 2 days. I was surprised, but I have noticed a significant improvement in my success with new and lapsed cards after making this change. I'm still working through a large number of cards with low interval and minimal ease - a bit of 'Ease Hell', but progress is better with this change. It seems I have an aversion to Easy, which would increase the ease, so progress is slow but at least it's steady.

Then I wrote this add-on to fix the fuzzed interval ranges. This is a really trivial issue. Just something I noticed as I was watching my progression and the impact of the previous add-on. At low intervals, the fuzzed ranges were sometimes larger but sometimes smaller, even 0, as the interval increased. The add-on makes the fuzzed range increase monotonically with interval, with a logarithmic roll-off. The fuzzed interval range is about two weeks at an interval of one year. That seems plenty to ensure pairs of cards don't keep appearing on the same day.

Finally, I wrote this add-on to slightly increase ease on Good. I had read about 'Ease Hell' and with my backlog of cards with many lapses and many Hard cards, I had experienced it - many cards with low ease and low intervals, slowly progressing through Good, Good, Good. I really do have an aversion to Easy, unless a card is very, very easy. I could just choose Easy, but it seems more reasonable to me that if a card is Good, ease should increase a little and if it is Easy it increases a lot. This way, ease won't get stuck, every review changes it a little:

Again: decrease ease by 20%
Hard: decrease ease by 15%
Good: increase ease by 5%
Easy: increase ease by 15%

Note: ease is itself a percentage of the interval. For example, if ease is 200%, then on Good the interval is increased to approximately 200% of the current interval (i.e. the interval is doubled). The change is approximate because of interval 'fuzz'. The changes indicated above are percent of interval, not percent of ease. If ease is 200 percent, Good increases it to 205% and Easy increases it to 215%.

These changes are all hard coded in the original scheduler and this add-on. I might add configuration but the add-on is just a bit of python script so it is easy to change them if you want something different.

If a card continues Good through many reviews, ease will slowly increase. Eventually it should reach an ease/interval at which it becomes Hard, which will decrease the ease (and, if Hard interval is less than 100%, also decrease the interval, which I think is better than the default of increasing the interval to 120%). Ease should then go up and down as reviews alternate between Good and Hard - keeping interval near the limits of retention, which research suggests is good for learning. What it won't do is get stuck at an unreasonably low ease, with very slowly increasing interval as you keep finding the card Good.

Ideally, the card will never be Easy (i.e. Anki will progress interval fast enough that you are not wasting your time reviewing it too often) and never lapse (Again). It may be Hard, sometimes but will mostly be Good. That's the sweet spot of challenging your memory but not waiting so long you forget (lapse) and have to relearn the card.

It's early days for this add-on but I have many cards with minimal ease (130%) and I can now see them gradually increasing. My 'Ease Hell' is being alleviated. I can still use Easy to progress more rapidly, if I want to, but the ease and intervals will sort themselves if I persist with Good. And this is good!

While I haven't proved it with objective analysis of my study outcomes, collectively, these changes have been good. I am no longer overwhelmed, I am making progress and I am enjoying study again.

The first add-on is published to AnkiWeb. The others are relatively minor and recent, so I haven't published them yet, but if you are interested you can download them from GitHub and install them yourself. If there is interest, I'll add zip files to make the installation easier, or publish them to AnkiWeb. Let me know if you are interested but in the meantime, they are very easy to install manually.

I think Anki is great and with these few changes to the scheduling, it is even better for me. If you try it, I hope you too find it helpful and maybe my experience with it helps you.

Misc links:

Tuesday, January 10, 2017

Windows 7 - Source path too long

I was trying to copy my user folder to a USB drive for backup but copy failed with 25 files remaining to be copied with alert "Source path too long". The problematic path was not given in the alert and there was no option to continue to copy the files that could be copied and no indication of what the remaining (as yet uncopied) files were. The only option was to cancel the copy. This is another instance of bad design in Windows7.

After a bit of searching I found FastCopy, described by Darwin Sanoy.

I downloaded this and ran it (no install necessary) and it does a much better job.

It fails to copy some files. When it fails it logs the reason for failure (e.g. cannot access file because it is being used by another process, access is denied, etc.) and the full path of the file it failed to copy, then continues copying what it can. This is excellent!! When it is done, I can review the failed files and decide what to do about them and it copies all that it can!

Not only does FastCopy fail more elegantly - it copies far more files. Windows copy function indicated that there were only 25 files remaining to be copied when it failed. FastCopy copies far more than 25 more files than Windows copy. Windows copy indicated about 17GB to copy. FastCopy copied vastly more data than this. Windows copy function must silently exclude a lot of files.

So much better than Microsoft's copy function.

Wednesday, January 4, 2017

huawei Y360-U72 USB to Windows7

Connecting this phone to laptop didn't work immediately. The problem was the USB mode.

When I connected the phone, drivers were loaded and the device appeared in Windows explorer, but when I tried to open it I got:

Later I stumbled on the solution...

Open settings and select All at the top then scroll down and select Storage:

Tap on the three dots at the bottom to open an additional menu and tap on USB computer connection:

Then select Media device (MTP). The default was USB storage. Maybe that is something my service provider set - the phone came locked to the provider. Huawei default might be different.

With the USB computer connection setting changed, the phone appears in Windows Computer as Y360-U72 and I can access the contents to copy photos, videos, music etc. between the phone and computer: