Tech Stuff: Spaced Repetition vs. Pseudo-randomness

Instead of putting a long comment to my last post, I'll update it here :)
My original algorithm for working through flashcards was based on a pseudo-random algorithm. Frankly, it wasn't very good (though it did exactly what I wanted, which is probably good) :D

Imagine you have a set of flashcards. The set is a random distribution of cards you

have completely memorized
know very well
don't know very well
don't know at all

So, you separate out your cards into 4 piles, each mapping to your level of knowledge, ordered from least to greatest knowledge. According to Leitner's algorithm, every time you guess a card correctly, it moves one pile forward. Every time you guess a card incorrectly, it moves one pile backward. This means the cards at the back are of more immediate import than those at the front.

Now, how do you determine how often to take a card from each pile? Do you just start with the last pile and move up? Do you start with the first pile and move down?

My original implementation provided a pseudo-random distribution in which you were more likely to get cards from the lowest pile than the highest. There were gradations of probability in the piles between.

For example: the odds of picking a card from pile 1 might be 60%, 20% from pile 2, 12% from pile 3, and 8% from pile 4. As piles became exhausted, the probability would adjust, making the next lowest pile the most likely, say 70%, 20%, 10%.

It worked pretty well, but failed to take into account one of the fundamental aspects of learning: time.

In a single session, this system isn't a bad one. It forces you to focus on the cards you know, while periodically making you go back over those you are starting to learn. But, if you think of memorization as a function of capacity to know something in conjunction with its degradation over time, this system is really kinda worthless :)

Spaced Repetition is a system which compliments flashcards pretty well. There are a number of apps which utilize it. The theory is this: those cards in the first pile should be displayed more frequently than the others. Note: there is no randomness to picking a card. Instead of focusing on a given session, spaced repetition presents cards on fixed intervals, which increase as the user begins to learn the card. Thus, a session lasts as long as the system is still presenting cards at intervals. A single learning "session" could, in theory, be considered a lifetime.

For example: a card you are learning for the first time might be presented to you once an hour. As you start to guess the card more correctly more often, it is presented less often, but still regularly (say, once a day). This proceeds until you are only reviewing the card at large intervals (perhaps twice a year) -- just enough to keep it retained in memory.

It's a pretty decent system, and I really like the idea. It will make me work with schedulers in Windows, Mac, and Linux (if I ever get the lousy GUI finished).

I'd like to work this app out so that it can do several things:

Utilize Leitner so the concept of advancement, degradation are in place
Allow users to swap in/out various algorithms for dictating the cards:

Spaced Repetation
Pseudo-random (for single-sessions)
...

Allow users to swap in/out their own flashcards easily.

The last bullet point merits a lot of thought. Should a user simply provide a comma-separated file w/questions and answers? What about pictures? How would you capture that? Should everything be imported into a database for me to pull from?

Many thoughts :) The framework, however, PureMVC is working beautifully in Python, which is exciting. I'm a big fan of it, and I like to see it implemented well in a given language.

Tech Stuff

Saturday, April 3, 2010

Spaced Repetition vs. Pseudo-randomness

No comments:

Post a Comment