Joined: May 2007
Posts: 910
Thanks:
0
Okay, a further refinement. Instead of partitioning young/mature cards, a relative delay (RD) is calculated based on the card's initial interval, divided by the delay. This way, a delay of 1 day for a 10 day card should be the same as a delay of 5 days for a 50 day card.
fail=1 young=1 d=00.04 rd=00.00 i=00.00
fail=1 young=0 d=11.96 rd=00.76 i=09.09
fail=1 young=0 d=11.96 rd=01.49 i=17.84
fail=1 young=0 d=11.96 rd=02.34 i=28.01
fail=1 young=0 d=11.96 rd=03.96 i=47.34
fail=1 young=1 d=00.04 rd=23.60 i=01.00
fail=1 young=0 d=00.08 rd=228.00 i=18.53
fail=1 young=0 d=00.08 rd=302.00 i=24.11
fail=0 young=1 d=37.83 rd=00.08 i=03.21
fail=0 young=1 d=37.28 rd=00.09 i=03.28
fail=0 young=1 d=37.69 rd=00.09 i=03.35
[...]
fail=0 young=0 d=36.53 rd=00.82 i=29.84
fail=0 young=0 d=09.86 rd=00.83 i=08.18
fail=0 young=0 d=29.00 rd=00.84 i=24.51
fail=0 young=0 d=35.54 rd=00.87 i=30.85
fail=0 young=0 d=09.51 rd=00.90 i=08.53
fail=0 young=0 d=09.34 rd=00.93 i=08.71
fail=0 young=0 d=34.02 rd=00.95 i=32.39
Joined: Sep 2006
Posts: 331
Thanks:
0
This sounds really interesting. Do you look at any statistics across users? I would be curious to know if this new optimization (which sounds great to me btw) will actually have any noticeable memory benefit across the user base.
Joined: May 2007
Posts: 910
Thanks:
0
It has two useful properties:
- failed cards are given priority, even if other cards expire in the interim. you can close anki with a bunch of failed cards due within 10 minutes, and when you come back to it the next day, they will be displayed first
- it shows the cards with the largest relative delay first. this means that
a) cards will be displayed in the same order on the server and your local machine, regardless of deck order
b) it improves your chances of answering delayed cards - if you do a little a day, the cards which most need your attention will be displayed first
What statistics do you speak of? I would have thought the optimisation is quite logical. :-)
Joined: May 2007
Posts: 910
Thanks:
0
You're asking "do you have any stats to back up the assertion that cards with a longer delay in answering are less likely to be answered correctly?"
This seems obvious to me, so I'm not exactly keen to spend hours gathering and graphing statistics to try and back this up. Remember that this optimisation is mainly there to improve the handling of scheduling when you don't answer things on time. It's not about "having an effect on memory", it's just trying to ensure that the most urgent cards to review get shown first. In an ideal world, you'll answer all the cards in the same session. All this does is show the "oldest" cards first, where age is relative to the interval of the card.
Joined: May 2007
Posts: 910
Thanks:
0
No need to apologize :-) Anyway, to see the results of this, pay attention to your global correct% after the next Anki release. It should increase a little if you don't always answer cards on time.
Joined: Sep 2006
Posts: 331
Thanks:
0
Of course I know about my own global percent correct and I watch that like a hawk. I was talking about stats across all users, in any case I think I have cluttered up the thread a bit too much trying to express this idea time to get back to studying. It really wasn't all that important.
Joined: Apr 2006
Posts: 873
Thanks:
0
Would it be possible to make the prioritising settings optional in one of the preferences windows?
Joined: Aug 2007
Posts: 576
Thanks:
0
Hi Resolve,
I wasnt able to try your advice for a week but it appears to be working now.
I did step 3, but didnt have to do step four to run anki. (and I get version 3.6). Is this ok or am I gonna run into problems later.
Also a question. in step four (below), where does cd..; refer to? anki-0.3.6 folder and another folder?
4. cd ..; sudo python setup.py install
Thanks for a great product and all the support
Joined: Mar 2007
Posts: 227
Thanks:
0
cd .. takes you to the directory above where you are currently. So if you're in /home/anki-0.3.6/libanki and you cd .. , you'll be in /home/anki-0.3.6
Joined: Apr 2007
Posts: 196
Thanks:
3
The server seems to be down right now, so maybe it's just trying to sync and timing out?
Joined: May 2007
Posts: 910
Thanks:
0
Right, I'm fed up with that router. I'll do something about it today.
(edit: server will be down for a while)
Edited: 2007-10-22, 12:59 am
Joined: May 2007
Posts: 910
Thanks:
0
Using a new router now. Please let me know if any further problems occur.
Joined: May 2006
Posts: 355
Thanks:
0
what was your old router? what did you upgrade too?
Was it just getting swamped under the traffic? Or was it just old?
Joined: May 2007
Posts: 910
Thanks:
0
The phone interface feels quite a bit snappier now, too.
Joined: May 2007
Posts: 910
Thanks:
0
new=not seen before
old=seen before
young=interval < 9 days
mature=interval > 9 days
in initial state=young
Joined: Apr 2006
Posts: 873
Thanks:
0
Thank you, that's great.
The reason that I asked for some of these settings to be made optional is because I think there may be an argument for not using the prioritising settings that you are outlining above. I realise that you've been thinking about this for a lot longer than I have so please bear with me if there are flaws in my reasoning.
Suppose that after answering a card, there is an ideal time to schedule it's next review. Too late and you will have forgotten it, too early and you lose efficiency. Suppose then that Anki does a reasonably good job of scheduling cards at these ideal times. For mature cards especially, it is probably the case that an interval rather than a point in time is ideal. This is captured with your concept of a relative delay.
If you prioritise cards with a high relative delay over ones with a short relative delay then you prioritise ones that have a higher probability of having been forgotten and are the furthest from their ideal review time. In order to maximise efficiency, would it not be better to do the opposite?
There's no way to tell for sure if a card has been forgotten but if you did know for sure then it wouldn't matter when you see it. The relative day measurement seems to measure the probability that a card will have been forgotten. If you have a list of cards ordered by the probability that they've been forgotten, then the urgent ones are the ones with the smallest probabilities. They're due for review so it's certainly not too early and you'd better review them before it gets too late and you forget. The ones with the higher probabilities are the ones that can be put to one side since you've probably forgotten them anyway.
The argument that is put forward for scheduling old cards over new cards (as I understand it) is that you you don't want to waste the learning that you've already done. Prioritising high relative delay cards over low relative delay cards seems that it might contradict this argument for the reasons in the previous paragraph. In addition, I think that failed cards are quite similar to new cards in that you don't know either of them very well. If you prioritise failed cards over correctly answered cards then this also seems that it goes against the argument for prioritising old ones over new ones.
Having said all of the above, I appreciate that the arguments you made for your methods aren't invalid at all. I think that some people may find different prioritising settings more effective than for other people and therefore optional settings might be appropriate.
Joined: May 2007
Posts: 910
Thanks:
0
I don't think that the old/new prioritization is really related to prioritizing high relative delay over low relative delay.
The reason that old cards are shown first is because it's important to review existing material before introducing new material. If you're overwhelmed by your workload, the last thing you want to do is introduce more work.
But prioritizing high relative delays is not introducing more work. You _need_ to answer all of the cards that are pending or you will forget them.
You argue that relatively large delays indicate a high percentage of forgetting, and thus the cards should just be given up on. I don't think giving up on them is appropriate. Say the chance of remembering a long-delayed card is 33%, and the chance of remembering a non-delayed card is 90%. If you prioritize the long-delayed cards, then you have successfully saved all the effort spent on 1/3rd of those cards up until now. The non-delayed cards are still quite strong in your memory, and by the time you get around to answering them, even if the new probability is 75%, you're getting a net gain. This is especially important if you have too much work to do, and can't finish the deck each session (instead, hacking at a little bit of it each time)
If you were to do the opposite, and prioritize cards that are near their ideal scheduling time, you may keep the recall ratio for those cards at the ideal 90%, but it becomes impossible to do a little a day. Any missed cards will keep getting pushed further and further away, and the only way to recover from that is to finish the entire deck.
(Remember here that the change in scheduling is to optimise the way we handle partial completions, delays and work overload. If you're regularly finishing the deck, this change won't have a big effect on you either way).
You also suggest that it may not be best to prioritize failed cards. Remember that one of the goals of a session is to finish with every card answered correctly and scheduled for at least a day in the future. This is the reason so many finish practicing and then wait the requisite few minutes for their expired cards to show up. If you have a review schedule which regularly leaves you with more than a few failed cards, you're probably doing something wrong.
"Can you make it configurable?" is usually not something a software author wants to hear. Different people want to tweak different things, and trying to make everyone happy results in a complicated, bloated application which in turn makes other people who like streamlined apps unhappy! It's pretty much impossible to make everyone happy all the time.
I'm all for configurability when it's appropriate, and Anki has a plugin system. With a few lines of python you could change the scheduling algorithm to whatever you wanted it to be. Any more support than that requires that I understand the reason for making something configurable, and believing that it will actually be useful. At this point I'm not really convinced that keeping the old system or allowing prioritization of on-time cards would be a good idea.
Edited: 2007-10-22, 9:28 am