Learning How To Learn
Why School Didn't Teach You This
The irony is thick: twelve or more years of formal education, and almost none of it included instruction in the science of learning. Students are taught what to learn in exhaustive detail. They are almost never taught how to learn.
This is partly institutional inertia — teaching effective study methods is a layer of complexity on top of an already-complex system. It is partly because most teachers were not taught these methods either. And it is partly because effective methods look counterintuitive from the inside: they feel like they are not working even when they are producing excellent results.
The result is that most educated adults use a small set of study strategies — re-reading, highlighting, reviewing notes, re-doing examples — that feel productive and are substantially ineffective for long-term retention and transfer.
This is not a peripheral concern. If learning efficiency doubles, the effective time budget for skill acquisition doubles without any additional hours. For someone building skills across multiple domains — which describes anyone operating in a complex, changing world — learning how to learn is one of the highest-leverage investments available.
The Memory Architecture You Are Working With
Learning science makes more sense once you understand the basic architecture of human memory.
Working memory is the active workspace where you consciously process information. It is severely limited — roughly 4-7 chunks of information at once. When something is too complex for working memory to hold, learning breaks down. This is why gradual progression matters: you cannot learn calculus before algebra, because calculus requires working memory to process concepts that presuppose algebraic fluency (which has to be automated, not occupying working memory).
Long-term memory is effectively unlimited in capacity but is accessed through retrieval — a process that is not perfectly reliable and that degrades without reinforcement. Information in long-term memory is organized associatively: concepts that are connected to many other concepts are more retrievable than isolated facts.
The forgetting curve: Without reinforcement, memories decay in a predictable pattern. Roughly half of new information is forgotten within 24 hours without review. By a week, most of it is inaccessible without deliberate retrieval practice. This is why cramming for a test produces a test score but not durable knowledge — the material enters short-term memory in quantity but decays rapidly without spaced reinforcement.
The design of effective learning is the design of a process that moves information from working memory into durable, retrievable long-term memory through repeated, spaced retrieval.
The Evidence Hierarchy: What Works and By How Much
John Dunlosky and colleagues published a landmark 2013 meta-analysis in Psychological Science in the Public Interest, reviewing ten common study techniques across hundreds of studies. They rated each for utility. The results:
High utility: - Practice testing (retrieval practice) - Distributed practice (spaced repetition)
Moderate utility: - Elaborative interrogation - Self-explanation - Interleaved practice
Low utility: - Summarization - Highlighting/underlining - Keyword mnemonics (for specific narrow uses) - Imagery for text learning - Re-reading
The low utility strategies are exactly the ones that students most commonly use and report as their primary methods. This is not random: these strategies feel good. Re-reading produces fluency — the material flows smoothly, recognition is easy. This fluency is misinterpreted as knowledge. The cognitive psychology term for this is "fluency illusion" — the mistaking of processing ease for durable retention.
Retrieval Practice: The Central Tool
The testing effect was documented as early as 1909 (by psychologist Arthur Gates) and has been replicated hundreds of times in the intervening century. It is one of the most robust findings in cognitive psychology, and it is consistently underused.
The mechanism: each time you attempt to retrieve a memory, you strengthen the memory trace more than any amount of passive review. The effort of retrieval — the slight struggle to pull something from memory — is itself the mechanism of consolidation. This is why easy review produces little memory gain: it requires no retrieval effort.
Practical applications:
Flashcards: The classic retrieval practice tool. Used correctly — active recall, not passive review — flashcards are excellent. The failure mode is treating them as a passive review mechanism (flipping cards and reading both sides). Used correctly, you look at the prompt, attempt to generate the answer in your head before turning the card, then compare your answer to the actual answer.
The blank page: After reading a chapter or watching a lecture, close everything and write from memory what you just encountered. Do not look at your notes. Write until you have nothing more to add. Then open your materials and check. The gaps are your actual learning agenda.
The Feynman technique: Explain the concept you are trying to learn as if teaching it to a non-expert. Do this from memory, without notes. When you cannot explain it clearly — when your explanation becomes circular or vague — you have found a gap in your understanding. Go back to the source material and fix it.
Practice problems from varied sources: For quantitative subjects, solving problems you have not seen before is retrieval practice applied to procedures. It forces you to identify which method applies and execute it from memory, which is what mastery actually requires.
Spaced Repetition in Practice
The spacing effect is almost as old and robust as the testing effect. Hermann Ebbinghaus documented it in 1885. He also discovered the forgetting curve — the rate at which memories decay without reinforcement — and noted that each successful retrieval resets the curve at a higher level.
Optimal spacing is not fixed. For very new, fragile memories, short intervals are appropriate. As a memory becomes more robust — demonstrated by successful retrieval — the interval until the next review can lengthen. This graduated spacing is what Anki and similar spaced repetition systems implement algorithmically.
For most users, Anki is the tool. The setup cost is real — someone has to create the flashcard deck, and creating good cards is a skill. The return is significant: a well-built Anki deck reviewed daily allows the maintenance of thousands of items in long-term memory with 20-30 minutes of daily review. Medical students routinely maintain 10,000-20,000 item decks. Foreign language learners use Anki to sustain vocabulary across multiple languages.
Card creation principles matter: - One fact per card: "What is the function of the hippocampus?" rather than a card covering five related facts. - Cards should require active production: "The battle of Hastings was fought in year ____" forces retrieval. "What year was the battle of Hastings fought?" works better because you must produce the answer. - Both directions for important facts: If you need to recognize the concept from the name AND name the concept from the description, make both cards. - Cloze deletion for complex facts: For material that has multiple elements, cloze deletions (fill-in-the-blank style cards) are highly efficient.
Interleaving: The Counterintuitive Mixer
Blocked practice — finishing all problems of type A before moving to type B — feels productive and produces good within-session performance. Interleaved practice — mixing types A, B, and C within a session — feels chaotic and produces worse within-session performance. But on delayed tests, interleaved practice produces dramatically better results.
The reason: blocked practice lets you use the same method over and over without having to identify which method is appropriate. Interleaving forces discrimination — each problem requires you to first figure out what kind of problem it is, then solve it. This discrimination skill is exactly what transfer to novel problems requires.
Applied to reading: reading chapters from multiple books in a subject area, rather than completing one book before touching another, often produces better understanding and retention. Applied to language learning: mixing vocabulary, grammar exercises, and listening comprehension within a session rather than blocking them. Applied to physical skills: mixing different shots in tennis practice rather than drilling one shot for 30 minutes.
Metacognition: Knowing What You Know
Effective learning requires accurate self-monitoring — knowing which material you know well and which you only think you know. Most learners are systematically overconfident.
The Dunning-Kruger effect is partly a metacognition failure: people with limited knowledge in a domain lack the knowledge to recognize the limits of their knowledge. But even skilled learners are susceptible to fluency illusion — mistaking ease of reading for depth of knowledge.
Regular low-stakes testing serves as a calibration mechanism. It creates honest feedback about actual knowledge vs. perceived knowledge. The discomfort of discovering you cannot recall something you thought you knew is valuable information, not failure.
A practical calibration practice: before each study session, write down what you believe you know about the material you are about to study. After the session, check this against what you were able to retrieve. The gap between your pre-session belief and your actual retrieval is your calibration gap.
Deep Learning vs. Surface Learning
There is a qualitative difference between surface learning — recognition and recall of specific facts in familiar formats — and deep learning — understanding principles well enough to apply them in novel contexts.
Deep learning requires: - Understanding causal mechanisms, not just facts. Not "the answer is X" but "the answer is X because of the following chain of reasoning." - Multiple representations of the same concept. Mathematical relationships represented algebraically, geometrically, and numerically create richer understanding than any single representation. - Cross-domain connections. Noticing that the same principle appears in physics, economics, and ecology builds the kind of integrated understanding that transfers across problems. - Application in novel contexts. Actually using knowledge to solve problems it was not originally presented with. Teaching others is the most demanding form of this.
The difference in outcome between surface and deep learning compounds over time. Surface learners accumulate isolated facts that have limited application. Deep learners build mental models that apply to new problems automatically.
Building the Practice
The minimal effective dose for a serious learner:
1. Whenever encountering new material you want to retain: Use retrieval practice within 24 hours of first exposure. Blank page, flashcards, or explanation from memory.
2. For material that must remain accessible over time: Add to a spaced repetition system. Review daily.
3. For skill acquisition: Interleave problem types. Seek to explain, not just solve.
4. Weekly: Identify one thing you thought you knew but discovered you did not. Find the gap. Fix it.
The meta-skill is available to anyone who spends the time understanding it and the discipline to practice methods that are uncomfortable precisely because they work.
Comments
Sign in to join the conversation.
Be the first to share how this landed.