#### Discover more from Applied Inference

During covid I would often see on Twitter, or hear in real life, things like “we should replace calculus with statistics”. In the ML world I often see “we should replace calculus with linear algebra”. This got me thinking: what exactly should we teach in high school? What concepts are foundational to other domains, while providing real tools for everyday life, *and* are simple enough to be taught to kids in high school?

There seems to be a lot of agreement among the twitterati to get rid of calculus. Perhaps this has something to do with everyone’s experience trying to learn it. All I can say is, at least you didn’t have to take Real Analysis at Harvey Mudd, using “baby Rudin” as your textbook. (If you know what Harvey Mudd is, or what baby Rudin is, you know this was hard.)

Incidentally, I think calculus is very important (and satisfies the criteria I set out above; we should keep it), but taught entirely incorrectly. I am a concept-maximalist and an equation-minimalist when it comes to math (at least before it gets Serious, and you find yourself in a Harvey Mudd lecture hall), and the core concepts of calculus are fairly simple: what happens when something goes on forever? (limits) How do we translate between quantities (integrals) and rates (derivatives)? Slopes and areas are easy to draw, easy to see, and easy to intuit; but basically no one in their life ever has to actually solve an integral. (The entire world of Bayesian computation is premised on the concept that integrals are too hard)

If you think that concepts are supreme and that rote formulas are secondary, then I hope you will agree with me that statistics ought not to be taught in high school, and should only be taught *after* probability. But the question laid out in the title was not about the sequence of concepts, but the choice of concepts to teach to all young adults. And I think there would be no better choice than probability.

Why probability before statistics? The concepts of probability inform the math of statistics. Yes, you can learn the math of statistics first, but that’d be a bit like learning the entirety of the Spanish dictionary without learning any of its grammar. Sure, you can say some words, and to someone that doesn’t know the language, it will *sound* like you’re speaking it, but you’re not saying anything coherent.

Most misunderstandings of statistics come from misunderstandings of probability, including everyone’s favorite bugbear, the common mangling of the meaning of a p-value. Thinking a p-value is “the probability the null hypothesis is true” comes from a reversal of the order of a conditional probability statement: a p-value is actually “if the null hypothesis is true, we’d see data like this x% of the time”, but gets twisted into “if we see data like this, the null hypothesis is true x% of the time”

And practically, most people don’t ever need to do a chi-squared test to see if a two-way association is significant (and I mean *ever*), while almost everyone at some point will need to incorporate a base rate into their calculations, even if those calculations are rough and intuitive, or don’t even feel like calculations at all, just part of the subconscious analysis that goes into understanding the world around you.

It’s not just the base rate fallacy that’s inoculated by learning probability; the gambler’s fallacy (the idea that “black is due” if the roulette wheel landed on red three times in a row), the conjunctive fallacy (is it more likely that Russia’s next president is a woman, or that Putin resigns and makes his wife president?), and of course, the “confusion of the inverse”, which leads to “the chance that the null hypothesis is true”, are all slippery habits of the mind that are given firmer footing by a basic understanding of probability.

Three simple concepts, marginal probability (what is the chance that X happens?), conditional probability (if Y happens, what is the chance that X happens?), and joint probability (what is the chance that both X and Y happen?) have an outsized power to give everyday people real tools to understand the world we live in, and to answer basic, relatable questions like “what is the chance that Steph Curry hits this next free throw?” or “why are so many vaccinated people getting covid?”. (Incidentally, the answer to the first is not trivial, as there is a lot of debate about whether the “hot hand” exists in sports)

But is probability too advanced for high schoolers? I should think not. I can only speak for those interested in sports, but no one seemed to have any trouble figuring out what a batting average was growing up. And the pedagogical tools for probability need go no further than a few equations (sorry!) manipulated by simple algebra, and a few two-by-two contingency tables, which can easily be drawn by representing probabilities as areas. If you can teach kids derivatives, you can certainly teach them Bayes’ theorem.

In fact, I say we should go further. Since they’re already learning correlation (dependence between variables), it would be natural for them to ask *why* variables are correlated, a great opening to start teaching the difference between correlation and causation.

If you were to permit the instructor to draw arrows on the chalkboard (it was still chalk in my day — is that dating me?), you could teach the elements of causal inference, and why certain variables might be correlated even while neither is the cause of the other.

I didn’t learn probability theory until college, when it was inter-tangled with things, like measure theory, that are wholly unnecessary to understand what actually matters. If you’re going to do serious math, or if you want to make a lot of money understanding how risk-weighting is a translation between two measure spaces of area one, yes, you do need to learn measure theory. But most people do not. Probability is relatable, straightforward to teach, and *important*. We should teach it to all young adults.