What makes a mathematical concept “good”

Today, I’ll discuss a topic that I believe is very important both for teaching and explaining but also for understanding math: what makes some mathematical concepts “good” and “better” than others. A bit abstract and dry don’t you think? Here’s a probability example: what justifies the definition of the variance ( \text{var} = E[ (x-\mu)^2 ]) and the standard deviation (\sigma = \sqrt{\text{var}} ) ? If a student asked you that question, would you be able to give him a good answer?

I believe that the quality of a mathematical concept needs to be measured along two different axes:

  • First of all, our mathematical concept needs to capture some intuitions that are relevant for the problem at hand.
  • Second, our concept needs to be mathematically convenient: it needs to just click in whatever mathematical setting we are working in

The conjunction of those two is very important because it’s often possible to capture the same intuition with many different concepts, but most of them won’t be mathematically convenient. There is something that I find really mysterious in math which is that, in order to prove something, there really aren’t many ways which “flow”.

Note that it sometimes happen that we sometimes want to capture the same intuition with different concepts because a concept might only work well in some small set of circumstances. Outside of those, maybe a different concurrent concept is a better fit.

Also note that the first criterion can be a bit blurry. As you learn higher and higher level math, you also learn higher and higher level concepts. For these, the intuition they capture can be something that is mostly mathematical. For these, it might be a bit harder to distinguish my two axes. Maybe the best way to separate them is to think of them as a long term goal (first axis) that we care about reaching, and convenient properties (second axis) that make the end term goal easy to manipulate.

This was way too dry. Let’s go back to examples

First example: the variance

Let’s try to answer our hypothetical student’s question: why do we define variance and standard-deviation the way we do. We’ll answer along our two axes.

First: what intuition do the variance and std capture? This is one is easy enough. They capture the spread of a random variable. It sounds almost tautological but: if a random variable has a bigger variance, it means that it is more variable than another with a smaller variance. The standard deviation measures (very roughly) the width of where the random variable can be around its mean.

At this point, our hypothetical student interjects: there are many other measures which would capture roughly “the width of where the random variable can be around its mean”. For example, the L1 deviation: \delta = E( |x-\mu| ). Why don’t we use this one instead?

The reason why \delta isn’t great is because it isn’t mathematically convenient. The key property that the variance offers is that it sums between independent variables. Sums of random variables are ubiquitous so this is a really important property. There are plenty of other ways in which the variance is a convenient concept, but I believe this to be the key one (though if you have more, please tell me). The standard deviation inherits the mathematical convenience of the variance so it’s a better way of measuring the width of a random variable.

We can now give a short answer to our student. We define variance this way because it captures something we care about: a measure of the spread of a random variable, and it has good mathematical properties, a key one being that it sums between independent variables.

A context in which variance isn’t the good concept

Variance is not always the best concept of random variable width though, which is why we don’t always want to work with it. For example, when we use Bayes formula, we are not working with sums of random variables but with products of density functions. The concept of variance still captures the right intuition but stops being so mathematically convenient. One concept that does work for products of density functions is the Fisher information but I won’t describe it here because that would be too long.

Why it matters

I believe it’s quite important to keep in mind what makes a good mathematical concept. First of all, when learning about a new subject, it’s really important to learn its structure. If for every mathematical concept you encounter you can identify what intuitions it captures and the various ways and contexts in which it’s mathematically convenient, that structures your thoughts and it helps learn easier and better. Math is all about structure, and making sure your understanding of math is well-structured is very important.

The converse of that is that it’s also very important when teaching to keep in mind why concepts are good. By structuring your presentations and communicating clearly what makes the concepts you use good, you help your audience to understand the subject and everybody benefits.

Finally, it’s very important to keep in mind what makes a good mathematical concept good when doing research. Sometimes the solution to a problem can’t be reached from the currently used concepts of a field. These problems require developing new concepts and having a good idea of what one should be looking for is quite important there.


A good concept captures something we care about while being convenient to work with. Keep this in mind when doing research, and when learning and teaching math: being explicit with structure can only help you.

As always, feel free to correct any inaccuracies, errors, spelling mistakes and to send comments on my email ! I’ll be glad to hear from you


Why statistics are awesome

I will start this blog by talking about why statistics are pretty much the coolest domain of math. To sum up, it’s because probability is the language of reason in a stochastic world.

You might be surprised by my thesis: isn’t logic the true language of reason ? Logic is indeed a language of reason, but one that only applies to deterministic worlds ! Logic is perfectly suited to the realms of mathematics, in which everything is determined by absolutely rigid rules from which you can’t escape.

However, if we need to reason in a world in which relationships are not completely deterministic, we need a language which has more expressive power than logic, and this is where probabilities and statistics come into play. These two are very related, but if we really want to make a distinction, we could say that probabilities deal with saying how we should act to accomplish a goal efficiently (eg: what is a good poker strategy), whereas statistics deals with what we should believe in (eg: is this pattern I’m seeing significant or not). Not only are statistics a language that works at describing inferences in a stochastic world, one can also show that Bayesian statistics actually contain logic as a special case.

Statistics thus has a very important role to play in the scientific method. Most of science is basically trying to find patterns and checking whether established patterns hold up in new situations. For example, if you wanted to test whether Newton’s theory of gravitation is a better account of reality than Einstein’s, you would design an experiment in which the two theories give different predictions, collect data, put on your statistician hat and check which theory the data agrees with (if it agrees with any).

But statistics actually play a much wider role in this world: every human on earth (and most animals) is an intuitive statistician, collecting data about the regularities of his environment, and trying to act on those. This statistical knowledge is a fundamental component of the behavior of every being on this planet. For example, you know intuitively about what’s normal weather in the city you live, and you can probably predict the weather for the next few days because you have observed the regularities of the weather throughout your life; you know intuitively how your friends and family react to a wide variety of situations because you have lived through a wide variety of situations with them, and even for a person you’ve just met, you can guess how he would act, because he’s most likely going to act like somebody you already know ! Our world is filled with patterns, and it seems like it was particularly beneficial for animals to recognize those in order to increase their survival. A lot of animals thus seem to have “learned” (through evolution) intuitive statistics, through the prism of which they interpret the world.

To conclude, when we study statistics, we are studying the fundamental language of reason, which is not only at the heart of science, but also inside the head of every creature and of every human of this little earth. If that’s not cool, then I don’t know what is.

If you want to read more on this, I can only recommend E.T. Jaynes’s book, which is absolutely awesome, and which makes this point much better than I ever could.