Writing a punchy paper

I’m currently preparing a paper for the (very great) “Advances in Approximate Bayesian Inference” NIPS workshop (here is their webiste). The constraints are pretty high as the paper can only be 4 pages long (we can provide supplementary information in excess of that, thankfully).

I’ve thus spent this morning trying to make that paper punchy, which gives me a good excuse to detail (and thus think about) my writting process. Let’s dive into it.


High-level description

Before I start writting down the paper, I like to start with a simple list of high-level questions that gives a synthetic description of the paper.

  • What is the context of the article.
  • What is the precise issue that I’m adressing.
  • What does the reader learn from my paper.
  • What background information does he need to understand my paper. (Bonus question: will most readers have that background information? Do I to expand my paper to reduce the requirements?)
  • Why does the reader care. (The most important question! Even if you know your paper isn’t groundbreaking, give your readers a reason to care).


Most often, I also think of how I would describe the paper in two-three sentences to a colleague. Conveniently, this can be done by stringing together the answer to all the “big questions” which I just presented. For my article, we get:

In order to do Bayesian machine learning, we most often use approximations, but we don’t have results quantifying whether they are good or not. This article is a theoretical contribution giving a computable measure of the quality of one approximation.

The result refers to slightly exotic measures of distance between probability distributions, but nothing too crazy.

This gives a powerful tool for checking whether our approximations are good or not for little extra cost.

This casual abstract of the paper is valuable because it makes explicit all of the important points of the paper. Hopefully, this will prevent us from forgetting them!


Structuring the paper

The next thing I do is I organize my paper.

I always start with an introduction. The objective of it is to tell the reader why he should care about reading the rest of the paper. We need to recall the global and local context of the problem we address, and advocate why it is an interesting one. We also need to tell him what lessons we can learn from studying it. It is fine to tease in the introduction the results that we are going to derive.

I usually adopt the following structure:

  • Global context (one paragraph): what is the scientific field this article is relevant for? Which sub-theme in that field does my article tackle?
    I like to make this paragraph interesting even for scientists that are in close domains (for example, if my article is in Bayesian statistics, I’d want frequentist statisticians to still be interested in it, so I’ll make sure my introduction caters to them).
    It is perfect if the first sentence of the introduction is a “hook”: a provocative / thought-provoking summary of the global context. However, this is the perfect case. We do not necessarily always get there.
  • Local context (one-two paragraphs): what precise problem are we tackling?
    This is where I do my litterature review usually: we recall other earlier solutions that are present in the community, and explain how we can improve over them.
  • This article (one paragraph): details of the solution we provide.
    In this paragraph, I give a quick recap of the solutions that the article provides. I want to have the reader be interested in what I’m writting, and this seems like a good way: we are telling him what he will gain from reading our article.
  • Structure of the article (one paragraph): I finish my introduction by detailling how the article is organized.


The next step is organizing the body of the article. This is heavily dependent on what the article is about, but I’m usually looking for the following features:

  • Causality: my reader shouldn’t have to jump around my article when reading it. This means that I’ll start with the simplest points, and then continue with the more complicated ones. If I need to provide background information that the reader might be lacking, I’ll do so at the beginning of the article, etc.
  • Flow: as much as possible, ideas presented in the article should (appear to) flow naturally from one to another.
    I don’t want my reader to get stuck on a rough transition, and I want to understand why a section is finished and why I’m going to the next one.
  • Transparency: my goal is always for the reader to understand what I’m doing. I thus go overboard with stating my goals withing sections and recapping what we’ve learned, etc.
    While the structure is immediate to us the writer, it might not be such to our reader. My greatest fear is my reader getting lost in my paper.
  • Ease of reading: an over-arching principle of my writing is that I want to make it easy for my reader. If I have a decision to make on my paper, my focus is always on what makes it the easiest to read.

These are general objectives. Since I mostly write about statistical theory, I also have a few ideas which I think are specific to that theme:

  • Intuitions are better than proofs: my objective is to write for most readers so that my paper speaks to as large an audience as possible. Most readers won’t dive into proofs, or won’t gain a lot by reading them (even very technically competent reader) because proofs are boooooorrrinnng.
    In my theoretical papers, my objective is thus to give intuitive derivations of my results. These intuitive derivations aren’t fully rigorous proofs, but they are “lighter” than one. As such, they are much easier to digest for my readers and they might commit them to memory easier.
    I delegate fully rigorous proofs to supplementary information.
  • It’s better to have too much background: say my proof relies on some background information, I’d rather present it in my article instead of relying on my reader knowing it.
    Indeed, if the reader already knows it, he might still benefit from a refresher / different perspective on the topic. At worse, he’ll be bored but, if I’ve done a good job describing the structure of my paper, he’ll feel free to skip the sections in which he already knows and nothing will be lost.
    If the reader doesn’t know it, then it would be catastrophic to not present the background, as it means that he would either not understand my paper, or he’d need to take a break from it (and thus risk never coming back).
    Thus, adding more background as little cost but heavy benefits.
  • Be punchy instead of being fully general: this is related to my first point about intuitions, but I want to focus here more on the statement of our theorems. In many cases, a theoretical statement can be made in many different variants. For example, we might be able to weaken our assumptions to obtain a more general theorem, etc.
    There is a mistake here that I wish to avoid: focusing too much on generality. My worry here is confusing my reader by stating a theorem that is too general, too quickly. I’d rather ease him into the fully general result by starting from a more intuitive and easier-to-understand theorem.
    This is counter-intuitive because, as mathematicians, we are always told that generality is always better. This isn’t the case. Generality can also be harder to understand because they are more “moving pieces” in the result. Instead by going from simple to general, we can help the reader focus on the more important parts of the result first, so that he doesn’t feel lost when facing the general theorem.


Finally, there is the concluding sections. I include in this the discussion section and the true conclusion.

My objective here is the make the contribution of the article clear and memorable, and to highlight interesting potential follow-ups. Since I focus on theory, this is a good place to give an example of an application of the theorem presented in the article.


Writing !

This section is pretty straightforward: you sit down in front of your computer, get into the zone and WRITE!

Here is some useful but minor advice that work well for me:

  • Avoiding the “blank page syndrom”: it’s hard to get started. Usually, my problem is that all my ideas just seem to be really bad.
    My trick is to just power through it: I sit down and start writing whatever ideas pop into mind for the introduction. As I’m writing down, I might feel like my output is really bad, but I ignore those feelings. I’ll then come back to the section later to improve it.
  • Write – discard – repeat: when I decide to rewrite a section (which occurs a lot because my first drafts tend to be poor), my approach is usually to re-read the section, then copy-paste it somewhere safe, and start from scratch. This way, I don’t get lost in the details of modifying a document, which I find to be exhausting. What I have instead is a blank slate, and some ideas of what went well and what went poorly in the preceding version. I find it much easier to write this way.
  • Write it out, then edit: I usually never start editing a section before I finish my first draft of the whole document. I find that you have a much clearer picture of the document once you’ve wrote it out (at least) once. Thus, I refrain from trying to make the sections better before I have that clearer picture.
  • Write quickly and edit many times: for me, it works better to rewrite one section quickly many times than to think about it for hours carefully and slowly writing down a masterpiece by weighting each word. I’d rather go quickly many times than carefully once. That’s just how it works for me: find what works for you.
  • Get feedback: try to gather as much feedback from your colleagues as you can. This can mean giving them a draft of the article to read, if they’re very nice, but it can also simply be discussing the structure of the article, or the “punchyness” of an argument, explanation, etc.


Good luck with writing! In the end, it is just another skill that you need to practice to get good at it. I hope that this can help you get there faster.