Well, I’m buried up to my head in work trying to finish up a paper. I have been doing this for a while. A project that should have ended with a 20 page paper ballooned on me, and now I am writing a paper on the edge of 50 pages.
Which brings me to the title of the post. Fifty pages is a long paper. There is an additional problem in writing such a long paper. This is that by the time I’m writing the end, I’ve somewhat forgotten what I’ve written before. Mre specifically, not so much the content, but the way in which it was written. And even more specifically, what precise notation was used.
For example, just to give you a feel, was it or
, or was it
? Or was it
, rather than
.
I’m writing stream of consciousness, so it can be fixed later when I’m combing through the paper. Moreover, the longer the paper, the more symbols and fonts one needs, and they can start overlapping. This means that notation degenerates as I’m writing a paper. It’s not consistent for more than I can work on in one day. About five pages that is. So that is what I will call the correlation length of notation: the amount of pages that one writes before the notation mutates and starts getting disordered.
So, on a 50 page paper, there are ten correlation lengths of notation, so the end looks nothing like the beginning in terms of notation. This gets worse with more authors. And don’t even get me started on writing books. To fix this, one has to ‘cool down the system’ so that it becomes ordered (I’m making an analogy with ferromagnetism here). This requires time and many passes. So a paper of length seems to take of order of
time to write it down. Maybe that critical exponent is different.
There is always a plan B: give a guide to notation changes so that the work is piled on the reader. What do you think of this strategy? This seems to be the way for books, because there are many conventions that overlap: they come from different developments by different authors. Fortunately our brains seem to be able to read contextually, and can be energy, and electric field and
can be the electric charge and the Euler constant all of them in the same equation, when it becomes obvious how to interpret it. Isn’t it?

If you are writing in LaTeX, another strategy is to use a whole lot of \newcommands to setup the notation and make sure that you only refer to things you have already defined rather than writing in raw TeX. For example, this can solve the fonts issue. On the other hand, it doesn’t help with A,B,C vs. X,Y,Z.
I had a professor who, when we would encounter these overlapping notations in lectures (such as i being the imaginary unit and an index), used to say, “We’re all grown-ups here. We can handle equations like this.”
This professor was Claude Bernard, fwiw.
Dear David, why don’t you give this task to a bored student or a secretary?
Quite generally, I do think that there should exist people in the industry who do help with such things. It’s just bad if the high-IQ man-hours like yours are being wasted in this way even though the LaTeX notation tasks you’re trying to solve are not that infinitely hi-tech – and I hope that you realize that they’re not.
The failure of the top physicists to have people doing such things is a result of egalitarianism – all the manifest crap that everyone in science is the same “Einstein” (produced, not surprisingly, primarily by the people who are no Einsteins at all) – a disgusting artifact of all the left-wing lies penetrating the Academia.
Don’t get me wrong, I have almost never used anyone in this mechanical way – but it was mostly due to the atmosphere which results in many people being bored even though there could be a lot of useful work that could be done at every point but it’s often viewed as impolite to expect someone to do such work.
Best wishes, Lubos
Hi Lubos:
Unfortunately I’m using the same letters with different fonts because they need to be distinguished that way, so it can not be done in such a mechanical way as you suggest. Otherwise I would happily use such services. I’m also writing this with a graduate student who has to sort it out and make sure that they are all correctly notated. I had to change notation a couple of times because of small conflicts already, and because of that I’m sure some extra accidental typos were introduced.
OK, I tend to think that you underestimate grad students (and maybe even secretaries.).
At any rate, good to see attempts for long range correlation – a conformal limit. There are papers where the characteristic length of the coherence of the words – their very content – is just a few words, shorter than a sentence.
Yeah, Lubos doesn’t have a job, and given his postings, it’s clear that he’s got plenty of time on his hands.
I am not sure whether one had to be as specific or personal as you are – but indeed, that’s qualitatively what I was suggesting. Because I am not afraid of work, such a friendly request could even be accepted (and I’ve done a lot of similar work for others, virtually always for free) even though I couldn’t guarantee it and I am not offering David any help here.
It should be natural for people to be paid for helping with such things – and it should be natural for grad students and/or other “junior” occupations to do actual useful work (even though not quite original) as a part of their getting mature.
The job composition of the scientific community should be much more diverse and hierarchical and there should be many people in the process who are not labeled “full-fledged scientists”. It’s just bad that everyone must be considered a “full-fledged original scientist” even though (s)he is not that great in this respect – and there’s no one left to do many things that require more mechanical work.
It’s good to have enough time and freedom, but you have to be kind of relatively materially secured to afford it.
First we decide we like mathematics better than words because ideas become more compact. Then we discover there are not enough letters in our alphabet so we need to use each letter for multiple purposes.
This could be a stupid suggestion but I try to follow something similar and it sort of works for me:
You could keep an auxiliary text file open where you could make a note of the alphabet (along with the font used) you use to denote a particular physical/mathematical quantity. In fact, if you try to stick to these conventions for your subsequent publications (as much as possible of course), you might eventually get used to your conventions over time.
This is irratating of course, and happens to everyone in the few days before you hit the submit button.
When in doubt, just find similar highly cited reference papers and stick with their notation as it improves the readibility for your readers. Thats not always possible for pure ground paving theory work, but c’est la vie.
I can’t tell you how many times I might be reading a paper and mistake a letter for something else and end up confusing myself for 15 minutes.
This is where my woes really started: the various ‘highly cited papers’ that are needed for this had conflicting notations, and we were borrowing from them freely.
Or you could just shorten the paper. No offence, but I’m not likely to read a 50-page paper. In fact I’m not going to get past the abstract unless it claims that you have computed the mass of the electron from first principles.
Bottom line: the probability that your paper will be read declines exponentially with its length.
Hi Papa:
I don’t expect everyone to read this paper in full detail. This is why we write an introduction with a guide of the results, and a conclusion. You should only read the middle sections that are important for your work if you need to know precisely how the results were gotten at.
The reason why it is so long is that we develop an example in complete detail. So I expect that some students that are working on the area might actually find it useful for a paper to show a technique from start to finish without skipping steps.
“There is always a plan B: give a guide to notation changes so that the work is piled on the reader.”
Unifying your own notation, is time consuming and a pain, but I would argue it is the best approach. There are two primary reasons:
1) Whatever you are writing, it is safe to assume that it is sufficiently advanced that there will be a non-negligible percent of readers who simply will not understand your core message(s). Translating between several sets of notation is a non-trivial task, particularly for someone who is on the edge of maybe/maybe-not understanding your writing.
–The Point: leaving the work of translating notation(s) for your readers will noticeably reduce the number of people who actually understand what you are saying.
2) Efficiency: unifying one’s notation for a long paper can be time consuming — perhaps 10-20 hours (?). For the reader, this is less so — say 2 hours (for a 50 page paper). If 1000 people take the time to read your paper in detail (I chose this number arbitrarily), this yields 2000 work-hours to translate your work.
–The Point: the economies of scale make an hour of an author’s time save hundreds for the readers.
1) Have a list of symbols and their meanings. Only choose, copy, paste from that list.
2) You should write a paper… Is the perceived problem covariant (independent of the writer’s language)?
Pro tip: You can but notations in an index using something like
\index{\mathbb{C}}
But if you want to get sophisticated about it you can put the notations index in a second index using multiple indices as described here.
As a reader, I cannot tell you how useful it is for a author to put an index into a large document which references the definitions of their notation throughout.
I think that the entropy of notation would be interesting as well
I haven’t though about that one. Can you be more precise? Is it extensive on a paper?
Yes I think it is extensive. I would think that if your paper is short, there are fewer opportunities to use notation and less opportunity to have highly evolved concepts, so any symbol used in describing mathematical concepts is more likely to be one of the symbols used in the most common math notation.
As the paper grows in size, probability becomes more evenly distributed across the range of math notation, (because of exhaustion of the most common symbols), so entropy necessarily grows.
This brings up an interesting point, that any probability distribution that isn’t uniform is a sign of a low entropy system.
I am currently in the same position as you David. I am in the process of writing up a paper that will end up around 40 pages. I am the sole author, and only a phd student at that, so you can imagine the turmoil I am going through at the moment. The only thing I’ve found really useful is to always have a recent printout lying next to me while I write. If not, then the notation goes immediately.
Good luck with your paper!
Hi Per:
I’ve found that venting helps to get through with this process. Good luck on your paper.
[...] Posted The correlation length of notation [...]
Thanks.
Hard work and red bull got me through the last phase of the project. Now its done!
Hi Per:
Congratulations on getting a paper out. It always feels good to be able to let go. Now go and celebrate!