Do as I cite, not as I do
This was initially intended as a set of rules that I expect my students to adhere to when I'm grading their written work. It has evolved into a bit more of a meandering rant. I still expect my students to adhere by the rules henceforth set, but I hope this document is more than that.
1. TL;DR:
- Academic citations are a thing, with (mostly unspoken) rules.
- Citations will help you write the truth in an easily accessible manner.
- Use links.
- Provide enough context for the reader to immediately understand how the citation supports your assertion.
- Have fun.
2. On hypocrisy
Anybody who's read my work (Hi mom !11: Just kidding, even she does not read my work. ) will discover that I don't always fully abide by the rules I set below.
But I try to, and it almost always makes my prose more useful and easier to read.
3. What is a citation
A citation is text about text.
Writing was invented, apparently, in the fourth millenia BCE 22: Depending, of course, on how you define writing, this date can shift quite a bit, and by quite a bit I mean a few tens of millenia. The earliest date I could find was in (Bacon, Bennett, Azadeh Khatiri, James Palmer, Tony Freeth, Paul Pettitt, and Robert Kentridge. 2023. “An Upper Palaeolithic Proto-writing System and Phenological Calendar.” Cambridge Archaeological Journal 33 (3): 371–89. https://doi.org/10.1017/S0959774322000415.). Its last section is a good summary of what it means to write according to the current scholarship, and points to (Schmandt-Besserat 2010) as a good general-public introduction to the history of writing in Mesopotamia. , and at most two millenia after that writers wrote about what other writers wrote 33: The earliest mention of written text in a written text I was able to find is in a Sumerian epic called Enmerkar and the Lord of Aratta -search for "clay" in the page-, but bear in mind that I am not a scholar in the field, there may be earlier known written references to other writings. . And this self-referential property of text has now evolved into a whole thing, where:
- some literary works are only known because somebody else wrote about them, such as for example Sappho's poetry, a whole Shakespeare play (maybe), or much of Rome's early history (example);
- some works are known only by a twice removed relationship, where the original author (e.g. Sanchuniathon) is quoted (here, translated) by an intermediate author (e.g. Philo of Byblos), himself quoted by a third author (e.g. Eusebius) whose works are the only of the three we have extensive access to;
- how many times your text is cited, and by whom, and where, will define how much money you are paid and how much power you have see, section 16;
- people (e.g. Price (1965)) write about the citation process itself,
and their text is itself cited,
- which means that their work becomes part of its own subject,
- with as many orders of citations about citations about citations that you can imagine;
- schools exist whose major purpose it to teach people how to read, write about, store, display, lend, buy, give, study, and cite text,
- the massive addressability
44: On that and the following point, see the Address chapter of (Bratton 2015), particularly section 44 Deep Address; as well at the blog post which seems to have coined the term.
of text is so great that text can be written about
any subset, superset, statistical property, etc. of another text,
- in our computerized world, text now also mean any metadata about the text, such as who read it, and when, and why, and where, and on which device, and what can we sell them that they don't need. Maybe, someday, after we hang the last digital marketing executive with the bowels of the last venture capitalist(Žižek 2008) will this metadata (and the metadata about the metadata about the metadata, etc.) be used for something good.
Note that the self-referential property of text is not a specific characteristic of text. You can also, for example, talk (or sing) about what Alice said about what Bob told her, but as scripta manent, speech does not tend to reach as high an order of self referentiality as text does, except maybe in very gossipy circles.
Other media share this self referential property, and I like for example to watch Every frame a painting or, for the francophones Karim Debbache, speak among other things about how movies reference older movies, or books, or video games.
And in some sense what I've written here is not limited to text but can be applied with slight modifications to any media. It's just that text has got a head-start of a few millenia.
4. Why cite
You should cite to make the life of your reader easier than it would be without the citations.
Of course, this injunction makes the most sense in the initial context of this essay, where you, the reader of this article, are a student about to write something for me, who will become your reader, and whose life being easy has a very strong correlation with you getting a good grade (Edouard Klein, personal communication, 5: See what I did there ? ).
5I got a lot of my current knowledge from reading 66: I'd say most, but then I did not learn how to walk or bike by reading a book. It's hard to quantify. . And now I write to share this knowledge. Not all writing need to be about knowledge sharing. Maybe you are writing to reflect, process, influence, alert, entertain, play with language, provide a backdrop to your philological studies, fund your extravagant lifestyle77: good luck with that. , or settle the score with the people who did you dirty.
More generally, and less injuctionally, I would say that if you are writing non-fiction and want your readers to take the most out of what you wrote, then proper citations are, in my experience, a great way to:
- avoid falsehoods like half-truths, lieux communs, clichés, imprecise formulations, vague statements, weasel words, mistakes, straight up lies, made up88: "Use the Force, Harry" – Gandalf or misattributed 99: "Don't believe everything you read on the Internet" – Abraham Lincoln quotes, the use of etc. to make the reader believe you have more to say when in fact you ran out of examples half a sentence ago, etc., see section 7;
- point your reader to the contextualized, best-known entry of a rabbit hole too deep to examine in a given piece, but that you enjoyed going down into and suspect your reader would enjoy too, see section 11;
- remove unnecessary information and keep your writing short and to the point. This is a side effect of properly citing being so demanding: when each assertion requires even just a few minutes of research to properly source, you learn to keep your sentences short.
- Give credit where it is due, correct historical mistakes, shed light on an author you like, see section 15.
- Have a laugh, see section 18.
- Let the reader identify your novel contributions, as they should be the only parts with no citations.
I have also found that not only knowing, but knowing how "we 1010: as a civilization, or at least as a group of people sharing the same cultural references, that is to say the same citation space or reference pool or cultural canon or shared cultural capital. " know can give you new ideas for your own research. Whether it is applying old techniques to new problems, or new techniques to old problems. Finding the proper citation often leads to learning about how we came to know that you assert to be true is true.
Finally, if you can't find any good references to cite, this means you now can write something worthwhile yourself, detailing the exact point you hoped to find a citation for. This is, by the way, the exact way I recently got the idea for one of my articles (Klein 2024).
5. How to cite
As a reader, I like it when citations:
- link to the cited work,
- use consistent and expected1111: No matter how good of a web-designer you think you are, keep your links blue and underlined, and your visited links purple and underlined. typographic signals identifying them as citations,
- allow me to know what you cited even if the link is broken or if the document is read on a non interactive medium (e.g. paper),
- provide enough context for me, the reader, to understand, immediately and with no effort, the relationship between your assertion and the cited work.
Rule 4 is the actually important rule. The first three are quality-of-life improvements, but rule 4 is what most people get wrong, and what provides the most insight to the reader.
6. Working through an example
Consider the following sentence, slightly paraphrased from one of my students:
If you are not a computer scientist or a system administrator, this may sound like gibberish. Here is what you need to know to follow this example:
- There is a standards body called the Internet Engineering Task Force (IETF), that publishes documents called RFCs which are like the rules of the road (i.e. standards), but for the Internet.
- There is an RFC that defines a protocol called TLS (this was the topic of the assignment).
- OpenSSL is an implementation (among many) of said TLS standard.
- Almost all Linux distribution offer OpenSSL as the default TLS implementation.
In a student's work, the quoted sentence is worrying because its interpretation hinges on the sense given to the word "standard".
- If the student meant agreed-upon set of rules to ensure compatibility, interoperability, and consistency, usually shepherded by a designated institution, then the student missed that OpenSSL is merely an implementation of the standard, and not the standard itself. The standard is shepherded by the IETF, who provides no implementation.
- If the student meant default then they missed the existence of the standard altogether.
In a didactic work, such an imprecise use of technical terms from a position of authority will just confuse the reader. This kind of sloppy work is not acceptable and technical writers should strive to do better.
7. Citing for truth
Getting into the habit of providing sources along with their assertions will let students avoid writing such a problematic sentence.
One obvious place in which to look for a supporting statement for the assertion that "OpenSSL is the cryptographic standard on Linux" is OpenSSL's homepage.
Lazy students will just add a reference to https://openssl.org with the statement and be done with it. This is not how I expect, as a reader, citations to work.
A student abiding by rule 4 above will look for a supporting statement to quote or link to. They will not find it, but in the process of looking for it, they will read OpenSSL's README, the RFC, some articles, or even ask an LLM, and come to understand their mistake and correct their awkward sentence to something like:
8. Sometimes a simple link is enough
This corrected sentence can be augmented to:
The link here works because:
- The URL1212: The part of the link that tells your computer where to find the content it points to is stable:
- the content being pointed to is unlikely to disappear in the foreseeable future;
- the URL is likely to point to the content the author intends to link to, from anywhere, for anybody, anytime.1313: RFC 3986 clearly threw the towel on this to my great chagrin: "This specification does not require that a URI persists in identifying the same resource over time, though that is a common goal of all URI schemes."
- A user examining the link (typically by hovering their mouse pointer over it before clicking) will know which document the author is linking to1414: This will also proves useful when -when, not if, think many hundreds of years ahead, or maybe just tens of years ahead- the content eventually disappears from its location. , which in technical terms means that the URL is also a URN, see RFC 1737.
- The link text ("TLS standard") makes it immediately obvious how the linked document supports the point being made (Rule 4 above): the TLS standard is defined in RFC 5246.
9. A full citation will outlive a link target
When the link target does not bear, to the reader, enough information to identify the document without the reader having to visit the link, then a link is not enough and the author should use a citation.
Instead of the usual practice of redirecting the reader to the "References" section, my advice is to put the reference near the citation, like so:
which does not prevent the item to be referenced again in the "References" section.
10. Ease your reader into your references by narrowing it down
Knowledge does not come in self-content, easy to digest, small bites.
Citing or linking to e.g. a 700-pages book when you are only trying to make a small point that the book author mentions in passing is, at best, ineffective.
To somebody grading your work, it will be seen as lazy, and make the grader question whether you actually read and understood what you cite.
To somebody trying to learn from what you wrote, it makes life harder and increases the chance the reference will not be read at all.
What I advise instead is to quickly contextualize and sum up the linked/cited work and its authors, and paraphrase or quote the relevant passages.
Thus:
If, in the cited article, a specific short passage will spell out the point you are trying to make, I think it is perfectly OK to quote it (unambiguously identifying the excerpt as a quote, and providing the source, otherwise it is plagiarism):
11. The quest for the best introductory resources
Sometimes you make a point only in passing, based on your experience or deep knowledge of the subject, but can't stray too far into an explanation. I find it best in these cases to avoid paraphrasing or quoting an authoritative source strengthening your point. Instead I go look for the best introductory material I can find, and direct the reader to the entrance of the rabbit hole.
I am very often dismayed by the lack of any proper introductory materials for things I consider common knowledge in my field. So much so that whenever I come across a good introduction to anything, I save the link in a big file of "good resources for beginners" I maintain.
It has been said that, in the US alone, 10 000 people a day learn something that "everyone knows"1717: And this is true for any given fact, so you can multiply this figure by the number of facts that "everyone knows". . This figure can be extrapolated to 400 000 people a day worldwide. Let's be nice to today's lucky 400 000 and provide links to good introductory material.
12. Links, links everywhere
Hypertext is (as of nearly a century old as an idea and more than thirty years old as a widespread, mostly working implementation.
)Use it:
- citations should be links from the citation point to where the full reference is,
- there should be a link to go back up to the citation point1818: I have not yet found a way to implement that in my publishing workflow. ,
- Wikipedia-style page previews are cool1919: but they rely on JavaScript and will therefore stay unimplemented here, for now. ,
- but most importantly, every reference should contain a clickable link to the cited work.
This last part is easy when the reference is a web page, less so when it is a book or a scientific article.
Fortunately, archivists around the world are stepping up so that anybody can access information:
- Library Genesis (LibGen)
- LibGen was launched around 2008 in Russia, it is run by an anonymous collective. It is a file-sharing database that offers access to millions of academic papers, books, and scientific articles, often behind paywalls elsewhere. See here for a guide on how to access and use it. You can help the project by lending disk space.
- SciHub
- Founded in 2011 by Alexandra Elbakyan, SciHub was created to provide unrestricted access to academic papers and research articles, especially in countries where journal access is cost-prohibitive. It uses LibGen. Enter a DOI, URL, or search term, and SciHub retrieves the article for free. See how to access it
- Internet Archive
- Founded in 1996 by Brewster Kahle, the Internet Archive was established as a non-profit organization based in San Francisco with the goal of providing universal access to all knowledge. One of its most used feature is the Wayback Machine which preserves old versions of websites.
Using any of those sites, it should be easy enough to provide a link for every work you cite.
As far as copyright is concerned, linking to copyrighted work is not illegal (yet). And, anyway, fuck copyright when it gets in the way of knowledge.
13. Open the gates, let people in
Despite its many flaws, Reddit is a good source of folksonomies. One I particularly like is the difference between:
- Gatekeeping: keeping people out and being a snob,
- Gates Open, Come on in !: the opposite, easing people into whatever it is you are into.
On particular subtle and often unconscious form of gatekeeping is to make a literary reference without any marker that it is a reference. In sociological terms, this is what Bourdieu (1979)2020: I have read and heard about this book, but I have yet to read it: (Bourdieu, Pierre. 1979. La Distinction: Critique Sociale Du Jugement. Le Sens Commun. Paris: Éditions de Minuit. https://libgen.st/book/index.php?md5=99A0545D2DF1575735076CAB10B19804.). In the meantime I ordered a comic book inspired by it: (Rivière, Tiphaine. 2023. La distinction: librement inspiré du livre de Pierre Bourdieu. La découverte. Paris: La Découverte.). The best entrypoint I have found into these notions is the following book which to my knowledge only exists in French: (Sapiro, Gisèle, François Denord, Julien Duval, Mathieu Hauchecorne, Johan Heilbron, Franck Poupeau, and Hélène Seiler, eds. 2020. Dictionnaire International Bourdieu. Collection “Culture & Société”. Paris: CNRS éditions. https://libgen.st/book/index.php?md5=224D5038EFDFBC83DDD9607541DA7499.) would call a display of social capital, and a form of symbolic violence.
Restricting access to an erudite reference to those who already know about it helps preserving social stratification, and is the opposite of what a technical writer should, in my opinion, strive to do.
Our (at least, mine) cultural background is replete with stories of low-class protagonists raising above their condition, being helped on their way by an access to books, which, as non-judgmental holders of knowledge, deliver sapience and hope to the protagonist with no regard for their undeserving nature. There's Jane Eyre2121: I have read neither this one, , A Tree Grows in Brooklyn2222: nor this one. , Good Will Hunting and the education he got for himself for "a dollar fifty in late charges at the public library", or Roald Dahl's Matilda, and probably more that I will come back to add here once I think of them. It's when the "Bookworm" trope meets the Hero's journey.
Well, I like to think of technical writers as the unsung heroes of these stories, the unwritten character having written the book that the hero relies upon to escape their humble beginnings.
And you can't be that hero if you just private-joke-wink-wink your knowledge to your peers with no consideration for outsiders trying to come in.
This does not mean that you have to limit yourself to a small (and time- and location-dependent) reference pool (which critics of pop culture may say is dumbing us down2323: The paraphrase is mine. For an intelligent musing on this issue, see the start of chapter 2 of (Postman, Neil. 2006. Amusing Ourselves to Death: Public Discourse in the Age of Show Business. 20. anniversary ed. New York, NY: Penguin Books. https://libgen.st/book/index.php?md5=EEB3AF3EAA4FE2B697C2911D60D90963.). It provides an example of the importance of citations in the academic world, explaining that other media may simply lack such a capability. ). It means that when you reference something, be kind and link to an explanation. Outsiders, and future historians, will thank you.
In a previous draft, the title of the section about quoting was "to quote, or not to quote", a reference to Shakespeare's Hamlet's "To be, or not to be". This reference could be considered a tired cliché by anyone with a formal education in an English speaking country, or fly over the head of e.g. people for whom English is not the first language. So I left it on the road not taken.
14. It is OK to cite something you did not read
As long as you are upfront about it.
Even if you have not read something, linking to it is fine, but manage your reader's expectation. Second-hand knowledge is still knowledge.
A technique that was suggested to me during my PhD was to read how other people would cite some paper, and paraphrase them when citing said paper. No need to actually read the paper.
Now, I get alerts each time somebody cites one of my paper, and oh boy is that method widespread ! I can see what most people think my papers say, and it is never exactly what I intended for it to say. Nevertheless what the community thinks of my work is as worthy of being documented as my work itself. So go forth and cite unread documents (but mark them as such).
15. What to cite is a political choice
Nobody can afford to write a full state-of-the-art summary for every assertion. One usually look for either an authoritative or an introductory source.
Therefore by choosing a source and thus deeming it authoritative or well written, you are making an assertion about its relative worth to whatever it is you did not cite.
For example, if you are talking about the discovery of DNA, the canonical reference would be to cite Watson and Crick (1953), but I'd rather link to Rosalind Franklin.
There is a quantifiable systemic bias in citations (for an example, see (Chatterjee and Werner 2021)), which you can help correct by, when the option arise, choosing to cite the paper written by a member of an underrepresented group.
Also, apart from the pathological case of the struggle for academic power, citing is free and does not devalue your work, so give credit where credit is due, and point your readers to where the idea originated from. This point is humorously made by Simon Peyton Jones in his slides How to make a great research paper, when he says "credit is not like money".
16. Academic citing is broken
Ever since I have seen Seth Godin's talk This is broken, this little sentence lives rent-free in my head. I often think "this is broken" when faced easily-fixable but unfixed annoyances.
Academic citations are one of the most broken dispositif I have ever had the displeasure to interact with and be subjected to.
This brokenness is, quite amusingly, the topic of a large body of academic work that would be far too large to cite here. Good entrypoints seem to be Bourdieu (1984)2424: (Bourdieu, Pierre. 1984. Homo Academicus. Collection “Le Sens Commun”. Paris: Editions de Minuit. http://libgen.st/book/index.php?md5=C98523879170884069C5151F4116223F.) (which I have yet to read), the "Champ académique" entry of Sapiro et al. (2020)2525: (Sapiro, Gisèle, François Denord, Julien Duval, Mathieu Hauchecorne, Johan Heilbron, Franck Poupeau, and Hélène Seiler, eds. 2020. Dictionnaire International Bourdieu. Collection “Culture & Société”. Paris: CNRS éditions. https://libgen.st/book/index.php?md5=224D5038EFDFBC83DDD9607541DA7499.) , and a search for "bilbiometry" on google scholar. The root of the argument is that citing is one of the weapon in a struggle for power, and that any bibliometric measure falls prey to Goodhart's law2626: Any measure used as a target ceases to be a good measure. , especially because scientists tend to be rational agents with a knack for optimizing quantitative metrics. Add for-profit publishing companies and a distrust of science from the political class, and you have the current disaster. It is quite a feat that Science carries on despite the way it is organized.
From my point of view, having authored a few papers, I was constrained in my use of citations by multiple factors:
- Space constraints prevented me from developing the prerequisites of my argument in a way that would be respectful to the reader.
- Stupid, publisher-imposed styles would mandate the use of numbers (e.g. [1]) as citations, and prevent the use of links.
- Reviewers would not-so-subtly drop hints that a specific, very good, and obviously highly relevant paper was not cited, and that the next submitted revision ought to cite it. I am far from the only one to have this experience 2727: (Wilhite, Allen W., and Eric A. Fong. 2012. “Coercive Citation in Academic Publishing.” Science 335 (6068): 542–43. http://libgen.st/scimag/10.1126%2Fscience.1212540.) .
- The need to make our specific contribution appear as big as possible would discourage expanding too much energy explaining previous contributions (by others or ourselves), and also provide no incentive to look into adjacent literature to discover whether similar ideas had taken root there.
- Literally all of my papers from this period began with the same tired
exposition of the problem setting and mathematical notation, but with slight
variations because writing the same thing twice is "self-plagiarism"2828: I hated that oxymoron, if I write something I should be free to use it however, and as many times as I see fit. . Nobody ever read that part, as
everybody knew the setting and used the same notations, as our scientific
community consisted of maybe a hundred people, tops, at the time. What I ought
to have been able to do was:
- write the settings introduction once, and as best as possible, in a single piece,
- link, cite, or quote it as relevant in other papers.
Now that I gained my independence from the usual research funding source, I'm free to do as I please. Maybe the advice I'm giving in the rest of the article about citations is actually terrible career advice if you are a scientist.
17. Tools
My setup is tailored to my needs and hard to replicate, but I will give some generic advice about the tools you can use to insert proper citations in your written works.
Use BibTeX as a format. Either maintain the text file yourself, or use a dedicated piece of software like e.g. Zotero and its Better BibTeX extension.
Do not insert citations by copy pasting, but instead use a way to reference the entries in your BibTeX file. I use org-mode and its new (as of ) org-cite syntax.
18. Humor
Citation can be a source of humor. My two favorite mechanisms are:
- self-reference, when you cite yourself, or even your own article, in your article2929: Yo dawg, … ,
- breaking rule 4 above, letting the reader understand the voluntarily fuzzy
connection between the context and the reference.
- A beautiful example of that can be seen in the footnote of the front page of Pudwell (2007)3030: (Pudwell, Lara. 2007. “Digit Reversal without Apology.” Mathematics Magazine 80 (2): 129–32. https://www.jstor.org/stable/27643011.) .
- A subcategory is making a paper from a completely unrelated field somehow relevant in a technical discussion, such as for example citing a sociology or economy paper when discussing Privilege in Computer Science.
Relatedly, having a funny title can earn you more citations, apparently: Heard, Cull, and White (2022)3131: (Heard, Stephen B., Chloe A. Cull, and Easton R. White. 2022. “If This Title Is Funny, Will You Cite Me? Citation Impacts of Humour and Other Features of Article Titles in Ecology and Evolution.” bioRxiv. https://doi.org/10.1101/2022.03.18.484880.) .
19. Changelog
- Put the TL;DR: at the top (thanks Yaroslav :)
- Initial version