Modern reference management using BibLaTeX

BibTeX has become a universal reference management format in the mathematical sciences. This format is used on arXiv and in virtually all journals and conferences which publish papers. As a result, a significant amount of time is spent managing references within a manuscript. This is not something I like to spend my time on, so in this post, I’ll explore some ways of making the process more efficient.

I believe that one should work with modern tools when possible. BibLaTeX is a modern BibTeX replacement1 which is, in my experience, substantially easier to configure and customize as needed. In this post, I’ll illustrate a commonly-encoutered BibTeX issue – lower-case proper nouns in paper titles – and show how to avoid it in BibLaTeX. Along the way, I’ll illustrate how to override journal style files to avoid auto-loading BibTeX-based packages such as Natbib, and how to get BibLaTeX to play well with older TeX versions on arXiv.

Capitalization difficulties with BibTeX

Most journals and conferences require standardized citation formats, such as IEEE format. They generally provide .bst files which contain the necessary style files. Unfortunately, virtually every .bst file displays the citation

@book{lifshits12,
	Author = {Lifshits, Mikhail},
	Publisher = {Springer},
	Title = {Lectures on Gaussian processes},
	Year = {2012}
}

as

[1] M. Lifshits. Lectures on gaussian processes. Springer, 2012.

which changes all correctly-capitalized proper nouns in the .bib file to lower-case. The root cause of the issue is the function

FUNCTION {format.title}
{ title empty$
    { "" }
    { title "t" change.case$ }
  if$
}

in the .bst file, which one can modify to disable the behavior – but it is sufficiently non-obvious to journal editors how to do this that no-one makes this change in practice. In particular, the default ICML style file has this issue. This is because .bst files are written in a highly outdated and esoteric stack-based programming language not legible to many. In particular, it is not possible to disable title capitalization changes by changing a BibTeX or Natbib package option in the LaTeX document preamble. Provided one wants to continue using BibTeX, one is left with two options to fix this issue.

  1. Edit the source code of the .bst file, thereby no longer using the .bst style provided by the journal.
  2. Edit the .bib file manually to replace Lectures on Gaussian processes with Lectures on {G}aussian processes.

It would shock many outside the field to learn how many people in the mathematical sciences choose the latter option, and spend their time fixing .bib files by hand, manually one-by-one. This is an utter waste of time that no-one should be doing. One can automate the latter process, but this is unsatisfying, because it means one can no longer search for the word Gaussian inside the .bib file and must instead search for {G}aussian. It can also cause issues with LaTeX correctly breaking lines in the reference section. Why not transition to modern tools that can produce the same output?

Using BibLaTeX for reference management

BibLaTeX is a modern BibTeX replacement. Much like Natbib, it offers the ability to specify standard and in-line citations via commands such as \cite, \textcite, \parencite, and similar2. Unlike Natbib or any general BibTeX-based reference package, it offers the ability to change citation format programmatically from within LaTeX. For example, I can disable title capitalization in legacy-mimicking styles do so by default by using the following TeX command.

\DeclareFieldFormat{titlecase}{#1}

BibLaTeX also offers backreferences, which list pages on which a paper was cited. These look like as follows.

[1] M. Lifshits. Lectures on Gaussian processes. Springer, 2012 (cited on page 1).

I find this feature to be useful, because the link to page 1 is clickable – this helps me easily go back and forth between the main document and references while editing the document. However, I also find the default format a bit unsightly, so I change it to the following.

[1] M. Lifshits. Lectures on Gaussian processes. Springer, 2012. Cited on page 1.

This is easily achieved via the following snippet.

\usepackage{xpatch}
\xpatchbibmacro{pageref}{\printtext[parens]}{\addperiod\space\printtext}{}{}

In BibTeX, making changes like this would be much more time-consuming.

Overridding journal files which load Natbib or other BibTeX-based packages

Some journals and conference, such as NeurIPS, offer package options such as nonatbib to prevent Natbib and BibTeX from loading. Others, such as ICML, do not, and auto-load their BibTeX-based style files which include the above irritating capitalization issue. For journals whose styles are implemented as LaTeX packages, one can override this using the following macros.

\makeatletter
\@namedef{ver@natbib.sty}{9999/12/31}
\let\setcitestyle\@gobble
\usepackage{icml2020}
\let\setcitestyle\undefined
\expandafter\let\csname ver@natbib.sty\endcsname\@undefined
\makeatother

This works in a very simple manner.

  1. It tells LaTeX that Natbib is already loaded, so that the package is not imported when icml2020 attempts to load it.
  2. The command \setcitestyle is redefined to \@gobble, which simply ignores its arguments – this prevents icml2020 from raising an error while loading.
  3. Once icml2020 is loaded, it then tells LaTeX that Natbib isn’t loaded – this prevents BibLaTeX from raising an incompatible package error when it is subsequently loaded.

With these tricks, I’ve been using BibLaTeX exclusively for all of my paper submissions in recent years. In particular, my paper on Polya Urn LDA – which is now published in IEEE TPAMI – uses BibLaTeX. At no point during the entire publication, refereeing, and copy-editing process did anyone object to BibLaTeX, even though it is not the official journal-supplied style, because it generates the same bibliographies in print.

Getting BibLaTeX-based documents to compile on arXiv

BibLaTeX is a large package which supports many advanced options. It supports two backends for processing .bib files into a LaTeX-readable format, BibTeX and Biber – I use the BibTeX backend, because this yields good compatibility with arXiv, Overleaf, TeXpad, and other tools.

On a modern system, BibLaTeX-based LaTeX files using the BibTeX backend will fail to compile when submitted to arXiv. The reason for this is that arXiv does not use an up-to-date version of TeX Live – see here – and this causes incompatibilities with .bbl files generated by the most recent version of BibLaTeX, which is still under active development.

The solution is to upload a new version of BibLaTeX to arXiv so that it correctly processes the submission. This is done by also uploading the following library files with the submission.

biblatex.def
biblatex.sty
blx-compat.def

Depending on the particular style file used, others may also be necessary. It should be possible to determine this from arXiv’s error messages.

Concluding remarks

Using BibLaTeX offers advantages over BibTeX in terms of reference style customization. This also fixes reference capitalization issues in BibTeX, which people often instead fix painstakingly by hand – wasting time that could be spent on more worthwhile tasks such as scientific research.

However, switching to BibLaTeX can result in having to deal with journal style files that may wish to load incompatible packages, and with potential arXiv compilation errors. I hope that this article has illustrated some ways in which these issues may be smoothed out.

References

  1. In particular, BibLaTeX is the reference system recommended for by Overleaf for new users. 

  2. See the BibLaTeX cheat sheet