Blog

Managing academic literature with Zotero and Docear

Over the years, as any research scientist, I have accumulated a decent amount of references to the academic literature (research papers, reviews, thesis, etc.). And over the years, I’ve tried several methods and tools to (somewhat) efficiently manage those references.

I even went so far as designing my own XML-based format to store them, along with a set of tools to manipulate files in said format—I dropped both the format and the associated tools when I discovered Zotero, circa 2010.

By now I have settled on a method relying on both Zotero and Docear, which I will describe here. In a few words, the core principle of that method is that I use Zotero to retrieve, store, and sort all my references, while I use Docear to work with, analyze and annotate a smaller set of selected references of interest.

Using Zotero

I import most of my references into Zotero using the Firefox’s plugin, either from PubMed or directly from the publishers’ websites.

I don’t use Collections much, as I rather rely on Tags. I have one main Collection named Biology, which hosts all the references from my work as a life sciences researcher. I store newly imported references in the root My Library folder. Since they don’t belong to any Collection yet, those new references will also appear in the Unfiled Items special folder (hence the interest of having even a single Collection below My Library: it allows to immediately find references that I have imported but not processed yet).

Whenever I know I can spare at least ten minutes, I go to the Unfiled Items folder and start processing the references waiting there. I “skim” quickly through the paper, just enough to understand what it is about and what the points of the paper are. I then add a PDF annotation where I put my own summary of what I understood of the paper by skimming it.

My “own summary” should not be a mere paraphrasing of the abstract! While the abstract may (actually should) adequately convey the main results of the paper, it is often lacking some key informations that I’d like to find quickly if I come to this paper again (e.g., is it an in vitro or an in vivo result? From a cell model or an animal model?).

Back in Zotero, I import the annotation I’ve just added to the PDF (using the ZotFile extension to Zotero), so that I can read it directly from Zotero without needing to open the PDF itself. I tag the reference with the DGG:Status:Skimmed tag, optionally with some other relevant tags (more on tags below), then move it to my main Biology Collection.

If I believe the paper is interesting but does not warrant, at least for now, a more thorough reading, I leave it like that (of course I may come back to it later and then change my mind). On the contrary, if I believe I want to spend more time on it, I tag it with DGG:Status:ToRead and import it to Docear.

Using Docear

Docear and Zotero do not share the same bibliographic database. In my method, this is a feature, not a bug: I do not want my Docear database to be filled with the thousands of references that I accumulate in my Zotero database. Docear should only contain the few hand-picked references that I want to use and study in details.

I manually import a reference from Zotero into Docear by doing a Quick Copy (Ctrl+Shift+C) from Zotero using the BibTeX format, and pasting it into the BibTeX Source tab in Docear’s reference editor. Then I add a corresponding entry to the Literature & Annotations page in Docear’s project.

Then, I do read the paper in details (immediately after importing it or some time later, if I can’t find the time to do it now; I know I have this paper to read since it is tagged with DGG:Status:ToRead in Zotero), spending as much time I need on it and taking notes along the way. To avoid switching constantly between my PDF reader and Docear, I typically write my notes as PDF annotations on the paper itself, then import the annotations in Docear. This also has the advantage that a copy of the notes exists independently of Docear, which may be useful.

When I’m done with reading the paper, I put it at an appropriate place in my Literature & Annotations mind map. I also go quickly back to Zotero to replace the DGG:Status:ToRead tag with the DGG:Status:Read tag.

About tags

The following table lists the tags that I use in Zotero. They are all prefixed with DGG: so that I can find them very easily in Zotero’s tags cloud widget.

Table 1. Zotero tags
Tag name Description and purpose
Labs sorting tags
DGG:Lab:InsermU823 Identify the lab for wich this reference is relevant
DGG:Lab:RoyouLab
DGG:Lab:PDCSLab
Reference status tags
DGG:Status:ToSkim References that I have yet to skim​1
DGG:Status:Skimmed References that have been skimmed​2
DGG:Status:ToRead References that I have yet to fully read​3
DGG:Status:Read References that have been fully read
DGG:Status:ToAnalyze References that I want to completely dissect​4
DGG:Status:Analyzed References that have been completely dissected
Interests classification tags
DGG:Topics:* A reference relevant for the specified topic
DGG:Methods A reference interesting for some of its methods​5
  1. Ideally this tag should not be used; it’s better to take the time to actually skim the paper rather than marking it as “to be skimmed”.
  2. “Skimmed” references should be accompanied with a note summarizing them.
  3. All references tagged with DGG:Status:ToRead or higher should have been imported into Docear.
  4. Typically the case for papers I want to present in a journal club or a similar context.
  5. More precise tags can be added “below”, similar to what is done for the Topics category.