Project · Electronic Researcher

The Electronic Researcher.

A working guide to digital tools for academic research — document capture, OCR, citation management, and the practical question of how to keep a working archive without losing the source.

The Electronic Researcher began in the late 1990s as a small set of pages collecting notes on the digital tools that academic historians were starting to use in research: the first usable OCR packages, the first generation of digital-camera document capture in archives, the early citation-management software (EndNote, ProCite, Bookends), and the question that ran underneath all of it — how to keep a working archive of source material that you could still find and use ten years later.

The project lived for years as a working resource, periodically updated as the tools changed. The 2008 update cycle is the one most often linked from external academic resource pages — see in particular the post in the coffee grounds blog at the University of Minnesota that pointed at the project in early 2008 ("Amateur digitization for academics, updated").

Sections

Cameras. Notes on document-capture cameras for academic research in archives — what to look for, what to avoid, lighting and stand setup, and the practical workflow that turns captured images into usable source files.

OCR & document workflow

Notes on running captured page images through optical character recognition and producing a clean text version with intact formatting. Covers the early ABBYY FineReader and Adobe Acrobat OCR workflows, and the perennial problem of accent characters, marginalia, and stamped pagination.

Citation management

Notes on citation-management software in the period before Zotero — EndNote, ProCite, Bookends, RefManager — and the longer-running question of how to maintain a citation library that survives software-version transitions.

Working archive

The question that runs through the whole project: how do you keep a working archive of source material — captured images, OCR text, transcribed notes, citation entries — that you can still find and use ten years from now?