collation (automatic)

computer-aided critical editions, i.e., the use of the computer to compare texts or manuscripts and to indicate variants (Gilbert 1973, 139).

Automatic text collation—the use of the computer to locate variant readings in manuscript copies of a text (Gilbert 1974, 106).

There is no longer, I am told, any need to collate manuscripts against a base text. Instead they should be transcribed into a computer by touch typing supported by the use of “hypertext” methods to accommodate any ancillary comment the editor might wish to record. When his transcriptions are complete, or at any appropriate stage in the process of collation, the editor would have the computer compare the transcriptions and print out the results in any way that seemed useful.
Unfortunately, the manuscript traditions of classical texts do not normally lend themselves to this sort of treatment for many reasons, but most obviously because the suggested method involves feeding into a computer a diplomatic edition of every manuscript deemed worthy of collation, as a preliminary to “letting the computer make the comparisons.” (Whittaker 1991, 128).

There are two basic strategies for collating texts by computer: fully automated, batch collation and interactive, computer-assisted collation. The first strategy, fully automated collation, has received the lion’s share of attention. Its goal is to find and record all textual variants without human interaction. […]
The second strategy, interactive collation, […] provid[es] the computer with human assistance whenever necessary (Hilton 1992, 139-140).

Automatic collation is based on the idea that each document (transcription of a manuscript or OCR recognition of a printed edition) is a complete instance of the text to reconstruct, with variations (Boschetti 2007, 3).

The core principle behind automated text collation is that, rather than choosing a base text (or reference text) against which all subsequent texts should be compared, the scholar refrains from any selection or comparison at all. She or he will instead produce a full transcription of each witness to be collated, in as much diplomatic detail as is feasible, and leave the work of comparison to the software (Macé et al. 2015, 333).

The principle behind automated collation is that, given a set of texts that resemble each other, the programme will identify and align the matching words and phrases across all text witnesses. Depending upon the collation programme, the scholar might compare all texts to a selected base, or compare each text to every other text without reference to a base. Various tools offer different forms of visualization of the results—side-by-side comparison of two witnesses, a spreadsheet of all witnesses that can be downloaded and used as a more traditional collation table, or even a representation of the text as a graph, with text variants marked as divergent paths. The result can also be used in further analysis, for example in a programme that will compute a hypothesis (partial or full) for the stemma (Macé et al. 2015, 335).

Related entries

Comments are closed.