There is no longer, I am told, any need to collate manuscripts against a base text. Instead they should be transcribed into a computer by touch typing supported by the use of “hypertext” methods to accommodate any ancillary comment the editor might wish to record. When his transcriptions are complete, or at any appropriate stage in the process of collation, the editor would have the computer compare the transcriptions and print out the results in any way that seemed useful. Unfortunately, the manuscript traditions of classical texts do not normally lend themselves to this sort of treatment for many reasons, but most obviously because the suggested method involves feeding into a computer a diplomatic edition of every manuscript deemed worthy of collation, as a preliminary to “letting the computer make the comparisons.”
There are two basic strategies for collating texts by computer: fully automated, batch collation and interactive, computer-assisted collation. The first strategy, fully automated collation, has received the lion’s share of attention. Its goal is to find and record all textual variants without human interaction. […] The second strategy, interactive collation, […] provid[es] the computer with human assistance whenever necessary.
The core principle behind automated text collation is that, rather than choosing a base text (or reference text) against which all subsequent texts should be compared, the scholar refrains from any selection or comparison at all. She or he will instead produce a full transcription of each witness to be collated, in as much diplomatic detail as is feasible, and leave the work of comparison to the software.
The principle behind automated collation is that, given a set of texts that resemble each other, the programme will identify and align the matching words and phrases across all text witnesses. Depending upon the collation programme, the scholar might compare all texts to a selected base, or compare each text to every other text without reference to a base. Various tools offer different forms of visualization of the results—side-by-side comparison of two witnesses, a spreadsheet of all witnesses that can be downloaded and used as a more traditional collation table, or even a representation of the text as a graph, with text variants marked as divergent paths. The result can also be used in further analysis, for example in a programme that will compute a hypothesis (partial or full) for the stemma.