Duplicate detection
Duplicate detection
In large scanning projects, such as those undertaken by Google or the Internet Archive in multiple libraries, a particular book or work may be scanned multiple times. In this visualization, aligned portions of two books are shown with green regions, which take up most of the text. Divergent portions of the two books are marked red—in this case, from different publishers’ catalogues at the end of the book.