Sample texts

This section of our site contains various texts from Shakespeare, from the official war record for the American Civil War, and from the King James Bible.

All of the texts here are complete and unabridged. They've come from sources such as Project Gutenberg, and are being provided here without charge and within the terms of the original source.

Shakespeare plays

The Shakespeare plays on this site are from Project Gutenberg, and are subject to the following copyright statement. They are available without charge on this site, for non-commercial use.

Standard disclaimer and copyright statement from Project Gutenberg: THIS ELECTRONIC VERSION OF THE COMPLETE WORKS OF WILLIAM SHAKESPEARE IS COPYRIGHT 1990-1993 BY WORLD LIBRARY, INC., AND IS PROVIDED BY PROJECT GUTENBERG ETEXT OF ILLINOIS BENEDICTINE COLLEGE WITH PERMISSION. ELECTRONIC AND MACHINE READABLE COPIES MAY BE DISTRIBUTED SO LONG AS SUCH COPIES (1) ARE FOR YOUR OR OTHERS PERSONAL USE ONLY, AND (2) ARE NOT DISTRIBUTED OR USED COMMERCIALLY. PROHIBITED COMMERCIAL DISTRIBUTION INCLUDES BY ANY SERVICE THAT CHARGES FOR DOWNLOAD TIME OR FOR MEMBERSHIP.

Shakespeare hints and tips

In the Shakespeare plays on this site, the stage directions, including names of characters when used as stage directions, are in uppercase (e.g. ROMEO). When a character is referred to by name by another character, they are in lowercase with the first letter of the name in uppercase (e.g. Romeo). At time of writing, the Search Visualiser is not case-sensitive, so it will show both ROMEO and Romeo as hits for "Romeo" and for "ROMEO" and "romeo".

When searching for short words, it's advisable to check whether you have Search Visualiser set to accept partial matches. If it's set to accept partial matches, then a search for "sing" will show matches within "singer" and "singing". If it's set for whole word search, then it will only show matches for the word "sing". If you're searching for things like how often "he" is mentioned compared to "she" then you need to use whole word search, otherwise you may get false positives because the word "he" is a partial match for "she" and "the" etc.

American Civil War documents

Our document Dealing with very large texts contains some examples of how you can use Search Visualiser to search and to study large historical documents. Our document Searching for common names in large documents contains other examples.

The American Civil War official records documents on this part of the Search Visualiser site are from the archive at Cornell University at:
http://digital.library.cornell.edu/m/moawar/waro.html

That archive contains the complete collected official war records for the armies and navies on both sides.

The records contain significant quantities of text – typically about a thousand pages per volume. The text has been scanned in by OCR and contains a moderate proportion of typographic errors as a result. We have not edited the text in the files on this site, apart from splitting the files into more manageable sizes, so these errors remain in the text. This will lead to some false negatives, where a word is corrupted by a typographic error and is missed by the Search Visualiser as a result; for instance, a sentence which claimed that gnus were entering a city (presumably intended to be "guns"). The rate of false negatives for a given search due to typographic errors will probably be around 1% (i.e. the SV will detect about 99% of the words that it would have detected if there were no typos).

The copyright statement for Cornell is here:
http://cdl.library.cornell.edu/guidelines.html

The copyright statement for their use on the Search Visualiser site is here:
Pages/ACWcopyright.aspx

We have selected three volumes from the army archives, from different stages in the war – the beginning, the turning point at the Battle of Gettysburg, and the end of the war.

The first volume in this selection is Series 1, Volume 1. We have split this into three files to make it more tractable. The Search Visualiser can handle the full-sized volume, but most readers will want to be able to compare text from before and after key points, so we have split the volume accordingly.

The first file, Volume 1a, runs from the start of the official war records before the war, to the end of the chapter dealing with the siege and surrender of Fort Sumter.

The second file, Volume 1b, contains the remaining chapters of Volume 1, apart from the index. These chapters deal with the secession of several states, and with operations in the South.

The third file, Volume 1 Index, contains only the index for Volume 1. We have separated out the index so that the Search Visualiser results for the main text aren't complicated by hits from the index.

The second volume we have selected deals with the period around the Battle of Gettysburg: Series 1, Volume 27, part 1.

We have divided it into two files.

The first file, contains the whole of the body text of this volume, excluding only the index. Unlike the other volumes in our selection, this volume deals with a single central event and a relatively short period of time, with no logical dividing point within it.

The second file contains the index for this volume.

The third volume we have selected deals with the end of the war: Series 1, Volume 49, part 1.

We have divided it into four files as follows.

The first file contains the opening section of this volume, Union records, and ends with the capture of Jefferson Davis.

The second file contains the next section of the volume, consisting of Union records from after the capture of Jefferson Davis, and concluding at the start of the section containing Confederate records.

The third file contains the third section of this volume, containing Confederate records.

The fourth file contains the index for this volume.

Bible texts

We have selected the first five books of the King James Bible, plus the four Gospels, also from Project Gutenberg.