Library:Digging into Digital Book Collections
This page was created to support the UBC Library Workshop "Digging into Digital Book Collections" offered in Winter 2014.
Workshop Description
Learn how to further your research by using digital book collections including Google Books and the Internet Archive. You will leave this workshop knowing:
- how to search within books to locate research material not evident from title or chapter descriptions
- conduct more thorough literature reviews on primary sources
- perform cited reference searches
- conduct historical word searches
- and more!
Copyright and Digital Book Collections
This article is still being drafted. This means that the article is still being worked on and information may be incomplete. This template will be removed when the article is finished. If you have any concerns, please start a discussion on the talk page. |
This information should not be construed to be legal advice nor UBC policy. More information about Copyright at UBC can be found at http://copyright.ubc.ca.
Most material in digital book collections is still in copyright and this affects how you can access and use it. See UBC's official copyright website for more information about your rights and responsibilities regarding copyright.
Public Domain
In Canada, the copyright for a work usually expires 50 years after the death of the creator, at the end of the relevant calendar year. At this point it is said to pass into the public domain.
- Example: Mordecai Richler died on 3 July 2001, and his novels will remain copyrighted until 31 December 2051, passing into the public domain on 1 January 2052.
Items that are in the public domain are free to use in any way you choose. That means no restrictions on copying and adapting, no need to seek permission, and no uncertainty about your rights as a user! (There is also no legal requirement to attribute works in the public domain to their creators, although doing so is an important part of maintaining academic integrity.)
- Determining whether a work is in the public domain can be complicated as the duration of copyright differs depending on a work's authorship and format. If you are uncertain whether a work is in the public domain in Canada you can contact UBC's copyright help list-serv for more assistance.
US Copyright
Many digital book collections were created in the United States. American copyright law is different from Canadian copyright law and there are many possible periods of copyright. While American copyright law does not apply in Canada, it does often have the effect of restricting access to digital collections held on servers in the United States.
Overview of Some Digital Book Collections
Digital Book Collections on the Internet
Google Books is a large-scale book digitization project by Google. Its goal is to make the large print collections of a number of libraries available for full-text searching on the web. The initial partner libraries included Harvard University, Stanford University, University of Michigan, New York Public Library, and the Bodleian Library at Oxford University. Additional libraries have since joined the project. While the records and text of these library collections are now available for full-text search, many records do not provide a full-text preview due to copyright restrictions. Works that are in the public domain can be downloaded.
Open Library is a project sponsored by the Internet Archive with the goal to create a web page for every published book. Like Google Books, the Open Library does permit you to conduct full text search as well as view and download open domain books. The Open Library is a Wikipedia-like project and has grown extensively through public contributions, but this also means that it has a mixed-bag of bibliographic data making searching it complex. It has also become a repository for many terminated or dormant digitization projects such as those by the Microsoft, the University of Toronto, and Cornell University. These projects often have superior OCR to those in Google Books.
Hathi Trust is a searchable repository of books based out of the University of Michigan. The Hathi Trust is a partnership of over 60 universities in the United States and was intended to provide secure future access to the large-scale book digitization projects. As such, it includes many collections that are in Google Books and the Open Library. While only members of partner universities can download full text open domain books from the Hathi Trust, anyone can register a guest account and create searchable collections. The Hathi Trust has partnered with OCLC to create a Worldcat-like search interface for their collections.
The Internet Archive hosts one of the largest collections of freely available digital content on the Web and includes digitized print books, audio files, moving images and, by means of the Wayback Machine, cached copies of websites (including a significant number of the now defunct Geocities sites). Considerable Canadian content is available - click the "Canadian Libraries" link in the top menu bar for quick links and information about the contributors to this collection. Of special note: the Canadiana Collection "of publications dating back to the early 17th century that are about Canada, or written and published by Canadians....(The content) begins with pre-1900 non-serial materials which were originally microfilmed ...gathered and produced by the Canadian Institute for Historical Microreproductions (CIHM)" (About). Coverage ends at 1920.
- "Archive.org supports all metadata about items in just about any language so long as the characters are UTF8 encoded" (FAQs) so you will find materials in a wide variety of scripts.
- Most but not all the content is in the public domain and please note that the Internet Archive's terms limit use to "scholarship and research purposes only."
- The copyright status for most content is found in the description menu and Creative Commons licensed materials are also clearly identified with the CC logo appearing under the file links.
Gallica: Many national libraries have sites devoted to displaying their country's cultural patrimony. Gallica, the digital library of the Bibliothèque nationale de France, is among the best. It makes available more than two million documents from a number of major French libraries. In addition to 400,000 books, Gallica also includes almost a million magazine and newspaper issues as well as over 550,000 images and a variety of other materials including 2400 sound files. The standard of presentation is uniformly high and the interface admirable.
Wikipedia maintains a growing list of other digital library projects.
UBC Library eBooks
Summon is a service that allows you to search most of UBC Library's books, journal articles, primary sources, newspapers, microforms, cds/dvds and other materials from a single search box. To narrow your results down to e-books only, simply type in your keywords/title words/author names etc., click Search and then choose "book/e-book" in the Content Type menu and "items with full-text online" in the Refine Your Search menu.
- As these books come from a wide array of publishers and distributors the permitted uses can vary.
- Some e-book content from the Public Domain is included in Summon - primarily from the Hathi Trust and Project Gutenberg.
- Look for links such as "copyright status," "permitted uses," "terms of use," etc. to determine what you are able to do with the materials you find.
- If you are unsure you can contact a librarian or send an email to the copyright help list-serv.
Many ebooks available via UBC Library are on ProQuest's ebrary platform. Video tutorialsare available, and include
Downloading and reading ebrary books requires that you first download and install Adobe Digital Editions. Adobe Digital Editions is free.
If you prefer to read text instructions check out this blog entry which brings together all the steps you need to follow - both on eBrary and Adobe's sites - to download and read your ebrary book.
Theses and Dissertations
EThOS is a database of over 300,000 UK theses and dissertations many of which are available without charge. Hosted by the British Library, its intent is to offer 'a single point of access where researchers the world over can access ALL theses produced by UK Higher Education' (About). Registration is required but it is easily accomplished.
For other sources of online theses and dissertations:
- Although ProQuest Dissertations and Theses database offers a vast array of citations and full-text documents it does not include everything. For example, UBC's recent theses and dissertations are not represented. Furthermore, the database is much stronger on North American material than from abroad.
- To find UBC theses, including the full-text of older UBC theses and dissertations, start by going to the Library's Guide to Finding Theses & Dissertations
- International theses may be found in the Institutional Repositories of the granting university, as well as international databases like OAIster and/or the Center for Research Libraries Online Global Resources Network.
Activities Enhanced by Digital Book Collections
Search for terms in books
- Good for fact-finding
- Find terms and subjects not evident in title or chapter descriptions
- Find terms, people (e.g. minor historical figures, film directors, authors and artists) or places that are not the main subjects of works
- Note that the full-text search will also find authors and places in footnotes and bibliographies
Create your own index by searching
- Find terms that are not indexed in a book
- Use the separate search box to search within the text of a single book
Search for references in books
- Find full bibliographic citations of works including journal abbreviations or missing page numbers
- Try to negate the effect of bibliographic styles (e.g. APA, MLA) by using search terms such as author and "title in quotations"
Exercise |
---|
Search for a book or article. If you don't know a full title, try just searching for a few words in "". E.g. "Gordon Childe" "Most Ancient East") |
If you searched for a book and found it, click on it and then select "About this book." Scroll down to the very bottom of the record to see the bibliographic information. E.g. Most Ancient East |
If this does not work, see if you can find mention of the book in a bibliography, book review, or footnote to obtain the information you need. e.g. Book review of the Most Ancient East with bibliographic information in the second snippet. |
Conduct more thorough literature reviews
- Search for mention of influential articles or works
- Track mention of the work in books to quickly survey opinions and discussion
- Useful for crowdsourcing scholarly opinion on an influential article
Exercise |
---|
Search for a seminal theoretical article in your discipline (e.g. Flannery "Golden Marshalltown") |
Read previews and snippets to see how the discussion has evolved around the article. |
For older articles, try filtering by date to see if scholarly consensus has changed over time. e.g. same search as above limited to 1982-1990) |
Conduct cited reference searches
- Search full text for mentions of references
- Not as precise as using Google Scholar "Cited By" but can turn up some hidden results
Exercise |
---|
Find a Book in Google Books e.g. Dyson, Stephen L. The Creation of the Roman Frontier. Princeton, N.J: Princeton University Press, 1985. Print. |
Check to see if the Google Books profile page lists References to this book in other books, Google Scholar, or web pages (e.g. JSTOR) |
Try searching all of Google Books for "Last Name, Year" You may want to add an additional subject-based search term to increase relevancy. E.g. "Dyson 1987" Roman Do the results match? Did you discover something new? |
Are the results the same in Google Scholar? |
Searching for primary sources and discipline-specific abbreviations
- Very handy for discovering sources that are not mentioned in bibliographies or the book index
- Try to think of all the ways that a source can be cited
Exercise |
---|
Think of a primary source or common abbreviation in your field (e.g. the Gortyn Law Code) |
Brainstorm for all the possible ways that it could be abbreviated (e.g. IC IV 72, IC 4.72, I Cr. 4.72, I. Cret. IV 72) |
Do separate searches for each abbreviated form. You may want to use "" to search as a phrase (e.g. "IC IV 72") |
Did you discover a reference that was previously unknown to you? Did you discover something that was outside of your subject area? In the above example, comparisons between ancient Greek and Jewish traditions became evident. |
Create collections of research material
Using
- Google Books "My Library" feature
- Hathi Trust "Collections" tab - may be private or public
- Search within your own curated book collections
- Keep a record of books you've read and/or referenced
- Organize them into public or private collections
Activities Specific to Google Books
Finding references to the book under About this Book
- Links to other books in Google Books and articles in Google Scholar
- Also contains links to web pages mentioning the books including book reviews and online syllabi
Find popular passages in other books under About this Book
- Incredibly useful for following discussion on influential passages from scholars
- Useful also for primary sources including literary and legal passages that tend to be quoted verbatim
- Still a test feature that does not always work well
Conduct searches for historical frequency of words using the Google N-Gram Viewer
- Create visualizations of the usage of words in English Language
- Useful for determining when a word was employed
- More useful from the 19th Century onwards and less useful after 2008
- Plot out multiple words on a single chart for comparative analysis
- Indirectly allows you to explore how a word was being employed and whether you need to look for synonyms
Exercise |
---|
Go to the Google N-Gram Viewer |
Put a word or words in the search. You can compare words by separating them with a comma. E.g. sustainability, environmentalism |
You can search for phrases and compare them as well. Unlike normal Google searches, the n-gram viewer is case sensitive and doesn't require "" for phrases. E.g. Star Trek, Star Wars |
Make sure to play with the year ranges to make meaningful graphs. If your term doesn't show up in earlier years, try to brainstorm for historical terms. |
Use the hyperlinks below the graph to see actual usage within Google Books Search. |
Activities Specific to the Internet Archive
- Download different formats of books
- file types potentially available include html, Daisy, Kindle, ePub, PDF, text and DjVu
- Search within the book when the "Read Online" file format is available
- Use the power of Google to search the Internet Archive by limiting your search to the domain archive.org
- Download different formats of books
Digital Humanities Restrictions
Most of digital library collections are not friendly to being scraped or harvested on large scales. You can contact Google to gain access to open domain datasets for the purpose of research.