Moving to new Quarters…

Old quarters:

B105 – Manual file and two student workstations
B115 – Lateral files, table, supply cabinets
B117 – Microfilm cabinets
B114 – The front office workstation, copier, fridge and table
B118 – Shelving!
B109 – Three full workstations and four bookcases

New quarters:

M1006, view 1
M1006, view 2
M1010
 

Some of the furniture is still missing from the new place too… Anyone got a shoehorn?

ALA 2012: Text Creation Partnership Update

Text Creation Partnership Update
1:30 pm on Saturday, June 23, 2012
Rebecca Welzenbach
With guest speakers Eric Nebeker and Harriet Green

Welzenbach began the session with introductions from the small group of us there. [I did not note down the participants though.]

Welzenbach then spoke about EEBO-TCP. Proquest provides online images but they are not text-searchable. TCP provides the searchability. The texts are keyed and marked up by hand with the goal of creating a faithful transcription and not a re-creation of the image. The transcription is done manually because OCR simply does not work. Welzenbach showed some slides comparing OCR and manual transcription.

In Phase 1 (2000-2009), TCP transcribed about 25,000 books. In Phase 2 (2009-present), they have transcribed almost 15,000. The cost to process an average book is $200-$250. This averages out to about $2 per book for each of the libraries that joined. About two thirds of the libraries that subscribed to Phase 1 have continued with Phase 2. Additionally, TCP is getting interest from libraries that did not participate in Phase 1.

The TCP transcription of the texts allows:

  • full-text searching
  • precise searching
  • legible rendering in HTML [in multiple formats I assume but didn’t note, it is SGML/XML-based TEI]
  • accessibility

She noted that TCP is seeing more consumption of the texts for new tools and technologies and gave some examples:

  • mapping early modern London
  • a project to create XSLT stylesheets to process texts and output them consistently
  • ARC: 18th Connect, others
  • OCR research at Texas A&M
  • MITH [I have a note that this might be a project that Carl spoke of]

After a short break, two guest speakers gave presentations about how TCP is being used at their institutions.

Eric Nebeker from UC Santa Barbara spoke about a project to trace the same story across various works even when it is not attributed to the same author. He also noted that the TCP texts are very useful in teaching his undergraduates.

Harriet Green from University of Illinois Champaign-Urbana noted that she often refers students to the TCP collection. She described a project by a student to mine the TCP texts.

Finally, more information can be found at the TCP website: http://www.textcreationpartnership.org/.