MHLonArchiveSpark Development for the Digital Humanities
Hosted by one of our member institutions in New York, Boston, New Haven, Philadelphia, or San Francisco, the fellow will develop a user-friendly web interface and author supporting workflows to make MHLonArchiveSpark functionality more broadly accessible to researchers and better facilitate: 1) using the MHL’s Advanced Search Tool to identify a set of texts meeting user criteria and retrieving all of them from the Internet Archive and 2) using ArchiveSpark to extract the full text of a results set (including metadata) for the purpose of performing additional queries against that set. ArchiveSpark is an open source Apache Spark framework for data processing, extraction, and derivation for Web archives and archival collections developed by the Internet Archive and L3S Research Center.
Additional products of this project could include creating a number of canned recipes for searching content using MHLonArchiveSpark and considering new approaches to making extraction and analysis easier.
For more information about ArchiveSpark, visit the following:
- http://mhl.countway.harvard.edu/search/ (ArchiveSpark utilizes the full text search) http://l3s.de/~holzmann/papers/archivespark2017bigdata.pdf
The fellowship is paid and may be taken for course credit.
DUTIES AND RESPONSIBILITIES:
- Based on the input of MHL members and others, assess user needs and propose possible solutions to enhance MHLonArchiveSpark functionality; implement new approach.
- Create a number of canned recipes for searching content with ArchiveSpark.
- Create user-friendly documentation for the purposes of increasing the use and reach of MHLonArchiveSpark.
QUALIFICATIONS AND EXPERIENCE:
This position is open to all qualified graduate students with a strong interest in the digital humanities and computer science, including API development, with additional interests in library/information science or education. Strong communication and collaboration skills a must. Fellows are expected to learn quickly and work independently.
Education and Outreach FellowMedical Heritage Library, Inc.
- Based on the input of MHL members and others, work on the creation of curated sets of materials drawn from MHL collections.
- Develop educational materials tied to K-12 and/or university level curriculum
- Enrich MHL metadata to highlight underrepresented topics in our Internet Archive collections.
- Regularly create blog posts and other type of social media for posting to MHL accounts.
- Other duties as assigned.
QUALIFICATIONS AND EXPERIENCE:
This position is open to all qualified graduate students with a strong interest in medical or health history, with additional interests in library/information science or education. Strong communication and collaboration skills a must. Fellows are expected to learn quickly and work independently.
The fellowship will take place anytime between the end of May 2019-mid-August 2019
20 hours per week, over 12 weeks.
To apply, please provide the following:
Cover letter documenting interest in position
Please submit your application materials by April 1st, 2019 to: Attn: Fellowship committee