Skip to Main Content

Digital Humanities

An introduction to digital humanities resources both at and beyond Duke.

Schedule a consultation

Profile Photo
Hannah Jacobs
she/her/hers
Schedule Appointment
Subjects: Humanities

Don't know where to start?

Don't know where to start? Contact a Subject Specialist.

Acknowledgments

This guide was created by Heidi Madden and Arianne Hartsell-Gundy.

It was updated and revised by Julia Glauberman in spring 2016.

It was updated by Arianne Hartsell-Gundy, Heidi Madden, Liz Milewicz, and Will Shaw in 2021.

It was revised in 2025 by Hannah Jacobs.

How to choose

Digital humanities methods comprise a wide range reflecting the many disciplines engaged in dh research: from text and data to visualizations and sound, there are many possible approaches and tools you might apply.

How do I choose a dh method?

It's most important that you choose a method helps you answer your research question and/or meet your goal. Your sources must also be supported by your method. For example:

  • You are studying a building's architectural history and want to understand its changes over time. You have access to both original and renovation plans. You choose 3D modeling as your method because it will help you visually analyze the changes, and you can build your model based on the architectural plans.
  • You are interested in literary history, and you want to understand what kinds of themes developed out of a particular movement through distant reading. The literature you are studying is out of copyright, and you have access to the full text of several hundred works. You choose topic modeling as your distant reading method to explore and analyze the texts at scale.
  • You are working with a community to research their history and need a way to share what you've learned. The community wants to create an exhibit at the local library, which has access to a touchscreen. You work with community members to design an interactive storymap that presents your research.

Another way to approach choosing a method is to consider the question words in relation to your visualization. This table offers an overview of which methods might support which question words:

Methods Who? What? / Which? Where? When? How?
quantitative (e.g. data visualization, distant reading) x x x x  
temporal (e.g. timelines)       x x
spatial (e.g. mapping)     x   x
dimensional (e.g. 3d modeling)   x x   x
narrative (e.g. storymaps, exhibits, virtual reality) x x x x x
archival (e.g. data collection & cleaning) x x x x x
network (e.g. social network analysis) x x     x

Data Management

Regardless of which method you choose, you will have data that you need to manage. The libraries offers guidance and workshops on data management. If you are working on a grant-funded project, you'll likely need a data management plan. Regardless, you'll want to think early in your project about practices such as file naming and organization, file versioning, storage & back up, documentation, and more.

Before you begin

You may want to think critically about how you approach working with data. Scholars from a variety of fields are thinking about

These and many other theoretical framings can help you make decisions regarding not only how you gather and structure data but also which methods you engage.

Structuring data

Especially if you are working with text or other unstructured data, you may need to spend some time structuring your data before you can move into your chosen method. Here are some tools, approaches, and resources to help you get started:

 

Preparing text

If you are beginning with printed or analog material, you may need to scan these documents before you can work with them. Scanning will enable you to work with the materials outside of the archive or storage location, share them with others if you are working collaboratively, and perform computational tasks that will support your data structuring and management.

If you have images or PDFs of text that you'd like to make machine readable, you may need to transcribe your sources and/or go through processes called Handwritten Text Recognition (HTR) and Optical Character Recognition (OCR)This step can take longer than all the other parts of your project combined, so take time to figure out the best platform and method for your project. Here are some tools and resources to help you get started:

Handwritten Text Recognition (HTR)

If you're working with handwritten sources, you may be able to use a semi-automated process, handwritten text recognition, to transform the handwriting into machine-readable text. Try some of the following:

  • Basic: 
  • Customizable, no coding, best for historic scripts:
    • eScriptorium: a free platform for HTR, OCR, and manual transcription; self hosting required; led by a team at the laboratoire AOROC at the École Pratique des Hautes Études – Université Paris Sciences et Lettres, in Paris. Read this quick start guide to learn more. This blog post describes one use case.
    • HandwritingOCR: commercial, paid platform; AI is used to assist with formatting outputs; minimal export formats; API access available.
    • Leo: a freemium LLM-powered platform, great for secretarial or cursive scripts and currently best for Western languages.
    • Transkribus: a freemium collaborative platform for applying HTR, OCR, and manual transcription to historic texts; developed by Read Coop, a cooperative of 200 academic institutions. Some large language model integration; ability to train models on historic scripts (see, for example, their model for Latin Carolingian Miniscule). Broad range of mostly interoperable export formats & API access available.
  • Customizable, coding in Python required:
    • Kraken: a turn-key OCR system optimized for historical and non-Latin script material. This blog post describes one use case.
    • Handprint: a command-line program that invokes HTR (handwritten text recognition) services on images of document pages. It can produce annotated images showing the results, compare the recognized text to expected text, save the HTR service results as JSON and text files, and more. Here's a tutorial.
    • OpenCV & PyTesseract: Here's a bare-bones example of how to use two Python libraries for basic HTR.
    • PyLaia: a device agnostic, PyTorch based, deep learning toolkit for handwritten document analysis. 
    • Python & Microsoft Azure: this tutorial shows you how to write a Python program to transcribe handwritten documents using Microsoft’s Azure Cognitive Services, a commercially available service that has a cost-free option for low volumes of use. If you have enough volume that you must pay for Azure, contact OIT.

Interested in learning more about HTR tools? Check out these papers:

  • AlKendi W, Gechter F, Heyberger L, Guyeux C. Advancements and Challenges in Handwritten Text Recognition: A Comprehensive Survey. Journal of Imaging. 2024 Jan 8;10(1):18. https://doi.org/10.3390/jimaging10010018.
  • Chinthaginjala, R., Dhanamjayulu, C., Kim, Th. et al. Enhancing handwritten text recognition accuracy with gated mechanisms. Sci Rep 14, 16800 (2024). https://doi.org/10.1038/s41598-024-67738-8.
  • Garrido-Munoz, Carlos, Antonio Rios-Vila, and Jorge Calvo-Zaragoza. "Handwritten Text Recognition: A Survey." arXiv. uploaded 12 Feb. 2025. https://arxiv.org/html/2502.08417v1
  • Humphries, M., Leddy, L. C., Downton, Q., Legace, M., McConnell, J., Murray, I., & Spence, E. (2025). Unlocking the archives: Using large language models to transcribe handwritten historical documents. Historical Methods: A Journal of Quantitative and Interdisciplinary History, 58(3), 175–193. https://doi.org/10.1080/01615440.2025.2500309.
  • Romein, C.A., Rabus, A., Leifert, G. et al. Assessing advanced handwritten text recognition engines for digitizing historical documents. International Journal of Digit Humanities 7, 115–134 (2025). https://doi.org/10.1007/s42803-025-00100-0.
Transcription

If you're working with a script that HTR tools have difficulty recognizing, you're concerned about the privacy policies of HTR tools, or you wish to engage in a manual process for epistemological or ethical reasons, there are a number of transcription tools that can help you (in addition to eScriptorium and Transkribus listed above). For manual transcription, you could start with a folder of Microsoft Word documents. If you need to collaborate, you could migrate these into Duke Box or Google Drive. If you're looking to work with a broader community or integrate with a larger project, here are some platforms to try:

  • FromThePage: community-based transcription platform. Great for collaborative and crowdsourced projects.
  • Scripto for Omeka: a tool that integrates with Omeka S & Omeka Classic sites to incorporate transcription with digital archive building.
  • Zooniverse: a crowdsourcing platform that is great for discrete tasks like transcribing image captions or handwritten forms or tagging documents.

Looking for resources to teach students about transcribing historical texts? This Script Tutorial from Brigham Young University and this general introduction to transcription from eLaboratories are two places to start.

Optical Character Recognition (OCR)

OCR focuses specifically on automated transcription of printed text, so it's best used for books, newspapers, and other mass-produced documents. Here are some tools to get you started. Note that some HTR tools listed above can also do OCR.

 

Extracting text from audio & video

If you're working with audio or video and want to generate text for archiving or for analysis, here are some tools that can help:

  • Manual: these tools require a human to do all of the transcribing work. This is helpful for difficult to understand audio, uncommon vocabularies, and when privacy is a priority.
  • Automated with manual correction:
    • Descript: a paid platform that can auto-generate transcripts and let you edit them. You can even use transcript editing to (optionally) edit video recordings themselves. Contact Duke OIT regarding possible licensing options or similar platforms.
    • Microsoft Word: upload an audio file and generate a Word document that you can edit.
    • OHMS - Oral History Metadata Synchronizer: a combined transcription and sharing platform; comes with manual and (paid) automated transcription options.
    • Rev: paid human or AI transcription.
    • Sonix: browser-based collaborative transcription platform. Platform starts at $10/hour for transcription. Supports 40+ languages.
    • Trint:  another paid, browser-based collaborative platform. Also supports 40+ languages.

 

Adding structure to text

Once you have machine-readable text, you might need to structure it to support your planned analyses. (If you're planning to do natural language processing or basic text analysis, though, this step might not be necessary.)

  • Cleaning errors in HTR/OCR/transcribed text: In addition to manually cleaning text, it's possible to use regular expressions to clean text based on patterns you recognize in the text. This tutorial offers one introduction. Regular expressions (sometimes referred to as regex) can be integrated into Python for advanced document clean-up. Tools like Text Fixer can also help address some of the same issues, without learning the regex syntax.
  • Tag your text with TEI: TEI, or the Text Encoding Initiative, provides a standard for encoding, or tagging, text in a standardized way using XML. This approach can be helpful for text analysis, publishing critical digital editions, and more. Here are a bunch of resources to help you learn more.
  • Create a data set: You might alternatively want to pull out key metadata, or parts of the text, to create a data set. This tutorial, as well as this open educational resource, offer examples of how to do this. Check the standardizing data section below for more tools.

 

Organizing images

The following tools are great not only for organizing all those photos you took in the archive, but also for browsing and analyzing your collected materials:

  • Tropy: a free desktop platform in which you can use folders, metadata, and tags to organize all of your images and search through them. 
  • Recogito: annotate images and documents, and use those annotations to create a dataset for visualization and further analysis. Also check out Recogito Studio, a new paid version of the platform with extended capabilities. 

 

Standardizing data

Once you have your data, you may find that you need to structure it: many dh methods rely on spreadsheets or other simple data structures like JSON. You'll need to think about what kinds of data fields will support your research question and method. Here are some basic and more advanced methods and tools to help you:

3D Modeling & Extended Reality

3D modeling can help you analyze spaces and/or objects. 3D modeling combined with virtual or augmented reality (collectively: extended reality) can be ways of sharing scholarship in the classroom, in exhibitions, and beyond.

3D modeling approaches & tools

There are multiple approaches to creating a digital 3D model. You might use a camera to take many photos of a room or object and use software to stitch those images together (photogrammetry); you might use a laser scanner to collect and stitch together spatial data; you might build a digital model in software from scratch based on historical plans--there are many possibilities.

Virtual & Augmented Reality platforms

There are many ways to approach the creation of virtual and augmented reality (VR & AR). Here are some of them:

  • Gaming engines that support virtual & augmented reality development. Coding may be helpful in these platforms, but they are also complex interfaces and in some cases offer visual programming options:
  • Web-based VR & AR. Coding required!:
    • A-Frame: a framework for building VR & AR on top of HTML.
    • AR.js: Javascript library for augmented reality.
    • BabylonJS: Javascript library for 3D visualization on the web.
    • <model-viewer>: Javascript library for 3D & AR.
    • ThreeJS: Javascript library for 3D visualization on the web.
  • Mobile apps:
    • Stqry: pronounced "story," a mobile tour app with augmented reality capabilities built in. Get access through Duke.
    • Vuforia Engine: commercial platform that extends Unity3D's AR capabilities.
    • Zappar: a commercial platform for creating AR experiences for mobile devices.

Looking for help with a VR or AR project? Look no further than Duke's Virtual Reality Studio.

Data Visualization

Data visualization can help you analyze and communicate quantitative information about your research. The Center for Data & Visualization Science in Duke Libraries offers workshops, recordings, guides, and consultations to support data visualization. In addition, here are some resources for creating basic visualizations:

  • Canva: a freemium platform that provides templates you can fill in to create static graphs.
  • Code Libraries: there are numerous data visualization libraries created for the coding languages Javascript, Python, and R. Try D3.js, Altair, BokehSeaborn, or ggplot2
  • Excel: build directly in your favorite desktop spreadsheet software. 
  • Flourish: freemium online platform that provides a helpful data visualization interface.
  • Google Sheets: build directly in your favorite online spreadsheet platform.
  • Raw Graphs: a free and open source platform for data visualization.
  • Tableau: a robust desktop software and sharing platform for creating a variety of visualizations. Free versions are available for students and educators. Get started here.

Maps & Timelines

Maps and timelines can be useful tools to analyze and share topics that change over time, move across space, and/or operate at multiple scales. Some mapping and timeline tools are listed below.

Mapping

For support with mapping, reach out to askdata@duke.edu.

  • Web mapping & storytelling: these tools will help you publish and share spatial stories online. 
    • ArcGIS Online & ArcGIS StoryMaps (Free access via Duke): in addition to building maps, apps, and storymaps online, these tools enable you to integrate your GIS analyses.
    • Carto
    • Google MyMaps: a great platform for mapping simple datasets. Handles points, lines, and polygons, with the possibility of export for use in other software. Good for mapping many places at once. It's possible to use colors and icons to categories those places.
    • Leaflet: an open-source Javascript library for building custom web maps. Coding required!
    • Neatline: a free add-on to Omeka, digital archiving platform. Neatline provides a flexible space to map stories in geospatial and non-geospatial contexts. 
    • StoryMapJS: map movement through space from point to point and include text, images, video, and other media to tell your story. This free platform from Knight Lab is great for mapping travel diaries, diasporas, and other spatial stories.
  • GIS & geospatial analysis: these tools will help you map large and complex datasets and conduct geospatial analyses.

Timelines

The following web-based tool enable timeline creation in a variety of formats (interactive and static, digital and print-ready). When selecting your timeline tool, be sure to think about how your audience will interact with the timeline and what kinds of media you might want to integrate into your timeline.

  • Canva: freemium platform that provides templates and guidance for creating static, print- and social-media ready graphics.
  • Sutori: freemium platform for creating a vertical, scrollable, and presentation-ready timeline with text and images. Paid version allows for embedding other media and adding in quizzes, among other features.
  • TimelineJS: a popular digital humanities tool, TimelineJS is a free platform from Knight Lab that operates like a slide deck. Events are managed in a Google Sheet that is then connected to TimelineJS for formatting. Embed many types of media, group events, and add larger context in eras to your timeline.
  • TikiToki: freemium platform that lets you create an interactive timeline with images, audio, and video. Events are represented as speech bubbles.

Networks Diagrams & Analysis

If you are interested in studying many relationships at once, network diagramming and network analysis might be for you. The difference between diagramming and analysis is in your method:

  • Network diagrams can help you visually and qualitatively interpret your dataset. 
  • Network analysis can help you quantitatively interpret your dataset. This approach often includes the creation of one or more diagrams to help communicate your analysis.

Here are two places to learn more about networks.

Tools for Network Diagrams

  • Gephi Lite: a free and open-source web application to visualize and explore networks and graphs. It is a web-based, lighter version of Gephi.
  • GraphCommons: freemium proprietary platform. Upload your data or draw by hand. Some analytical tools included.
  • Kumu: freemium proprietary platform. In addition to creating diagrams, Kumu lets you embed them in other sites and create stories that contextualize your diagrams. Free projects must be made public. Some basic analytical tools included.
  • Palladio: a free platform from the Stanford Humanities + Design Lab, Palladio can visualize data in multiple forms, including as both spatial and non-spatial network graphs.
  • More network diagram tools, some require coding.

Tools for Network Analysis

The following tools can also be used for creating diagrams but include powerful analytical functionalities.

  • Cytoscape: free, open-source desktop software for network analysis. The Cytoscape community has built a number of apps, add-ons for the core software, that extend its functionality. Originally designed for biological sciences, Cytoscape can be adapted for social and other forms of network analysis.
  • Gephi: free, open-source desktop software for network analysis. Gephi also includes a number of plugins to extend its functionality

Text Analysis

Text analysis tools can support both interpretive and quantitative analyses of text. They are helpful for examining and visualizing patterns, enumerating word frequencies and collocations, identifying clusters of topics, reading across texts, and more. Here are some tools to get started. If you don't find what you're looking for, try this how-to guide and this directory.

Have sources that you need to make computationally readable first? Check out the preparing text section above, which includes information on Handwritten Text Recognition (HTR) and Optical Character Recognition (OCR).

Basic Text Analysis

  • AntConc: a free desktop software great for word frequencies, collocations, and concordancing. 
  • Voyant Tools: a free web platform that provides numerous ways to visualize and examine text at a distance.
  • DiffCheck: online text & table comparison; privacy forward: does not store data beyond your current session. 
  • Concordle: a very basic copy-paste word cloud and concordance tool. 
  • Lexos: a free web platform for text analysis and visualization that includes a web scraping feature.
  • Orange: open source software for data mining that can be used for text analysis.

Advanced Text Analysis

  • Bookworm: a free and open-source web infrastructure for text analysis. Some systems administration required.
  • MALLET: Machine learning for language toolkit, which provides sophisticate analytical tools. 
  • ProQuest TDM Studio: text and data mining tools for content on ProQuest. Contact askdigital@duke.edu to learn how to access.
  • Wordseer: a text analysis environment that combines visualization, information retrieval, sensemaking and natural language processing to make the contents of text navigable, accessible, and useful. You can run WordSeer on your local machine or on a shared server, and you access it via your modern web browser of choice.

Learn more about text analysis in our how-to guide.

Digital Publishing: Websites & Apps

The digital publishing platform (content management system) you choose will match your project goals and your resources. There are several decisions you'll need to make:

  • How much funding do you have to support your project, and for how long? Hosting costs can vary, and if you need a developer, that could significantly increase your cost.
  • How much time do you have to dedicate to building and maintaining the project, which might include learning some new skills?
  • Do you want to host your project on a Duke-managed platform or on an external service you manage? -- You'll want to consider both flexibility & security when answering this question:
    • Hosting your project on an external service will offer considerable flexibility in how you customize your project; however, it also opens you up to a range of security threats. You will also be responsible for managing any software updates and other maintenance needs over time.
    • Hosting your project with a Duke service can still require some maintenance, and may limit your flexibility, but it offers far more security protections.
    • Note that all of the options listed below, except for Google Sites, can be hosted either on your own or via a Duke service.
  • Which platform will support your project best? Are you aiming to share blog posts and multimedia? Do you need to create a custom data structure? Are you sharing a community-driven digital archive? Or are you creating a digital publication that will not need to be edited once it's complete?

Duke Hosting Platforms

Whichever platform you choose will need a hosting provider. Check out this list of Duke-managed hosting options below, or the options for external hosting listed below.

External Hosting Platforms

If you are considering an external hosting platform, check out these services:

  • Reclaim Hosting: made by academics, for academics; includes one-click install for platforms like Omeka.
  • GreenGeeks: an affordable hosting provider with a mission to reduce environmental impact.
  • Namecheap: a cost effective hosting provider.
  • Github Pages: a great option for free hosting if you are building a custom app or a static site.

If you choose to host a website external to Duke, it's highly recommended that you not use your Duke credentials when creating website admin accounts. NEVER use your, or anyone else's, netID when setting up user accounts.

Blogging & Multimedia

  • WordPress: one of the most commonly used content management systems in the world. Built for blogging, though it can be customized with plugins to fit your data structure and sharing needs. Access Wordpress for free via Sites@Duke Express, roll your own with a Duke VM, host with Wordpress.com, or host with one of the external providers listed above.
  • Google Sites: hosted by Google, this platform offers simplicity: create a set of pages and embed multimedia content. Access via your own Google account or via a Duke Google Workspace account.

Custom Content Management

  • Drupal: If you will have a complex data structure, need modularity, and have specific interaction needs, then Drupal might be for you. Sites@Duke Pro is built in Drupal, but comes with a price tag. You can also host it externally on many different hosting providers.

Digital Archives & Exhibits

  • Omeka: designed for creating digital collections, Omeka helps you structure your archival data and create webpage-based exhibits. Two versions, Classic and S, offer different levels of data management, extensibility, and sharing. Access a basic version of Omeka Classic for free or paid subscription via Omeka.net. You can also roll your own with a Duke VM or host with one of the external providers listed above. Reclaim Hosting has a one-click install for Omeka.
  • CollectionBuilder: an open-source framework for creating digital collection & exhibit websites. Built on Jekyll and Github, CollectionBuilder is hosted on Github Pages and requires some learning of web technologies to get started. However, once you've built it, you will need to do very little to maintain it.
  • Mukurtu: designed with and for indigenous communities, Mukurtu is an archiving and community-based platform that empowers community members to document their cultural heritage and manage multiple levels of access based on their cultural norms. Roll your own with a Duke VM or host with one of the external providers listed above.
  • Wax: another open-source framework for creating scholarly exhibitions. Wax can be run on Github Pages or a number of other platforms.

Oral Histories

Digital Books & Catalogs

While you could choose any of the above to create a digital book or catalog, the following platforms are designed specifically for this purpose:

  • Lantern: an open-source template designed for open educational resources in multiple formats.
  • Quire: an open-source multiformat publishing tool designed for longevity, discoverability, and scholarship. Developed by The Getty, Quire is optimal for digital catalogs as well as digital books. As a static site, it can be hosted just about anywhere and needs no maintenance once it's published.
  • Manifold: a platform adopted by publishers like the University of Minnesota Press, Manifold is designed for open access digital book publishing. It can be self-hosted, or you can sign up for managed hosting. It is, however, costly.

Digital Publishing Platforms at Duke

Duke offers several publishing options for researchers who want to share their work, create digital supplements to printed scholarship, or experiment with new forms of scholarly communication. The table below lists the most commonly used platforms that are available to most (or all) Duke users, locally supported, and public-facing.

Not sure which platform to choose? Schedule a consultation: askdigital@duke.edu.

Sites@Duke Express 

A free WordPress-based website hosting service. It is ideally used for blogs, course websites, project or group websites, and individual portfolios. (More information...)

Typical User: Any Duke affiliate who wants to publish content in a locally supported platform equipped with common plugins and capabilities. 

Cost: Free. Renewal required every 5 years.

Sites@Duke Pro

A semi-custom Drupal-based website hosting and development service. (More information...)

Typical User: Departments, programs, initiatives, and large campus organizations.

Cost: $3,000 initial cost + $250/month maintenance (More information...)

Duke Web Services custom development

An OIT development service that creates completely custom websites and helps with building content and publication workflows. (More information...)

Typical User: Large projects or organizations in need of a publishing option beyond what’s possible with Sites Pro.

Cost: Variable, but generally expensive

OIT Virtual Machines (VMs)

Essentially standalone servers that allow complete software customization for setting up research and publishing solutions.

Typical User: Users or groups with a high level of technical expertise and the capacity to maintain both the server and publishing software.

Cost: VCM & RAPID VMs are free (time-limited, experimental, user-maintained). VMWare hosting*** is paid (extra storage and RAM, long-term use, optional OIT administration and support). OIT support ranges from $170-585 / yr. Storage and support prices are available at https://oit.duke.edu/help/articles/kb0025194.

***VMWare hosting support will be discontinued April 1, 2026. OIT will be migrating to a new provider beginning in Fall 2025. Please reach out to OIT for more information.

Research Data Repository

A service allowing researchers to archive and share important data related to their work. (More information...)

Typical User: Duke affiliates who need to allow long-term access to, citation of, and preservation of their data.

Cost: Free up to 300GB per deposit for Duke researchers (defined as graduate, post-doctoral, research staff, and faculty). Contact datamanagement@duke.edu to inquire about additional storage needs and data preservation for grant applications.

DukeSpace

An open-access repository for publications by Duke authors. (More information...)

Typical User: Duke affiliates who need an openly-available location to share dissertations, theses, and other published scholarship.

Cost: Free.

ArcGIS Online/StoryMaps

Tools for publishing interactive maps and rich multimedia narratives. (More information...)

Typical User: Scholars who want to experiment with new forms of publication, share geospatial analyses or map-based tools, or collaboratively build interactive maps.

Cost: Free.

Tableau

Often used for publishing data visualizations. (More information...

Typical User: Users who would like to create visualizations based on their data and share them publicly.

Cost: Free one year renewable license.

Other lists of tools & methods

There are many digital tools available for use in the digital humanities, some made specifically for dh and others that can be re-purposed quite effectively for Humanities research. The following lists show a complex variety of tools. Consult with a librarian or with staff at a support center to identify the best tools needed for your project: