Categories
Pedagogy UVA Collaboration

My Experience Leading a Workshop on Text Analysis at Washington and Lee University

[Enjoy this guest post by Sarah McEleney, doctoral candidate in the Slavic Languages and Literature Department at University Virginia. She came to W&L to give a workshop in Prof. Mackenzie Brooks’s DH 102: Data in the Humanities course through a Mellon-funded collaboration with the Scholars’ Lab at UVA. More information about this initiative can be found here. This post is cross-listed on the Scholars’ Lab blog.]

As a graduate student participating in the University of Virginia and Washington & Lee University digital humanities collaboration, during the fall 2017 I led a guest workshop on text analysis in Mackenzie Brooks’ course DH 102: Data in the Humanities. This workshop was an exploration of approaches to text analysis in the digital humanities, which concurrently introduced students to basic programming concepts. For humanities students and scholars, the question of how to begin to conduct text analysis can be tricky because platforms do exist that allow one to perform basic text analyses without any programming knowledge. However, the ability to write one’s own scripts for text analysis purposes allows for the fine-tuning and tailoring of one’s work in highly-individualized ways that goes beyond the capabilities of popular tools like Voyant. Additionally, the existence of a multitude of Python libraries allows for numerous approaches for understanding the subtleties of a given text of a corpus of them. As the possibilities and directions for text analysis that Python enables are countless, the goal of this workshop was to introduce students to basic programming concepts in Python through the completion of simple text analysis tasks.

At the start of workshop, we discussed how humanities scholars have used text analysis techniques to create some groundbreaking research, such as Matthew Jockers’ research into the language of bestselling novels, as well as the different ways that text analysis can be approached, briefly looking the online text analysis tool, Voyant.

For this workshop students downloaded Python3 and used the simple text editor that is automatically installed with it, IDLE. This way we didn’t have to spend time downloading multiple programs. While IDLE is rather barebones, its functionality as a text editor is fine for learning the basics of Python, especially if one doesn’t want to install other software. From here, by using a script provided to the students, we explored the concepts of variables, lists, functions, loops, and conditional statements, and their syntax in Python. Using these concepts, we were able to track the frequency of chosen words throughout different sections of a story read by the script.

The workshop then delved into a discussion of libraries and how work can be enhanced and made to better suit one’s needs by using specific Python libraries. As the focus of the workshop was on text analysis, the Python library that we looked at was NLTK (Natural Language Toolkit), which has a vast variety of functions that aid in natural language processing work, such as word_tokenize() and sent_tokenize(), which break up a text into individual parts, as words or sentences, respectively. The NLTK function FreqDist() simplifies the task of getting a count of all the individual words in a text, which we had done with Python alone in the prior script before working with NLTK. The inclusion of NLTK in the workshop was meant to briefly show students how important and useful libraries can be when working with Python.

While only so much can be covered over the course of a single workshop, the premise of the workshop was to show students that you can do some very interesting things with text analysis with basic Python knowledge, and to dive into Python programming headfirst while learning about general concepts fundamental to programming. As digital humanities methods for humanities research are becoming more and more common, working with Python’s capability for natural language processing is a useful tool for humanists, and in an introductory class, the goal of my workshop was to spark students’ interest and curiosity and provide a stepping stone for learning more, and at the end of the workshop, further resources for students to turn to in learning more about Python and text analysis were discussed.

Categories
DH Event on campus

Double the DH: Check Out These Events On Campus

Calling your attention to two speakers:

Laura I. Gómez on “The Problem is Not in the Code: Racism, Sexism and Inequalities in Tech”

Thursday, November 30, 2017
5pm
Stackhouse Theater

Laura I. Gómez is the founder and CEO of venture-backed startup Atipica, Inc., which is a talent discovery engine that uniquely combines both artificial and human intelligence to help companies unlock the lifetime value of their talent data. She is also a founding member of Project Include, a non-profit that uses data and advocacy to promote inclusion solutions in the tech industry. Gómez will talk about the importance of the diversity and inclusion efforts she has helped foster and why we need them to combat racism, sexism, and discrimination in the technology industry.

This event is sponsored by the Roger Mudd Center for Ethics at Washington and Lee University. It is part of the 2017-18 Equality and Difference speaker series. 


Deen Freelon on “Computational Communication Research is a Thing”

Thursday, November 30, 2017
5pm
Northen Auditorium

Deen Freelon is an associate professor in the School of Media and Journalism at the University of North Carolina. He will be giving an introduction into using computational methods to study online communication. He uses computational methods in his own work studying political expression in digital and social media, including movements like the Arab Spring and Black Lives Matter.

This event is hosted by the Washington and Lee University Journalism Department. 

 

Both are on Thursday, November 30, 2017 at 5pm so the choice is yours!

Categories
DH Event on campus Speaker Series

DH Speaker Series: Dr. Sarah Bond

Join us for a talk by Dr. Sarah Bond, Assistant Professor in Classics at the University of Iowa. She will be speaking on “Why We Need to Start Seeing the Classical World in Color.”

Monday, November 13, 2017
5pm
Northen Auditorium


Why We Need to Start Seeing the Classical World in Color

Sarah E. Bond
Department of Classics
University of Iowa

In an essay, Sarah Bond writes, “The equation of white marble with beauty is not an inherent truth of the universe; it’s a dangerous construct that continues to influence white supremacist ideas today.” Bond continues her exploration of color perceptions in the scope of the ancient world with a discussion of polychromy and the technology used to restore the colors of statues and other pieces of ancient Roman art. She prompts us to wonder: what is the relationship between color and our cultural values? Why is white marble considered the epitome of beauty? What influenced this perception, and how can we challenge it? How does this reflect on the status of ourselves? Bond’s talk will raise these questions about cultural values and explore what the absence of color really means.

Dr. Sarah Bond is an assistant professor in Classics at the University of Iowa. She is a digital humanist, who is also interested in late Roman history, epigraphy, late antique law, Roman topography, and the socio-legal experience of ancient marginal peoples. Bond earned her Ph.D. in History from the University of North Carolina at Chapel Hill and her B.A. in Classics and History with a minor in Classical Archaeology from the University of Virginia. During the 2011-2012 academic year, she was a Mellon Junior Faculty Fellow in Classics and History at W&L.

This event is made possible by a grant from the Andrew W. Mellon Foundation and a Dean of the College Cohort Grant. It is co-sponsored by the Washington and Lee History Department.