Sravana Reddy

sravana.reddy at wellesley dot edu

I am a Hess Fellow at Wellesley College Computer Science, working on speech and natural language processing. I got my PhD from the University of Chicago, and spent some time at Dartmouth College as a Neukom Fellow, and USC/ISI as an intern.

Research

My interests include unsupervised learning, pronunciation modeling, and applications of NLP to linguistics, humanities, and the social sciences.

Publications

Toward completely automated vowel extraction: Introducing DARLA. Sravana Reddy and James N. Stanford. Linguistics Vanguard (2015).

preprint paper

Automatic Speech Recognition (ASR) is reaching further and further into everyday life with Apple's Siri, Google voice search, automated telephone information systems, dictation devices, closed captioning, and other applications. Along with such advances in speech technology, sociolinguists have been considering new methods for alignment and vowel formant extraction, including techniques like the Penn Aligner (Yuan and Liberman, 2008) and the FAVE automated vowel extraction program (Evanini et al., 2009, Rosenfelder et al., 2011). With humans transcribing audio recordings into sentences, these semi-automated methods can produce effective vowel formant measurements (Labov et al., 2013). But as the quality of ASR improves, sociolinguistics may be on the brink of another transformative technology: large-scale, completely automated vowel extraction without any need for human transcription. It would then be possible to quickly extract vowels from virtually limitless hours of recordings, such as YouTube, publicly available audio/video archives, and large-scale personal interviews or streaming video. How far away is this transformative moment? In this article, we introduce a fully automated program called DARLA (short for "Dartmouth Linguistic Automation," http://darla.dartmouth.edu), which automatically generates transcriptions with ASR and extracts vowels using FAVE. Users simply upload an audio recording of speech, and DARLA produces vowel plots, a table of vowel formants, and probabilities of the phonetic environments for each token. In this paper, we describe DARLA and explore its sociolinguistic applications. We test the system on a dataset of the US Southern Shift and compare the results with semi-automated methods.

A Web Application for Automated Dialect Analysis. Sravana Reddy and James N. Stanford. In Proceedings of NAACL 2015.

paper poster website

Sociolinguists are regularly faced with the task of measuring phonetic features from speech, which involves manually transcribing audio recordings -- a major bottleneck to analyzing large collections of data. We harness automatic speech recognition to build an online end-to-end web application where users upload untranscribed speech collections and receive formant measurements of the vowels in their data. We demonstrate this tool by using it to automatically analyze President Barack Obama’s vowel pronunciations.

Decoding Running Key Ciphers. Sravana Reddy and Kevin Knight. In Proceedings of ACL 2012.

paper

There has been recent interest in the problem of decoding letter substitution ciphers using techniques inspired by natural language processing. We consider a different type of classical encoding scheme known as the running key cipher, and propose a search solution using Gibbs sampling with a word language model. We evaluate our method on synthetic ciphertexts of different lengths, and find that it outperforms previous work that employs Viterbi decoding with character-based models.

G2P Conversion of Proper Names Using Word Origin Information. Sonjia Waxmonsky and Sravana Reddy. In Proceedings of NAACL 2012.

paper poster data

Motivated by the fact that the pronunciation of a name may be influenced by its language of origin, we present methods to improve pronunciation prediction of proper names using word origin information. We train grapheme-to-phoneme (G2P) models on language-specific data sets and interpolate the outputs. We perform experiments on US personal surnames, a data set where word origin variation occurs naturally. Our methods can be used with any G2P algorithm that outputs posterior probabilities of phoneme sequences for a given word.

Learning from Mistakes: Expanding Pronunciation Lexicons Using Word Recognition Errors. Sravana Reddy and Evandro Gouvêa. In Proceedings of Interspeech 2011.

paper slides

We introduce the problem of learning pronunciations of out-of-vocabulary words from word recognition mistakes made by an automatic speech recognition (ASR) system. This question is especially relevant in cases where the ASR engine is a black box -- meaning that the only acoustic cues about the speech data come from the word recognition outputs. This paper presents an expectation maximization approach to inferring pronunciations from ASR word recognition hypotheses, which outperforms pronunciation estimates of a state of the art grapheme-to-phoneme system.

Unsupervised Discovery of Rhyme Schemes. Sravana Reddy and Kevin Knight. In Proceedings of ACL 2011.

paper slides data code

This paper describes an unsupervised, language-independent model for finding rhyme schemes in poetry, using no prior knowledge about rhyme or pronunciation.

What We Know About The Voynich Manuscript. Sravana Reddy and Kevin Knight. In Proceedings of the ACL 2011 Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities.

paper slides code and data press

The Voynich Manuscript is an undeciphered document from medieval Europe. We present current knowledge about the manuscript's text through a series of questions about its linguistic properties.

An MDL-based Approach to Extracting Subword Units for Grapheme-to-Phoneme Conversion. Sravana Reddy and John Goldsmith. In Proceedings of NAACL 2010.

paper

We address a key problem in grapheme-to-phoneme conversion: the ambiguity in mapping grapheme units to phonemes. Rather than using single letters and phonemes as units, we propose learning chunks, or subwords, to reduce ambiguity. This can be interpreted as learning a lexicon of subwords that has minimum description length. We implement an algorithm to build such a lexicon, as well as a simple decoder that uses these subwords.

Substring-based Transliteration with Conditional Random Fields. Sravana Reddy and Sonjia Waxmonsky. In Proceedings of the ACL 2010 Names Entities Workshop.

paper

Motivated by phrase-based translation research, we present a transliteration system where characters are grouped into substrings to be mapped atomically into the target language. We show how this substring representation can be incorporated into a Conditional Random Field model that uses local context and phonemic information. Our training and test data consists of three sets: English to Hindi, English to Kannada, and English to Tamil (Kumaran and Kellner, 2007) from the NEWS 2009 Machine Transliteration Shared Task (Li et al., 2009).

Understanding Eggcorns. Sravana Reddy. In Proceedings of the the NAACL 2009 Workshop on Computational Approaches to Linguistic Creativity.

paper

An eggcorn is a type of linguistic error where a word is substituted with one that is semantically plausible -- that is, the substitution is a semantic reanalysis of what may be a rare, archaic, or otherwise opaque term. We build a system that, given the original word and its eggcorn form, finds a semantic path between the two. Based on these paths, we derive a typology that reflects the different classes of semantic reinterpretation underlying eggcorns.

Presentations

Note: These are papers at linguistics and humanities conferences that are not archival.

Automatic speech recognition in sociophonetics. Sravana Reddy, James N. Stanford, and Michael Lefkowitz. Workshop (tutorial) in NWAV 2015.

link

Is the Future Almost Here? Large-Scale Completely Automated Vowel Extraction of Free Speech. Sravana Reddy and James N. Stanford. In NWAV 2014.

slides

Automatic Speech Recognition (ASR) is reaching farther into everyday life through applications like Apple’s Siri. Likewise, sociolinguists have been considering new technologies for vowel formant extraction, including semi-automated alignment/extraction techniques like the Penn Aligner and Forced Alignment Vowel Extraction (FAVE). With humans transcribing recordings into sentences, these semi-automated methods produce effective results. But sociolinguistics may be on the brink of another transformative technology: large-scale, completely automated vowel extraction without any need for human transcription. It would then be possible to quickly extract vowels from virtually limitless hours of recordings, such as YouTube, publicly available audio/video archives, and even live-streaming video. How far away is this transformative moment? In the present study, we apply state-of-the-art ASR to a real-world sociolinguistic dataset (U.S. Southern Vowel Shift) as a feasibility test.

A Twitter-Based Study of Newly Formed Clippings in American English. Sravana Reddy, James N. Stanford, and Joy Zhong. In ADS 2014.

slides press

Following Baclawski (2012), this study uses Twitter to examine newly formed clippings among younger speakers, including awks (awkward), adorb (adorable), ridic (ridiculous), hilar (hilarious). We analyzed 94 million tweets from 334,000 U.S. Twitter users who posted during 2013 (cf. Eisenstein et al. 2010; Bamman et al. 2012). We find that while women and men both use truncated forms, women are the leaders of the newer, primarily adjectival forms. These recently coined forms are also more common in tweets from urban locations. We compare our results to classic principles (Labov 2001), illustrating how large-scale Twitter analyses can be valuable in American dialectology.

A Document Recognition System for Early Modern Latin. Sravana Reddy and Gregory Crane. In DHCS 2006.

Large-scale digitization of manuscripts is facilitated by high-accuracy optical character recognition (OCR) engines. The focus of our work is on using these tools to digitize Latin texts. Many of the texts in the language, especially the early modern, make heavy use of special characters like ligatures and accented abbreviations. Current OCRs are inadequate for our purpose: their built-in training sets do not include all these special characters, and further, post-processing of OCR output is based on data and methods specific to the domain language, most of the current systems do not implement error-correction tools for Latin. This abstract outlines the development of a document recognition system for medieval and early modern Latin texts. We first evaluate the performance of the open source OCR framework, Gamera, on these manuscripts. We then incorporate language modeling functions to sharpen the character recognition output.

Theses

Learning Pronunciations from Unlabeled Evidence. 2012. Doctoral Dissertation, The University of Chicago.

front matter

Part of Speech Induction Using Non-negative Matrix Factorization. 2009. Masters' Thesis, The University of Chicago.

Unsupervised part-of-speech induction involves the discovery of syntactic categories in a text, given no additional information other than the text itself. One requirement of an induction system is the ability to handle multiple categories for each word, in order to deal with word sense ambiguity. We construct an algorithm for unsupervised part-of-speech induction, treating the problem as one of soft clustering. The key technical component of the algorithm is the application of the recently developed technique of non-negative matrix factorization to the task of category discovery, using word contexts and morphology as syntactic cues.

Teaching/Service

Courses

I am currently co-teaching Computer Programming and Problem Solving at Wellesley in Fall 2016.

I taught a previous iteration of this course in Fall 2015, and a new class on Natural Language Processing in Spring 2016.

I will be teaching a new course on Machine Learning in Spring 2016.

Previously:

  • A two week seminar on Language Variation through the Lens of Web Data at the LSA 2015 Linguistic Summer Institute.
  • At Dartmouth, three iterations of Computational Linguistics, cross-listed with Linguistics and Computer Science, in Winter 2013, Fall 2013, and Fall 2014.
  • TA for several courses at Chicago, running the gamut from AI to systems to labs for introductory programming.

Students

I've worked with some great undergraduate students: James Brofos, Irene Feng, Steven Nugent, Crystal Ye, Joy Zhong, Daniela Kreimerman, Alexander Welton, Ian Stewart, and Emily Ahn. The last three wrote senior theses on Automated Stylistic Analysis, African American English Syntax on Twitter, and Foreign Accent Classification -- go check them out!

Admin

I was the local organizer for the North American Computational Linguistics Olympiad (NACLO) at Dartmouth, and one of the co-chairs of the demo session at NAACL 2016. I've also reviewed for *ACL conferences and workshops, and various journals.

Tools and Data

This is a collection of various resources I collected/created for research that may be useful to others.

Python Autograder with HTML Output (under development). Sravana Reddy and Daniela Kreimerman. 2016.

code

Autograder for the Introductory CS class at Wellesley.

Transcriptions for the CSLU Foreign-Accented English Corpus. Emily Ahn and Sravana Reddy. 2016.

data

The CSLU Foreign-Accented Speech Corpus is a great source of speech data from non-native English speakers. We crowdsourced transcriptions for 7 of the 23 native languages on Mechanical Turk, and are making them available here.

DARLA (Dartmouth Linguistic Automation). Sravana Reddy and James Stanford. 2015-2016.

website

DARLA is a suite of automated analysis programs tailored to research questions in sociophonetics.

Chicago Rhyming Poetry Corpus. Morgan Sonderegger and Sravana Reddy. 2011.

data

A collection of rhyming poetry in English and French, manually annotated with rhyme schemes.