CSE logo College of Engineering Slate Lab Logo CSE Website


  • Welcome
  • IEEE SPS Best Paper Award
  • OCWIC Best Poster

The Speech and Language Technologies Lab, part of the Laboratory for Artificial Intelligence Research at OSU, conducts research in automatic speech recognition and computational linguistics.

Use the tabs above to learn about recent news in the lab, or you can learn more about:

Drs. Jeremy Morris and Eric Fosler-Lussier have been honored with a 2010 IEEE Signal Processing Society Best Paper Award for their 2008 paper, "Conditional Random Fields for Integrating Local Discriminative Classifiers," published in the IEEE Transactions on Audio, Speech, and Language Processing. In this work, Morris and Fosler-Lussier explore the novel use of the Conditional Random Field (CRF) paradigm in an Automatic Speech Recognition (ASR) system. CRFs are a statistical framework that allows for combination of correlated sources of evidence in a time sequence; the article examines how this framework can be used to incorporate short-term estimates of speech sounds in determining what was said in a speech utterance. These estimates can express probabilities over sound classes (e.g., is this snippet of sound a "t" or "ah"?) or phonological classes (e.g., is this snippet a vowel? a nasal consonant?). They compare phonetic recognition using CRFs to a standard Hidden Markov Model (HMM) ASR system, and show comparable or better performance in their system while minimizing the number of free parameters in the system.

The Signal Processing Society Best Paper Award is presented to up to six journal papers annually across all of the Society's Transactions; it "honors the authors of a paper of exceptional merit dealing with a subject related to the Society's technical scope." Papers published within the last five years in one of the Society's Transactions are eligible; nominations arise from one of the society's technical committees or editorial boards.

The award will be presented in May 2011 in a ceremony at the International Conference on Acoustics, Speech, and Signal Processing(ICASSP 2011) in Prague, Czech Republic.

Preethi Raghavan, along with advisors Dr. Albert Lai and Dr. Chris Brew, was awarded Best Poster at the 2011 Ohio Celebration of Women in Computing (OCWIC) for her poster Leveraging Natural Language Processing of Clinical Narratives for Clinical Phenotype Modeling. Preethi will be able to present her poster at the 2011 Grace Hopper Celebration of Women in Computing.

Poster abstract
We explore the application of state of the art natural language processing techniques to clinical narratives, such as medical admission notes, discharge summaries, progress notes, and radiology reports to extract information of interest. The objective of this knowledge discovery task is the ability to generate a chronology of events, for a given patient, also enabling identification of groups of patients with common defining characteristics. This in turn facilitates efficient information retrieval and enables patient specific question answering. Clinical narratives exhibit a unique medical sub‐language with characteristics such as semantic categorization of words, co‐ occurrence of patterns and constraints, domain specific terminology, incomplete phrases and omission of information. They are also replete with events and temporal expressions. Significant linguistic analysis in conjunction with medical domain knowledge is required to represent and reason with temporal information effectively. We approach this problem by primarily focusing on following aspects: 1) Representing temporal expressions and events found in clinical narratives using the formalism of temporal constraint networks and identifying tractable approaches to temporal constraint reasoning 2) Event co‐reference resolution across narratives using a probabilistic discourse grammar based approach. 3) A language modeling approach to model the inherent uncertainty of the temporal content in clinical narratives to ensure retrieval effectiveness. 4) Information fusion that includes integrating the list of obtained events from all across all unstructured clinical narratives with structured data such as lab values. We plan on empirically evaluating and demonstrating our approach on patient data obtained from the Ohio State University Medical Center (OSUMC) with the help of two use case scenarios: 1) Automatically identify groups of patients who satisfy specific constraints and are hence eligible for particular research studies that try to answer scientific questions and to find better ways to prevent, diagnose, or treat a disease (i.e. clinical trials), 2) Augmenting the limited description labels on biological specimens, stored in repositories for research purposes, with the extracted patient specific information. This significantly improves the overall search experience, as the accuracy when searching for specimens matching multiple criteria increases.