Talk: Barna Saha: Language Edit Distance and Connection to Fundamental Graph Problems

Thursday, October 8, 2015 - 4:45pm to 5:45pm
Barna Saha
UMAss Amherst

Abstract: In this presentation we will discuss two fundamental problems that generalize string edit distance computation and parsing of context free grammars.

(1)  Language Edit Distance: Given a string s and a context free language L(G), the language edit distance problem determines the minimum number of insertions, deletions and substitutions required to convert s to a valid member of the language. In 1972 Aho and Peterson gave a dynamic programming algorithm that solves this problem in O(n^3) time (ignoring grammar size) where is the string length. Despite its vast number of applications, in forty years there has been no significant improvement over this cubic dynamic programming algorithm.

(2) Stochastic Parsing: This problem asks for the maximum likelihood parsing of a stochastic context free grammar. Stochastic context free grammars are at the heart of statistical natural language processing. They generalize hidden Markov models, and have found widespread applications. Despite many efforts, no subcubic exact or approximate algorithm exists for this problem.

In this talk, we will present the first subcubic nearly optimal algorithms for these problems, and discuss the consequences of exact subcubic algorithms for them (to appear in FOCS’15). Further, we will discuss our recent progress on developing exact subcubic algorithms for the language edit distance problem, and how our techniques lead to better algorithms for several versions of the all-pairs shortest path and the 3-SUM problem.