Venkat Guruswami: Improved Bounds for Perfect Hashing

Thursday, May 17, 2018 - 4:30pm to 5:30pm
Venkat Guruswami

A code C over alphabet {1,2,...,k} of length n is said to be a k-hash code if every k distinct elements of C have a coordinate where they all differ. Understanding the largest possible rate (in bits), defined as (log_2 |C|)/n, of a k-hash code is a classical problem. It arises in two equivalent contexts: (i)  the smallest size possible for a perfect hash family that maps a universe of prescribed size into {1,2,...,k}, and (ii) the zero-error capacity for decoding with lists of size less than k for a certain combinatorial channel.

A general upper bound of k!/k^{k-1} on the rate of a k-hash code (in the limit of large n) was obtained by Fredman and Komlos in 1984 for any k \ge 4. Arikan (1994) improved the bound for k=4 (from 0.375) to 0.3512. For k > 4, however, the original Fredman-Komlos bound has remained the best known. 

In this talk, I’ll describe two recent works making progress on these bounds. The first (joint with Marco Dalai and Jaikumar Radhakrishnan) improves the rate upper bound to 6/19 < 0.3158 for k=4. The second (with Andrii Riazanov) gives the first improvement over the Fredman-Komlos bound for every k \ge 5.

Our approach is based on a careful combination of the Plotkin-type bounds in coding theory and Hansel's lemma for covering the complete graph by bipartite graphs.