Friday, March 9, 2018 - 10:30am to 12:00pm

Location:

Hewlett, G882

Speaker:

Hayim Shaul

Seminar group:

Abstract:

Given a set S of n d-dimensional points, the k-nearest neighbors (kNN) is the problem of quickly finding k points in S that are nearest to a query point q.

The k-nearest neighbors problem has applications in machine learning for classifications and regression and also in searching.

The secure version of kNN where either q or S are encrypted, has applications such as providing services over sensitive (such as medical or localization) data.

In this work we present a scalable and efficient algorithm for solving kNN with Fully Homomorphic Encryption (FHE).

The efficiency stems from the fact that it is realized by a polynomial whose degree is independent of n, the number of points, or d the dimension.

We implemented our algorithm based on HELib implementation for the Brakerski-Gentry-Vaikuntanthan’s FHE scheme, and ran experiments on a standard server.

Our algorithm computes the set of k=13 nearest points to a query point q for databases with more than a thousand points in close to half an hour, compared to several hours via existing approaches.

Our result introduces a statistical coreset, which is a data summarization technique that allows sums such as moments to be efficiently and scalably approximated. As a central tool, we design a new coin toss technique which we use to build the coreset. This coin toss technique and computation of moments may be of independent interest.

Given a set S of n d-dimensional points, the k-nearest neighbors (kNN) is the problem of quickly finding k points in S that are nearest to a query point q.

The k-nearest neighbors problem has applications in machine learning for classifications and regression and also in searching.

The secure version of kNN where either q or S are encrypted, has applications such as providing services over sensitive (such as medical or localization) data.

In this work we present a scalable and efficient algorithm for solving kNN with Fully Homomorphic Encryption (FHE).

The efficiency stems from the fact that it is realized by a polynomial whose degree is independent of n, the number of points, or d the dimension.

We implemented our algorithm based on HELib implementation for the Brakerski-Gentry-

Our algorithm computes the set of k=13 nearest points to a query point q for databases with more than a thousand points in close to half an hour, compared to several hours via existing approaches.

Our result introduces a statistical coreset, which is a data summarization technique that allows sums such as moments to be efficiently and scalably approximated. As a central tool, we design a new coin toss technique which we use to build the coreset. This coin toss technique and computation of moments may be of independent interest.

Bio: Hayim Shaul is a postdoc in MIT working with Daniela Rus. His research interests are in building efficient secure systems. Prior to being a postdoc Hayim did his PhD in computational geometry in Tel-Aviv University under the supervision of Micha Sharir.