Yupeng Zhang: Verifiable Databases and RAM Programs

Friday, February 9, 2018 - 10:30am to 12:00pm
Location: 
Hewlett, G882
Speaker: 
Yupeng Zhang, University of Maryland, College Park
Abstract: The talk covers the following two papers:
 
vSQL: Verifying Arbitrary SQL Queries over Dynamic Outsourced Databases
 
Cloud database systems such as Amazon RDS or Google Cloud SQL enable the outsourcing of a large database to a server who then responds to SQL queries. A natural problem here is to efficiently verify the correctness of responses returned by the (untrusted) server. 
 
In this paper we present vSQL, a novel cryptographic protocol for publicly verifiable SQL queries on dynamic databases. At a high level, our construction relies on two extensions of the CMT interactive-proof protocol [Cormode et al., 2012]: (i) supporting outsourced input via the use of a polynomial-delegation protocol with succinct proofs, and (ii) supporting auxiliary input (i.e., non-deterministic computation) efficiently. Compared to previous verifiable-computation systems based on interactive proofs, our construction has verification cost polylogarithmic in the auxiliary input (which for SQL queries can be as large as the database) rather than linear. 
 
In order to evaluate the performance and expressiveness of our scheme, we tested it on SQL queries based on the TPC-H benchmark on a database with 6 x 106 rows and 13 columns. The server overhead in our scheme (which is typically the main bottleneck) is up to 120 x lower than previous approaches based on succinct arguments of knowledge (SNARKs), and moreover we avoid the need for query-dependent pre-processing which is required by optimized SNARK-based schemes. In our construction, the server/client time and the communication cost are comparable to, and sometimes smaller than, those of existing customized solutions which only support specific queries.
 
vRAM: Faster Verifiable RAM With Program-Independent Preprocessing
 
We study the problem of verifiable computation (VC) for RAM programs, where a computationally weak verifier outsources the execution of a program to a powerful (but untrusted) prover. Existing efficient implementations of VC protocols require an expensive preprocessing phase that binds the parties to a single circuit. (While there are schemes that avoid preprocessing entirely, their performance remains significantly worse than constructions with preprocessing.) Thus, a prover and verifier are forced to choose between two approaches: (1) Allow verification of arbitrary RAM programs, at the expense of efficiency, by preprocessing a universal circuit which can handle all possible instructions during each CPU cycle; or (2) Sacrifice expressiveness by preprocessing an efficient circuit which is tailored to the verification of a single specific RAM program. 
 
We present vRAM, a VC system for RAM programs that avoids both the above drawbacks by having a preprocessing phase that is entirely circuit-independent (other than an upper bound on the circuit size). During the proving phase, once the program to be verified and its inputs are chosen, the circuit-independence of our construction allows the parties to use a smaller circuit tailored to verifying the specific program on the chosen inputs, i.e., without needing to encode all possible instructions in each cycle. Moreover, our construction is the first with asymptotically optimal prover overhead; i.e., the work of the prover is a constant multiplicative factor of the time to execute the program. 
 
Our experimental evaluation demonstrates that vRAM reduces the prover's memory consumption by 55-110x and its running time by 9-30x compared to existing schemes with universal preprocessing. This allows us to scale to RAM computations with more than 2 million CPU cycles, a 65x improvement compared to the state of the art. Finally, vRAM has performance comparable to (and sometimes better than) the best existing scheme with program-specific preprocessing despite the fact that the latter can deploy program-specific optimizations (and has to pay a separate preprocessing cost for every new program).