Cryptography & Security | May 1996

Software Protection and Simulation on Oblivious RAMs

Authors: Oded Goldreich & Rafail Ostrovsky (Summary by Blog IT)

This seminal paper establishes the theoretical foundations for software protection through oblivious computation, introducing the concept of Oblivious RAM (ORAM) and demonstrating how to achieve polylogarithmic overhead in hiding memory access patterns during program execution.

The Software Protection Problem

Software piracy represents a fundamental challenge: software is expensive to create but trivial to copy. Traditional approaches lack theoretical foundations. This paper provides rigorous treatment by identifying a key problem—preventing adversaries from learning about programs through execution observation.

The authors distinguish two protection types:

Protection against illegitimate duplication: Ensuring no efficient method exists for creating executable copies
Protection against redistribution: Ensuring only the producer can prove authorship

The Hardware-Software Hybrid Approach

Pure software solutions are impossible—any program is ultimately a copyable binary sequence. The paper adopts a Software-Hardware package (SH-package) consisting of:

A physically shielded CPU with limited protected storage (constant number of registers)
An encrypted program stored in unprotected memory
A small ROM in the CPU containing decryption keys

While encryption protects instruction contents, it leaves the memory access pattern visible—a critical vulnerability this paper addresses.

What Can Adversaries Learn?

The paper investigates what adversaries can learn by experimenting with SH-packages. Adversaries can:

Select inputs and initiate executions
Observe memory access patterns
Read and potentially modify CPU-memory communication
Inspect and alter memory contents (in tampering scenarios)

However, adversaries cannot inspect CPU internal registers or access its random oracle.

Oblivious RAMs: The Core Concept

An Oblivious RAM is a probabilistic RAM where the probability distribution of memory addresses accessed during execution depends only on running time, revealing nothing about actual input beyond execution time.

Key Insight: If two inputs have identical running times, their access patterns must be identically distributed.

The Reduction: Software Protection to Oblivious Simulation

Protection Against Nontampering Adversaries

The solution combines two techniques:

Oblivious simulation: Hides the access pattern (which addresses are accessed)
Encryption: Hides communication message values using one-time pad encryption

The CPU maintains a counter and encrypts values as: (v ⊕ f(counter), counter). Since each counter value is used once, this provides perfect security.

Protection Against Tampering Adversaries

For adversaries who can modify messages, the solution adds authentication tags using version numbers. To store encrypted value v at address a:

send (store, a, (v, f(a, version(a), v)))

Upon retrieval, the CPU verifies the tag matches. This prevents value modification, address substitution, and replay attacks.

The "Square Root" Solution

This preliminary approach provides intuition for the final hierarchical solution:

Data Structure:

First m words: virtual memory content
Next √m words: "dummy" words
Last √m words: "shelter" (auxiliary storage)

Algorithm (per epoch of √m virtual accesses):

Step 1: Randomly permute memory using oblivious sorting
Step 2: For each virtual access, scan shelter first; if not found, access permuted location; store updated value in shelter
Step 3: Update permuted memory from shelter contents

Overhead: O(√m · log² m) per virtual access—a significant improvement over trivial O(m) but still not optimal.

The Hierarchical Solution: Main Result

Theorem: Any t ≤ 2^k steps of arbitrary RAM_k can be obliviously simulated by probabilistic-RAM with overhead O((log t)³). Furthermore, simulation is on-line and time-labeled.

Key Innovation: Hierarchy of Buffers

Instead of a single hash table, use multiple buffers with different sizes and reshuffling frequencies:

Level i buffer: 2^i buckets, each holding O(log t) words
Epoch length: 2^(i-1) virtual accesses
Total buffers: N = 1 + ⌈log₂ t⌉

Design Principle: Balance buffer size and reshuffling frequency. Smaller buffers reshuffled frequently (low per-reshuffling cost), larger buffers reshuffled rarely (amortizing higher costs).

Simulation Algorithm

Virtual Access to Address V at Time t:

Step 1: Scan both buckets of 1st buffer entirely
Step 2: For each buffer i (from 2 to N):
- If not found: scan appropriate bucket of i-th buffer
- If found: scan dummy bucket
Step 3: Update V and store in 1st buffer
Step 4: Periodically rehash buffers (when epoch ends)

Oblivious Hashing: The 12-Step Procedure

Merging buffer contents obliviously requires careful protocol:

H1-H3: Copy buffers, sort by virtual address, eliminate duplicates (keeping newer versions)
H4-H7: Add dummies, assign hash tags, check for overflow
H8-H12: Distribute into buckets maintaining exactly m words per bucket

Key Property: Access pattern is oblivious and independent of actual data values. Overflow probability ≤ n · exp{-m}.

Complexity Analysis

Per Virtual Access Costs:

Steps 1, 3: 2m accesses (scanning 1st buffer)
Step 2: N·m ≈ (log t) · (log t) accesses
Rehashing (amortized): O(m · log m · log t + (log t)² · m)

Total Amortized Overhead: O((log t)³)

Using Batcher's Sorting Network yields O((log t)⁴) in practice, while AKS Sorting Network achieves the theoretical O((log t)³).

Lower Bound: Optimality

Theorem: Any oblivious simulation must make at least Ω(t·log t) accesses to simulate t steps.

Proof via Game Theory: Model simulation as a game where a player (CPU) with limited memory must serve requests while an observer sees only which "cells" are accessed. Counting argument shows q ≥ t · log m accesses required.

Significance: The O(log³ t) upper bound is within a quadratic factor of optimal Ω(log t) lower bound.

Main Theorems

Information-Theoretic Result

There exists a software-protecting compiler C such that for any probabilistic-RAM, C protects software against tampering adversaries with overhead O((log t)³).

Practical Result

Assuming one-way functions exist: There exists a compiler with overhead poly(k) that is secure against all poly(k)-time tampering adversaries.

The practical version replaces random oracles with pseudorandom functions, maintaining security against polynomial-time adversaries.

Extensions and Applications

Secure Distributed Databases

Apply ORAM to databases distributed across trusted sites connected by insecure channels. Sites model memory cells, requesting site plays CPU role, ensuring access patterns reveal nothing about queried data.

Data Structure Checking

Maintain data structures with small reliable memory while most data resides in potentially tampered unreliable memory. Oblivious time-labeled simulation provides both content authentication and access pattern hiding.

Paging-Level Security

All results apply at any execution granularity. For virtual memory systems, replace single page access with oblivious sequence of page accesses.

Implementation Considerations

Protected Hardware Requirements:

CPU with k bits internal memory (k = 128-256 bits in practice)
Ability to evaluate pseudorandom functions
Technology exists: ATM bank machine chips provide such protection

Performance Characteristics:

Theoretical: O((log t)³) with AKS sorting
Practical: O((log t)⁴) with Batcher sorting (better constants ≈10)
Typical slowdown: 10-1000× depending on t

Space Overhead:

External memory: O(m·(log m)²) factor
Protected memory: O(k) bits (constant)

Theoretical Significance

Fundamental Contributions:

Formalization: First rigorous treatment of software protection as computational problem
Oblivious Computation Model: Introduced ORAM as fundamental abstraction
Efficient Construction: Proved polylogarithmic overhead possible (vs trivial O(t) overhead)
Lower Bound: Established Ω(t·log t) necessary, showing construction near-optimal
Security Reduction: Reduced software protection to standard cryptographic assumptions

Complexity-Theoretic Insights:

Deterministic simulation requires Ω(m) overhead (optimal trivial solution)
Randomization enables exponential improvement to polylog overhead
Time-space tradeoffs in oblivious computation

Impact and Open Problems

Follow-up Research:

Improved ORAM constructions achieving O(log t) overhead
Practical ORAM implementations for cloud storage
Hardware-supported ORAM for secure processors
Applications to multiparty computation and secure outsourcing

Remaining Open Problems:

Close gap between O(log³ t) upper and Ω(log t) lower bound
Extend to support concurrent memory accesses efficiently
Consider adversaries that adapt to side-channel information
Extend oblivious simulation to quantum computation model

Limitations and Assumptions

Model Assumptions:

CPU perfectly shielded (no side channels)
Random oracle model (justified by pseudorandom functions)
Sequential execution (no speculation, pipelining)

Practical Concerns:

Side-channel attacks (timing, power, electromagnetic)
Cache behavior may leak information
Speculative execution vulnerabilities (Spectre/Meltdown)
Performance overhead significant for some applications

Conclusions

This seminal paper establishes theoretical foundations for software protection through oblivious computation. Main achievements:

Theoretical Framework: Rigorous definitions of software protection, oblivious simulation, and their relationship
Efficient Construction: Hierarchical ORAM achieving O(log³ t) overhead, exponentially better than previous approaches
Fundamental Limits: Lower bound of Ω(log t) showing construction near-optimal
Security Reduction: Connection to standard cryptographic assumptions enables practical implementation
Broad Applicability: Technique extends to distributed databases, data structure checking, and secure outsourcing

The work demonstrates that strong software protection is theoretically achievable with moderate overhead, transforming an ad-hoc engineering problem into rigorous computer science theory with practical applications.

"Complete software protection requires both content hiding (via encryption) and access pattern hiding (via oblivious simulation). The hierarchical ORAM construction shows that efficient access pattern hiding is possible."