Cryptography & Security | May 1996
Software Protection and Simulation on Oblivious RAMs
This seminal paper establishes the theoretical foundations for software protection through oblivious computation, introducing the concept of Oblivious RAM (ORAM) and demonstrating how to achieve polylogarithmic overhead in hiding memory access patterns during program execution.
The Software Protection Problem
Software piracy represents a fundamental challenge: software is expensive to create but trivial to copy. Traditional approaches lack theoretical foundations. This paper provides rigorous treatment by identifying a key problem—preventing adversaries from learning about programs through execution observation.
The authors distinguish two protection types:
- Protection against illegitimate duplication: Ensuring no efficient method exists for creating executable copies
- Protection against redistribution: Ensuring only the producer can prove authorship
The Hardware-Software Hybrid Approach
Pure software solutions are impossible—any program is ultimately a copyable binary sequence. The paper adopts a Software-Hardware package (SH-package) consisting of:
- A physically shielded CPU with limited protected storage (constant number of registers)
- An encrypted program stored in unprotected memory
- A small ROM in the CPU containing decryption keys
While encryption protects instruction contents, it leaves the memory access pattern visible—a critical vulnerability this paper addresses.
What Can Adversaries Learn?
The paper investigates what adversaries can learn by experimenting with SH-packages. Adversaries can:
- Select inputs and initiate executions
- Observe memory access patterns
- Read and potentially modify CPU-memory communication
- Inspect and alter memory contents (in tampering scenarios)
However, adversaries cannot inspect CPU internal registers or access its random oracle.
Oblivious RAMs: The Core Concept
An Oblivious RAM is a probabilistic RAM where the probability distribution of memory addresses accessed during execution depends only on running time, revealing nothing about actual input beyond execution time.
Key Insight: If two inputs have identical running times, their access patterns must be identically distributed.
The Reduction: Software Protection to Oblivious Simulation
Protection Against Nontampering Adversaries
The solution combines two techniques:
- Oblivious simulation: Hides the access pattern (which addresses are accessed)
- Encryption: Hides communication message values using one-time pad encryption
The CPU maintains a counter and encrypts values as: (v ⊕ f(counter), counter). Since each counter value is used once, this provides perfect security.
Protection Against Tampering Adversaries
For adversaries who can modify messages, the solution adds authentication tags using version numbers. To store encrypted value v at address a:
send (store, a, (v, f(a, version(a), v)))
Upon retrieval, the CPU verifies the tag matches. This prevents value modification, address substitution, and replay attacks.
The "Square Root" Solution
This preliminary approach provides intuition for the final hierarchical solution:
Data Structure:
- First m words: virtual memory content
- Next √m words: "dummy" words
- Last √m words: "shelter" (auxiliary storage)
Algorithm (per epoch of √m virtual accesses):
- Step 1: Randomly permute memory using oblivious sorting
- Step 2: For each virtual access, scan shelter first; if not found, access permuted location; store updated value in shelter
- Step 3: Update permuted memory from shelter contents
Overhead: O(√m · log² m) per virtual access—a significant improvement over trivial O(m) but still not optimal.
The Hierarchical Solution: Main Result
Theorem: Any t ≤ 2^k steps of arbitrary RAM_k can be obliviously simulated by probabilistic-RAM with overhead O((log t)³). Furthermore, simulation is on-line and time-labeled.
Key Innovation: Hierarchy of Buffers
Instead of a single hash table, use multiple buffers with different sizes and reshuffling frequencies:
- Level i buffer: 2^i buckets, each holding O(log t) words
- Epoch length: 2^(i-1) virtual accesses
- Total buffers: N = 1 + ⌈log₂ t⌉
Design Principle: Balance buffer size and reshuffling frequency. Smaller buffers reshuffled frequently (low per-reshuffling cost), larger buffers reshuffled rarely (amortizing higher costs).
Simulation Algorithm
Virtual Access to Address V at Time t:
- Step 1: Scan both buckets of 1st buffer entirely
- Step 2: For each buffer i (from 2 to N):
- If not found: scan appropriate bucket of i-th buffer
- If found: scan dummy bucket
- Step 3: Update V and store in 1st buffer
- Step 4: Periodically rehash buffers (when epoch ends)
Oblivious Hashing: The 12-Step Procedure
Merging buffer contents obliviously requires careful protocol:
- H1-H3: Copy buffers, sort by virtual address, eliminate duplicates (keeping newer versions)
- H4-H7: Add dummies, assign hash tags, check for overflow
- H8-H12: Distribute into buckets maintaining exactly m words per bucket
Key Property: Access pattern is oblivious and independent of actual data values. Overflow probability ≤ n · exp{-m}.
Complexity Analysis
Per Virtual Access Costs:
- Steps 1, 3: 2m accesses (scanning 1st buffer)
- Step 2: N·m ≈ (log t) · (log t) accesses
- Rehashing (amortized): O(m · log m · log t + (log t)² · m)
Total Amortized Overhead: O((log t)³)
Using Batcher's Sorting Network yields O((log t)⁴) in practice, while AKS Sorting Network achieves the theoretical O((log t)³).
Lower Bound: Optimality
Theorem: Any oblivious simulation must make at least Ω(t·log t) accesses to simulate t steps.
Proof via Game Theory: Model simulation as a game where a player (CPU) with limited memory must serve requests while an observer sees only which "cells" are accessed. Counting argument shows q ≥ t · log m accesses required.
Significance: The O(log³ t) upper bound is within a quadratic factor of optimal Ω(log t) lower bound.
Main Theorems
Information-Theoretic Result
There exists a software-protecting compiler C such that for any probabilistic-RAM, C protects software against tampering adversaries with overhead O((log t)³).
Practical Result
Assuming one-way functions exist: There exists a compiler with overhead poly(k) that is secure against all poly(k)-time tampering adversaries.
The practical version replaces random oracles with pseudorandom functions, maintaining security against polynomial-time adversaries.
Extensions and Applications
Secure Distributed Databases
Apply ORAM to databases distributed across trusted sites connected by insecure channels. Sites model memory cells, requesting site plays CPU role, ensuring access patterns reveal nothing about queried data.
Data Structure Checking
Maintain data structures with small reliable memory while most data resides in potentially tampered unreliable memory. Oblivious time-labeled simulation provides both content authentication and access pattern hiding.
Paging-Level Security
All results apply at any execution granularity. For virtual memory systems, replace single page access with oblivious sequence of page accesses.
Implementation Considerations
Protected Hardware Requirements:
- CPU with k bits internal memory (k = 128-256 bits in practice)
- Ability to evaluate pseudorandom functions
- Technology exists: ATM bank machine chips provide such protection
Performance Characteristics:
- Theoretical: O((log t)³) with AKS sorting
- Practical: O((log t)⁴) with Batcher sorting (better constants ≈10)
- Typical slowdown: 10-1000× depending on t
Space Overhead:
- External memory: O(m·(log m)²) factor
- Protected memory: O(k) bits (constant)
Theoretical Significance
Fundamental Contributions:
- Formalization: First rigorous treatment of software protection as computational problem
- Oblivious Computation Model: Introduced ORAM as fundamental abstraction
- Efficient Construction: Proved polylogarithmic overhead possible (vs trivial O(t) overhead)
- Lower Bound: Established Ω(t·log t) necessary, showing construction near-optimal
- Security Reduction: Reduced software protection to standard cryptographic assumptions
Complexity-Theoretic Insights:
- Deterministic simulation requires Ω(m) overhead (optimal trivial solution)
- Randomization enables exponential improvement to polylog overhead
- Time-space tradeoffs in oblivious computation
Impact and Open Problems
Follow-up Research:
- Improved ORAM constructions achieving O(log t) overhead
- Practical ORAM implementations for cloud storage
- Hardware-supported ORAM for secure processors
- Applications to multiparty computation and secure outsourcing
Remaining Open Problems:
- Close gap between O(log³ t) upper and Ω(log t) lower bound
- Extend to support concurrent memory accesses efficiently
- Consider adversaries that adapt to side-channel information
- Extend oblivious simulation to quantum computation model
Limitations and Assumptions
Model Assumptions:
- CPU perfectly shielded (no side channels)
- Random oracle model (justified by pseudorandom functions)
- Sequential execution (no speculation, pipelining)
Practical Concerns:
- Side-channel attacks (timing, power, electromagnetic)
- Cache behavior may leak information
- Speculative execution vulnerabilities (Spectre/Meltdown)
- Performance overhead significant for some applications
Conclusions
This seminal paper establishes theoretical foundations for software protection through oblivious computation. Main achievements:
- Theoretical Framework: Rigorous definitions of software protection, oblivious simulation, and their relationship
- Efficient Construction: Hierarchical ORAM achieving O(log³ t) overhead, exponentially better than previous approaches
- Fundamental Limits: Lower bound of Ω(log t) showing construction near-optimal
- Security Reduction: Connection to standard cryptographic assumptions enables practical implementation
- Broad Applicability: Technique extends to distributed databases, data structure checking, and secure outsourcing
The work demonstrates that strong software protection is theoretically achievable with moderate overhead, transforming an ad-hoc engineering problem into rigorous computer science theory with practical applications.
"Complete software protection requires both content hiding (via encryption) and access pattern hiding (via oblivious simulation). The hierarchical ORAM construction shows that efficient access pattern hiding is possible."
Blog IT