Sunteți pe pagina 1din 1

A Compilation Target for Probabilistic Programming Languages

#include "probabilistic.h" completely independently from the point where fork was
#define N 10
called. While copying program execution state may naı̈vely
// Observed data
static double data[N] = { 1.0, 1.1, 1.2,
sound like a costly operation, this actually can be rather
-1.0, -1.5, -2.0, efficient: when fork is called, a lazy copy-on-write pro-
0.001, 0.01, 0.005, 0.0 };
cedure is used to avoid deep copying the entire program
// Struct holding mean and variance parameters for each cluster
typedef struct theta {
memory. Instead, initially only the pagetable is copied to
double mu; the new process; when an existing variable is modified in
double var;
} theta; the new program copy, then and only then are memory con-
// Draws a sample of theta from a normal-gamma prior tents duplicated. The overall cost of forking a program is
theta draw_theta() {
double variance = 1.0 / gamma_rng(1, 1);
proportional to the fraction of memory which is rewritten
return (theta) { normal_rng(0, variance), variance }; by the child process (Smith & Maguire, 1988).
}

// Get the class id for a given observation index Using fork we can branch a single program execution state
static polya_urn_state urn;
void get_class(int *index, int *class_id) {
and explore many possible downstream execution paths.
*class_id = polya_urn_draw(&urn); Each of these paths runs as its own process, and will run
}
in parallel with other processes. In general, multiple pro-
int main(int argc, char **argv) {
double alpha = 1.0;
cesses run in their own memory address space, and do not
polya_urn_new(&urn, alpha); communicate or share state. We handle inter-process com-
mem_func mem_get_class; munication via a small shared memory segment; the details
memoize(&mem_get_class, get_class, sizeof(int), sizeof(int));
of what global data must be stored are provided later.
theta params[N];
bool known_params[N] = { false }; Synchronization between processes is handled via mutual
int class; exclusion locks (mutex objects). Mutexes become partic-
for (int n=0; n<N; n++) {
mem_invoke(&mem_get_class, &n, &class); ularly useful for us when used in conjunction with a syn-
if (!known_params[class]) {
params[class] = draw_theta();
chronized counter to create a barrier, a high-level blocking
known_params[class] = true; construct which prevents any process proceeding in exe-
}
o b s e r v e (normal_lnp(data[n], params[class].mu, cution state beyond the barrier until some fixed number of
}
params[class].var));
processes have arrived.
// Predict number of classes
p r e d i c t ("num_classes,%2d\n", urn.len_buckets);
3. Inference
// Release memory; exit
polya_urn_free(&urn);
return 0;
3.1. Probability of a program execution trace
}
To notate the probability of a program execution trace,
we enumerate all N observe statements, and the asso-
Figure 3. A infinite mixture of Gaussians on the real line. Class ciated observed data points y1 , . . . , yN . During a single
assignment variables for each of the 10 data points are drawn fol-
run of the program, some total number N 0 random choices
lowing a Blackwell-MacQueen urn scheme to sequentially sample
x01 , . . . , x0N 0 are made. While N 0 may vary between indi-
from a Dirichlet process.
vidual executions of the program, we require that the num-
ber of observe directive calls N is constant.
2.1. Operating system primitives
The observations yn can appear at any point in the program
Inference proceeds by drawing posterior samples from the source code and define a partition of the random choices
space of program execution traces. We define an execution x01:N 0 into N subsequences x1:N , where each xn contains
trace as the sequence of memory states (the entire virtual all random choices made up to observing yn but excluding
memory address space) that arises during the sequential any random choices prior to observation yn 1 . We can then
step execution of machine instructions. define the probability of any single program execution trace
The algorithms we propose for inference in probabilistic N
Y
programs map directly onto standard computer operating p(y1:N , x1:N ) = g(yn |x1:n )f (xn |x1:n 1) (9)
system constructs, exposed in POSIX-compliant operating n=1
systems including Linux, BSD, and Mac OS X. The cor-
In this manner, any model with a generative process that
nerstone of our approach is POSIX fork (Open Group,
can be written in C code with stochastic choices can be
2004b). When a process forks, it clones itself, creating a
represented in this sequential form in the space of program
new process with an identical copy of the execution state
execution traces.
of the original process, and identical source code; both
processes then continue with normal program execution Each observe statement takes as its argument

S-ar putea să vă placă și