Documente Academic
Documente Profesional
Documente Cultură
(a) Classic database operation (b) Two-tier stack with (c) Noria: stateful data-flow operators pre-compute data for
with compute on reads. demand-filled cache [54, §2]. reads incrementally; data-flow change supports new queries.
Figure 1: Overview of how current website backends and Noria process frontend reads and writes.
out care, such races could produce permanently incorrect crease for Noria-optimized applications. When serving
state, and therefore incorrect cached query results. the Lobsters web application on a single Amazon EC2
The state that Noria keeps is similar to a material- VM, our prototype outperforms the default MySQL-
ized view, and its data-flow processing is akin to view based backend by 5× while simultaneously simplifying
maintenance [2, 37]. Noria demonstrates that, contrary the application (§8.1). For a representative query, our
to conventional wisdom, maintaining materialized views prototype outperforms the widely-used MySQL/mem-
for all application queries is feasible. This is possible cached stack and the materialized views of a commer-
because partially-stateful operators can evict rarely-used cial database by 2–10× (§8.2). It also scales the query
state, and discard writes for that state, which reduces to millions of writes and tens of millions of reads per
state size and write load. Noria further avoids redundant second on a cluster of EC2 VMs, outperforming a state-
computation and state by jointly optimizing its queries to of-the-art data-flow system, differential dataflow [46, 51]
merge overlapping data-flow subgraphs. (§8.3). Finally, our prototype adapts the data-flow with-
Few existing streaming data-flow systems can change out any perceptible downtime for reads or writes when
their queries and input schemas without downtime. For transitioning the same query to a modified version (§8.5).
example, Naiad must re-start to accommodate changes, Nevertheless, our current prototype has some limita-
and Spark’s Structured Streaming must restart from a tions. It only guarantees eventual consistency; its evic-
checkpoint [18]. Noria, by contrast, adapts its data-flow tion from partial state is randomized; it is inefficient for
to new queries without interrupting existing clients. It ap- sharded queries that require shuffles in the data-flow; and
plies changes while retaining existing state and while re- it lacks support for some SQL keywords. We plan to ad-
maining live for reads throughout. Writes from current dress these limitations in future work.
clients see sub-second interruptions in the common case.
Noria’s techniques remain compatible with traditional 2 Background
parallel and distributed data-flow, and allow Noria to
parallelize and scale fine-grained, partially materialized We now explain how current website backends and Noria
view maintenance over multiple cores and machines. process data. Figure 1 shows an overview.
In summary, Noria makes four principal contributions: Many web applications use a relational database to
1. the partially-stateful data-flow model, its correct- store and query data (Figure 1a). Page views generate
ness invariants, and a conforming system design; database queries that frequently require complex compu-
2. automatic merge-and-reuse techniques for data- tation, and the query load tends to be read-heavy. Across
flow subgraphs in joint data-flows over many one month of traffic data from a HotCRP site and the
queries, which reduce processing cost and state size; production deployment of Lobsters [32], 88% to 97%
3. near-instantaneous, dynamic transitions for data- of queries are reads (SELECT queries), and these reads
flow graphs in response to changes to queries or consume 88% of total query execution time in HotCRP.
schema without loss of existing state; and Since read performance is important, application devel-
4. a prototype implementation and an evaluation that opers often manually optimize it. For example, Lob-
demonstrates that practical web applications benefit sters stores individual votes for stories in a votes ta-
from Noria’s approach. ble, but also stores per-story vote counts as a column in
Our Noria prototype exposes a backwards-compatible the stories table. This speeds up read queries of vote
MySQL protocol interface and can serve real web appli- counts, but “de-normalizes” the schema and complicates
cations with minimal changes, although its benefits in- vote writes, which must update the derived counts.
Websites often deploy an in-memory key-value 1 /* base tables */
cache (like Redis, memcached, or TAO [8]) to speed 2 CREATE TABLE stories
up common-case read queries (Figure 1b). Such a 3 (id int, author int, title text, url text);
CREATE TABLE votes (user int, story_id int);
cache avoids re-evaluating the query when the under- 4
5 CREATE TABLE users (id int, username text);
lying records are unchanged. However, the application 6 /* internal view: vote count per story */
must invalidate or replace cache entries as the records 7 CREATE INTERNAL VIEW VoteCount AS
change. This process is error-prone and requires complex 8 SELECT story_id, COUNT(*) AS vcount
FROM votes GROUP BY story_id;
application-side logic [37, 48, 57, 64]. For example, de- 9
10 /* external view: story details */
velopers must carefully avoid performance collapse due 11 CREATE VIEW StoriesWithVC AS
to “thundering herds” (viz., many database queries issued 12 SELECT id, author, title, url, vcount
just after an invalidation) [54, 57]. Since the cache can 13 FROM stories
14 JOIN VoteCount ON VoteCount.story_id = stories.id
return stale records, reads are eventually-consistent. 15 WHERE stories.id = ?;
Some sites use stream-processing systems [13, 39] to
maintain results for queries whose re-execution over all Figure 2: Noria program for a key subset of the Lobsters
past data is infeasible. One major problem for these sys- news aggregator [43] that counts users’ votes for stories.
tems is that they must maintain state at some operators,
such as aggregations. To avoid unbounded growth, exist-
compute and store in base tables for performance. Views,
ing systems “window” this state by limiting it to the most
by contrast, will likely be larger than a typical cache foot-
recent records. This makes it difficult for a stream pro-
print, because Noria derives more data, including some
cessor to serve the general queries needed for websites,
intermediate results. Noria stores base tables persistently
which need to access older as well as recent state. More-
on disk, either on one server or sharded across multiple
over, stream processors are less flexible than a database
servers, but stores views in server memory. The applica-
that can execute any relational query on its schema: in-
tion’s working set in these views should fit in memory
troducing a new query often requires a restart.
for good performance, but Noria reduces memory use by
Noria, as shown in Figure 1c, combines the best of
only materializing records that are actually read, and by
these worlds. It supports the fast reads of key-value
evicting infrequently-accessed data.
caches, the efficient updates and parallelism of streaming
data-flow, and, like a classic database, supports changing 3.2 Programming interface
queries and base table schemas without downtime. Applications interact with Noria via an interface that
resembles parameterized SQL queries. The application
3 Noria design
supplies a Noria program, which registers base tables
Noria is a stateful, dynamic, parallel, and distributed and views with parameters supplied by the application
data-flow system designed for the storage, query process- when it retrieves data. Figure 2 shows an example Noria
ing, and caching needs of typical web applications. program for a Lobsters-like news aggregator application
(? is a parameter). The Noria program includes base ta-
3.1 Target applications and deployment
ble definitions, internal views used as shorthands in other
Noria targets read-heavy applications that tolerate even- expressions, and external views that the application later
tual consistency. Many web applications fit this model: queries. Internally, Noria instantiates a data-flow to con-
they accept the eventual consistency imposed by caches tinuously process the application’s writes through this
that make common-case reads fast [15, 19, 54, 72]. No- program, which in turn maintains the external views.
ria’s current design primarily targets relational operators, To retrieve data, the application supplies Noria with an
rather than the iterative or graph computations that are external view identifier (e.g., StoriesWithVC) and one
the focus of other data-flow systems [46, 51], and pro- or more sets of parameter values. Noria then responds
cesses structured records in tabular form [12, 16]. Large with the records in the view that match those values.
blobs (e.g., videos, PDF files) are best stored in external To modify records in base tables, the application per-
blob stores [7, 24, 50] and referenced by Noria’s records. forms insertions, updates, and deletions, similar to a SQL
Noria runs on one or more multicore servers that com- database. Noria applies these changes to the appropriate
municate with clients and with one another using RPCs. base tables and updates dependent views.
A Noria deployment stores both base tables and derived The application may change its Noria program to add
views. Roughly, base tables contain the data typically new views, to modify or remove existing views, and to
stored persistently, and derived views hold data an appli- adapt base table schemas. Noria expects such changes
cation might choose to cache. Compared to conventional to be common and aims to complete them quickly. This
database use, Noria base tables might be smaller, as No- contrasts with most previous data-flow systems, which
ria derives data that an application may otherwise pre- lack support for efficient changes without downtime.
... ...
I upstream II upstream for example, an operator that aggregates votes by user ID
∑ SUM state ∑ SUM state requires a user ID index to process new votes efficiently.
2 upquery
...
In most stream processors, join operators keep a win-
into
1 incoming σ FILTER upstream σ FILTER 3 upquery dowed cache of their inputs [3, 76], allowing an up-
response
record state date arriving at one input to join with all relevant state
at join
triggers ⨝ JOIN ⨝ JOIN from the other. In Noria, joins instead perform upqueries,
upquery which are requests for matching records from stateful an-
... ... cestors (Figure 3): when an update arrives at one join
Figure 3: Noria’s data-flow operators can query into up- input, the join looks up the relevant state by querying
stream state: a join issues an upquery (I) to retrieve a its other inputs. This reduces Noria’s space overhead,
...
record from upstream state to produce a join result (II). since joins often need not store duplicate state, but re-
quires care in the presence of concurrent updates, an is-
sue further discussed in §4. Upqueries also impose in-
In addition to its native SQL-based query interface, dexing obligations that Noria detects and satisfies.
Noria provides an implementation of the MySQL bi-
nary protocol, which allows existing applications that use 3.4 Consistency semantics
prepared statements against a MySQL database to in- To achieve high parallel processing performance, Noria’s
teract with Noria without further changes. The adapter data-flow avoids global progress tracking or coordina-
turns ad-hoc queries and prepared SQL statements into tion. An update injected by a base table takes time to
writes to base tables, reads from external views, and in- propagate through the data-flow, and the update may ap-
crementally effects Noria program changes. Noria sup- pear in different views at different times. Noria opera-
ports much, but not all, SQL syntax. We discuss the ex- tors and the contents of its external views are eventually-
perience of building and porting applications in §7. consistent. Eventual consistency is attractive for perfor-
3.3 Data-flow execution mance and scalability, and is sufficient for many web ap-
plications [15, 54, 72].
Noria’s data-flow is a directed acyclic graph of relational Noria does ensure that if writes quiesce, all external
operators such as aggregations, joins, and filters. Base views eventually hold results that are the same as if the
tables are the roots of this graph, and external views form queries had been executed directly against the base ta-
the leaves. Noria extends the graph with new base tables, ble data. Making this work correctly requires some care.
operators, and views as the application adds new queries. Like most data-flow systems, Noria requires that opera-
When an application write arrives, Noria applies it to tors are deterministic functions over their own state and
a durable base table and injects it into the data-flow as the inputs from their ancestors. In addition, Noria must
an update. Operators process the update and emit de- avoid races between updates and upqueries; avoid re-
rived updates to their children; eventually updates reach ordering updates on the same data-flow path; and resolve
and modify the external views. Updates are deltas [46, races between related updates that arrive independently
60] that can add to, modify, and remove from down- at multi-ancestor operators via different data-flow paths.
stream state. For example, a count operator emits deltas Consider an OR that combines filters using a union oper-
that indicate how the count for a key has changed; a ator, or a join between data-flow paths connected to the
join may emit an update that installs new rows in down- same base table: such operators’ final output (and state)
stream state; and a deletion from a base table generates must be commutative over the order in which updates
a “negative” update that revokes derived records. Neg- arrive at their inputs. The standard relational operators
ative updates remove entries when Noria applies them Noria supports have this property.
to state, and retain their negative “sign” when combined Web applications sometimes rely on database trans-
with other records (e.g., through joins). Negative updates actions, e.g., to atomically update pre-computed val-
hold exactly the same values as the positives they revoke ues. Noria approach’s is compatible with basic,
and thus follow the same data-flow paths. optimistically-concurrent multi-statement transactions,
Noria supports stateless and stateful operators. State- but Noria also often obviates the need for them. For ex-
less operators, such as filters and projections, need no ample, Lobsters uses transactions only to avoid write-
context to process updates; stateful operators, such as write conflicts on vote counts and stories’ “hotness”
count, min/max, and top-k, maintain state to avoid inef- scores. A multi-statement transaction is required only be-
ficient re-computation of aggregate values from scratch. cause baseline Lobsters pre-computes hotness for perfor-
Stateful operators, like external views, keep one or more mance. Noria instead computes hotness in the data-flow,
indexes to speed up operation. Noria adds indexes based which avoids write-write conflicts without a transaction,
on indexing obligations imposed by operator semantics; albeit at the cost of eventual consistency for reads. We
... ... stories votes
I II 3 recursive upquery hits
id author text user story_id
∑ SUM k x 7 ∑ SUM k x 7
0 u3 a u7 0 Te = { u1 1 , u3 1 }
k y 2 k y 2 1 u1 b u1 1
2 recursive u3 1 VoteCount
upquery u3 1 story_id vcount
misses,
recurses
∑ SUM
k
∑ SUM
k 9
⨝ JOIN ∑ COUNT
0
1 1 e
id:story_id
k 9 StoriesWithVC
1 read
Se = { u1 1 }
story_id author text vcount
misses 4 upquery response 0
k k 9 1 u1 b 2
De = { 1 u1 b 2 }
fills missing record
Figure 4: A partially-stateful view sends a recursive up- Figure 5: Definitions for partial state entry e (yellow)
query to derive evicted state (⊥) for key k from upstream in VoteCount: an in-flight update from votes (blue) is
state (I); the response fills the missing state (II). in Te , but not yet in Se ; the entry in StoriesWithVC is
key-descendant from e via story id (green).
omit further discussion of transactions with Noria in this
paper; we plan to describe them in future work. ing operator—while (possibly slow) upqueries are in
3.5 Challenges flight. These requirements complicate the design.
An efficient Noria design faces two key challenges: first, 4.1 Data-flow model and invariants
it must limit the size of its state and views (§4); and sec-
ond, changes to the Noria program must adapt the data- We first describe high-level correctness invariants of No-
flow without downtime in serving clients (§5). ria’s partially-stateful data-flow. These invariants ensure
that Noria remains eventually-consistent and never re-
4 Partially-stateful data-flow turns results contaminated by duplicate, missing, or spu-
Noria must limit the size of its views, as the state for rious updates. Since Noria allows operators to execute in
an application with many queries could exceed available parallel to take advantage of multicore processors, these
memory and become too expensive to maintain. invariants must hold in the presence of concurrent up-
The partially-stateful data-flow model lets operators dates and eviction notices. The invariants concern state
maintain only a subset of their state. This concept of par- entries, where a state entry models one record in one op-
tial materialization is well-known for materialized views erator or view. Data-flow implementations derive state
in databases [79, 80], but novel to data-flow systems. Par- entry values from input records, possibly after multi-
tial state reduces memory use, allows eviction of rarely- ple steps. For ease of expression, we model a state en-
used state, and relieves operators from maintaining state try as the multiset of input records that produced that
that is never read. Partially-stateful data-flow generalizes entry’s value. Noria’s eventual consistency requires that
beyond Noria, but we highlight specific design choices each state entry’s contents approach the ideal set of input
that help Noria achieve its goals. records that would produce the most up-to-date value.
Partial state introduces new data-flow messages to No- Given some state entry e, we define:
ria. Eviction notices flow forward along the update data- • Te is the set of all input records received so far that, in
flow path; they indicate that some state entries will no a correct implementation of the data-flow graph, would
longer be updated. Operators drop updates that would be used to compute e.
affect these evicted state entries without further pro-
• Se is either the multiset of input records actually used
cessing or forwarding. When Noria needs to read from
to compute in e, or ⊥, which represents an evicted entry.
evicted state—for instance, when the application reads
We use a multiset so the model can represent potential
state evicted from an external view—Noria re-computes
bugs such as duplicate updates.
that state. This process sends recursive upqueries to the
relevant ancestors in the graph (Figure 4). An ancestor • De is the set of key-descendant entries of e. These
that handles such an upquery computes the desired value are entries of operators downstream of e in the data-flow
(possibly after sending its own upqueries), then forwards that depend on e through key lookup.
a response that follows the data-flow path to the query- Te and Se are time-dependent, whereas the dependencies
ing operator. When the upquery response eventually ar- represented in De can be determined from the data-flow
rives, Noria uses it to populate the evicted entry. After the graph. If e is the VoteCount entry for some story in
evicted entry has been filled, subsequent updates through Figure 5, then Te contains all input votes ever received
the data-flow keep it up-to-date until it is evicted again. for that story; Se contains the updates represented in its
For correctness, upqueries must produce eventually- vcount; and De includes its StoriesWithVC entry.
consistent results. For performance, Noria should con- Correctness of partially-stateful data-flow relies on en-
tinue to process updates—including updates to the wait- suring these invariants:
1. Update completeness: if Se 6= ⊥, then either all up- use recursive upqueries to fill it in. Moreover, operators
dates in Te − Se are in flight toward e, or an eviction now encounter evicted state when they handle updates.
notice for e is in flight toward e. These factors influence the Noria design in several ways.
2. No spurious or duplicate updates: Se ⊆ Te . First and simplest, Noria operators drop updates that
3. Descendant eviction: if Se = ⊥, then for all d ∈ De , encounter evicted entries. This reduces the time spent
either Sd = ⊥, or an eviction notice for d is in flight processing updates downstream, but necessitates the de-
toward d’s operator. scendant eviction invariant: operators downstream of an
4. Eventual consistency: if Te stops growing, then evicted entry never see updates for that entry, so they
eventually either Se = Te or Se = ⊥. must evict their own dependent entries lest they remain
We now explain the mechanisms that Noria uses to real- permanently out of date.
ize this data-flow model and maintain the invariants. Second, recursive upqueries now occasionally cascade
4.2 Update ordering up in the data-flow until they encounter the necessary
state—in the worst case, up to base tables. Responses
Noria uses update ordering to ensure eventual consis- then flow forward to the querying operator. Upquery re-
tency without global data-flow coordination. Each oper- sults are snapshots of operator state, and do not com-
ator totally orders all updates and upquery requests it re- mute with updates. For unbranched chains, update order-
ceives for an entry; and, critically, the downstream data- ing (§4.2) and the fact that updates to evicted state are
flow ensures that all updates and upquery responses from dropped ensure that the requested upquery response is
that entry are processed by all consumers in that order. processed before any update for the evicted state.
Thus, if the operator orders update u1 before u2 , then Recursive upqueries of branching subgraphs, such as
every downstream consumer likewise processes updates joins, are more complex. A join operator must emit a sin-
derived from u1 before those derived from u2 . Noria data- gle correct response for each upquery it receives, even if
flows can split and merge (e.g., at joins), but update or- it must make one or more recursive upqueries of its own
dering and operator commutativity ensure that the even- to produce the needed state. Combining the upqueries’
tual result is correct independent of processing order. results directly would be incorrect: those upqueries exe-
4.3 Join upqueries cute independently, and updates can arrive between their
responses. Joins thus issue recursive upqueries, but com-
Join operators use upqueries (§3.3): when an update ar-
pute the final result exclusively with join upqueries once
rives at one input, the join upqueries its other input for the
the recursive upqueries complete (multiple rounds of re-
corresponding records, and combines them with the up-
cursive upqueries may be required). These join upqueries
date. Join upqueries reach the next upstream stateful op-
execute within a single operator chain and exclude con-
erator, which computes a snapshot of the requested state
current updates. Noria supports other branching opera-
entry and forwards it along the data-flow to the querying
tors, such as unions, which obey the same rules as joins.
join. Intermediate operators process the response as ap-
Finally, a join upquery performed during update pro-
propriate. Unlike normal updates, upquery responses fol-
cessing may encounter evicted state. In this case, No-
low the single path back to the querying operator without
ria chooses to drop the update and evict dependent en-
forking. Upquery responses also commute neither with
tries downstream; Noria statically analyzes the graph to
each other nor with previous updates. This introduces a
compute the required eviction notices. There is a trade-
problem for join update processing, since every such up-
off here: computing the missing entry could avoid future
date requires an upquery that produces non-commutative
upqueries. Noria chooses to evict to avoid blocking the
results, yet must produce an update that does commute.
write path while filling in the missing state.
Noria achieves this by ensuring that no updates are
in flight between the upstream stateful operator and the Such evictions are rare, but they can occur.
join when a join upquery occurs. To do so, Noria lim- For example, imagine a version of Figure 2 that
its the scope of each join upquery to an operator chain adds AuthorVotes, which aggregates VoteCount by
processed by a single thread. Noria executes updates on stories.author, and the following system state:
other operator chains in parallel with join upqueries. • stories[id=1] has author=Elena.
This introduces a trade-off between parallelism and • VoteCount[story id=1] has vcount=8.
state duplication: join processing must stay within a sin- • AuthorVotes[author=Elena] has vcount=8.
gle operator chain, so copies of upstream state may be • stories[id=2] has author=Bob.
required in each operator chain that contains a join. • VoteCount[story id=2] is evicted.
Now imagine that an update changes story 2’s au-
4.4 Eviction and recursive upqueries thor to Elena. When this update arrives at the join
Evicted state introduces new challenges for Noria’s data- for AuthorVotes, that join operator upqueries for
flow. If the application requests evicted state, Noria must VoteCount[story id=2], which is evicted. As a result,
Noria sends an eviction notice for Elena—whose number new expression. The sharing candidates are existing ex-
of votes has changed—to AuthorVotes. pressions that likely overlap with the new expression.
Next, Noria generates a verbose intermediate represen-
4.5 Partial and full state
tation (IR), which splits the new expression into more
Noria makes state partial whenever it can service up- fine-grained operators. This simplifies common subex-
queries using efficient index lookups. If Noria would pression detection, and allows Noria to efficiently merge
have to scan the full state of an upstream operator to sat- the new IR with the cached IR of the sharing candidates.
isfy upqueries, Noria disables partial state for that oper- For each sharing candidate, Noria reorders joins in the
ator. This may happen because every downstream record new IR to match the candidate when possible to max-
depends on all upstream ones—consider e.g., the top 20 imize re-use opportunities. It then traverses the candi-
stories by vote count. In addition, the descendant evic- date’s IR in topological order from the base tables. For
tion invariant implies that partial-state operators cannot each operator, Noria searches for a matching operator (or
have full-state descendants. clique of operators) in the new IR. A match represents a
Partial-state operators in Noria start out fully evicted reusable subexpression, and Noria splices the two IRs to-
and are gradually and lazily populated by upqueries. As gether at the deepest matches.
we show next, this choice has important consequences This process continues until Noria has considered all
for Noria’s ability to transition the data-flow efficiently. identified reuse candidates, producing a final, merged IR.
5 Dynamic data-flow 5.2 Data-flow transition
Application queries evolve over time, so Noria’s dy- The combined final IRs of all current expressions rep-
namic data-flow represents a continuously-changing set resent the transition’s target data-flow. Noria must add
of SQL expressions. Existing data-flow systems run sep- any operator in the final IR that does not already exist in
arate data-flows for each expression, initialize new op- the data-flow. To do so, Noria first informs existing op-
erators with empty state and reflect only new writes, or erators of index obligations (§3.3) incurred by new op-
require restarting from a checkpoint. Changes to the No- erators that they must construct indexes for. Noria then
ria program instead adapt the data-flow dynamically. walks the target data-flow in topological order and inserts
Given new or removed expressions, Noria transitions each new operator into the running data-flow and boot-
the data-flow to reflect the changes. Noria first plans the straps its state. Finally, after installing new operators and
transition, reusing operators and state of existing expres- deleting removed queries’ external views, Noria removes
sions where possible (§5.1). It then incrementally applies obsolete operators and state from the data-flow.
these changes to the data-flow, taking care to maintain its Bootstrapping operator state. When Noria adds a
correctness invariants (§5.2). Once both steps complete, new stateful operator, it must ensure that the operator
the application can use new tables and queries. starts with the correct state. Partially-stateful operators
The key challenges for transitions are to avoid unnec- and views start processing immediately. They are ini-
essary state duplication and to continue processing reads tially empty and bootstrap via upqueries in response to
and writes throughout. Operator reuse and partial state application reads during normal operation, amortizing
help Noria address these challenges. the bootstrapping work over time. Fully-stateful opera-
tors are initially marked as “inactive”, which causes them
5.1 Determining data-flow changes
to ignore all incoming updates. Noria then executes a
To initiate a transition, the application provides Noria special, large upquery for all keys on behalf of the fully-
with sets of added and removed expressions. Noria then stateful operator. Once the last upquery response has ar-
computes required changes to the currently-running data- rived, Noria activates the operator for update processing
flow. This process resembles traditional database query and moves on to the next new operator.
planning, but produces a long-term joint data-flow across Base table changes. As applications evolve, develop-
all expressions in the Noria program. This allows Noria ers often add or remove base table columns [17]. This
to reuse existing operators for efficiency: if two queries affects existing operators in the data-flow: new updates
include the same join, the data-flow contains it only once. from the base table may now lack values that existing op-
To plan a transition, Noria first translates each new ex- erators expect. Noria could rebuild the data-flow or trans-
pression into an extended query graph [21]. The query form the existing base table state to effect such a change,
graph contains a node for each table or view in the ex- but this would be inefficient for large base tables. Instead,
pression, and an edge for every join or group-by clause. Noria base tables internally track all columns that have
Noria uses query graphs to inexpensively reject many ex- existed in the table’s schema, including those that have
pressions from consideration [21, §3.4, 78, §3] and to been deleted. When a base table processes an application
quickly establish a set of sharing candidates for each write, it automatically injects default values for missing
columns (but does not store them). This permits queries There are typically fewer data-flow workers than oper-
for different base table schemas to coexist in the data- ators in the data-flow graph, so Noria multiplexes opera-
flow graph, and makes most base table changes cheap. tor work across the worker threads. Within one instance,
Noria schedules chains of operators with the same key as
6 Implementation a unit. This reduces queueing and inter-core data move-
Our Noria prototype implementation consists of 45k ment at operator boundaries. It also allows Noria to op-
lines of Rust and can operate both on a single server and timize some upqueries: an upquery within a chain can
across a cluster of servers. Applications interface with simply access the ancestor’s data synchronously, without
Noria either through native Rust bindings, using JSON worry of contamination from in-flight updates (§4.3).
over HTTP, or through a MySQL protocol adapter. Read handlers process clients’ RPCs to read from ex-
ternal views. They must access the view with low latency
6.1 Persistent data storage and high concurrency, even while a data-flow worker is
Noria persists base tables in RocksDB [66], a high- applying updates to the view. To minimize synchroniza-
performance key-value store based on log-structured tion, Noria uses double-buffered hash tables for external
merge (LSM) trees. Batches of application updates are views [27]: the data-flow worker updates one table while
synchronously flushed into RocksDB’s log before No- read handlers read the other, and an atomic pointer swap
ria acknowledges them and admits them into the data- exposes new writes. This trades space and timeliness for
flow; a background thread asynchronously merges log performance: with skewed key popularity distributions,
entries into the LSM trees. Each base table index forms it can improve read throughput by 10× over a single-
a RocksDB “column family”. For base tables with non- buffered hash table with bucket-level locks.
unique indexes, Noria uses RocksDB’s ordered iterators 6.3 Distributed operation
to efficiently retrieve all rows for an index key [14, 67].
Persistence reduces Noria’s write throughput by about A Noria controller process manages distributed in-
5% over in-memory base tables. Reads are not greatly stances on a cluster of servers, and informs them of
impacted when an application’s working set fits in mem- changes to the data-flow graph and of shard assign-
ory: only occasional upqueries access RocksDB, and ments. Noria elects the controller and persists its state
these add < 1ms of additional latency on a fast SSD. via ZooKeeper [34]. Clients discover the controller via
ZooKeeper, and obtain long-lived read and write handles
6.2 Parallel processing to send requests directly to instances.
Noria shards the data-flow and allows concurrent reads Noria handles failures by rebuilding the data-flow. If
and writes with minimal synchronization for parallelism. the controller fails, Noria elects a new controller that re-
stores the data-flow graph. It then streams the persistent
Sharding. Noria processes updates in parallel on a
base table data from RocksDB to rebuild fully-stateful
cluster by hash-partitioning each operator on a key and
operators and views. Partial operators are instead pop-
assigning shards to different servers. Each machine runs
ulated through on-demand upqueries. If individual in-
a Noria instance, a process that contains a complete copy
stances fail, Noria rebuilds only the affected operators.
of the data-flow graph, but holds state only for its shards
of each operator. When an operator with one hash parti- 6.4 MySQL adapter
tioning links to an operator with a different partitioning, Our prototype includes an implementation of the
Noria inserts “shuffle” operators that perform inter-shard MySQL binary protocol in a dedicated stateless adapter
transfers over TCP connections. Upqueries across shuf- that appears as a standard MySQL server to the applica-
fle operators are expensive since they must contact all tion. This adapter allows developers to easily run existing
ancestor shards. This limits scalability, but allows opera- applications on Noria. The adapter transparently trans-
tors below a shuffle to maintain partial state. lates prepared statements and ad-hoc queries into transi-
Multicore parallelism. Noria achieves multicore par- tions on Noria’s data-flow, and applies reads and writes
allelism within each server in two ways: a server can using Noria’s API behind the scenes. Its SQL support is
handle multiple shards by running multiple Noria in- sufficiently complete to run some unmodified web appli-
stances, and each instance runs multiple threads to pro- cations (e.g., JConf [74] written in Django [22]), and to
cess its shard. Each instance has two thread pools: data- run Lobsters with minimal syntax adaptation.
flow workers process updates within the data-flow graph,
and read handlers handle reads from external views. 6.5 Limitations
At most one data-flow worker executes updates for Our current prototype has some limitations that we plan
each data-flow operator at a time. This arrangement to address in future work; none of them are fundamental.
yields CPU parallelism among different operators, and First, it only shards by hash partitioning on a single col-
also allows lock-free processing within each operator. umn, and resharding requires sending updates through
a single instance, which limits scalability. Second, it long commit history). Most application updates reduced
re-computes data-flow state on failure; recovering from to single-table inserts, deletes, or updates.
snapshots or data-flow replicas would be more efficient Limitations. Though applications traditionally use
(e.g., using selective rollback [35]). And third, it does not parameterized queries to avoid SQL injection attacks
currently support range indices or multi-column joins. and cache query plans, Noria parameterized queries also
build materialized views. An application with many dis-
7 Applications tinct parameterized queries can thus end up with more
This section discusses our experiences with developing views than necessary. The developer can correct this by
Noria applications. Noria aims to simplify the develop- adding shared views. Our prototype does not yet support
ment of high-performance web applications; several as- update and delete operations conditioned on non-primary
pects of our implementation help it achieve that goal. key columns, and lacks support for parameterized range
First, applications written for a MySQL database can queries (e.g., age > ?), which some applications need.
use Noria directly via its MySQL adapter, provided Planned support for range indexes and an extended base
they generate parameterized SQL queries (for instance, table implementation will address these limitations.
via libraries like PHP Data Objects [69] or Python’s
MySQL connector [55, §10.6.8]). Porting typically pro- 8 Evaluation
ceeds in three steps. First, the developer points the appli- We evaluated our Noria prototype using backend work-
cation at the Noria MySQL adapter instead of a MySQL loads generated from the production Lobsters web appli-
server and imports existing data into Noria from database cation, as well as using individual queries. Our experi-
dumps. The application will immediately see perfor- ments seek to answer the following questions:
mance improvements for read queries that formerly ran
1. What performance gains does Noria deliver for a
substantial in-line compute. Though the MySQL adapter
typical database-backed web application? (§8.1)
even supports ad-hoc read queries (it transitions the
2. How does Noria perform compared to a
data-flow as required to support each query), the most
MySQL/memcached stack, the materialized
benefit will be seen for frequently-reused queries. Sec-
views of a commercial database, and an idealized
ond, the developer creates views for computations that
cache-only deployment? (§8.2)
the MySQL application manually materialized, such as
3. Given a scalable workload, how does our prototype
the per-story vote count in Lobsters. These views co-
utilize multiple servers, and how does it compare to
exist with the manual materializations, and allow exist-
a state-of-the-art data-flow system? (§8.3)
ing queries to continue to work as the developer updates
4. What space overhead does Noria’s data-flow state
the write path so that it no longer manually updates de-
impose, and how does Noria perform with limited
rived views and caches. Third, the developer incremen-
memory and partial state? (§8.4)
tally rewrites their application to rely on natural views
5. Can Noria data-flows adapt to new queries and input
and remove manual write optimizations. These changes
schema changes without downtime? (§8.5)
gradually increase application performance as the devel-
oper removes now-unnecessary complexity from the ap- Setup. In all experiments, Noria and other storage
plication’s read and write paths. backends run on an Amazon EC2 c5.4xlarge instance
The porting process is not burdensome. We ported with 16 vCPUs; clients run on separate c5.4xlarge in-
a PHP web application for college room ballots— stances unless stated otherwise. Our setup is “partially
developed by one of the authors and used production open-loop”: clients generate load according to a Poisson
for a decade—to Noria; the process took two evenings, distribution of interarrival-times and have a limited num-
and required changes to four queries. We also used ber of backend requests outstanding, queueing additional
the MySQL adapter to port the Lobsters application’s requests. This ensures that clients maintain the measure-
queries to Noria; the result is a focus of our evaluation. ment frequency even during periods of high latency [45].
Our test harness measures offered request throughput and
Developing native Noria applications can be even eas-
“sojourn time” [62], which is the delay from request gen-
ier. We developed a simple web application to show the
eration until a response returns from the backend.
results of our continuous integration (CI) tests for No-
ria. The CI system stores its results in Noria, and the 8.1 Application performance: Lobsters
web application displays performance results and aggre-
gate statistics. Since we developed directly for Noria, we We first evaluate Noria’s performance on a realistic web
were not tempted to cache intermediate results or ap- application workload to answer two questions:
ply other manual optimizations, and could use aggrega- 1. Do Noria’s fast reads help it outperform a conven-
tions and joins in queries without fear that performance tional database on a real application workload, even
would suffer as a result (e.g., due to aggregations over the on a hand-optimized application?
MariaDB, baseline qu. Noria, baseline qu. Noria, natural qu.
The baseline queries manually pre-compute aggre-
gates. MariaDB requires this for performance: without
100 the pre-computation, it supports just 20 pages/sec. Noria
80 instead maintains pre-computed aggregates in its data-
Latency [ms]
0 0
0 2M 4M 6M 8M 10M 12M 14M 0 2M 4M 6M 8M 10M 12M 14M
Offered load [requests/sec] Offered load [requests/sec]
(a) Read-heavy workload (95%/5%): Noria outperforms all (b) Mixed read-write workload (50%/50%): Noria outperforms
other systems (all but memcached at 100–200k requests/sec). all systems but memcached (others are at 20k requests/sec).
Figure 7: A Lobsters subset (Figure 2) benchmarked on Noria hand-optimized MariaDB, System Z’s materialized
views, a MariaDB/memcached setup, and on memcached only, all with Zipf-distributed (s = 1.08) reads and votes.
100 MariaDB (hand-opt.) heavy workload (50% writes) well (Figure 7b): although
System Z absolute performance has dropped, Noria still outper-
MariaDB+memcached forms all other systems apart from the cache-only setup.
50 memcached-only This is because sharding allows data-parallel write pro-
Noria (4 shards)
cessing, which helps Noria scale to 2M requests/second.