Sunteți pe pagina 1din 15

24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

The Oracle Problem


Why decentralizing everything is more di cult than it sounds

Victor Hogrefe
Sep 8, 2018 · 17 min read

More and more companies, financial institutions, start-ups and news outlets have been
making big claims about blockchain technology and its future applications. Large
institutions want to be “with it” and tick their innovation checkboxes by setting up
internal blockchain teams, or by partnering with crypto companies like Ripple, etc.
Business savvy entrepreneurs are vying for the attention of investors and ICO money by
claiming to solve every problem imaginable with blockchains.

Many of these efforts fail to grasp two fundamental principles:

1) Blockchains are very inefficient databases, that only lend themselves to specific
industries like financial products, insurance markets, gambling, collectables, etc.

2) Blockchains exist in virtual space, and are only tenuously connected to the physical
world

https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 1/15
24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

In this paper, we will focus on the second principle, since the first has been covered
many times and hardly needs repeating.

The Chasm

A fact that few people appreciate is the vast divide between the virtual and physical
world. That is, for information about the physical world to enter the digital space,
someone or something has to supply it first. No algorithm can come up with the
temperature in Paris today, or the weight of a blue whale, without that information
being entered into the digital space by a person or machine. Conceptually, one can
think of the digital and physical as two distinct entities that are only loosely connected
by a system of links, bridging the gap between worlds.

Oracles are essentially algorithms that attempt to learn some fact, or access some piece
of information from elsewhere. In the context of this paper, oracles refer to programs or
smart contracts that obtain information from outside the blockchain, and feed it into
the blockchain. In the case of decentralized flight insurance, for instance, the smart
contract needs to know whether a given flight was actually on time or not. However,
since the blockchain platform is decentralized, with no ultimate authority, who is to say
which information is true, or where to get information from.
https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 2/15
24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

Especially in times of universal uncertainty and mistrust in the media, we need to be


extremely careful how to obtain information, and how to verify that information,
without resorting ultimately to trust in any one person or source. This problem grows
with the stakes, which, in the insurance industry, or prediction markets can be huge.
The question then becomes how to create an oracle that is reliable, and resistant to
tampering or attempts to game it in order to extract some advantage or spread
misinformation.

Simple Oracles

The easiest way to find out the current temperature in Paris, the outcome of an election,
the price of an asset, or the status of a flight, is to consult a trusted source. For instance,
a smart contract could make an API call to a meteorological website, or scour the New
York Times for election result headlines, etc.

The problem with this approach is that a central source is required. While this may be
relatively uncontentious for some topics, such as election results, it can be much more
difficult for others, where the truth of the matter is not easily verifiable, or where a lot
of money is at stake. Prediction markets are notoriously hampered by corruption and
would therefore not benefit from centralized oracles. Concerns are not limited to the
trustworthiness of the source, but also to the security of the transfer of information.
Since a single source presents a clear target, it can be attacked more easily. A fake
source might pose as the real deal, or a malicious party may bribe or corrupt the
information, or launch a denial of service attack on the source.

There are several examples of bad news sources causing havoc in the stock and crypto
markets, and increasingly the resulting chaos is due to algorithmic trading bots that
cannot distinguish real from fake news based on their use of simple oracles. In 2013 a
hacker infiltrated the Associated Press Twitter account and tweeted that President
Obama was injured in an explosion at the White House. Even though it was quickly
deleted, this single tweet caused a $130 billion loss in the stock market. Similarly, in
2017, Ethereum lost $4 billion because of a fake news story about Vitalik Buterin’s
death in a car accident, showing that faulty sources can be very expensive.

A way of avoiding these problems is to consult several sources and trust that the median
result is correct. If, for example, 50 news sites are monitored for election headlines, it is
unlikely that the majority of them will run with a false story, be misinformed, or
hacked. Another solution is to enlist services such as Oraclize and RealityKeys to

https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 3/15
24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

provide DApps with API data, and other real-world facts about flights times, sports
results and exchange rates, using cryptographic authenticity proofs.

Schelling Points

In the 1960s Thomas Schelling proposed a game-theoretic concept he called a focal


point. Focal points are relevant in coordination games where the participants are
unable to communicate. A famous example is a story of four university students who
skipped out on a Monday class to spend an additional day skiing. Unfortunately, they
missed an exam during this time, and subsequently concocted a story about how they
attempted to attend the exam, when one of the tires blew, rendering them unable to
make it in time. Since they were good students the professor gave them a chance, and
let them retake the exam. On the day of the test, they were seated in separate rooms
and the exam consisted of only one question: Which tire blew?

Since all tires are roughly equally likely to fail, all four answers are equilibrium
solutions, which puts the students in a very difficult spot, because in truth, the tire
failure was a made-up story. Criminal investigators have been using these and other
techniques for decades, separating the suspects and asking detailed questions about
their accounts. The idea is that there is simply no way to create a lie consistent enough
to survive scrutiny if no communication between subjects is allowed.

What Schelling noticed is that there seem to be certain focal points dependant on
context which people will gravitate towards. In the case of tire failure, the front right
tire will be chosen more often than the others and is thus the “correct” choice. Another
example is that of two strangers trying to meet up in a given city, without knowing
exactly when and where, and without the ability to coordinate. Most people familiar
with New York tend to suggest 12 noon, in front of Grand Central Terminal, while most
Tokyo denizens will likely choose the Hachiko statue in front of Shibuya Station. While
technically, all times and locations are equilibrium solutions regarding meeting places,
these special places stand out as focal points.

Beauty Contests

Keynes proposed an experiment in the 1930s that let newspaper readers choose the six
most attractive faces in a lineup of 100 pictures. The winners of the contest would be
those that chose the faces that most other readers also believed to be the most
attractive. While a simple strategy is for a reader to pick the faces she finds most
appealing, a better strategy might be to disregard her own feelings, and focus on what
https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 4/15
24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

she thinks everyone else will think. Thus, perhaps the faces with the most stand-out
features could be focal points, or maybe readers will argue that blonde hair is typically
considered most attractive. The observation is that there can be significant differences
in outcome when participants chose based on what they think other participants will
chose.

Several experiments have been conducted to show such differences. For instance, when
asked to choose between a puppy, a kitten and several other baby animals based on
cuteness, participants mostly decided on the kitten and puppy. When asked to chose the
animal they thought most other participants would choose, the kitten overwhelmingly
won, because most people are convinced that other people prefer kittens over other
baby animals.

The German newspaper Spectrum der Wissenschaft gave out a $1000 prize in 1997 to
the person who could guess a number between 0 and 100 that was closest to 2/3 of the
average of all numbers submitted. From a game theory point of view, the equilibrium
solution to this game is 0 for the following reasons.

If every participant chooses more or less random numbers, we should expect the
average of all numbers to be 50. Thus, two thirds of 50 is 33, and we should choose 33 if
we want to win. This is a choice of the first degree.

https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 5/15
24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

However, once we consider that everyone else is thinking along these lines, the actual
average will be around 33, which means that the winning number will be 22. This is a
choice of the second degree.

If everyone takes a second-degree choice into account, the third-degree choice will be
14. This repeats until everyone should rationally choose 0. In reality, the average
person is not game theory savvy enough to fully appreciate the recursive structure of
this scenario, and therefore will likely choose numbers at one of these focal points. If
we assume the fallibility of people’s reasoning, we can predict the resulting distribution
to be an amalgamation of nth degree choices. There are sophisticated ways of doing
this, and behavioural game theory has come far in determining how many recursive
steps people tend to take. One can model the likelihood of the nth step by a poisson
distribution dependant on experimental data, for instance. However, we can achieve a
rough picture by simply overlaying the frequencies of nth degree choices, as follows.

https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 6/15
24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

In fact, when the German publication disclosed both its own findings, as well as those
of an identical Financial Times experiment, we see that our merged frequency
prediction is not too bad. Of course, several participants chose more or less random
numbers, or considered 99 and 100 to be of particular interest, but a large number of
choices center around the aforementioned focal points of various degrees of choice.

The Truth as a Focal Point

As discussed above, we can see that people tend towards various focal points,
depending on the context of the situation. Instead of posing a mathematical question,
we can turn our attention to practical matters, and incentivize those who are closest to
the median of overall choices.

https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 7/15
24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

Let us assume that someone has placed a large bet on the USD/Bitcoin exchange rate
exceeding 1 million by the end of 2019, and placed their funds in a smart contract,
which will pay him out 100 to 1, if the price of Bitcoin actually reaches or exceeds that
amount. An oracle is necessary to determine if and when this happens, such that the
smart contract can act. Since the price of Bitcoin is a large prediction market, let us
assume that the distribution of votes is roughly normal (this is not a necessary
assumption, but makes illustration easier). Every prediction within the interquartile
range (IQR), that is, between the 25th and 75th percentile of sorted predictions is
rewarded, creating an incentive to be as close to the median as possible. The most
reliable way to achieve this is to vote in accordance to the true exchange rate, because
that is the most obvious focal point around which everyone will likely gather.

Weaknesses of Focal Point Oracles


Simple Attack: Median Manipulation

As long as the majority of participants are being honest, we can see how the oracle will
produce an accurate account of reality. Conversely, if the majority of participants are
colluding and attempt to game the process, the oracle is thrown off. A way in which
dishonest participants could game the system is by manipulating the median of results.

https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 8/15
24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

If a smart contract needs to know the price of a certain good, or cryptocurrency, it can
use a decentralized oracle which relies on votes in the above way. If we imagine a price
of $100 being voted on for several votes in a row, and we reasonably expect the price to
remain the same in the next election, then the variance of choices will likely be small.
This is ideal, since the median will only have to be moved a little.

If a group of voters were colluding to push the median down it can help achieve their
nefarious purpose. This could be either a buying opportunity on a decentralized
exchange, or related to the terms of the relevant smart contract they are trying to trick.

https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 9/15
24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

While the median is quite resilient to change, something interesting happens to the IQR
in this case. Now, all values between 0 and 100.19, are within 50% of the median,
which means that this kind of manipulation can create downward vote incentives on a
subsequent vote, especially if it is combined with an announcement attack described
below.

Public Announcement Attack

Even in the absence of collusion, there are ways in which the oracle can be gamed. For
instance, if a group of participants holding 5% of votes, publicly announces their vote
ahead of time, others face a difficult choice, especially if it is not known how much
voting power the announcers have. In fact, it could be in the interest of the announcers
to spread misinformation about the size of their vote, or even lie about their
announcement, making everything very messy. These considerations are salient when
the subject matter is the price of a token, or contract.

Let us assume that there is an announcement claiming that 10% of votes will chose a
price X in the next election. Since focal points are not about what the participant
believes, but rather about what the participant believes other participants to believe, a
conservative bet might be to vote for X as well, or a middle ground between X and
previous prices, just to be safe. Even if the announcers were lying about their voting
power, and in fact only control 1% of votes, price X could be within the new range of
accepted results based on everyone else hedging their bets. In this particular example,
this kind of strategy could exert upward or downward pressure on price, regardless of
the forces of supply and demand.

Trying to censor these kinds of announcements could be a solution, but might prove to
be very difficult. Censorship on the internet is not generally a proven policy, because it
is almost impossible to enforce. Of course, any kind of censorship enforcement will
bring back worries of centralizing the process, and the concern is that the censors
themselves are corrupt. On the other hand, if voting announcements are heard all the
time, and are all over the place, we can reasonably expect other voters to build up a
tolerance, and ignore these dubious claims.

Incentivized Announcement Attack

https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 10/15
24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

A subset of announcement attacks are bribed announcements. If a villainous party


decides to incentivize others by rewarding them for voting according to the
announcement, we may end up with two focal points, in which case the median of
results is affected. As Vitalik Buterin points out in his SchellingCoin article the
incentives could be construed such that the announced focal point is in fact superior
and will be chosen over the unaffected one. This kind of bribery would undoubtedly be
very expensive, but could, in some cases pay off. It is conceivable that a malicious party
need only affect the outcome of a single vote in order to trigger a smart contract to act
in a desired way, such as prematurely, or incorrectly releasing billions of dollars to the
cheaters. This is a very serious problem.

Strategy

If the previous price was voted at $100, and an announcement attack claims that 25%
of the vote will choose a price of zero on the next round, then a relatively safe bet is to
choose a middle value, between 50 and 100. 25% of the vote is enough to move the first
quartile by half the distance between median and announced value. If, however, we do
not know how much of the vote is controlled by the announcers, it becomes more
complicated.

Since no one knows how anyone else will vote, and everyone votes the way they think
everyone else will vote, we may end up in a beauty contest described above, with n
degrees of choice, each pushing the IQR towards the announced vote. This depends on
the incentive structure of the system and risk taking behaviour of voters. In order to
make these choices we must assign probabilities to whether the announcement
attackers have at least 25% of the vote, whether the rest of voters will choose to vote for
a middle ground, or whether the rest of voters will choose to join the announcement
attackers.

Below, we can see IQR ranges that consider every scenario in 5% increments, from no
attackers, to 95% attackers vote. Assuming that attackers and other voters only have
two choices, namely to vote with the attackers, or vote for the previous value, we can
see that in 25% of cases, voting for the previous value is safe, in 20% of cases, voting for
the attackers value is safe, and in 55% of cases, voting for a middle ground is safe. In
this simplified model we do not know the probability associated with each choice, and
thus must assume a uniform distribution of probabilities. Given these assumptions,
voting for the middle value is likely the safest option. This is a choice of the first degree.

https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 11/15
24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

The problem does not stop here. Since everyone else is thinking along the same lines, a
considerable number of voters will choose the middle value, creating choices of the
second degree, in which a new middle value between the middle value and the
attackers’ value is chosen. This, of course, leads to a recursion, with the equilibrium
being the attackers value.

Whether or not this equilibrium is realistic depends mainly on two factors:

1. How many steps voters can think ahead

2. How likely voters are to believe announcers

3. The incentive structure of the system

When considering incentives, punishing bad votes and rewarding good ones could
drastically change the optimal strategy of this voting game. More importantly, we have
only considered a scenario in which everyone within the IQR is rewarded equally. If we
change the rules such that reward decreases as a function of distance from the mean,
behaviour would likely be more conservative. In this case, the middle value is much less
attractive.

Honesty

Worries regarding announcement attacks fail to factor in an important aspect of human


behaviour: Most people are relatively honest, and cannot tolerate cheating and

https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 12/15
24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

injustice. Some of the above assumptions are probably not accurate if we factor in
human tendencies. For instance, it might not occur to many people to report on
anything other than the truth, even if they would gain a slight advantage by not doing
so. In that case, the probabilities of the above distribution might not be uniform.

Augur, the Ethereum based prediction market is a large game theory experiment and
seems to confirm exactly this. On the Augur platform, participants stake their
Reputation tokens (REP) on the outcome of certain events, betting on anything from
sports results to elections to weather patterns. All participants in a given market are
incentivized to report on the relevant outcomes, for doing so accurately will entitle
them to a share in network fees. Falsely reporting on outcomes risks losing up to 20% of
staked tokens. Even if there is a false consensus, there are some conflict resolution
backstops that the injured parties can utilize in order to get justice.

Instead of re-centralizing the problem, Augur attempts to resolve its disputes in a


decentralized way by allowing disputing users to stake REP against the markets
tentative outcome. There are two possibilities: If enough REP is staked against a
tentative outcome to meet the required dispute bond size (calculated based on stake in
that particular market), the new result becomes the new tentative outcome. If the
market is large enough such that 2.5% of all REP is staked on a dispute bond, the entire
network goes into a fork stage in which it might split into two distinct universes; one in
which the outcome is considered true, and another in which it is considered false.

The fact that Augur needs to resort to different universes shows how difficult the oracle
problem can be. We can imagine one universe, according to the platform, in which
Hillary won the election, and one in which Trump did, even though there is a clear fact
of the matter.

The Double Spend Problem

While the price of gold or the outcomes of prediction markets can be considered real-
world facts that pass into the blockchain space, there is set of items which will prove
incredibly difficult to resolve in a decentralized way. DigixDAO, for instance, is putting
gold on the blockchain. What this means is that they secure a certain amount of gold,
go through audits and verifications and a host of other regulatory processes, and then
create gold tokens on Ethereum for every ounce they actually hold in their vaults,
similar to what Tether did with the USD.

https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 13/15
24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

The double spend problem arises because there is no permanent connection between
an ERC-20 gold token and the actual gold in the DigixDAO vaults. That gold could be
stolen, or sold secretly, in which case DigixDAO would have sold twice as much gold; a
100% profit. Clearly, there is a large incentive to do so, and this is precisely why they
are motivated to centralize the oracle process, in order for regulators and auditors to
anchor ERC-20 gold to the authority of such institutions.

There have been many efforts to bring real estate markets into the blockchain space, but
they face similar problems. If a house is tokenized, and sold on the blockchain, the
actual building is still in the real world and can be sold again, burned down for the
insurance money, rented out, lived in, or be abandoned without this information
passing onto the token, unless there is some authority that keeps tabs on real estate and
updates the state of affairs regularly. How this problem can be solved in a decentralized
way is largely unexplored territory.

Decentralized Autonomous Organizations (DAOs) could be a solution. We have already


seen how a DAO, like Dash, can have real effects in the world by hiring contractors,
sponsoring MMA fighters and investing in advertisements. It is foreseeable that an
Oracle DAO could hire contractors to check on buildings, or even run its own real estate
office, in exchange for network fees or mining rewards of the relevant blockchain real
estate market.

This is a somewhat centralizing solution, because those contractors are points of


weakness and can be bribed or corrupted to report falsely. If millions of dollars of real
estate are at stake, this is not unlikely. Neither robotic contractors, nor smart houses are
a good solution either, because both can be hacked or tricked, leaving us with an
uneasy sense that this problem is insurmountable.

Conclusion

This paper has gone over some aspects of the oracle problem, and hopefully has
illustrated why “decentralizing everything” may be more difficult and more dependant
on compromises than many people expect. This is especially true for tokenized real-
world assets, which suffer from the double-spend vulnerability. For decentralized
oracles that are used in prediction markets we have discussed how announcement
attacks may cause game-theoretically undesirable outcomes, in which the dominant
strategy is not the optimal one people are hoping for. It will depend on the specifics of
the incentive systems, and the ingenuity of the oracle’s design.

https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 14/15
24/02/2020 The Oracle Problem - Victor Hogrefe - Medium

Blockchain Augur Ethereum Prediction Markets Decentralized

About Help Legal

https://medium.com/@victorhogrefe/the-oracle-problem-5611bed763ba 15/15