Sunteți pe pagina 1din 28

Multiagent Interactions

Wooldridge (2002) Chapters 6


Multi-agent Interactions

Lecture Outline
1. Multi-agent Systems
2. Utility and Preferences
3. Game Theory and Payoff Matrices
4. Strategies
5. Summary

Interactions

The world functions through interacting


agents. Each person pursues his/her own
goals through encounters with other
people or machines.
Rules of Encouter by Rosenchein and Zlotskin, 1994

Example 1
Two students decide to work together on their
exercises. They have to decide upon a time. One
prefers to work on Thursday afternoons after the
lecture while the other prefers to work on Friday
morning. How do they decide upon a time to do
the work?

Example 2
A friend invites you out for a drink and the cinema tonight.
But your favourite TV program is on tonight. You think:

It would be nice to go out with my friend, but its cheaper


to watch TV.

If you stay at home and watch TV, you might not have a
chance to go out with your friend for a long time.

I can always record the program and watch it afterwards.

I can invite my friend home.

Lets play a game: Cooperate or Defect


I pick two students.
They are not allowed to communicate.
Each writes on a piece of paper, one word, either
Cooperate or Defect.
Reveal the answers.
If both cooperate, each receive 1 class interaction
mark.
If both defect, each receive 0 class interaction
mark.
If the answers are different, the defector receives 2
class interaction marks.

Selling mussels
You sell mussels at a stall in the local wet
market.
You sell it to your friends for 90 PHP/kg.
You sell it to your regular customers for 100
PHP/kg.
A stranger is asking for the price.
What would you reply?

Multi-agent Systems (MAS)


interact with one another through communication
able to act in an environment
have different spheres of influence
may be linked by other relationships, e.g. organisational
It is important to understand the type of interaction.
Each agent can be assumed to be self-interested
has its own preferences and desires about how the world
should be.

Typical structure of a multiagent system


Multi-agent System

Sphere of influence

Environment

Agent
Interaction
Organisation

Utilities and Preferences


We have 2 agents i and j.
={ 1, 2,.} is the set of outcomes (states) that agents
have preferences over.
We capture preferences by utility functions:
ui :
uj :
Utility functions lead to preference orderings over
outcomes:
i means ui() ui()
> i means ui() > ui()

Utility and money


Utility is not money but like it.
Initially, utility grows linearly with increasse in money.
After a certain point, increase in money does not increase utility
(happiness) much.
Utility

Money

State transformer function


Two agents, i and j.
Each agent has just two possible Action actions
Ac = {C, D}, C for cooperate and D, for defect.

Agents simultaneously choose an action and, as a result,


an outcome in will result.
Actual outcome depends on a combination of actions.
Environment behaviour given by state transformer
function:

: Ac Ac
Agent is action

Agent js action

3 kinds of environments
1.

Environment controlled only by 1 agent, e.g., j


(D, D)= 1 (D, C)= 2 (C, D)= 1 (C, C)= 2

2.

Environment sensitive to actions of both agents


(D, D)= 1 (D, C)= 2 (C, D)= 3 (C, C)= 4

3.

Environment where neither agent has any influence


(D, D)= 1 (D, C)= 1 (C, D)= 1 (C, C)= 1

Agents Preference
Consider the case where both agents influence the outcome
and they have the following utility functions:
ui(1 )=1 ui(2 )=1

ui(3)=4

ui(4 )=4

uj(1 )=1 uj(2 )=4

uj(3)=1

uj(4 )=4

Rephrases the above as follows:


ui(D,D)=1

ui(D,C)=1

ui(C,D)=4

ui(C,C)=4

uj(D,D)=1

uj(D,C)=4

uj(C,D)=1

uj(C,C)=4

What should agent i choose to do, cooperate or defect?

Payoff table
ui(D,D)=1

ui(D,C)=1

ui(C,D)=4

ui(C,C)=4

uj(D,D)=1

uj(D,C)=4

uj(C,D)=1

uj(C,C)=4

i
Defect

Coop

Defect
1

1
1

Coop

4
4

Agent i prefers to cooperate; payoffs (4 & 4) are better.

Game Theory

A mathematical theory that studies interactions about


self-interested agents.

Essential elements of a game are:

Players (2 or more)

Some choice of action (strategy)

One or more outcomes (someone wins, someone loses)

Information

Suitable for situations where the other agents (players)


behaviour matters.

The Prisoners Dilemma

2 men are collectively charged with a crime and held in


separate cells. They have no way of communicating with
each other or making an agreement. They are told:

if one confesses and the other does not, confessor will be freed
and the other jailed for 3 years.

if both confess to the crime, then each will be jailed for 2 years.

If neither confess, then each will be jailed for 1 year.

Confessing => defecting, D


Not confessing => cooperating, C
If you were one of the prisoners, what would you do?
Discuss your answer with your neighbour.

Punishment matrix

Prisoner B betrays (defects)


Prisoner B stays silent
(cooperates)
Prisoner A stays silent
(cooperates)
Prisoner A betrays (defects)

Each serves 1 year

Prisoner A: 3 years
Prisoner B: goes free

Prisoner A: goes free


Prisoner B: 3 years

Each serves 2 years

Payoff matrix

Top left: If both defect,

punishment for mutual defection.

Top right: if i cooperates and j defects,


i gets suckers payoff of 1 while j gets 4.

Bottom left: if j cooperates and i defects,


j gets suckers payoff of 1 while i gets 4.

Defect

j
Coop

Bottom right:

Defect

Coop

4
4

3
3

Reward for mutual cooperation.


Numbers in the payoff matrix reflect how good an outcome is for the agent. e.g.

uj(C,C)=3

ui(D,D)=2

ui(D,C)=4

ui(C,D)=1

uj(D,D)=2

uj(D,C)=1

uj(C,D)=4

ui(C,C)=3

The Prisoners Dilemma

The individual rational agent will defect!


This guarantees a payoff of no worse than 2
Cooperating guarantees a payoff of at most 1
Defection is the best response to all possible strategies
Both agents defect and get a payoff = 2.
If both agents cooperate, they will each get payoff = 3.
i

Defect

Defect

Coop

4
4

Coop

3
3

Iterated prisoner's dilemma


Play forever.
Each agent can see what the opponent did on the
previous round.
Which strategies is best?
1. Always defect
2. Always cooperate
3. Random: The strategy cooperates 50% of the time.
4. TIT FOR TAT strategy. Cooperates on the first move, and
then does whatever its opponent has done on the previous
move.
5. TIT FOR 2 TATS is a forgiving strategy that defects only
when the opponent has defected twice in a row.
6. 2 TITS FOR TAT punishes every defection with two of its
own.

Game of chicken

Rebel Without a Cause

A long straight road with a white line down


the middle.
2 fast cars running towards each other from
opposite ends.
Each car is expected to keep the wheels on
the same side of the white line.
Whoever swerves first is the chicken.

Nash equilibrium
Drive on the Left
Drive on the Left
Drive on the Right

10, 10
0, 0

Drive on the Right


0, 0
10, 10

If each player has chosen a strategy and no player can


benefit by changing strategies while the other players
keep theirs unchanged, then the current set of strategy
choices and the corresponding payoffs constitutes a Nash
equilibrium.
There are 2 Nash equilibria on this payoff table.

Another prisoner's dilemma


If both confess (tell the truth), eight years sentence
each.
If both lie, one year each.
If only one confesses, the other (the liar) will be
charged a ten years sentence.
What should Prisoner 1 do?

For Prisoner 2, if he lies, the results are worse than if


he confess. He should choose to confess.
Then Prisoner 1 should choose to confess also.
This is the Nash equilibrium.

Nash Equilibrium
2 strategies s1 and s2 are in Nash Equilibrium if:

Under the assumption that agent i plays s1, agent j can do no


better than play s2;

Under the assumption that agent j plays s2, agent i can do no


better than play s1;

Neither agant has any incentive to deviate from a Nash


Equilibrium.

Unfortunately:

Not every interaction scenario has a Nash Equilibrium.

Some interaction scenarios have more than one Nash


Equilibrium.

Multi-agent Interaction: Summary


MAS: a number of agents which interact with one another through
communication.
An agents action results in an outcome in the environment.
Utility functions are used for preference orderings.
Game theory a mathematical theory that studies interactions among
agents.
An agents action is a strategy:
Dominant
Nash Equilibrium

S-ar putea să vă placă și