cs402 Multiagent Interations

Multiagent Interactions
Wooldridge (2002) Chapters 6

Multi-agent Interactions
Lecture Outline
1. Multi-agent Systems
2. Utility and Preferences
3. Game Theory and Payoff Matrices
4. Strategies
5. Summary
Interactions
The world functions through interacting

agents. Each person pursues his/her own
goals through encounters with other
people or machines.
Rules of Encouter by Rosenchein and Zlotskin, 1994
Example 1
Two students decide to work together on their
exercises. They have to decide upon a time. One
prefers to work on Thursday afternoons after the
lecture while the other prefers to work on Friday
morning. How do they decide upon a time to do
the work?
Example 2
A friend invites you out for a drink and the cinema tonight.
But your favourite TV program is on tonight. You think:
It would be nice to go out with my friend, but its cheaper

to watch TV.
If you stay at home and watch TV, you might not have a
chance to go out with your friend for a long time.
I can always record the program and watch it afterwards.
I can invite my friend home.
Lets play a game: Cooperate or Defect

I pick two students.
They are not allowed to communicate.
Each writes on a piece of paper, one word, either
Cooperate or Defect.
Reveal the answers.
If both cooperate, each receive 1 class interaction
mark.
If both defect, each receive 0 class interaction
mark.
If the answers are different, the defector receives 2
class interaction marks.
Selling mussels
You sell mussels at a stall in the local wet
market.
You sell it to your friends for 90 PHP/kg.
You sell it to your regular customers for 100
PHP/kg.
A stranger is asking for the price.
What would you reply?
Multi-agent Systems (MAS)

interact with one another through communication
able to act in an environment
have different spheres of influence
may be linked by other relationships, e.g. organisational
It is important to understand the type of interaction.
Each agent can be assumed to be self-interested
has its own preferences and desires about how the world
should be.
Typical structure of a multiagent system

Multi-agent System
Sphere of influence
Environment
Agent
Interaction
Organisation
Utilities and Preferences

We have 2 agents i and j.
={ 1, 2,.} is the set of outcomes (states) that agents
have preferences over.
We capture preferences by utility functions:
ui :
uj :
Utility functions lead to preference orderings over
outcomes:
i means ui() ui()
> i means ui() > ui()
Utility and money

Utility is not money but like it.
Initially, utility grows linearly with increasse in money.
After a certain point, increase in money does not increase utility
(happiness) much.
Utility
Money
State transformer function

Two agents, i and j.
Each agent has just two possible Action actions
Ac = {C, D}, C for cooperate and D, for defect.
Agents simultaneously choose an action and, as a result,

an outcome in will result.
Actual outcome depends on a combination of actions.
Environment behaviour given by state transformer
function:
: Ac Ac
Agent is action
Agent js action
3 kinds of environments
1.
Environment controlled only by 1 agent, e.g., j

(D, D)= 1 (D, C)= 2 (C, D)= 1 (C, C)= 2
2.
Environment sensitive to actions of both agents

(D, D)= 1 (D, C)= 2 (C, D)= 3 (C, C)= 4
3.
Environment where neither agent has any influence

(D, D)= 1 (D, C)= 1 (C, D)= 1 (C, C)= 1
Agents Preference
Consider the case where both agents influence the outcome
and they have the following utility functions:
ui(1 )=1 ui(2 )=1
ui(3)=4
ui(4 )=4
uj(1 )=1 uj(2 )=4
uj(3)=1
uj(4 )=4
Rephrases the above as follows:

ui(D,D)=1
ui(D,C)=1
ui(C,D)=4
ui(C,C)=4
uj(D,D)=1
uj(D,C)=4
uj(C,D)=1
uj(C,C)=4
What should agent i choose to do, cooperate or defect?
Payoff table
ui(D,D)=1
ui(D,C)=1
ui(C,D)=4
ui(C,C)=4
uj(D,D)=1
uj(D,C)=4
uj(C,D)=1
uj(C,C)=4
i
Defect
Coop
Defect
1
1
1
Coop
4
4
Agent i prefers to cooperate; payoffs (4 & 4) are better.
Game Theory
A mathematical theory that studies interactions about

self-interested agents.
Essential elements of a game are:
Players (2 or more)
Some choice of action (strategy)
One or more outcomes (someone wins, someone loses)
Information
Suitable for situations where the other agents (players)

behaviour matters.
The Prisoners Dilemma
2 men are collectively charged with a crime and held in

separate cells. They have no way of communicating with
each other or making an agreement. They are told:
if one confesses and the other does not, confessor will be freed
and the other jailed for 3 years.
if both confess to the crime, then each will be jailed for 2 years.
If neither confess, then each will be jailed for 1 year.
Confessing => defecting, D

Not confessing => cooperating, C
If you were one of the prisoners, what would you do?
Discuss your answer with your neighbour.
Punishment matrix
Prisoner B betrays (defects)

Prisoner B stays silent
(cooperates)
Prisoner A stays silent
(cooperates)
Prisoner A betrays (defects)
Each serves 1 year
Prisoner A: 3 years
Prisoner B: goes free
Prisoner A: goes free

Prisoner B: 3 years
Each serves 2 years
Payoff matrix
Top left: If both defect,
punishment for mutual defection.
Top right: if i cooperates and j defects,

i gets suckers payoff of 1 while j gets 4.
Bottom left: if j cooperates and i defects,

j gets suckers payoff of 1 while i gets 4.
Defect
j
Coop
Bottom right:
Defect
Coop
4
4
3
3
Reward for mutual cooperation.

Numbers in the payoff matrix reflect how good an outcome is for the agent. e.g.
uj(C,C)=3
ui(D,D)=2
ui(D,C)=4
ui(C,D)=1
uj(D,D)=2
uj(D,C)=1
uj(C,D)=4
ui(C,C)=3
The Prisoners Dilemma
The individual rational agent will defect!

This guarantees a payoff of no worse than 2
Cooperating guarantees a payoff of at most 1
Defection is the best response to all possible strategies
Both agents defect and get a payoff = 2.
If both agents cooperate, they will each get payoff = 3.
i
Defect
Defect
Coop
4
4
Coop
3
3
Iterated prisoner's dilemma

Play forever.
Each agent can see what the opponent did on the
previous round.
Which strategies is best?
1. Always defect
2. Always cooperate
3. Random: The strategy cooperates 50% of the time.
4. TIT FOR TAT strategy. Cooperates on the first move, and
then does whatever its opponent has done on the previous
move.
5. TIT FOR 2 TATS is a forgiving strategy that defects only
when the opponent has defected twice in a row.
6. 2 TITS FOR TAT punishes every defection with two of its
own.
Game of chicken
Rebel Without a Cause
A long straight road with a white line down

the middle.
2 fast cars running towards each other from
opposite ends.
Each car is expected to keep the wheels on
the same side of the white line.
Whoever swerves first is the chicken.
Nash equilibrium
Drive on the Left
Drive on the Left
Drive on the Right
10, 10
0, 0
Drive on the Right

0, 0
10, 10
If each player has chosen a strategy and no player can

benefit by changing strategies while the other players
keep theirs unchanged, then the current set of strategy
choices and the corresponding payoffs constitutes a Nash
equilibrium.
There are 2 Nash equilibria on this payoff table.
Another prisoner's dilemma

If both confess (tell the truth), eight years sentence
each.
If both lie, one year each.
If only one confesses, the other (the liar) will be
charged a ten years sentence.
What should Prisoner 1 do?
For Prisoner 2, if he lies, the results are worse than if

he confess. He should choose to confess.
Then Prisoner 1 should choose to confess also.
This is the Nash equilibrium.
Nash Equilibrium
2 strategies s1 and s2 are in Nash Equilibrium if:
Under the assumption that agent i plays s1, agent j can do no

better than play s2;
Under the assumption that agent j plays s2, agent i can do no

better than play s1;
Neither agant has any incentive to deviate from a Nash

Equilibrium.
Unfortunately:
Not every interaction scenario has a Nash Equilibrium.
Some interaction scenarios have more than one Nash

Equilibrium.
Multi-agent Interaction: Summary

MAS: a number of agents which interact with one another through
communication.
An agents action results in an outcome in the environment.
Utility functions are used for preference orderings.
Game theory a mathematical theory that studies interactions among
agents.
An agents action is a strategy:
Dominant
Nash Equilibrium

cs402 Multiagent Interations

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

cs402 Multiagent Interations

Încărcat de

Drepturi de autor:

Formate disponibile

Multiagent Interactions

Wooldridge (2002) Chapters 6

The world functions through interacting

It would be nice to go out with my friend, but its cheaper

I can always record the program and watch it afterwards.

I can invite my friend home.

Lets play a game: Cooperate or Defect

Multi-agent Systems (MAS)

Typical structure of a multiagent system

Utilities and Preferences

Utility and money

State transformer function

Agents simultaneously choose an action and, as a result,

Environment controlled only by 1 agent, e.g., j

Environment sensitive to actions of both agents

Environment where neither agent has any influence

uj(1 )=1 uj(2 )=4

Rephrases the above as follows:

What should agent i choose to do, cooperate or defect?

Agent i prefers to cooperate; payoffs (4 & 4) are better.

A mathematical theory that studies interactions about

Essential elements of a game are:

Some choice of action (strategy)

One or more outcomes (someone wins, someone loses)

Suitable for situations where the other agents (players)

The Prisoners Dilemma

2 men are collectively charged with a crime and held in

If neither confess, then each will be jailed for 1 year.

Confessing => defecting, D

Prisoner B betrays (defects)

Each serves 1 year

Prisoner A: goes free

Each serves 2 years

Top left: If both defect,

punishment for mutual defection.

Top right: if i cooperates and j defects,

Bottom left: if j cooperates and i defects,

Reward for mutual cooperation.

The Prisoners Dilemma

The individual rational agent will defect!

Iterated prisoner's dilemma

Rebel Without a Cause

A long straight road with a white line down

Drive on the Right

If each player has chosen a strategy and no player can

Another prisoner's dilemma

For Prisoner 2, if he lies, the results are worse than if

Under the assumption that agent i plays s1, agent j can do no

Under the assumption that agent j plays s2, agent i can do no

Neither agant has any incentive to deviate from a Nash

Not every interaction scenario has a Nash Equilibrium.

Some interaction scenarios have more than one Nash

Multi-agent Interaction: Summary

S-ar putea să vă placă și