Sunteți pe pagina 1din 14

How happy is your web browsing?

A
arXiv:0902.1104v1 [cs.HC] 6 Feb 2009

probabilisti model to des ribe user satisfa tion.

Anirban Banerji
Bioinformati s Centre, University of Pune

Pune-411007

Maharashtra, India

anirbanbioinfo.ernet.in,anirbanabgmail. om

th
Date : 17 De ember, 2008

Abstra t
We all go through s ores of web-pages everyday, in sear h
of required information. At times we be ome satised with
the ontent of some pages, at time we fail to. An obje tive
framework that attempts to model the user satisfa tion when
he sear hes for some desired pie e of information, is essential
for Human-Computer-Intera tion. In the present work, a sim-
ple probabilisti model is onstru ted to a hieve pre isely the
same. This realisti yet stri tly mathemati al study proposes
a marker, the 'satisfa tion retentivity quotient', to model the
omplex realm of user psy hology as he attempts to nd re-
quired information and forgets some bits of it here and there,
simultaneously.

1
We read hundreds of web-pages everyday to nd information of our interest.
Startling extent of inter onne tivity [1℄ of the web-do uments have made life
easier for us while attempting to retrieve ertain pie e of information from the
Internet. We all know how happy we be ome to nd some bits of relevant
information and we know how irritated we be ome not to nd the same in
s ores of pages. While signi ant progress has been made in many realms of
'Human-Computer Intera tion'(HCI), a fundamental aspe t of the eld, namely
an obje tive model that attempts to apture the states of satisfa tion of a user
as he traverses through the web in sear h of a parti ular set of information; still
eludes the students of HCI and in general, people who use Internet. Although
some eorts have been made to quantify ertain aspe ts of 'user satisfa tion'
([2℄ studied it as a fun tion of sear h engines, [3℄ had attempted to des ribe it
with respe t to tra demand and apa ity of ea h link in the network, still
dierently, [4℄ attempted to des ribe it as a fun tion of information retrieval),
a realisti pi ture of the hanging state of satisfa tion level of an Internet user,
does not generally emerge from these works. Here, a simple probabilisti model
is proposed, whi h tries to model the user psy hology as he wanders around un-
known web-sites in sear h of a parti ular set of desired information. An honest
attempt is made to des ribe the pro ess as it is, instead of assuming the user to
perform unrealisti operations.

Let us attempt to model a situation when a user is sear hing for a hunk of
information, of somewhat fuzzy nature (be ause in most of the ases, user him-
self does not possess ategori al knowledge about what exa tly he is sear hing
for, but be omes aware of his pre ise needs when he starts the sear hing op-
eration). Similarly, it is not exa tly ommonpla e for the user to hit-upon a
parti ular web-site where the entire bulk of information would be available to

2
him in one magi al attempt. Instead, he/she gathers bits and pie es of rele-
vant information from various web-sites. We assume that the required set of
information an be broken down to an arbitrarily large number of small bits of
information, whi h the user a umulates and arranges in his mind suitably, to
gather the desired knowledge. Let us assume that a ux of tiny bits of pertinent
information in the very rst page of any web-site is what aptures the imagi-
nation of the user (say, U) and motivates him to read the ontent of the entire
site in su ient detail, in the hope of olle ting the maximum ontent of infor-
mation about the question in his(her, at any rate) mind. We assume further
that the satisfa tion level of U is purely a fun tion of the level of information
he is re eiving from the web-site (in this work, the design features of the web-
site that do not ontribute to the existing level of information ontent of U, is
not onsidered as a reasonable parameter whi h might inuen e the satisfa tion
level of U). We assume further that U's oming a ross these tiny bits of perti-
nent information forms an elementary ow of intensity λ (s aler rate parameter).

It is realisti to assume that the very moment U tou hes upon a pie e of infor-
mation that he thinks might take him loser to the desired pie e of information
he is sear hing for, his satisfa tion level grows. To make the al ulation simpler,
we assume that this growth of satisfa tion in U, takes pla e with a onstant s ale
of unit magnitude. However, as it generally happens in real-life, very soon U
realizes that any typi ally en ountered tiny pie e of hint of an information is not
taking him loser to the set of information he wishes to possess, but is aiming
at something else that aren't exa tly related to the premise of question he is
interested in. His satisfa tion level therefore starts to de ay. To represent the
situation reasonably, we assume this de ay fun tion to be having an exponential
nature (rather than having a two-state (or some other) hara teristi s), with a

3
parameter µ. Thus, while λ is inuen ed predominantly by the ontent of the
site; origin of µ is omplex, be ause of its dependen e on various parameters.
Nevertheless, those tiny sour es of information satises U to some extent and a
umulative ee t of these a quired information starts to build up in his mind.
We assume that the gradual growth of residual satisfa tion of U (attained with
the bits of pie es of a quired information from the web-site) over the travers-
ing time t, through the web-page, an be aptured by summing them up. We
designate the user-satisfa tion level by X(t). In the present work, a simple
mathemati al model is proposed whi h attempts to nd the hara teristi s of
this growth of user-satisfa tion upon browsing through a web-site.

At this point we segment our study in two ases.


Case 1) :

Let us assume that U tou hes upon the pertinent information at random mo-
ments, T1 , T2 , . . . , Ti , . . . , whi h forms an elementary ow of events. The user
satisfa tion level at any arbitrarily hosen moment t, due to his intera tion with
any parti ular bit of information (say, ith bit of information) en ountered at the
moment Ti , is given by :

Si (t) = 0, (t < Ti ) (1)

= e−µ(t−Ti ) , (t > Ti ) (2)

Hen e ompositely we an write, Si (t) = 1(t − Ti )e−µ(t−Ti ) , where 1(t) is a unit


fun tion; Ti > 0, t > 0.

Let us now dene a random variable Ω, whi h des ribes the number of su h
tiny bits of information that inuen e the satisfa tion level of U. This variable,

4
to be realisti , will be having a Poisson distribution with parameter λt ([5-8℄).
Further, to des ribe the real-life situation properly, we represent the user-
satisfa tion level X(t) as the sum of random number of random terms :


X
X(t) = e−µ(t−Ti ) 1(t − Ti ) (3)
i=1

Sin e a Poisson ow of events on any interval (0, t) an be represented, with suf-
 ient a ura y as a olle tion of points on that interval (des ribed in the 'Sup-
plementary Material', available on request), the oordinate of whi h αi ∈ (0, t)
is uniformly distributed on that interval and does not depend on the oordi-
nates of other points. This is natural to expe t even from a non-mathemati al
intuitive understanding of the situation also. Be ause the user U omes a ross
all these points (representing the exa t instan e of nding a bit of relevant in-
formation) during the interval (0, t) and this des ription exhaustively represents
the entire event spa e of favorable en ounter for the user during (0, t).

Therefore, eqn -3 an be re-written in the form :


X
X(t) = e−µ(t−αi ) (4)
i=1

where the random variables αi are independent and uniformly distributed in


the interval (0, t).

Sin e the satisfa tion of U is a fun tion of interplay of λ and µ, and pra ti-
al experien e suggests it to be having a umulative nature, we an attempt
to model user satisfa tion as a resultant of ea h event of favorable information
gathering and unfavorable de ay. We designate Xi (t) = e−µ(t−αi ) = e−µt eµαi ,

5
where Xi (t) represents ea h of these tiny events. Hen e we have :


X Ω
X
X(t) = Xi (t) = e−µt eµαi (5)
i=1 i=1

where Xi (t) are independent similarly distributed random variables, and the
random variable Ω does not depend upon the random variables Xi (t) either.
Here we note that although X(t) is umulative in nature, essentially it is a
sto hasti pro ess.

At this moment, we invoke the known formula regarding mean value and vari-
an e of the sum of a random number of random variables [9℄,[10℄ (if random
PΩ
variable Z is a sum Z = i=1 Xi , where the random variables Xi are indepen-
dent and have the same distribution with mean value mx and varian e Varx ;
the number of terms Ω is an integral random variable whi h does not depend
upon terms of Xi ; has a mean value mΩ and varian e VarΩ ; we know the mean
value mz and varian e Varz of the random variable are given by : mz = mx mΩ
and V arz = V arx mΩ + m2x V arΩ ). Whereby in the present ase, we have :

mx (t) = mΩ (t)mxi (t) (6)

and
V arx (t) = mΩ (t)V arxi (t) + V arΩ (t)m2xi (t) (7)

Sin e the random variable Ω has a Poisson distribution with parameter λt, it
follows that mΩ (t) = V arΩ (t) = λt.

To nd mxi (t) :

6
t 1−e−µt
mxi (t) = E[Xi (t)] = 1
t 0
e−µ(t−x) dx = µt .

However, it is pragmati to assume that the pro ess of estimation of mxi (t)
in user's mind to be less than smooth and therefore to des ribe the situation
realisti ally, we need to al ulate some quantity analogous to moment of iner-
tia of mxi (t), if in the mental spa e mxi (t) is des ribed as a line-shaped obje t.
Hen e we determine the se ond moment about the origin of the random variable
Xi (t) :
t 1−e−2µt
E[Xi2 (t)] = 1
t 0
[e−µ(t−x) ]2 dx = 2µt .

Hen e,
1 − e−µt
mx (t) = λ (8)
µ

1 − e−2µt
V arx (t) = λt[V arxi (t) + m2xi (t)] = λtE[Xi2 (t)] = λ (9)

It is interesting to noti e that as t → ∞, the mean value and varian e of the


pro ess X(t) do not depend on time, sin e

λ
limt→∞ mx = mx = (10)
µ

and

λ
limt→∞ V arx (t) = V arx = (11)

7
This is expe ted purely from an intuitive perspe tive also. After traversing
through the web-site(s) for a su iently long time, the user is expe ted to
gather a nite amount of desirable information. However, sin e he fails to re-
member all of it, only a fra tion of the amassed information will be retained
by him. Hen e the fra tion λ
µ an be named as 'satisfa tion retentivity quotient'.

If we represent a su iently a large number by L, then the distribution pro-


le of the se tion of the sto hasti pro ess X(t) for mx = λ
µ > L, an be
interesting. For this we onsider a nite but su iently large interval (0, t) and
assume that for some su iently large number of times Ω, user's intera tion
with the desired pie e of information takes pla e on that interval. For su h a
situation we see that the pro ess X(t) (eqn -3) is a sum of independent similarly
distributed random variables, whi h has an approximately normal distribution
(sin e in this ase the onditions of the entral limit theorem are in fa t ful-
lled). Hen eforth, the se tion of the sto hasti pro ess an be onsidered to be
having a normal distribution with hara teristi s mx = λ
µ and V arx = 2µ .
λ

To understand the hara teristi s of user satisfa tion, it is imperative to form


an idea about the evolution of it over the time interval of his browsing through
web-sit(s). Hen e we pro eed to nd the orrelation fun tion between user sat-
isfa tion proles during dierent instan es of browsing operation. This an be
done by onsidering two se tions of the sto hasti pro ess in question, at the
moments t and t′ (t′ > t). By virtue of the assumption made, we an assert that
the user satisfa tion X(t′ ) at the moment t′ , is equal to the extent of satisfa -
tion X(t) at the moment t multiplied by the exponent e−µ(t −t) , added with the

8
satisfa tion Ω(t′ − t), whi h omes into being due to user's oming a ross some
interesting bits of information during the time interval (t′ − t). Hen e X(t′ ) is
given by :


X(t′ ) = [X(t)e−λ(t −t) + Ω(t′ − t)] (12)

The sto hasti pro esses X(t) and Ω(t′ − t) are evidently independent sin e they
are generated due to user's intera tion with desired pie e of information during
dierent, non-overlapping time intervals (0, t) and (t, t′ ) respe tively.

The same an be said about the entered sto hasti pro esses Ẋ(t) and Ω̇(t′ −t),
where we dene Ẋ(t) = X(t) − mx (t) and Ω̇(t′ − t) = Ω(t′ − t) − mΩ (t′ − t) as
entered random fun tions of the aforementioned sto hasti pro esses.
Hen e, using eqn -12 we have :

h i
Cx (t, t′ ) = E Ẋ(t)Ẋ(t′ )
h n ′
oi
= E Ẋ(t) Ẋ(t)e−µ(t −t) + Ω̇(t′ − t)
 2  ′
= E Ẋ(t) e−µ(t −t) if (t′ > t)
 2  ′

= E Ẋ(t ) e−µ(t−t ) if (t′ < t)

Thus the orrelation an be ompositely expressed as :

h ′
i ′
Cx (t, t′ ) = V arx (min(t, t′ )) 1 − e2αmin(t,t ) e−µ|t −t| (13)

9
Let us onsider the limiting behavior of the sto hasti pro ess when t → ∞,
t′ → ∞, but the magnitude of their dieren e τ = t′ − t is nite. In this ase,
Cx (τ ) = V arx e−µ|τ | = λ −µ|τ |
2α e .

Hen e the sto hasti pro ess X(t) representing user satisfa tion pra ti ally at-
tains stationarity in every aspe t when the user spends a long time sear hing
for some desired bulk of information, whi h onforms to our experien e. Fur-
thermore, its nature assumes that of a normal distribution when user sear hes
for long (in other words, (t → ∞), (t′ → ∞)) and λ
µ > L.

Of ourse the user an hit upon a web-site where the information regarding
all of his interest is kept in one pla e. In su h (unlikely) ase, naturally µ → 0.
Here the extent of user satisfa tion will be a Poisson pro ess sin e every new
pie e of information that the user will be en ountering will exa tly mat h with
the desired set of information he wanted to ollate. Consequently, the de ay in
user's interest will o ur minimally. In su h a ase, the expressions for mx (t),
Varx (t) and Cx (t, t′ ) will assume the form :

−µt −2µt
limt→∞ mx (t) = limµ→0 λ 1−eµ = limµ→0 V arx (t) = limµ→0 λ 1−e2µ = λt

h ′
i ′
limµ→0 Cx (t, t′ ) = limµ→0 2µ
λ
1 − e2µmin(t,t ) e−µ|t −t| = λ [min(t, t′ )]

Case 2) :

In some other real-life ases another situation is frequently en ountered. When


ertain related pie es of information from a web-site onform to user's desired
set of information and user omes a ross these related pie es of information

10
in a somewhat quantized form. Although this ase is similar to one dis ussed
already, there are ertain subtle dieren es. To nd the hara teristi s of user
satisfa tion level in this situation, we assume the appli ability of the assump-
tions made earlier and at the same time assume further that user's oming a ross
su h quantum of desired information form an elementary ow with intensity λ.
The exa t number of information that onstitute any ith quantum of desired
information is assumed to be a random variable Ri , whi h, keeping with the
real-life situation, is obviously independent of the number of information that
onstitutes any other quantum. The random variable Ri has a distribution f (R)
with hara teristi s of mR and varR .

Just like the ase where user was en ountering the desired information in bits
and pie es(eqn 3), here too we an represent the extent of user-satisfa tion by :


X
X(t) = Ri e−µ(t−αi ) (14)
i=1

where the random variables Ω, Ri and αi are mutually independent.

Keeping with the ase-1 approa h, we designate Xi (t) = Ri e−λ(t−αi ) and then
−µt −2µt
E [Xi (t)] = mR 1−eµt and E Xi2 (t) = (V arR + m2R ) 1−e2µt
 

Hen eforth,

1 − e−µt
mx (t) = λmR (15)
µ

and

1 − e−2µt
V arx (t) = λ[(V arR + m2R ) ] (16)

11
Sin e mR > 0 and V arR > 0, eqn -15 will grow faster than eqn -8, similarly
eqn -16 will grow faster than eqn -9. This is ompletely in agreement with pra -
ti al experien es. Sin e user omes a ross the desired bulk of information in a
oherent quantized form, he a quisition of knowledge be omes fast.

This basi s heme of swiftness of knowledge gathering (obviously) doesn't hange


when the user browses for a long time and at the limiting ase t → ∞, we have:

λmR
limt→∞ mx (t) = mx = µ
λ(V arR +m2R )
limt→∞ V arx (t) = V arx = 2µ

Cx (τ ) = V arx e−µ|τ |

Con lusion :

A probabilisti model is proposed here that des ribes the satisfa tion prole of
a user when he browses through web-site(s) in sear h of a desired set of informa-
tion. Sin e the results obtained from theoreti al onsiderations seem to agree
pretty mu h with our routine experien es, the reliability of this attempt an be
onsidered trustworthy. The model points to a stationarity in user satisfa tion
prole when the browsing operation ontinues for a long time. Most impor-
tantly, it suggests a marker, the 'satisfa tion retentivity quotient' that aptures
the essen e of the entire pro ess and an help in obje tive des ription of many
of the pro esses that the rapidly emerging eld of HCI attempts to model.

12
A knowledgment : This work was supported by COE-DBT(Department of

Biote hnology, Govt. of India)S heme.


The author would like to thank the present and previous Dire tors of Bioin-
formati s Centre, University of Pune; Dr. Urmila Kulkarni-Kale and Professor
Indira Ghosh, for supporting him to perform this work, although it has got
nothing to do with his PhD. resear h.

Referen es :

[1℄ D. Cohn, and T. Hofmann, The Missing Link - A Probabilisti Model of Do -

ument Content and Hypertext Conne tivity, in T. Leen et al., eds., Advan es in
Neural Information Pro essing Systems 13; MIT Press, Cambridge, MA, 430-
436, 2001.
[2℄ S. Kohli, and E. Kumar, Development Of A Fuzzy Based User Satisfa -

tion Model Of A Sear h Session With Sear h Engine; 2nd National Conferen e
Mathemati al Te hniques: Emerging Paradigms for Ele troni s and IT Indus-
tries, 202-210, 2008.
[3℄ M. Liu, and D. M. Frangopol, Probability-Based Bridge Network Perfor-

man e Evaluation; J. Bridge Engrg. 11(5), 633-641, 2006.


[4℄ B. Wu, K.Y.M. Wong, and D. Bodo, Mean Field Approa h to a Probabilis-

ti Model in Information Retrieval; Advan es in Neural Information Pro essing


Systems 15: Pro eedings of the 2002 Conferen e; eds. Be ker S., et. al., MIT
Press, Cambridge, MA, 513-520, 2003.
[5℄ D. Cahoy, Fra tional Poisson pro ess in terms of alpha-stable densities, PhD.

Thesis, Case Western Reserve University, Cleveland, Ohio, United States, 2007.
[6℄ T. Bonald, and A. Proutiere, Insensitive Bandwidth Sharing in Data Net-

works, Queuing Sys.: Theory and Appl., 44(1), 69-100, 2003.

13
[7℄ E. Wentzel, and L. Ov harov, Applied Problems in Probability Theory, En-

glish translation, Mir, Mos ow, pp. 200, 1986.


[8℄ I. Kovalenko, N. Kuznetsov, and V. Shurenkov, Models of random pro esses:

a handbook for mathemati ians and engineers, CRC Press, Bo a Raton, pp. 75-
78,1996.
[9℄ A. Mayerson, D. Jones, and N. Bowers, On the Credibility of the Pure Pre-

mium, Pro . of the Casualty A tuarial So iety, 55, 175185, 1968.


[10℄ B. Gnedenko, Theory of Probability, 6th ed., CRC Press, Bo a Raton, pp.

303-308, 1998.

14