Sunteți pe pagina 1din 10

PReCinCt: A Scheme for Cooperative Caching in Mobile Peer-to-Peer Systems ∗

Huaping Shen, Mary Suchitha Joseph, Mohan Kumar, and Sajal K. Das
Center for Research in Wireless Mobility and Networking (CReWMaN)
Department of Computer Science and Engineering, University of Texas at Arlington
Arlington, TX 76019, USA
{hpshen, mjoseph, kumar, das}@cse.uta.edu

Abstract wireless exchanges of data [10]. Typically, P2P systems en-


able direct real-time sharing of services and informa-
Mobile Peer-to-Peer (MP2P) systems consist of mo- tion among distributed peers. In contrast to P2P systems
bile peers that collaborate with each other to complete on wired networks that comprise static peers, MP2P sys-
application problems. Information sharing in such en- tems are subjected to the limitations of battery power, wire-
vironments is a challenging problem due to the fun- less bandwidth, and highly dynamic network topology.
damental limitations of battery power, wireless band- Thus MP2P systems give rise to new challenges for re-
width, and users’ frequent mobility. We proposed a novel search on routing, resource discovery, data retrieval, data
scheme, called Proximity Regions for Caching in Coopera- consistency maintenance, security and privacy manage-
tive MP2P Networks (PReCinCt) to efficiently support scal- ment.
able data retrieval in large-scale MP2P networks. In the
In existing unstructured P2P networks [17], flooding is
PReCinCt scheme, the network topology is divided into ge-
the most popular data retrieval mechanism. Flooding en-
ographical regions where each region is responsible for
tails message processing at every node which is expensive
a set of keys representing the data. In this paper, we ex-
in terms of bandwidth, battery power and computational re-
tend the PReCinCt scheme to facilitate consistent cooper-
sources. In [12], the expanding ring scheme is proposed to
ative caching in MP2P systems. The caching scheme con-
reduce the cost of each data retrieval. A node starts a flood
siders data popularity, data size and region-distance dur-
for a request with a small Time-to-Live (TTL) and contin-
ing replacement to optimize cache content of peers.
uously increases the TTL until the data is found. Due to
PReCinCt employs a hybrid push/pull mechanism to main-
the resource constraints (e.g., battery and wireless band-
tain data consistency among replicas in the network. Sim-
width), both flooding and expanding ring schemes are not
ulation results show the cost of consistency maintenance
expected to adapt well in large scale MP2P networks. In the
in terms of latency and energy consumption is signifi-
structured P2P networks [15], distributed hash table (DHT)
cantly improved in the PReCinCt scheme.
is used to deploy and retrieve data in controlled network
topologies. In DHT based data retrieval schemes, given a
key, the query will be routed to a specific node (location)
1. Introduction and Background that is responsible for storing the value associated with the
key. However, due to the nodes’ frequent mobility, the data
In Peer-to-Peer (P2P) systems, peers with equiva- retrieval schemes of structured P2P networks are also not
lent functionality, communicate and share resources. With expected to perform well in MP2P networks.
the rapid advancements in mobile wireless communica-
In this paper, we investigate cooperative caching and
tion technology, P2P computing has also been introduced
consistency issues in a novel scheme, called Proxim-
into mobile and wireless networks, thus leading to mo-
ity Regions for Caching in Cooperative MP2P Networks
bile peer-to-peer (MP2P) networks. MP2P systems con-
(PReCinCt) [11]. In this scheme, the entire network topol-
sist of peers that are user applications executing on mobile
ogy is divided into geographical regions, each being
devices. Mobile peers interact during (brief) physical en-
responsible for a set of keys representing the data of in-
counters in the real world, thereby engaging in short-haul
terest. A hash function h(ki ) = Lj is used at each peer to
∗ This work was supported by the National Science Foundation under map a key ki to a region (Rj )’s location Lj . We call the re-
Grant No. 0129682 and the Texas Advanced Research Program under gion Rj as the home region of key ki . When a peer needs a
Grant Number: 14-771032. data item represented by a key ki , the peer obtains the lo-

Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05)
1530-2075/05 $ 20.00 IEEE
cation information of home region Rj by using the hash cache and invalidates the relevant data. The advantage of
function, and then a request message is sent to the home re- the push scheme is its simplicity and stateless nature. How-
gion Rj by using a geographic-aided routing protocol, ever this scheme substantially increases the control message
such as GPSR [8]. After reaching the home region, lo- overhead due to subsequent broadcasts of the invalidation
calized flooding is used to locate the peer holding the messages and entails message processing at every peer. This
requested data. By routing to regions rather than to spe- is undesirable in mobile environments where bandwidth is
cific points, PReCinCt requires only approximate location scarce and peers have very limited battery power. In addi-
information of each region, thus making it robust to er- tion if a peer is unreachable, it will not receive the invalida-
rors in location measurement and frequent mobility of tions thus introducing the risk of violating consistency guar-
peers. antees.
Caching is a key technique for improving the data re- In the Pull-Every-time scheme individual peers are re-
trieval performance of mobile devices [14]. Cooperative sponsible for maintaining consistency of their cached items.
caching is a mechanism in which peers can access data In this scheme, each time a peer P requests for a data item
items from the cache in their neighboring peers. In the lit- d and there is a local cached copy, the peer polls the owner
erature, cooperative caching schemes have been studied to of the data to validate its copy of d. The advantage of this
improve the performance of web proxies in wired networks scheme is its simplicity and strong consistency guarantees.
[5][13]. In order to further save bandwidth for each data re- The disadvantage is that it generates a large amount of net-
trieval, PReCinCt incorporates a novel cooperative caching work traffic thus incurring a considerable amount of band-
scheme that caches relevant data among a set of peers in width. Also, the latency increases as the response time in-
a region. In contrast to existing caching schemes, our co- curs an additional round-trip delay to poll the owner of the
operative caching scheme determines cache replacement by data item.
considering not only each peer’s own access frequency but The Push with Adaptive Pull, combines the positive fea-
also the importance of data items to other peers in the same tures of the Plain-Push and Pull-Every-time schemes in or-
region. The proposed caching strategy, Gready-Dual Least der to address the problems of the original schemes. Dur-
Distance (GD-LD), uses a utility function to evaluate the ing the push phase of our proposed scheme, the peer up-
importance of each data item based on a combination of dating a data item sends the update to the home region for
three factors: the popularity of the item in the region, size that data item. During the pull phase, peers poll the home
of the item, and the region distance between the request- regions to check if their cache contents are obsolete. We
ing and responding peers. In this paper, the words peer and propose an adaptive polling scheme that determines the fre-
node will be used interchangeably. quency at which peers should poll the home regions. The
Most existing P2P systems are specifically designed to home region assigns a Time-to-Refresh (TTR) value for its
share static content such as music and video files and thus data items. This TTR value is an estimation of the duration
assume that data is static and updates occur very infre- of time during which the data item is less likely to change.
quently [4]. To handle dynamic data, the existing techniques Requests for a data item before the expiration of the TTR
must advance from a primarily read-only system to one in are served by the peer’s local cache. Upon expiration of the
which data can be read as well as updated frequently. Han- TTR, peers poll the home region to check for the validity of
dling dynamic data necessitates effective consistency main- the data item.
tenance schemes to ensure that all replicas of a data item are The rest of the paper is organized as follows: Section
consistent with each other. These observations have moti- 2 presents an overview of the PReCinCt scheme. This
vated us to consider cache consistency issues in MP2P sys- section includes a discussion on region management and
tems. fault-tolerance and data replication. Section 3 discusses the
In this paper, we propose a hybrid push/pull algorithm caching strategy employed in PReCinCt and Section 4 dis-
called Push with Adaptive Pull that uses minimal message cusses data consistency issue in PReCinCt scheme. Section
overheads to maintain data consistency. We investigate and 5 presents analytical performance evaluation of PReCinCt
compare the proposed scheme with two existing cache con- scheme. Section 6 presents detailed simulation results and
sistency algorithms: Plain-Push [3], which is a push based finally Section 7 concludes the paper and discusses our fu-
scheme initiated by the peer making the update; Pull-Every- ture work.
time [7] initiated by the peer requesting for the data. The
proposed push with adaptive pull algorithm enhance the 2. Description of PReCinCt Scheme
original PReCinCt
In Plain-Push, the new update for a data item is pushed This section describes our PReCinCt scheme for data
by the initiator to the entire MP2P network using flooding. retrieval in MP2P networks. In this scheme, the network
When a peer receives an invalidation message, it checks its topology is divided into geographical regions, each respon-

Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05)
1530-2075/05 $ 20.00 IEEE
sible for a set of keys of data. The keys for the data are dis- Nodes routing the requested message towards the desti-
tributed among the regions, such that each key ki maps to a nation region check the destination region location in the
region Rj . This key-set to region mappings are known to all header, to determine whether they are within that region.
peers by using a geographic hash function. Every data item The first node inside the destination region receiving the re-
is associated with a home region, where the data item is ini- quest message, identified as the point of broadcast, floods
tially stored. the message within the region to locate the peer holding
the data item. Each peer in the home region processes the
request message to determine if it has the requested data.
2.1. Region Management Peers located outside the home region drop the request
message without further processing. When the data is lo-
In PReCinCt, the whole network topology is divided into cated, the response is sent back to the original request-
multiple geographical regions. Each region is represented ing peer and the search process terminates. When the re-
by the location information of its center point and all ver- questing peer receives the response, the cache manager de-
tices in perimeter. Each peer uses a region table to keep the cides if this data item should be cached. (Cache manage-
location information of all regions in the whole network. ment will be discussed in the next section). Thus using the
When a peer joins the network for the first time, the peer can PReCinCt scheme, flooding is limited to within a region
retrieve the region table from its neighboring peers. The size leading to savings in network bandwidth and energy con-
and shape of the regions can be changed by updating the re- sumption at the nodes. The detailed algorithm of search pro-
gion table of all peers in the network. At the same time, cess of PReCinCt is given in Figure 1.
each key in the network also needs to be relocated accord-
ing to the region table changes.
2.3. Peer Mobility Handling
There are four operations that can be applied to modify
regions - Add, Delete, Merge, and Separate. In Add opera-
In PReCinCt, mobility of peers is classified into cate-
tion, a new entry which includes the location information of
gories: intra-region and inter-region. In the intra-region mo-
new region is added into the region table to indicate the ex-
bility, a peer moves in the same region, thus the overheads
pansion of the whole network topology. When a region is
are minimal. In the inter-region mobility, on the other hand,
no longer in the network, the Delete operation removes the
a peer moves out of its initial region to a neighboring re-
entry of corresponding region from region table. Merge re-
gion. Peers check their positions periodically to detect a
places the entries of two existing regions in the region ta-
inter-region mobility. If a inter-region mobility occurs, the
ble with the location information entry of the new region,
peer has to distribute its keys to other peers in its origi-
indicating the merging of two neighboring regions. Sepa-
nal region. To reduce the overhead of inter-region mobil-
rate is used to divide one region into two new regions. All
ity, the peer moving out of the region sends its keys to the
the four operations are executed by updating the region lo-
other peers such that: 1) have low mobility rates; 2) are lo-
cation information in the region table. After each execution,
cated near the center of the region; and 3) have cache space
the peer needs to disseminate the update to all other peers in
to store this data. Peers with low mobility and those located
the whole network to guarantee the consistency of region ta-
near the center of the region have a low probability of leav-
bles of all peers.
ing the region in the near future.

2.2. Data Search Process 2.4. Fault Tolerance and Replication

When a peer needs a data item, it first floods the request There is one major obstacle that all peer-to-peer sys-
in the region in which it currently resides to determine if tems must overcome - that of node disconnecting or sud-
any of its neighboring peers have a local cached copy of the den crash. Due to the high peer mobility and low reliability
item. If so, it is a local hit, the request succeeds and the re- of wireless links, the problem is more severe in the MP2P
sponse is sent back to the requesting peer. If not, the re- systems. We must therefore make fault-tolerance a priority.
questing peer uses the geographic hash function to get a lo- A general solution to fault tolerance problem is is very
cation according to the query. Home region is determined hard to construct. We make three assumptions based on our
by searching the region whose center location is closest to target application and expected users’ behavior in MP2P
the location from the region table. Then the requesting peer systems. i) Node failures are independent: Since nodes are
generates a request message containing the following three owned by different users and nodes use peer-to-peer wire-
fields: i) the identity of the peer making the request, ii) the less links, we assume that a set of nodes with adjacent geo-
location of home (destination) region for the requested item, graphic locations are highly unlikely to have dependent fail-
and iii) the key of data item being requested. ures. ii) Most users quit the network gracefully: If a mov-

Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05)
1530-2075/05 $ 20.00 IEEE
ing node retains some keys, such keys will be transferred to
Notations:
other nodes in the same region prior to disconnection of the d: requested data item;
node from the MP2P system. iii) Messages will be routed Preq : peer requesting d;
to the correct node: Despite node mobility and inhomoge- Rreq : region in which Preq resides;
Presp : peer which responds with data d ;
neous node distribution in MP2P networks, it is reasonable Rresp : region in which Presp resides;
to assume that messages eventually reach the correct node. U (i): Utility value of data item i;
Most underlying ad-hoc routing protocols provide lim-
Procedure Search(d)
ited fault-tolerance to make the routing resilient to user mo- Begin
bility and node failures, but data items in MP2P systems still if (d is cached in Preq )
need to be replicated to improve their availability. Further- Update utility value of d in Preq ;
return d;
more, the replicas must be kept consistent with the original else
after each update. Since PReCinCt keeps keys at their home Broadcast the request for d in Rreq
regions, we define the home region failure as the failure that for each (peer Pi in Rreq )
if (d cached in some peer Presp )
there is no copy of a key in its home region. Home region Update utility value of d in Presp ;
failure happens when there is no peer in a region, a peer Send response to Preq
did not send back the key to the home region after a inter- else
Obtain the home region Rresp for data d;
region mobility, or sudden death occurs to the peer hold- Route the request for d to Rresp
ing the key. Based on the above three assumptions, we de- Broadcast the request in Rresp
sign a lightweight data replication mechanism in PReCinCt if ( d cached in some peer Presp )
Update utility value of d in Presp ;
to tolerate network, nodes and home region failures. The al- Send response to Preq
gorithm tries to maintain at least one replica of each key in End
a region other than the home region under dynamic circum-
Procedure CacheAdmissionControl(d)
stances to improve the availability of each data item. The Begin
algorithm can be easily extended to to multiple replicas for if (Rresp = Rreq )
coping with higher failure frequencies. Multiple copies re- Calculate the utility of data d;
if (cache space available for d)
sult in increased bandwidth usage for replica messages and cache d;
more cache space. else
call CacheReplacementPolicy(d);
As we mentioned in the Data Search Process section, else
home region of each key is determined by selecting the do not cache d;
region whose center location is closest to the key’s hash End
value. Given a data item with key K whose hashed value Procedure CacheReplacementPolicy(d)
is location L, and the home region Rh is the region whose Begin
center location (Lh ) is closest to L. We will similarly se- L = min(U (q)), ∀q in the cache
Evict q which satisfies U (q) = L and cache d;
lect the next closest region Rr whose center location is Lr U (d) = L + U (d);
as a replica region of K, i.e., ∀Ri : dist(L − Lh ) ≤ End
dist(L − Lr ) ≤ dist(L − Li ). This means that a request
message for key K will always be routed to the region Rh . Figure 1. Algorithms of the PReCinCt
If there exists network failure, node failure or region failure Scheme
of Rh , this request message will instead be rerouted to the
replica region Rr . After reaching the Rr , localized flood-
ing is used to locate the requested data item. In order to tency of subsequent retrievals. The dynamic cache is opti-
maintain the consistency between original key and replica, mally managed by a greedy cache mechanism. In this sec-
update message needs to be sent from home region to the tion we present the caching scheme employed in PReCinCt
replica region if the home region receives a data update for that includes a cummulative cache, a cache admission con-
key K. We will discuss the details in the next section. trol and a cache replacement policy.

3. Cooperative Caching in PReCinCt 3.1. Cumulative Cache

The cache space of each peer is divided into two parts: Caching data items in the local caches of peers, helps re-
static and dynamic. The static space contains the values of duce latency and increase accessibility. When a peer Pi re-
keys that belong to the region where the peer currently re- quests for data item d, it first attempts to obtain the data lo-
sides. The dynamic space, on the other hand, contains data cally from its own region by broadcasting the request to its
items that are placed opportunistically at a peer to reduce la- regional members. A local hit occurs if the request can be

Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05)
1530-2075/05 $ 20.00 IEEE
satisfied within the same region. If not, the request is sent to is a hit. This value is used by the greedy replacement algo-
the home region of that data. If a peer along the path to the rithm at the peer to find an optimal replacement and thereby
home region has the requested data item d, then it serves the manage the cache efficiently.
request without forwarding it further towards the home re-
gion. Otherwise, the request is forwarded to the home re-
gion of item d. Since the local caches of the peers virtually 4. Data Consistency in PreCinCt
form a cumulative cache, decisions regarding the caching of
a data item and its eviction from the cache depend not only In this section we discuss our proposed Push with Adap-
on the peer itself but also on the neighboring peers. There- tive Pull scheme to maintain data consistency among repli-
fore, PReCinCt includes a cache admission control policy cas in the MP2P network. This scheme combines the push-
and a cache replacement algorithm. based invalidation initiated by the peer updating a data item
with the pull based invalidation initiated by the individual
peers that maintain the cache. During the push phase the
3.2. Cache Admission Control peer updating a data item sends the update to the home re-
gion for that data item. During the pull phase, peers poll
When a peer Pi receives the response for the requested
the home regions to check if their cache contents are obso-
data item, a cache admission control is triggered at Pi to de-
lete. The proposed an adaptive polling scheme determines
cide whether the data item should be cached. If the origin
the frequency at which peers should poll the home regions.
of the data resides in the same region as the requesting peer
Pi , then the item is not cached; otherwise it is cached at the Push Phase: When a peer Pupdate initializes an update
peer Pi . Peers cooperatively cache data and thus it is unnec- request for a data item d at time t, Pupdate updates d and
essary to replicate data in the same region, as they can be pushes the update message to the home region Rh of d as
obtained locally for subsequent requests. well as replica region Rr of d. Within the home and replica
regions flooding is used to locate the peer, Pi that has the
data item d. Pi updates its copy of data item d. Thus by
3.3. Cache Replacement Policy pushing the update to only the home and replica regions in-
stead of to the entire network, our scheme saves on band-
We have developed a greedy replacement policy which width of the wireless network and incurs less message over-
considers three factors while selecting a victim: i) the ac- head to maintain data consistency. After receiving the up-
cess count of the data items that reflects the popularity of date message, Pi updates the Time-to-Refresh (TTR) value
a data item in the region; ii) the size of the data items; and for d. The TTR values are dynamically maintained to re-
iii) the region-distance which is the distance between the re- flect the update rates for the data items. The algorithm of
gions of the requesting and the responding peers. This algo- the Push phase is given in Figure 2.
rithm incorporates the region-distance as an important pa-
rameter in selecting a victim for replacement. The greater
the region-distance, the greater is the utility of the data. Notations:
This is because caching data items which are further away, d: updated data item;
t:time at which the update is initiated;
saves bandwidth and reduces latency for subsequent re- Pupdate : peer initiating d;
quests. Thus we call our replacement algorithm Greedy- Rh : home region for data d;
Dual Least-Distance (GD-LD). Rr : replica region for data d;
The cache replacement policy employs a utility function M (d,t): Update message for data d which is updated at time t;
to assign a utility value for each data item, based on the Procedure Push
three factors mentioned above. Since the nodes in the re- Begin
gion cooperatively cache data, the utility value reflects the When Pupdate initiates an update message M (d, t)
Obtain the home region Rh and replica region Rr for d
importance of the data item to the entire region. The util- Route the message M (d, t) to regions Rh and Rr
ity value is calculated using the following expression: Within Rd or Rr locate the peer Pi which has d
Update d and store the update time t in Pi
Pi calculates TTR of data item d
1 End
U tility = wr × aci + wd × reg dst + ws × (1)
szi Figure 2. Push Phase
where aci is number of times the item di has been accessed
in the region; reg dst is the distance between the request-
ing region and the home region for the data; szi is size of the Pull Phase: The pull phase of the algorithm is initiated
item di , and wr , wd , ws are the corresponding weight fac- by the individual peers interested in checking for the valid-
tors. The utility value of the data item is updated when there ity of their cached items. During this phase peers poll the

Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05)
1530-2075/05 $ 20.00 IEEE
home regions to determine the freshness of data items in rate for the data item, redundant polls to the home region
their cache. We propose an adaptive pull mechanism. are avoided. This saves the bandwidth as well as reduces la-
Adaptive Pull: The notion of adaptive polling has been tency.
used in the context of web cache consistency [7] and we
adopt a similar idea here. In our proposed adaptive polling 5. Performance Analysis of PReCinCt
mechanism, peers vary the frequency with which they poll
the home region for a data item, depending on the update In this section we describe the energy model used in
rate for that data item. In this scheme the polling frequency our performance evaluation and determine the energy con-
is dynamically varied so that data items that are frequently sumption of the flooding scheme and proposed PReCinCt
updated are polled more often than data items that are rela- scheme.
tively static. Energy as a distributed network resource has unique
The home regions assign a TTR value to each of their properties that distinguish it from other resources [6]. En-
data items. This TTR value is varied dynamically by the ergy is non-renewable: a mobile node has a finite, mono-
home region to reflect the update rate for the associated tonically decreasing energy store. Unlike bandwidth the
data item. When the home region for data item d receives energy cost requires separate calculations with respect to
an update for d, it locates the peer Pi which holds d. If the sender, the intended receiver and other nodes which
tupd intvl is the interval between successive updates of d, overhear the message. This makes it necessary to distin-
and α (whose value is between 0 and 1 is a constant fac- guish between broadcast traffic which is processed by all
tor to weigh the relative importance of the recent and past nearby nodes and point-to-point traffic which is processed
updates), then the TTR value is updated as: by the intended receiver(s) and discarded by all other nearby
nodes.
T T R = α × T T R + (1 − α) × tupd intvl (2)
This TTR value will be sent by the home region to other 5.1. Energy Model
peers on subsequent requests for data item d. When a peer
needs d and there is a local cached copy, it first determines According to [6], the energy consumed by a mobile peer
if the TTR of d has expired. The peer polls the home region for sending, receiving or discarding a message is given by
only if the TTR of d has expired. Thus we avoid too many the following linear equation:
polls to the home regions while still ensuring good consis-
tency. The detailed algorithm of pull phase is given in Fig- Cost = m × size + b (3)
ure 3. where size is the message size, m denotes the incremental
energy cost associated with a message, and b is the energy
Notations: cost for the overhead of message. The parameters m and
d: requested data item; b are different for sending and receiving. In a majority of
Preq : peer requesting d;
Rh : home region for data item d; mobile devices like laptops, palmtops and cellular phones,
communication is one of the major sources of energy con-
Procedure Pull sumption that reduces battery life. Therefore, we only con-
Begin
When Preq requests data item d sider the energy cost for data communication in our fol-
if ( d is cached in Preq ) then lowing analysis. In particular, we analyze energy consump-
if ( TTR for d has expired) then tion for broadcast traffic and point-to-point traffic and then
Obtain the home region Rh for d
Poll the home region Rh derive equations for energy consumption for the flooding
Inquire for missed updates based on the last update times scheme and our proposed PReCinCt scheme.
Update the TTR and local cached copy
else 5.1.1. Broadcast Traffic Before sending a broadcast
Use the cached copy of d to satisfy the request packet, the sender listens briefly to the channel. If the chan-
End
nel is clear, the message is sent and received by all nodes in
Figure 3. Pull Phase the wireless (radio) range. Otherwise, the sender must back
off and try later. The cost associated with sending a broad-
cast packet is given by:
The main advantages of our Push with Adaptive Pull
Ebd sd = mbd sd × size + bbd sd (4)
mechanism are: 1) The push messages need to be sent only
to the home and replica regions, and not to the entire net- The cost associated in receiving a broadcast packet is:
work, thus saving bandwidth and power consumption; 2)
As the frequency of polling is determined by the update Ebd rv = mbd rv × size + bbd rv (5)

Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05)
1530-2075/05 $ 20.00 IEEE
When a packet is broadcast, all nodes within the transmis- 5.2.2. Energy Cost of the PReCinCt Scheme In
sion range of the sender will receive it. Let N be the num- PReCinCt, we assume there is no dynamic cache space
ber of nodes in the network, and A be the service area. The at each node, when a node requests for a data item, it
number of nodes in a square area is given by the node den- uses its hash function to get the home region for the re-
sity δ, quested data item. The node then sends the request to that
δ = N/A (6) region via intermediate nodes. After reaching the home re-
gion, localized flooding is used to locate the node with that
Let r be the transmission range. The area within the trans- data. The response is sent back to the requesting node via
mission range of the sender is given by π × r2. Thus, one or more intermediate nodes. If I is the number of in-
the number of nodes (within the transmission range of the termediate nodes between the requesting and responding
sender) receiving the broadcast packet is, nodes and n is the average number of nodes in a re-
gion, the energy cost of each request in PReCinCt is given
ζ = δ × π × r2 (7) by,

Hence the total cost associated with a broadcast send and EP ReCinCt = I × (Ep2p sd + Ep2p rv )

receive is, + EF looding in Region


+ I × (Ep2p sd + Ep2p rv ) (12)
Etotal bd = Ebd sd + ζ × Ebd rv (8)
The term EF looding in Region is the cost of flooding in a re-
gion and depends on the number of nodes n in the region.
5.1.2. Point-to-Point Traffic In this case, the fixed
Thus,
costs include both the channel access cost and the cost
due to medium access control (MAC) layer negotia- EP ReCinCt = I × (Ep2p sd + Ep2p rv )
tions [6]. Therefore, the cost of a point-to-point sending is + n × Etotal bd
given by,, + I × (Ep2p sd + Ep2p rv ) (13)
Ep2p sd (size) = mp2p sd × size + bp2p sd (9)
6. Simulation Results
The cost of a point-to-point receive is, To measure the performance of our data retrieval and
caching schemes, we simulated the algorithms on a variety
Ep2p rv (size) = mp2p rv × size + bp2p rv (10) of static and mobile network topologies. We focus mainly
on the mobile simulation results in this paper, as it is more
5.2. Energy Consumption Analysis of the demanding and challenging in MP2P networks. GPSR [8] is
used as the wireless routing protocol. We have modified it to
PReCinCt and Flooding Schemes
route to regions instead of specific destinations by forward-
ing the packet towards the center of the region and using
In the light of the discussions above, we now derive
broadcast inside the region. The performance metrics mea-
expressions for energy consumption of the flooding and
sured were average latency and energy consumption. We
PReCInCt scheme.
compare our cache replacement algorithm GD-LD with the
well known GD-Size [2]. Likewise, we evaluate our cache
5.2.1. Energy Cost of the Flooding Scheme In the flood- consistency scheme namely Push with Adaptive Pull with
ing scheme as discussed in Section 1, when a node receives the Plain-Push and Pull-Every-time schemes.
a request for data, it broadcasts the request to all its neigh-
bors which in turn broadcast to each of their neighbors and
6.1. Simulation Environment
so on. The request will be processed by all the nodes in the
network. When the data is found, it is sent back to the re- We simulated our experimental setup in NS-2 [18]. The
questing node via a point to point link through one or more NS-2 simulation model simulates nodes moving in an unob-
intermediate nodes. If I is the number of intermediate nodes structed plane. Motion follows the random waypoint model
between the requesting and the responding node, and N is [1]. In our simulations the nodes are initially placed in a
the number of nodes in the network, the cost of the Flood- rectangular region of 1200m × 1200m divided into equal
ing scheme is given by, sized regions. We conducted simulations for a network of
up to 160 nodes with a nominal 250m transmission range
and a wireless bandwidth of 11Mbps. The time interval be-
EF looding = N ×Etotal bd +I ×(Ep2p sd +Ep2p rv ) (11) tween two consecutive requests and updates generated from

Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05)
1530-2075/05 $ 20.00 IEEE
a peer follows a Poisson distribution with a mean of 30 sec- 0.65
GD−Size
onds. Each peer generates accesses to data items following 0.64
GD− LD

a Zipf distribution with a skewness parameter, Θ. Nodes


pause for a period of 5s between motion steps. We simu- 0.63

late maximum velocities of 2, 8, 12, 16 and 20m/s. The de-

Latency/Request(s)
0.62

fault number of regions is 9. 0.61

0.6

6.2. Simulation Experiments 0.59

0.58

The PReCinCt scheme is compared with the flooding


0.57
and the expanding ring search schemes for energy con- 0.5 1 1.5

Cache Size(% of Database Size)


2 2.5

sumption under varying node densities and moving speeds


in [11]. Three sets of experiments are reported in this pa- Figure 4. Variation of latency with cache size
per for evaluating the cooperative caching and consistency
mechanisms of PReCinCt. In the first set of experiments
our GD-LD cache replacement algorithm is compared with 0.5
GD−Size
GD−LD
the GD-Size algorithm. The performance metrics measured
0.45
were byte hit ratio and latency. The second set of experi-
ments compare different cache consistency algorithms and 0.4

show that our proposed Push with Adaptive Pull scheme

Byte Hit Ratio


performs better. Finally the third set of experiments vali- 0.35

dates the theoretical studies by comparing the theoretical


0.3

and simulation results.


0.25

6.2.1. Cache Replacement Experiments In these set of


experiments the developed cache replacement algorithm 0.2
0.5 1 1.5 2 2.5

Cache Size(% of Database Size)


GD-LD discussed in Section 2 is compared with GD-Size
[2]. The simulations were conducted on a topology with 80 Figure 5. Variation of byte hit ratio with cache
nodes moving at a speed of 6m/s. size
Figure 4 shows the variation of latency to fetch data
items with varying cache sizes. GD-LD by far outperforms
the GD-Size algorithm for all cache sizes. This is due to two Overhead, which is measured as the total number of mes-
reasons: (1) The developed algorithm includes a new pa- sages generated in the network to maintain data consistency
rameter - region distance while calculating the utility value among the replicas; 2) False Hit ratio(FHR), measured as
for an item. Region distance is the distance between the the ratio of the number of stale hits to the number of hits
regions of the requesting and responding peers. The algo- that are shown as valid; and 3) The average latency to re-
rithm favors items that are further away as caching these trieve a data item.
items would reduce latency and save bandwidth for subse- In our model, the time interval between two consecutive
quent requests for the same item, and (2) The GD-Size al- requests, Trequest and the time between two consecutive up-
gorithm, penalizes a large sized data item without consider- dates, Tupdate generated from a peer follows a Poisson dis-
ing its popularity or the cost of fetching it again for a sub- tribution. We measure the effects of the time between suc-
sequent request. cessive updates Tupdate on the different cache consistency
Figure 5 shows the byte hit ratio of the two caching algo- schemes. We fix Trequest to 30 seconds, and vary Tupdate so
rithms. We observe GD-LD is able to achieve much higher that the ratio of Tupdate /Trequest varies from 1 to 5. A ra-
byte hit ratios as compared to those with GD-Size. This is tio of 1 indicates the highest update rate.
because GD-Size favors small data items independent of Figure 6 shows the control message overhead of the
their popularity, thus a large popular data item stands less three consistency schemes. The message overhead of all
chance of being cached under GD-Size. the schemes decreases with the update rate due to lesser
invalidation messages that are generated. The overhead of
6.2.2. Data Consistency Experiments In this section we Plain-Push is extremely high as the invalidation messages
compare our Push with Adaptive Pull cache consistency al- are flooded to the entire network. In our Push with Adaptive
gorithm with the Plain-Push and Pull-Every-time schemes. Pull scheme the invalidation messages are pushed only to
The performance metrics measured are: 1) Control Message the corresponding home region thus incurring almost 89%

Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05)
1530-2075/05 $ 20.00 IEEE
6 0.9
10
Pull−Every−time Pull−Every−time
Push−with−Adaptive−Pull Push−With−Adaptive−Pull
Plain−Push 0.85 Plain−Push

0.8

Control Message Overhead in log scale


5
10 0.75

Latency/Request(s)
0.7

0.65

4
10 0.6

0.55

0.5

3
10 0.45
1 1.5 2 2.5 3 3.5 4 4.5 5 1 1.5 2 2.5 3 3.5 4 4.5 5
Tupdate / Trequest
T /T
update request

Figure 6. Effect of update rate on the control Figure 8. Effect of update rate on the latency
message overhead per request

less message overhead than the Plain-Push scheme. The Figure 8 shows the average latency to retrieve a data item
Pull-Every-time scheme also incurs higher overhead than with varying update rates. It is observed that the Pull-Every-
that of our scheme, as in the Pull-Every-time scheme the time scheme has the highest average latency, as the peers are
peers are required to poll the home regions for every data required to poll the home regions for every request, thus in-
request to check for the validity of their cached item. In curring an extra round-trip delay in obtaining the requested
contrast, our scheme polls the home region only when the data.
TTR of that data item has expired, thus achieving 24%-57%
less overhead than Pull- Every-time. As the update rate in- 6.2.3. Validation of Theoretical Analysis In this section
creases our adaptive scheme tends to produce smaller TTR we validate the theoretical studies by comparing the results
values, resulting in slightly more polls. obtained from the theoretical analysis with those obtained
The FHR of the three schemes is shown in Figure 7. from the simulation experiments. The results are for a static
We observe that our scheme incurs the highest FHR, as the topology with an area of 600m × 600m.
peers poll the home regions for data only when the TTR
has expired, thus increasing the probability of a false hit.
300 200
However, we see that this ratio is very small, 0.01 even PReCinCt Theoretical PReCinCt Theoretical
250 PreCinCt Simulation PReCinCt Simulation
with highest update rate. ThePlain-Push scheme also in-
Energy/Request(mJ)

Energy/Request(mJ)
Flooding Theoretical 150
200 Flooding Simulation
curs some false hits, as it is possible that the invalidation
messages do not reach all the peers due to network conges- 150 100

tion, network partition or peer mobility. 100


50
50

0 0
0.01 20 40 60 80 0 5 10 15 20 25
Pull−Every−time No.of Nodes No. of Regions
Push−With−Adaptive−Pull
0.009 Plain−Push
(a) (b)

0.008
Figure 9. Comparison of theoretical and sim-
0.007
ulation results
0.006
False Hit Ratio

0.005

0.004

0.003 Figure 9(a) shows the theoretical and simulation results


0.002 for the energy consumed per request for the flooding and
0.001 PReCinCt schemes. We see that the simulation and theo-
0
1 1.5 2 2.5 3 3.5 4 4.5 5
retical results match when the node densities are small, but
Tupdate / Trequest
the difference between the results increases with larger node
Figure 7. Effect of update rate on the false hit densities. This difference is due to the well known ’edge-
ratio effects’ in flooding. When nodes at the edges of the topol-
ogy broadcast a message, the message is received by fewer
than expected number of nodes. However the edge effects

Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05)
1530-2075/05 $ 20.00 IEEE
in our scheme are not as pronounced as those in the flood- [3] P. Cao and C. Liu, Maintaining Strong Cache consistency
ing scheme. in the World Wide Web, IEEE Transactions on Computers,
Figure 9(b) shows the performance of our scheme with volume. 47, no. 4, pp. 445-457, Apr. 1998.
varying number of regions. In this experiment the number [4] A. Datta, M. Hauswirth, and K. Aberer, Updates in Highly
of nodes is set to 20. The number of regions is an impor- Unreliable, Replicated Peer-to-Peer Systems, In Proceed-
ings of ICDCS 2003, 23rd International Conference on Dis-
tant design parameter that indicates the region size. Ide-
tributed Computing Systems, Providence, Rhode Island, May
ally we would like to have regions with small size to reduce
2003, pp. 76C85.
the flooding overhead within a region and at the same time [5] L. Fan, Summary Cache: A Scalable Wide-Area Web Cache
to have a region size large enough to reduce the effects of Sharing Protocol. In Proceedings of ACM SIGCOMM Conf.,
mobility. We observe that the scheme performs better and ACM Press, 1998
consumes lesser energy with larger number of regions be- [6] L. M. Feeney, An energy consumption model for perfor-
cause the flooding takes place in smaller regions. Flooding mance analysis of routing protocols for mobile ad hoc net-
within smaller regions reduces the number of nodes pro- works. In ACM Journal on Mobile Networks and Applica-
cessing each query. Thus the energy consumed decreases tions 6, 2001.
with increasing number of regions. [7] J. Gwertzman and M. Seltzer, World-Wide Web Cache Con-
sistency, In Proceedings of the 1996 USENIX Technical Con-
ference,January 1996.
7. Conclusions and Future Work [8] B. Karp, and H. T. Kung, GPSR: Greedy Perimeter Stateless
Routing for Wireless Networks In Proc. of the ACM/IEEE
This paper presents a cooperative caching and data con- International Conference on Mobile Computing and Net-
sistency scheme for PReCinCt for MP2P networks. working (Mobicom 2000), August 2000.
PReCinCt incorporates a cooperative caching scheme to [9] A. Kahol, S. Khurana, S.K.S. Gupta and P.K. Srimani, A
improve the communication performance of MP2P net- strategy to manage cache consistency in a distributed mo-
works. This cooperative caching scheme enables peers bile wireless environment, IEEE Transactions on Parallel
in a region to share their data, thus providing a uni- and Distributed System, 12(7), pp 686- 700, 2001.
fied view of the cache. This helps in alleviating the message [10] G. Kortuem, J. Schneider, D. Preuitt, T.G.C. Thompson, S.
latency and limited accessibility problems in MP2P net- Fickas, and Z. Segall, When peer-to-peer comes face-to-face:
collaborative peer-to-peer computing in mobile ad-hoc net-
works. The caching scheme includes a cache admission
works, In Proceedings of the First International Conference
control policy and a cache replacement algorithm. In or- on Peer-to-Peer Computing,August 2001.
der to handle dynamic data, PreCinCt also includes an [11] M.S. Joesph, M. Kumar, H. Shen and S.K. Das, Energy Ef-
effective cache consistency scheme called Push with Adap- ficient Data Retrieval and Caching in Mobile Peer-to-Peer
tive Pull that incurs less control message overheads as Networks , Workshop on Mobile Peer-to- Peer Computing,
compared to that of existing schemes. PReCInCt em- Third International Conference on Pervasive Computig and
ploys replication to achieve fault-tolerance. We also discuss Communications, Kauai, Hawaii, March 8-12, 2005.
the management of regions in PReCInCt. [12] Q. Lv, P. Cao, E. Cohen, K. Li, and S. Shenker, Search and
Future work includes an exhaustive analytical and ex- replication in unstructured peer-to-peer networks In Pro-
perimental investigation on the impact of region size on our ceedings of the 16th annual ACM International Conference
scheme. A dynamic region management scheme need to be on Supercomputing, 2002.
investigated to make PReCinCt adaptive to real network en- [13] P. Sarkar, and J. H. Hartman, Hint-based cooperative caching
ACM Transactions on Computer Systems, volume 18, num-
vironments, therefor optimizing its performance. In our fu-
ber 4, pages 387–419, 2000.
ture work, various experiments need to be conducted to ver-
[14] H. Shen, M. Kumar, S.K. Das, and Z. Wang, Energy-
ify the robust performance of PReCinCt scheme under dif-
Efficient Caching and Prefetching with Data Consistency in
ferent mobility models and node disconnection rates. Mobile Distributed Systems In Proc. of IEEE International
Parallel and Distributed Processing Symposium (IPDPS),
References Santa Fe, NM, April 2004.
[15] I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakr-
[1] J. Broch, D. Maltz, D. Johnson, Y. Hu, and J. Jetcheva, A per- ishnan, Chord: A Scalable Peer-ot-Peer Lookup Services for
formance comparison of multi-hop wireless ad hoc network Internet Applications In Proc. of ACM SIGCOMM, 2001.
routing protocols In Proc. of the ACM/IEEE International [16] K.L. Wu, P.S. Yu and M.S. Chen, Energy-efficient caching
Conference on Mobile Computing and Networking (Mobi- for wireless mobile computing, In 20th International Con-
Com’98), August 1998. ference on Data Engineering, pp 336-345, 1996
[17] The Gnutella Homepage. http://gnutella.wego.com/.
[2] P. Cao, and S. Irani, Cost-aware WWW proxy caching al-
gorithms In Proceedings of USENIX Symposium on Internet [18] The VINT Project. The UCB/LBNL/VINT Network
Technology and Systems, December, 1997. Simulator-ns(version 2).http://mash.cs.berkeley.edu/ns.

Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05)
1530-2075/05 $ 20.00 IEEE

S-ar putea să vă placă și