Documente Academic
Documente Profesional
Documente Cultură
(S tasks/sec)
600
thread.sleep( L secs )
# concurrent 400
closed loop threads in server:
implies S = A T threads 200
T'
0
dispatch( ) or
create( ) 1 10 100 1000 10000
# threads executing in server (T)
server throughput
completion rate: 1600
(S tasks/sec)
S tasks / sec
1200
800
400
closed loop timer queue
implies S = A with latency: 0
L seconds 1 10 100 1000 10000
# tasks in client-server pipeline
task arrival rate: Figure 5: Event-driven server throughput: Using the same
A tasks / sec
benchmark setup as in Figure 3, this figure shows the
event-driven server’s throughput as a function of the num-
ber of tasks in the pipeline. The event-driven server has
Figure 4: Event-driven server: Each task that arrives at one thread receiving all tasks, and another thread han-
the server is placed in a main event queue. The dedicated dling timer events. The throughput flattens in excess of
thread serving this queue sets an L second timer per task; that of the threaded server as the system saturates, and
the timer is implemented as queue which is processed by the throughput does not degrade with increased concurrent
another thread. When a timer fires, a timer event is placed load.
in the main event queue, causing the main server thread to
generate a response.
1200
throughput
throughput
(S ta sks / se c)
A=S A 600
800
S 400
400
200
0 0
1 10 100 1000 10000 1 10 100 1000 10000
# simultaneous threads (T) # simultaneous threads (T)
# simultaneous threads (T) # ta sks in clie n t-se rve r pipe line
Figure 7: Throughput of the hybrid event and thread system: (a) and (b) illustrate the theoretical performance of the
hybrid server, where T 0 is larger or smaller than the concurrency demand A × L. (c) shows measurements of the benchmark
presented in Figure 3, augmented by placing a queue in front of the thread pool, for different values of L and T . (d) shows
the throughput of the hybrid server when T = T 0 , which is the optimal operating point of the server. Here, L = 50ms.
The middle plateau in (d) corresponds to the point where the pipeline has filled and convoys are beginning to form in the
server. The right-hand plateau in (d) signifies that the convoys in all stages of the pipeline have merged. Note that the
x-axis of (a) and (b) are on a linear scale, while (c) and (d) are on logarithmic scales.
they also share their fate: that is, if the physical re-
sources fail (e.g., the node crashes), both task handlers I/O core
disk
I/O core
network
will also fail. Clearly, replicas of the same task handler
should be kept on separate physical nodes, or at least file system /
network stack
operating system raw disk
in separate address spaces, to avoid this fate sharing.
(c)
Note that task handlers can be linked in terms of (a)
References
6 Future Work and Conclusions
[1] T. Anderson, B. Bershad, E. Lazowska, and H. Levy.
We believe that our design framework will enable Scheduler Activations: Effective Kernel Support for the
the construction of a new class of highly concurrent User-Level Management of Parallelism. ACM Trans-
applications. We are in the process of building several actions on Computer Systems, 10(1):53–79, February
1992.
systems based upon this design. These include novel
Internet services [30, 8], a new segment-based database [2] Gaurav Banga, Jeffrey C. Mogul, and Peter Druschel.
storage manager [12], and a secure, consistent, highly A scalable and explicit event delivery mechanism for
available global filesystem [26]. All of these applica- UNIX. In Proceedings of the USENIX 1999 Annual
tions share the concurrency and availability require- Technical Conference, Monterey, CA, June 1999.
ments targeted by our framework. [3] BEA Systems. BEA WebLogic.
In addition, we continue to explore the design trade- http://www.beasys.com/products/weblogic/.
offs within this framework. Many approaches to load-
[4] A. Chankhunthod, P. B. Danzig, C. Neerdaels, M. F.
balancing and resource management within a cluster Schwartz, and K. J. Worrell. A hierarchical internet
have yet to be investigated in depth. In particular, object cache. In Proceedings of the 1996 Usenix Annual
we are interested in using economic models to manage Technical Conference, pages 153–163, January 1996.
arbitration for “competing” applications sharing the
same physical resources [24, 28]. [5] Yatin Chawathe and Eric Brewer. System support for
scalable and fault tolerant internet services. In Pro-
Building event-driven systems could be supported
ceedings of Middleware ’98, September 1998.
by better high-level language features for managing
state, consistency, and scheduling. While much of our [6] Inktomi Corporation. Inktomi search engine.
current work is based in Java [9], new language abstrac- http://www.inktomi.com/products/portal/search/.
tions supporting our framework reveal several avenues [7] Armando Fox, Steven D. Gribble, Yatin Chawathe,
for future research. Eric A. Brewer, and Paul Gauthier. Cluster-Based
Our goal has been to map out the design space of Scalable Network Services. In Proceedings of the 16th
highly concurrent systems, and to present a framework ACM Symposium on Operating Systems Principles,
which provides a way to reason about their perfor- St.-Malo, France, October 1997.
mance, fault isolation, and software engineering char- [8] Ian Goldberg, Steven D. Gribble, David Wagner, and
acteristics. Our framework is based on three simple Eric A. Brewer. The Ninja Jukebox. In Proceedings of
components — tasks, queues, and thread pools — which the 2nd USENIX Symposium on Internet Technologies
capture the benefits of both threaded and event-driven and Systems, Boulder, CO, October 1999.
[9] J. Gosling, B. Joy, and G. Steele. The Java Language [23] Vivek S. Pai, Peter Druschel, and Willy Zwaenepoel.
Specification. Addison-Wesley, Reading, MA, 1996. Flash: An efficient and portable Web server. In Pro-
ceedings of the 1999 Annual Usenix Technical Confer-
[10] Steven D. Gribble, Eric A. Brewer, Joseph M. Heller-
ence, June 1999.
stein, and David Culler. Scalable, Distributed Data
Structures for Internet Service Construction. Submit- [24] M. Stonebraker, R. Devine, M. Kornacker, W. Litwin,
ted to the Fourth Symposium on Operating System De- A. Pfeffer, A. Sah, and C. Staelin. An economic
sign and Implementation (OSDI 2000), October 2000. paradigm for query processing and data migration in
[11] Steven D. Gribble, Matt Welsh, David Culler, and Mariposa. In Proceedings of the 3rd International Con-
Eric Brewer. MultiSpace: An evolutionary platform ference on Parallel and Distributed Information Sys-
for infrastructural services. In Proceedings of the 16th tems, September 1994.
USENIX Annual Technical Conference, Monterey, Cal- [25] Sun Microsystems Inc. Enterprise Java Beans Technol-
ifornia, 1999. ogy. http://java.sun.com/products/ejb/.
[12] Joe Hellerstein, Eric Brewer, and Mike Franklin. [26] UC Berkeley OceanStore Project.
Telegraph: A Universal System for Information. http://oceanstore.cs.berkeley.edu.
http://db.cs.berkeley.edu/telegraph/.
[27] Robbert van Renesse, Kenneth P. Birman, and Silvano
[13] James C. Hu, Irfan Pyarali, and Douglas C. Schmidt. Maffeis. Horus: A flexible group communication sys-
High Performance Web Servers on Windows NT: De- tem. Communications of the ACM, 39(4):76–83, April
sign and Performance. In Proceedings of the USENIX 1996.
Windows NT Workshop 1997, August 1997.
[28] C. A. Waldspruger, T. Hogg, B. A. Huberman, J. O.
[14] James C. Hu, Irfan Pyarali, and Douglas C. Schmidt. Kephart, and S. Stornetta. Spawn: A Distributed
Applying the Proactor Pattern to High-Performance Computational Economy. IEEE Transactions on Soft-
Web Servers. In Proceedings of the 10th International ware Engineering, 18(2):103–177, February 1992.
Conference on Parallel and Distributed Computing and
Systems, October 1998. [29] Deborah A. Wallach, Dawson R. Engler, and M. Frans
Kaashoek. ASHs: Application-specific handlers for
[15] IBM Corporation. IBM WebSphere Application Server.
high-performance messaging. In Proceedings of the
http://www-4.ibm.com/software/webservers/.
ACM SIGCOMM ’96 Conference: Applications, Tech-
[16] Sun Microsystems Inc. Java Servlet API. nologies, Architectures, and Protocols for Computer
http://java.sun.com/products/servlet/index.html. Communication, pages 40–52, Stanford, California,
[17] M. Frans Kaashoek, Dawson R. Engler, Gregory R. August 1996.
Ganger, and Deborah A. Wallach. Server Operating [30] Matt Welsh, Nikita Borisov, Jason Hill, Rob von
Systems. In Proceedings of the 1996 SIGOPS European Behren, and Alec Woo. Querying large collections
Workshop, September 1996. of music for similarity. Technical Report UCB/CSD-
[18] P. Keleher, S. Dwarkadas, A. L. Cox, and 00-1096, U.C. Berkeley Computer Science Division,
W. Zwaenepoel. TreadMarks: Distributed Shared November 1999.
Memory on Standard Workstations and Operating Sys-
tems. In Proceedings of the Winter 94 Usenix Confer-
ence, pages 115–131, January 1994.
[19] Robert Morris, Eddie Kohler, John Jannotti, and
M. Frans Kaashoek. The Click modular router. In
Proceedings of the 17th ACM Symposium on Operat-
ing Systems Principles (SOSP ’99), pages 217–231, Ki-
awah Island, South Carolina, December 1999.
[20] L. E. Moser, Y. Amir, P. M. Melliar-Smith, and D. A.
Agarwal. Extended virtual synchrony. In Proceedings of
the Fourteenth International Conference on Distributed
COmputing Systems, pages 56–65, Poznan, Poland,
June 1994.
[21] National Laboratory for Applied Network Re-
search. The Squid Internet Object Cache.
http://www.squid-cache.org.
[22] ObjectSpace Inc. ObjectSpace Voyager.
http://www.objectspace.com/Products/voya-
ger1.htm.