Web Cluster Laboratory
Project Description
The hardware in our Web Cluster Laboratory is composed of commodity PC hardware and equipped with additional measurement infrastructure for performance evaluation. PCs with cheap off-the-shelf components are interconnected using a fast-/gigabit ethernet switch. We use Open Source solutions for the operating systems and all other software needed. The system is used as a distributed web server serving both static and dynamic content. The dynamic generation of web pages includes on-demand generation of HTML pages from XML documents using XSL, and a full-featured e-commerce system, which consists of an Enterprise Java Beans implementation of a bookshop as specified by the TPC-W benchmark.
The measurement infrastructure is composed of several GPS receiver cards connected to a roof-mounted GPS antenna. For time synchronization, we use a combination of NTP and the PPS API, where our GPS receivers generate precise pulses marking the start of every second. We use additional hardware to connect the PPS pulse to every node of the cluster. A kernel buffer is used to record time stamps for relevant events inside the kernel during the measurement process. Buffering the kernel time stamps enables us to conduct measurements with high data volume and microsecond precision.
Additionally, we record time stamps for important events in user-space processes and measure the ressource utilization.
The use of an external clock for measuring the interrupt lantency for every PPS pulse and correcting the time stamps used by NTP afterwards has been evaluated and an external clock device using FPGAs for every node of the cluster is currently being developed.
As an additional method for time stamping in a distributed system, we have implemented an offline synchronization mechanism in which the unsynchronized local clocks of the cluster nodes are used to time stamp both events relevant for our analyses and the arrival of PPS pulses. Since the precise frequency and offset from UTC is known for the PPS pulses, the time stamps for the events can be related to UTC after the measurement in an offline synchronization process.
We use both httperf and SURGE as load generators for our web server. SURGE generates traffic patterns similar to real user behaviour with heavy-tailed distributions and self-similar traffic as observed empirically in recent studies of the internet.
We built a simulation model of the Web cluster which includes relevant aspects of the hardware, operating and communication systems as well as the application layer as it is required for a realistic performance evaluation of such systems. We formulated the model in UML taking into account both architectural and behavioral aspects and applied detailed system-level measurements at low loads for determining the basic parameters. These are mainly one-way delays which implicitly include other system behavior like interrupt latencies and show random behavior. We applied advanced input modeling techniques including multi-modal distributions, multiple phases, and Bézier curves for unconventional shapes to adequately represent all quantities. We also proposed a new model based on differences between successive delays for the autocorrelations observed in the measurements.
The UML simulation model explicitly represents higher level dynamics which significantly affect the behavior at higher loads including buffering, resource contention, and transport control. The model predicts well the quantiles of the overall delay for HTTP responses.
Project Period
- 2001-10-01 – 2008-05-06
Project Members
Related Publications
- “A Measurement-Based Simulation Model of a Web Cluster,” 3rd Int. Industrial Simulation Conference, Ghent, Belgium, Berlin, Germany, pp. 88-92, Juni 2005 ,
- “Towards Cost Efficient Mobile Services,” 2nd Int. Conf. on Advances in Mobile Multimedia, Bali, Bali, Indonesia, pp. 179-188, September 2004 ,
- “A Low-Cost Infrastructure for High Precision High Volume Performance Measurements of Web Clusters,” Proc. 13th Conf. on Computer Performance Evaluations, Modelling Techniques and Tools, Heidelberg, Urbana, IL, USA, pp. 11-28, September 2003 ,
- “Aufbau eines clusterbasierten Webservers zur Leistungsanalyse,” Erlangen, pp. 96, 2001 ,