Sunteți pe pagina 1din 9

Introduction to Multi-tenant Web-based Applications

Patrick Nicolas October 2006

Contents
Overview ....................................................................................................................................................................... 2 Web Servers Configuration ........................................................................................................................................... 2 Introduction .............................................................................................................................................................. 2 Traffic management ................................................................................................................................................. 2 Web Server Performance Issues ............................................................................................................................... 3 Application Servers Deployment .................................................................................................................................. 3 Introduction .............................................................................................................................................................. 3 Deployment Models ................................................................................................................................................. 3 Performance Improvement Tips ............................................................................................................................... 4 Databases Deployment ................................................................................................................................................. 5 Introduction .............................................................................................................................................................. 5 Deployment Models ................................................................................................................................................. 5 Basic Trade-off .......................................................................................................................................................... 6 Key Considerations ................................................................................................................................................... 7 Data Protection .................................................................................................................................................... 7 Scalability ............................................................................................................................................................. 8 Customization ...................................................................................................................................................... 8 High Availability .................................................................................................................................................... 9

Page 1 of 9

Introduction to Multi-tenant Web Applications

Patrick Nicolas

Overview
This paper introduces the different modes of deployment of Web servers, application and database servers components to support multi-tenant environment known as Software as a Service (SaaS). Solutions relying on traffic management application which are external to the applications and database such as load balancing are not presented in this article.

Web Servers Configuration


Introduction
The performance of web server farms depends on the load balancing appliances to route traffic as well as tuning of web servers. The successful web sites rely on the combination of IT experience to architect an efficient network infrastructure and programming skills to customize and tune the web servers.

Traffic management
A combination of load balancing and web acceleration software or appliance goes a long way to improve response time for web users. In case most of the traffic is outbound (HTTP response content) a triangular flow (Client -> Load balancer -> HTTP server ->Client) is a best approach. All software and some load balancing appliance support such option. Some load balancing solution choke on processing large binary. Binary content such as video or audio files larger than 4K is delivered through fragmentation of TCP packet. The load balancer then has to reassemble those fragments using the TCP sequence number which I processor intensive. A load balancer or Web accelerator which offloads the execution of the TCP/IP stack out of the main processor generates between 20 to 45% performance improvements. Caching which is now available in a increasing number of load balancer and web accelerator should be turned on and regularly tuned. Whenever possible HTML pages should be converted to into javascript/DOM to reduce the amount of HTML data (tags and content) send to the client browser. Processing and analyzing HTTP traffic at the Layer 7 (application) is a powerful tool to route the traffic to the most appropriate server. However, such capability comes with a significant performance cost. It is advisable to restrict the load balancing to manage traffic at the TCP layer. In case the application relies on sticky session, it is critical to tune the time out parameter for the persistent connection between the Load balancer and the web server. A small timeout requires the LB to reinitiate TCP connections and a large timeout requires the load balancer to maintain a large hashtable of those connections (Sockets): both options are computing intensive. For very large web site, it is sometimes necessary to break down the domain/VIP into two or more load balancer and relies on the native DNS round robin mechanism to distribute the traffic between multiple load balancer, although the behavior is not always predictable because of cookie, session persistency..

Page 2 of 9

Introduction to Multi-tenant Web Applications

Patrick Nicolas

Web Server Performance Issues


A web site can be made more responsive to client requests by tuning and programming a web server for performance. Here a sample list of performance improvement tips: Use Apache/IIS to server static pages, images and video. Servlets and JSP tags should be restricted to highly dynamic content. Segregate HTTPS (secure) and HTTP traffic Configure the web server to cache SSL credential (certificate) with appropriate timeout. A round trip to a central certificate server should be avoided. Disable the DNSlookup which is set by default Use non-blocking sockets to serialize requests (i.e acceptMutex in Apache) Use worker threads (IIS, Apache 2.x) Specify a timeout for FIN_WAIT to close HTTP socket Keep length GET small (<256 bytes) to reduce Cache SSL sessions state information (shared memory hash table) Tune Keep Alive Timeout (to avoid creating a new TCP connection) Disable request from localhost (127.0.0.1) Use compression for large page and images (.jar, .zip) Force close connection on HTTP Client whenever possible Dedicate thread to each HTTP client connection (session). The initialization and management of lightweight processes or standard process are processor intensive and had a significant negative impact on performance of short HTTP sessions Stream request/response (instead of buffering) (Apache, Servlets) Disable cookie processing whenever not needed Use 'Expect-continue' handshake for servers that require HTTP authentication

Application Servers Deployment


Introduction
There are multiple options to deploy application servers to service multiple tenants according criteria such as security, scalability, performance and high availability. There are four approaches to deploy SaaS solutions within an IT infrastructure: Dedicated servers Shared virtualized hosts Dedicated application servers Shared application servers

Deployment Models
The following componentization of application servers by hosts, virtual machines, application servers and transactions/sessions can be implemented in both J2EE (jBoss, Weblogic, Websphere) and .NET environments.

Page 3 of 9

Introduction to Multi-tenant Web Applications

Patrick Nicolas

Fully Isolated Application server Each tenant accesses an application server running on a dedicated servers

Virtualized Application Server Each tenant accesses a dedicated application running on a separate virtual machine

Shared Virtual Server Each tenant accesses a dedicated application server running on a shared virtual machine
.

Shared Application Server The tenants shared the application server and access application resources through separate session or threads.

Fig.1 Basic Categories of Multi-tenants Deployment Modes for Application Servers

Performance Improvement Tips


Although the architecture and specific deployment mode have a significant impact on the overall performance of application servers, performance can be further improved by tuning configuration parameters. For instance, j2EE-based application servers benefits from the following changes Use Java Caching System (JCS) or Commons Collection Increase default heap Avoid Entity EJB

Page 4 of 9

Introduction to Multi-tenant Web Applications

Patrick Nicolas

Do not use static variables in servlets Use Stateless Session bean to avoid synchronization Make sure log message are generated asynchronously Reduce size of critical section for race condition between sessions or threads Used callable (stored procedures) or prepared statements (pre-compiled) instead of statements

Databases Deployment
Introduction
There are multiple options for ASP to deploy SaaS solutions and distributed customers data across servers, virtual machines databases, schema and tables according to criteria such as security, scalability and maintainability. There are five approaches to deploy SaaS solutions within an IT infrastructure Dedicated Servers Shared virtualized hosts Dedicated Databases in shared servers Dedicated schema with within shared database Shared tables

Deployment Models
The different operational modes of deployment rely on the componentization of the infrastructure as Servers or hosts, Database and Schema. Other types of componentization or classification include Servers groups segregated by subnet mask (switches), Network Attached Storage (NAS) and Table partitions. The following table (Fig. 2) describes the most commonly used multi-tenants deployment for Application Service Providers.

Fully isolated data center The tenants do not share any data center resources

Virtualized servers The tenants share the same host but access different database running on separate virtual machines

Page 5 of 9

Introduction to Multi-tenant Web Applications

Patrick Nicolas

Shared Server The tenants share the same server (Hostname or IP) but access different databases

Shared Database The tenants share the same server and database (shared or different ports) but access different schema (tables)

Shared Schema The tenants share the same server, database and schema (tables). The irrespective data is segregated by key and rows.
Fig. 2 Basic Categories of Multi-tenants Deployment Modes in Data centers

Basic Trade-off
As described in the next section, the selection of the optimal deployment modes depends on multiple conflicting criteria such as Regulatory requirements Perceived data isolation Development and maintenance costs Extensibility of solution by customers Business continuity Liability regarding breaches One simple trade-off is balancing security and authentication concerns by the tenants with the operational costs incurred by the service provider as illustrated in Fig. 3

Page 6 of 9

Introduction to Multi-tenant Web Applications

Patrick Nicolas

Fig 3 Comparing Multi-tenant Deployment Modes using Costs and Security

It should be obvious to the reader that greater isolation of data in both logical and physical terms reduces the potential for security breaches and inadvertent sharing of data. However, a fully isolated approach increases costs related to equipment and maintenance. The service provider has to estimate and monitor average cost per tenant and/or users to guarantee profitability. As we discussed in the next section, softer approach such as data encryption to improve security and data compression to reduce costs can be used as a more elaborate alternative.

Key Considerations
The requirements to build a truly robust SaaS solution can be broken down into 4 categories Security and data protection Scalability and costs Customization or extensibility Business continuity (high availability) Data Protection A SaaS architect is responsible for building adequate data protection as well as multiple defense level that complement each other to counter both internal and external threats. Data protection can be implemented through filters or firewalls, access control lists and encryption. Proxy Filters: The objective is to create a filter or cache between the tenants and users and the actual data source. Once common approach to manage the access to data is to monitoring the TCP packet for tempering, impersonation, incorrect source IP. The filtering mechanism can be implemented in a device outside the databases or within a view. A proxy can be setup to convert authentication information such as user name, password, database and roles to non-descriptive credential prior a connection is made to the actual database. A proxy has also the advantage to hide the actual IP/hostname and port the database engine is listening to. Access Control Lists: This commonly used technique relies on a trusted 3rd party subsystem which is already authenticated to access a database. Systems such as Kerberos and Active

Page 7 of 9

Introduction to Multi-tenant Web Applications

Patrick Nicolas

Directory monitor the actual credential of the principle or trusted client to the database. The combination of the user, tenant credentials as well as the trusted principal constitutes the security context. Data Encryption: Cryptology can be applied to both the tenants credential with the protocol used to access the database and the stored data. Symmetric encryption/decryption which relies on a single key (i.e. common licensing mechanism) is not as safe as Asymmetric algorithms but requires a lot less computing power. Those asymmetric algorithms rely on a public/private keys paring such as SSL. Scalability For a SaaS application, scalability is important, because you'll have to support data belonging to all your customers. Databases can be scaled up (by moving to a larger server that uses more powerful processors, more memory, and faster disk drives) and scaled out (by partitioning a database onto multiple servers). Different strategies are appropriate when scaling a shared database versus scaling dedicated databases. The most common techniques to scale database are dynamic provisioning, partitioning and combination of provisioning and partitioning. Dynamic Provisioning relies on a single master replication , in which only the original can be written to, is much easier to manage than multi-master replication, in which some or all of the copies can be written to and some kind of synchronization mechanism is used to reconcile changes between different copies of the data. Partitioning involves pruning subsets of the data from a database and moving the pruned data to other databases or other tables in the same database. It means that the database is divided into several smaller databases using the same schema and structure, but with few rows in each table. An alternative is to divide the table into smaller tables with the same number of rows, but with each table containing a subset of the columns from the original. Not all the data should be partitioned. For instance, identity data or index table should be preserved as single database object. A mix model is usually implemented when the condition that trigger the provisioning of the database cluster are dynamic or the traffic patterns between the different tenants is not consistent. Customization It is fair to assume that each tenant may have a different set of requirement and data structure. One size does not likely fit all. It is critical to deploy database instance and design schema so fields, type and constraints can be created, removed or modified without interrupting the access to the databases. There are several known techniques to extend existing tables. Customized predefined fields: The records from different tenants are intermingled within the same set of tables. The field (or column) has to be defined as generic as possible to cover most common scenario. The fields are divided as standard fields and extensions. The standard fields contain strict data type while the extensions allow the type to be overwritten. This technique can be applied to any SaaS deployment model but is more relevant to shared schema. Customized predefined Tables: This approach allows the tenant to create new fields and storing specific data into a separate table which has already some predefined labels and data types. This

Page 8 of 9

Introduction to Multi-tenant Web Applications

Patrick Nicolas

design has the advantage to allow the customization of the number of fields at run-time. . This technique can be applied to any SaaS deployment model but is more relevant to shared schema. Dynamic fields: The technique which makes sense in the case of the schema is not shared, allow the tenant to add dynamically new columns to an existing table. High Availability One last critical element of database deployment in a SaaS multi-tenant environment is the guarantee of availability of the data. Each database object should be replicated in timely fashion and primary servers and database should have fail-over mechanism. Replication can be event-driven or scheduled. A significant drawback of the separate-schema approach is that tenant data is harder to restore in case of an outage as restoring the entire database would mean overwriting the data of every tenant on the same database with backup data for all the tenants.

Page 9 of 9

Introduction to Multi-tenant Web Applications

Patrick Nicolas