Sunteți pe pagina 1din 94

Lessons Learned in

Building a Highly
Scalable MySQL
Database
Presented by,
MySQL AB® & O’Reilly Media, Inc.

Mariella Di Giacomo
Daniel Chote
Geoff Harrison

The Hive
Outline

 Introduction/Background
 Motivations, Project Requirements
 Architectural Design
 Proposed Solutions
 Performance Analysis Results
 MySQL and Other Optimizations
The Hive
 Started in 1998 (under a different name)
 Operations in 4 countries
 Developer-centric organization
The Hive
 The Hive’s collective vision is coding and
collaborating from anywhere.
 The Hive has a number of large scale projects
that include video conferencing, VOIP, IM, and
virtual whiteboard all connected through smart
self-managing project tools.
MySQL Installations at The Hive

 70+ MySQL Server Installations


 Linux Operating System
 Mostly Commodity Hardware
 Used for production, batch processing,
replication and development
The Next Generation
 “What we need now is a brilliant idea …”
Scalable Architecture

 High Volume of users and access traffic


 Exponential growth of data access and storage
 Use of relatively commodity hardware
Architectural Requirements

 Store data related to millions of users


 Store some of the data in an XML format
 Store and Retrieve the data through an Open
Source Relational Database Management
System (RDBMS)
Architectural Requirements

 Build a scalable architecture capable of


handling content updates and data addition at
fastest rate reasonably possible
 A system that has as little service disruption as
possible
 Robustness, Flexibility, Scalability and Speed
A Scalable Architecture
 Scalable, robust, fast and flexible.
Capable of handling data related to millions of users.
Based on a data layout that maps all the data for
each user to its individual table, organizing the users
across multiple systems, databases and filesystems.
Consists of homogeneous systems.
Each system consists of two physical computers that
synchronize the data through Master Master MySQL
Replication.
Hardware Architecture
 The hardware architecture consists of:
Pairs of AMD nodes (Opteron). Each node has 2
CPUs 2GHz with 2cores; it has 24 GB of RAM + 8 GB
of swap
Each computer system has two types of data storage
devices: Serial Advanced Technology Attachment
(SATA) and Solid State Drive (SSD).
Each computer System is composed of Debian Linux
Operating System (OS)
Software Architecture
 We benchmarked 4 filesystems (Ext3, JFS, ReiserFS
and XFS).
 We chose MySQL as the RDBMS for the following
reasons:
Open Source
Speed
Data Storage Capabilities
Database Design
Fault Tolerance
Security
MySQL Database Design
 MySQL supports several storage engines.
 We chose MyISAM, for the following reasons:
Each table data is stored in a MYD file
Each table indices are stored in a MYI file
Each table file requires less disk space compared to
other storage engines.
Memory key buffer for the indexed data
 Downsides of MyISAM
Good for high volume of writes and/or read, but not
both in high concurrency.
MySQL Database Design
 Our design takes advantage of the fact that
there will not be high level of concurrency on
the same table, while high level of concurrency
will be present at database and server level.
 Table Structure Design:
Reduce and or combine indices on some fields, to
take fewer resources when adding and modifying data
Use of NOT NULL fields where possible
Use the most appropriate, efficient field type for a
given field
Architecture
Why that architecture ?
Why that architecture ?
 We had an idea …..
 We validated the idea through benchmark
results and analysis
 We analyzed several MySQL Storage engines
(MyISAM, InnoDB, Maria, PBXT, Falcon) on 4
types of filesystems (Ext3, JFS, XFS, ReiserFS)
using two system architectures (AMD and Intel),
two types of data device technologies and
integrated the benchmarks with some specific
low-level optimizations.
Benchmarks

 Benchmark Configuration
 Benchmark Hardware Architecture
 Benchmark Topics
Benchmark Configuration
 The benchmark program is written in Perl. The
same program is used for table creation, data
insertion, update and selection.
 The benchmark program spawns 100 parallel
instances (clients) that execute Data Definition
Language (DDL) and Data Manipulation
Language (DML) statements against the
MySQL server.
 The statement response time has been taken
on the server side and on the client side.
Benchmark Configuration
 All benchmarks have been run several times for
the same configuration.
 All the computer systems benchmarked have
been set on an isolated network to lower noise
as much as possible.
 All computer systems are composed entirely of
Debian Linux (Lenny) running 2.6.23 and 2.6.24
 All systems have been compiled with latest
MySQL 5.1.x
Benchmark Hardware Architecture
 The hardware architecture used for the
benchmarks consists of:
Pairs of AMD nodes (Opteron). Each node has 2
CPUs 2GHz with 2cores; 24 GB of RAM + 8 GB of
swap
2 Intel Xeon nodes. Each node has 2 CPUs 3GHz
with 2cores and hyper-threading enabled (4 virtual
cores each CPU); 24GB of RAM + 8 GB of swap
Each system has identical SATA storage devices. The
AMD nodes contain also two identical SSD drives that
are raided via an add-on raid controller.
Benchmark Hardware Architecture
Each system always uses two dedicated storage
devices for database benchmarks.
Benchmark Topics
 Data Layout
Organize each user’s data in a separate table
Partition a table to contain the data associated to a
set of users, using MySQL horizontal partitioning.
 Database Organization
20 and 100 databases with 2 million tables
50 databases with 1 million tables
 MySQL Storage Engines: MyISAM, InnoDB,
Maria, Falcon, PBXT
Benchmark Topics

 Filesystem Organization
One data storage device for filesystem. Each
filesystem contains 1 million tables
Ext3, JFS, ReiserFS and XFS
 Data Storage Device Technology
SATA
SSD
Data Layout Scenarios
 Two data layout scenarios are evaluated:
Organize each user’s data in a separate table
Partition a table to contain the data associated to a
set of users, using MySQL horizontal partitioning. In this
case the total number of tables required is the total
number of users divided by the maximum number of
partitions allowed on a partitioned table.
Data Layout Scenarios
 Each Data Layout scenario has been
benchmarked for:
Dynamic Creation of 1 million to 2 million tables on
one system. In the case of 1 million tables they would
be physically located on only one filesystem (one hard
drive). In the case of 2 million tables they would be
spread across 100 or 20 databases. Each dedicated
filesystem would be mapped to a physical storage
device.
Data Insertion and Update on the created tables
Data Selection
Table Creation
Table Creation Configuration
 Table creation has been executed using a
stored procedure.
 Each benchmark has been executed on an
AMD and Intel system
 Each benchmark has been evaluated with 4
distinct filesystem types.
 The table creation benchmarks have been
evaluated with write cache enabled, write back
(WB) and write cache disabled, write through
(WT) disk policies.
Data Insert, Update and Query
Inserts and Updates Configuration
 The benchmark configuration for the data insert
and update is equivalent to the one used for
table creation.
 Each benchmark has been executed only on
AMD systems
 Each benchmark has been evaluated with 4
distinct filesystem types.
 The benchmarks have been executed using
write cache enabled and write cache disabled.
Inserts and Updates Configuration
 While table creation would perform better using
write cache disabled, initial benchmarks with
DML operations using write cache disabled
would perform worst compared to the
equivalent with write cache enabled.
Data Layout Scenarios
 Two data layout scenarios are evaluated:
Organize each user’s data in a separate table
SATA Device Technology
•MyISAM
•InnoDB
•Falcon
•Maria
•PBXT
SSD Device Technology
MyISAM
Data Layout Scenarios
 Data layout scenarios:
Organize each user’s data in a separate table
SATA Device Technology
MyISAM and SATA Benchmarks
 Benchmarks shown in the coming graphs have
been executed using the MyISAM Storage
Engine
 The MyISAM key_buffer is set to 4GB
 In each benchmark the system containing
MySQL server has two storage devices for the
databases
Table Creation
 These graphs show the benchmark results
derived from processing 2 million tables.
 Table creation of 2 million tables was spread
across 20 databases. The databases were split
across two filesystems. Each filesystem was on
a unique SATA drive. (The drives were set up as
single volumes through an add-on raid
controller)
MyISAM using WB
MyISAM using WT
MyISAM using JFS
MyISAM with XFS
MyISAM with ReiserFS
MyISAM with Ext3
Table Creation with MyISAM
 The benchmark results showed:
 When creating 2 million tables, approximately
36GB of space is used
 The highest processing rate was obtained using
the JFS filesystem on a volume (SATA via raid
controller) with write cache disabled
 The MyISAM key buffer did not play a major role
 The number of volumes would influence the
processing rate.
Table Creation with MyISAM
 Between 1 million tables on one filesystem
versus 2 million tables on two filesystems, we
did not see any major impact in number of
tables created per second (provided that we
would not reduce the number of storage
devices).
 1 million tables would generate a total of 3
million files, 2 million tables would double the
number.
MyISAM and SATA Benchmarks

 Inserts and Updates on 1 million tables (50


databases)
 Inserts and Updates on 2 million tables (20 and
100 databases)
Inserts and Updates
 The next graphs show the results of DML
operations on 1 million tables created using
MyISAM storage engine with 4 types of
filesystems and using write cache enabled.
 While table creation would perform better using
write cache disabled, initial benchmarks with
DML operations using write cache disabled
result in generating response times 10% higher
compared to the equivalent with write cache
enabled.
MyISAM and Ext3 and WB
MyISAM and JFS and WB
MyISAM and XFS and WB
MyISAM and ReiserFS and WB
MyISAM and SATA Benchmarks

 Inserts and Updates on 1 million tables


 Inserts and Updates on 2 million tables
Benchmarks on DML operations on 2 million tables
were stopped after a few tests due to their high
response time, caused by heavy I/O operations on the
volumes.
MyISAM and SATA Results

 The benchmarks run using MySQL MyISAM


storage engine on SATA device technology
showed, after several optimizations, that the
average response time of a DML operation
would be higher than one second on an
environment with 100 concurrent clients.
InnoDB and SATA Benchmarks

 Table creation with 1 million tables


(50 databases)
 Table creation with 2 million tables
(20 databases)
 Inserts and Updates on 1 million tables
 Inserts and Updates on 2 million tables
Table creation
 These next graphs show the benchmark results
derived from processing 2 million InnoDB
tables.
 The benchmarks have shown:
There is a memory overhead with the table dictionary
(approximately 6GB of memory for all of the table
dictionaries).
There is an overhead on the disk. Approximately 4
times the space (in comparison to MyISAM) is used to
store a table, whether using single or multiple table
spaces, or a single file per table.
InnoDB on JFS
Inserts and Updates with InnoDB
 The next graph shows the results obtained with
1 million tables using InnoDB
 The number of databases is 50
 The filesystem considered is JFS
 The data storage devices are SATA
 The InnoDB buffer is set to 6GB, but not limited
to it
InnoDB on JFS with WB
InnoDB and SATA Results

 The benchmarks were evaluated only with one


filesystem type, JFS. They were not extended to
other types because of the InnoDB overheads
(memory required for table dictionary and disk
space).
Falcon and SATA Benchmarks

 We evaluated a few benchmarks related to


table creation and DML operations using Falcon
as MySQL Storage engine.
 The benchmarks were executed using 1 million
tables.
 Falcon storage engine was discarded because
it is a relatively new storage engine and is not
yet robust as MyISAM or InnoDB.
Maria and SATA Benchmarks

 We evaluated a few benchmarks related to


table creation and DML operations using Maria
Storage engine.
 The benchmarks were executed using 1 million
tables.
 Maria storage engine was discarded because it
is a new storage engine and is not yet
optimized as MyISAM.
Maria and SATA Benchmarks

 One of the interesting features of Maria is the


ability to be generate crash safe tables. Maria is
a crash safe version of MyISAM.
 It has all the main functionality of MyISAM
engine, but includes recovery support, full
logging, etc.
PBXT and SATA Benchmarks

 We evaluated a few benchmarks related to


table creation using PBXT Storage engine.
 The benchmarks were executed using 1 million
tables.
 PBXT storage engine was discarded because it
showed some limitations on the number of
tables that could be allocated.
Data Layout Scenarios
 Data layout scenarios:
Organize each user’s data in a separate table
SSD Device Technology
MyISAM and SSD Benchmarks

 1 million tables using 50 databases


 2 million tables using 100 and 20 databases
Table Creation (WB)
MyISAM and Ext3 and WB
MyISAM and JFS and WB
MyISAM and ReiserFS and WB
MyISAM and XFS and WB
MyISAM and SSD Benchmarks

 After having run the benchmarks for 1 million tables, we


noticed that Ext3 provided the best performance. We
expect this is due to the fact that it does less IO
optimization when compared to other filesystems.
 Consequently the following benchmarks of 2 million
tables have been executed only using the Ext3
filesystem and two SSD storage devices.
MyISAM and SSD Benchmarks

 1 million tables
 2 million tables using 100 and 20 databases
Table Creation: Ext3 and WB (100)
MyISAM and Ext3 and WB (100)
MyISAM and Ext3 and WB (100)
MyISAM and Ext3 and WB (20)
MyISAM and Ext3 and WB (20)
MyISAM and SSD Results
 We have evaluated 50 databases with 1 million
of tables on a single SSD device, 20 and 100
databases with 2000000 of tables spread on
two SSD devices.
 The benchmarks showed that Ext3 with write
cache enabled would perform better while doing
data inserts, updates and selects with 100
concurrent clients.
MyISAM and SSD Results
 The benchmarks showed also a quite
interesting improvement in response time in the
chosen architecture.
 The noop I/O scheduler performed better
compared to the others available on Linux I/O.
MyISAM SSD versus MyISAM SATA
 The graph below compares DML statements
with MyISAM using 100 databases with 1
million tables on SATA and SSD
Data Layout Scenarios
 Initially two data layout scenarios are evaluated:
Organize each individual user’s data in a separate
table
Partition a table to contain the data associated to a
set of users, using MySQL horizontal partitioning. In this
case the total number of tables required is the total
number of users divided by the maximum number of
partitions allowed on a partitioned table.
SATA Device Technology
•MyISAM
•InnoDB
MySQL and Horizontal Partitioning

 The next slides discuss the results of 2 million


tables with horizontal partitioning and both the
MyISAM and InnoDB storage engines.
 The type of partitioning used was “Range”
partitioning based on the primary key, unsigned
integer field.
 The number of partitions investigated was 10
and 1024.
MyISAM with Horizontal Partitioning
 The next graph shows the creation of 2 million
tables using the MyISAM storage engine and
horizontal partitioning
 The initial number of partitions was set to 10.
This implies that for each table will be created 2
extra files per partition, with a total of 22 files
per table. In the case of 2 million users this
would generate a total of 4.4 million files and
200,000 tables.
MyISAM with Horizontal Partitioning

 We investigated also the scenario with a


number of partitions of 1024.
 With 1024 partitions per table, one table would
generate 2050 files, so in the case of 2 millions
users approximately 4 millions of files and
approximately 1954 tables.
 The graph shows the results of table creation
using a number of partitions of 10.
Table Creation with WB
MyISAM with Horizontal Partitioning
 The drawback of using MyISAM versus
MyISAM with partitioning was:
Pros
The results for using a number of 10 partitions were slightly
lower compared to those of the first data layout scenario.
Cons
Higher level of management compared to the first data
layout scenario and still a high number of files to be managed
Complexity and overhead when using partitioning with a
number of partitions set to 1024
Some limitations on the possibility of using sub partitioning
to distribute segments of data for a specified user.
InnoDB with Horizontal Partitioning

 InnoDB with partitioning was considered but


was excluded for the following reasons:
 Memory overhead due to table dictionary
 Disk Space overhead compared to MyISAM as
discussed earlier (4 times that of MyISAM).
Optimizations
 Why Optimize ?
Get more Performance with same Hardware. As
your data grows, Performance may degrade.
 What Optimize ?
Hardware Architecture, Operating System
Components and MySQL.
 Where Optimize ?
Monitor your system and applications and
watch for possible bottlenecks.
Optimizations

 MySQL Optimizations

 Other Optimizations
MySQL Optimization

 Server Compilation (query cache disabled, fast


mutexes enabled, etc.)
 Server Configuration and Tuning (memory
buffers, number of open tables, number of open
files, concurrency, etc.)
 Table structure (Database Design) and
allocation
 Query handling
MySQL Server Compilation

 Compilation using appropriate optimizations for


Linux (Debian) and the specific hardware
architecture made improvements of 10% of
speed over equivalent standard MySQL server
binary.
MySQL Configuration

 MySQL assumes little about your hardware


system configuration.
 Optimal size of the MyISAM key_buffer used to
allocate indices in memory.
 Number of open tables at a given point (it plays
an important role when dealing with millions of
tables).
 Max number of temporary tables
MySQL Configuration

 Number of connections per second, number of


threads created per second, max number of
connections
 Thread concurrency
 Thread memory allocation
 Long query time
MySQL Query Handling
 Use EXPLAIN SELECT to improve queries
 Use Indexes
 Simplify some of the WHERE clauses
 Use UNION of SELECT instead of only one
SELECT with several conditions in the WHERE
clause
 Disabled query cache, provided that the queries
would not take advantage of that
 Avoid Table Scans
Other Optimizations

 File System tuning


 File System “mount” options tuning
 OS Linux Scheduler tuning
 OS Linux Kernel Swap tuning
 Tuning for Write-Heavy Loads
Q&A

 Thanks for your attention!


 We would like to thank The Hive for making this
talk possible.

S-ar putea să vă placă și