Sunteți pe pagina 1din 44

Benchmarking FreeBSD

Ivan Voras <ivoras@freebsd.org>

What and why?

Everyone likes a nice benchmark graph :)

And it's nice to keep track of these things

The previous major run comparing FreeBSD to Linux was done by Kris Kennaway in 2008. There are occasional benchmarks appearing on blogs and mailing lists which show interesting results...

Target audience

Mostly developers

The results are decent, but not rosy Significant space for improvement What to do, what not to do Tuning? Do you need it?

Some system administrator material


Avoid benchmarking?

Purpose of benchmarking...

To check your system for configuration bugs and obvious problems (hw/sw) To check your capacity for running an application (web server, db server, etc.) To compare your hardware to others' To compare your favorite application / operating system to others...

Repeatability

The best benchmarks are performed in a way anyone can repeat them Both result verification (error checking) and a way to compare their own setup to the one tested Repeatability is a good thing...

Test hardware

2x IBM x3250 M3 Xeon E3-1220 3.1 GHz, 4-core (no HTT) 32 GB RAM 4x SATA drive RAID0 1 Gbps NIC (2 ports)
Directly connected NICs

Test software

FreeBSD 9.1 and 10-CURRENT (HEAD) CentOS 6.3 PostgreSQL 9.2.3 blogbench 1.1 bonnie++ 1.97 filebench 1.4.8 Mdcached 1.0.7

Accuracy and precision

Accurate = if it measures what we think it measures (or is it off the mark) Precise = is the measured value good approximation of the real one (or is it affected by noise)

Systematic and random errors

Systematic = we're not measuring what we thing we're measuring (the method of benchmarking is wrong, even though the results may be repeatable and look correct). Random = affected by noise outside our control, results in non-repeatable measurements.

Expressing precision

Find meaningful precision

(or: avoid false precision)

412.567 MB/s 412.5 MB/s 412 MB/s 410 MB/s

Standard deviation

The error bars 1234 +/- 5 of something Expresses confidence in your measurement how precise they are (clustered around the measured value)

Example: hard drive performance

(Systematic: you are not measuring what you think are measuring) File systems vs (spinning) drives
350 300 250

MB/s

200 150 100 50 0 outside

diskinfo -vt /dev/da0

middle

inside

File systems are hard to measure

Data location-dependant performance Data structure (re)use-dependant performance (newfs vs old system)

Double for flash drives


400 350 300

Whole drive

Noise
MB/s

250 200 150 100 50 0 1 2 3 4 5 6 7 8

outside middle inside

Speaking of noise...

Reducing the difference

Create a partition spanning the minimal required disk space (e.g. 3x-4x the RAM size)
350 300 250

MB/s

200

150

outside middle inside

100

50

partition

whole drive

bonnie++

File system bandwidth & file op latency

600

500

400

MB/s
300

200

100

0 Write CentOS 6.3 Rewrite FreeBSD 9.1 FreeBSD 10 Read

File systems are complex beasts

Think about it it's a database... Data layout issues (esp. on rotating rust) Concurrent access issues miscellaneous features such as attributes, security, reliability, TRIM NFS is also a horror but for different reasons

Blogbench

Stresses file system parallelism and random disk IO

Creates a tree of small-ish files (2 KiB 64 KiB) Reads and writes them Atomic renames for (some) writes Multithreaded

Blogbench results
2000000 1800000 1600000 1400000 1200000 1000000

700,000

1,900,000

800000 600000 400000 200000 0 Linux read 4500 4000 3500 3000 2500 2000 1500 1000 500 0 Linux write FreeBSD 9.1 write FreeBSD 10 write FreeBSD 9.1 read FreeBSD 10 read

Blogbench FreeBSD 9.1 vs 10


x read-91 + read-10 +----------------------------------------------------------------------+ |xx x * x x*x + * x +* + + + + +| | |__________AM|________|______M__A_________________| | +----------------------------------------------------------------------+ N Min Max Median Avg Stddev x 11 509450 695303 600394 593580.45 60234.934 + 11 577355 893889 695303 708560.45 102923.93 Difference at 95.0% confidence 114980 +/- 75005.3 19.3706% +/- 12.6361% (Student's t, pooled s = 84325.5) x write-91 + write-10 +----------------------------------------------------------------------+ | + | |+ + + ++* * x+ x x x x * x x+ x| | |__________M_______A|_________________|M_________________| | +----------------------------------------------------------------------+ N Min Max Median Avg Stddev x 11 1299 2173 1734 1716.6364 287.0642 + 11 1111 2104 1299 1416 296.7814 Difference at 95.0% confidence -300.636 +/- 259.694 -17.5131% +/- 15.128% (Student's t, pooled s = 291.963)

Why is Blogbench slow on FreeBSD?

A multithreaded mix of file operations On FreeBSD (UFS):


open(O_WRITE) write() rename() etc ... block each other (exclusive lock) + writers block readers

So called write bias of FreeBSD

Why is Blogbench slow on FreeBSD?


884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 100130 100190 100162 100141 100190 100162 100130 100141 100162 100130 100190 100162 100130 100141 100190 100141 100162 100190 100130 100190 100141 100162 100190 100141 100130 100190 100162 100190 100162 100141 100130 blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench 1.472372 1.472381 1.472382 1.472386 1.472387 1.472395 1.472398 1.472402 1.472407 1.472412 1.472403 1.472418 1.472421 1.472415 1.472423 1.472430 1.472433 1.472438 1.472442 1.472443 1.472451 1.472451 1.472450 1.472460 1.472464 1.472465 1.472461 1.472475 1.472477 1.472480 1.472478 CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL open(0x7ffffebf5770,0x601<O_WRONLY|O_CREAT|O_TRUNC>,0x180<S_IRUSR|S_IWUSR>) close(0x2a) read(0x31,0x602f60,0x10000) read(0x32,0x602f60,0x10000) open(0x7ffff73b9ba0,0<O_RDONLY>,<unused>0) read(0x31,0x602f60,0x10000) write(0x2a,0x612f60,0x1df2) read(0x32,0x602f60,0x10000) close(0x31) close(0x2a) open(0x7ffff73b9ba0,0<O_RDONLY>,<unused>0) open(0x7ffffabd5ba0,0<O_RDONLY>,<unused>0) rename(0x7ffffebf5770,0x7ffffebf5ba0) close(0x32) read(0x2a,0x602f60,0x10000) open(0x7ffffd5eaba0,0<O_RDONLY>,<unused>0) read(0x31,0x602f60,0x10000) read(0x2a,0x602f60,0x10000) open(0x7ffffebf5770,0x601<O_WRONLY|O_CREAT|O_TRUNC>,0x180<S_IRUSR|S_IWUSR>) close(0x2a) open(0x7ffffd5eaba0,0<O_RDONLY>,<unused>0) read(0x31,0x602f60,0x10000) open(0x7ffff73b9ba0,0<O_RDONLY>,<unused>0) read(0x2a,0x602f60,0x10000) write(0x32,0x612f60,0x13f2) open(0x7ffff73b9ba0,0<O_RDONLY>,<unused>0) close(0x31) read(0x31,0x602f60,0x10000) open(0x7ffffabd5ba0,0<O_RDONLY>,<unused>0) read(0x2a,0x602f60,0x10000) close(0x32)

Why is Blogbench slow on FreeBSD?


800000

close() read()

700000

(not distinguishing between open(O_RD) and open(O_WR) )

600000

500000

open()

400000

300000

write()
200000 100000

PostgreSQL pgbench

Initialized with -s 1000 100,000,000 records, 16 GB size

(fits in RAM, for RO) 8 GB shared buffers 32 MB work memory autovacuum off 30 checkpoint segments

PostgreSQL configuration:

PostgreSQL / pgbench caveats

WAL logging means:

data is written to checkpoint segments, restarts are needed between WRITE benchmarks data is transferred to proper storage almost unpredictably, long-ish runs are needed for repeatable benchmarks

Autovacuum can run almost unpredictably

PostgreSQL results
60000

50000

40000

30000

20000

10000

0 1 2 4 8 12 16 24 32 48 64 128

Linux disk RO avg Linux ramfs RW avg FreeBSD disk RW avg FreeBSD remote tmpfs RO avg

Linux disk RW avg Linux remote ramfs RO avg FreeBSD tmpfs RO avg

Linux ramfs RO avg FreeBSD disk RO avg FreeBSD tmpfs RW avg

PostgreSQL result notes

On Linux, there's almost no difference between ext4 when data is fully cached (warmed) and ramfs On FreeBSD, tmpfs is faster
60000 50000 40000 30000 20000 10000 0 1 2 4 8 12 16 24 32 48 64 128

Linux disk RO avg FreeBSD disk RO avg

Linux ramfs RO avg FreeBSD tmpfs RO avg

PostgreSQL result notes

Linux is consistently 5%-10% better


1200

1000

800

600

400

200

0 1 2 4 8 12 Linux disk RW avg 16 24 32 48 64 128

FreeBSD disk RW avg

PostgreSQL result notes

Remote access doesn't help much

60000

50000

40000

30000

20000

Scheduler issue

10000

0 1 2 4 8 12 16 24 FreeBSD tmpfs RO avg 32 48 64 128

Linux ramfs RO avg

Linux remote ramfs RO avg

FreeBSD remote tmpfs RO avg

However...

Different benchmark, by Florian Smeets 40-core, 80-thread system, 256 GB RAM scale 100 10,000,000 records

200000 180000 160000 140000 120000 100000 80000 60000 40000 20000 0 Threads 2

Linux wins after 32 threads


Threads 4 Threads 8 Threads 10 Threads 16 Threads 20 Threads 24 Threads 32

FreeBSD-VMC-233854-postgres-9.2-devel

Linux-kernel-3.3.0-glibc-2.15-postgres-9.2-devel

FreeBSD-head-r233892

Filebench

Dubious correctness of the port... fileserver and webproxy profiles

fileserver: Emulates simple file-server I/O activity. This workload performs a sequence of creates, deletes, appends, reads, writes and attribute operations on a directory tree. 50 threads are used by default. The workload generated is somewhat similar to SPECsfs. webproxy: Emulates I/O activity of a simple web proxy server. A mix of create-write-close, open-read-close, and delete operations of multiple files in a directory tree and a file append to simulate proxy log. 100 threads are used by default.

Local drive, NFSv3 and NFSv4

fileserver profile

900 800 700 600 500 400 300 200 100 0 Linux local MB/s Linux NFSv3 MB/s Linux NFSv4 MB/s FreeBSD local MB/s FreeBSD NFSv3 MB/s FreeBSD NFSv4 MB/s

NFSv4 is slow even on Linux

Should not be possible

webproxy profile
800

700

600

500

Possible, but improbable

400

300

200

100

0 Linux local MB/s Linux NFSv3 MB/s Linux NFSv4 MB/s FreeBSD local MB/s FreeBSD NFSv3 MB/s FreeBSD NFSv4 MB/s

Cross-benchmarking

FreeBSD server, Linux client Linux does caching...


250 200

150

100

50

0 Linux NFSv3 MB/s Linux client, FreeBSD 10 server, NFSv3 MB/s

Bullet Cache

My own cache server, presented at BSDCan 2012 :) Lots of small (32 byte 128 byte) TCP command / response transactions Multithreaded + non-blocking IO (nearly 2,000,000 TPS over Unix sockets on medium-end hardware)

Linux support incomplete, lacks epoll() Within measurement error, 9.1 == 10

600000

500000

400000

300000

200000

100000

0 CentOS 6.3 100 conn 200 conn FreeBSD 9.1 300 conn 400 conn 500 conn FreeBSD 10

Result notes

TCP is fairly concurrent in FreeBSD, multiple TCP streams do not block each other > 470,000 PPS per direction Over Unix sockets (local): 1,020,000 TPS

On tuning...

The year is 2013 and FreeBSD actually auto-tunes reasonably well Example #1: vfs.read_max Example #2: hw.em.txd/rxd Example #3: kern.maxusers Example #4: vm.pmap.shpgperproc
+ hi/lobufspace

Why no CPU benchmarks?

You are not going to influence raw CPU performance from the OS (except in edge cases / misconfiguration) You can test the quality of the libc and libm implementations... but be sure that is what you want to You can also test the quality of compiler optimizations... if you want to.

(LLVM vs GCC)

(do not draw conclusions based on this) (Phoronix benchmark, Botan/KASUMI)


Botan v1.10.3
Test: KASUMI
Mbytes/s, More Is Better OpenBenchmarking.org

GCC 4.7.2
SE +/- 0.00

39.35

GCC 4.8.0
SE +/- 0.00

37.91

LLVM Clang 3.2


SE +/- 0.00

39.34

LLVM Clang 3.3 SVN


SE +/- 0.00

65.75 15 30 45 60 75

Powered By Phoronix Test Suite 4.4.1


1. (CXX) g++ options: -m64 -ldl -lpthread -lrt

Why NOT benchmark?

When is benchmarking important?

Is performance more important than features? Is it cheaper to buy another machine than to reconfigure / develop a better OS? During 7.x timeframe: 8 CPU scalability During 10.x timeframe: 32 CPU scalability

The future is actually quite good


When you do need to benchmark?

As a sysadmin, you just want to know Planning / budgeting for a project When your boss tells you to :) Advocacy...

Continuous benchmarking?

Some talks were had... It would probably be easier to set up now after the developer cluster has been updated... Performance regression monitoring

The End

Benchmarking FreeBSD
Ivan Voras <ivoras@freebsd.org>

S-ar putea să vă placă și