Sunteți pe pagina 1din 87

Mongrel2 Manual Installing, Deploying, Managing, Hacking

Zed A. Shaw
Guillermo O. Tordek Freschi

July 2010

ii

Contents
Preface 1 Introduction 1.1 Language Agnostic . . . 1.2 Asynchronous . . . . . . 1.3 Message Protocol . . . . 1.4 Application Oriented . . 1.5 Automated Management 1.6 Using This Manualv 1 2 2 2 3 3 4 5 5 6 6 6 7 7 8 8 9 11 11 13 13 15 16 17 17 18 18 19 19 20

Installing 2.1 Install Dependencies . . . . . . . . 2.2 Building Mongrel2 . . . . . . . . . 2.2.1 Using the .tar.bz2 File . . . 2.2.2 Using git . . . . . . . . . . . 2.3 Building And Installing . . . . . . 2.3.1 Other platforms than Linux 2.4 Testing The Installation . . . . . . . 2.5 Upgrading from trunk . . . . . . . 2.6 Up Next . . . . . . . . . . . . . . .

Managing 3.1 Model-View-Controller . . . . . . . . . 3.2 Trying m2sh . . . . . . . . . . . . . . . 3.2.1 What The Hell Just Happened? 3.3 A Simple Conguration File . . . . . . 3.4 How A Cong Is Structured . . . . . . 3.4.1 Server . . . . . . . . . . . . . . 3.4.2 Host . . . . . . . . . . . . . . . 3.4.3 Route . . . . . . . . . . . . . . . 3.4.4 Dir . . . . . . . . . . . . . . . . 3.4.5 Proxy . . . . . . . . . . . . . . . 3.4.6 Handler . . . . . . . . . . . . . 3.4.7 Others . . . . . . . . . . . . . . iii

iv 3.5 3.6 A More Complex Example . . . . . . . . . . . Routing And Host Patterns . . . . . . . . . . 3.6.1 How Routing Works . . . . . . . . . . 3.6.2 JSON/XML Message Routing Syntax 3.7 Deployment Logs And Commits . . . . . . . 3.8 Control Port . . . . . . . . . . . . . . . . . . . 3.9 Multiple Servers . . . . . . . . . . . . . . . . . 3.10 Tweakable Expert Settings . . . . . . . . . . . 3.11 SSL Conguration . . . . . . . . . . . . . . . . 3.11.1 Experimental SSL Caching . . . . . . 3.12 Conguring Filters (BETA) . . . . . . . . . . . 4 Deploying 4.1 Mongrel2 Deployment Requirements . . 4.1.1 Introducing procer . . . . . . . . 4.1.2 Installing procer . . . . . . . . . 4.2 The Plan . . . . . . . . . . . . . . . . . . 4.3 Step 1: The Deployment Area . . . . . . 4.4 Step 2: The mongrel2.org Conguration 4.5 Step 3: Setup procer . . . . . . . . . . . . 4.5.1 The Python Examples . . . . . . 4.5.2 Testing The New Setup . . . . . 4.5.3 Nice Features of Procer . . . . . 4.6 Step 4: Static Content . . . . . . . . . . . 4.7 Step 5: Testing And Troubleshooting . . 4.8 Further Improvements . . . . . . . . . . 4.9 Deployment Tips . . . . . . . . . . . . . 5 Hacking 5.1 Front-end Goodies . . . . . . . . . . 5.1.1 HTTP . . . . . . . . . . . . . . 5.1.2 Proxying . . . . . . . . . . . . 5.1.3 WebSockets . . . . . . . . . . 5.1.4 JSSocket . . . . . . . . . . . . 5.1.5 Long Poll . . . . . . . . . . . 5.1.6 Streaming . . . . . . . . . . . 5.1.7 N:M Responses . . . . . . . . 5.1.8 Async Uploads . . . . . . . . 5.2 Introduction to ZeroMQ . . . . . . . 5.3 Handler ZeroMQ Format . . . . . . . 5.3.1 Socket Types Used . . . . . . 5.3.2 UUID Addressing . . . . . . 5.3.3 Numbers Identify Listeners . 5.3.4 Paths Identify Targets . . . . 5.3.5 Request Headers And Body . 5.3.6 Complete Message Examples

CONTENTS


CONTENTS
5.3.7 TNetStrings Alternative Protocol 5.3.8 Python Handler API . . . . . . . 5.4 Basic Handler Demo . . . . . . . . . . . 5.5 Async File Upload Demo . . . . . . . . . 5.6 MP3 Streaming Demo . . . . . . . . . . 5.7 Chat Demo . . . . . . . . . . . . . . . . . 5.8 Writing A Filter (BETA) . . . . . . . . . . 5.9 Other Language APIs . . . . . . . . . . . 5.10 Writing Your Own m2sh . . . . . . . . . 5.11 Cong From Anything: Experimental . 6 Contributing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v 64 65 66 67 70 72 72 74 74 75 79

vi

CONTENTS

Preface
This manual will tell you about the most awesome webserver on the planet: Mongrel2. It is written for people with a sense of humor who want to get things done with Mongrel2. That means, if youre an operations professional, software developer, hacker or just curious, its for you. However, if youre too serious and think owery language (A.K.A. good, entertaining writing) does not belong in your software manuals, then you should just go read the source code and save everyone a huge headache dealing with you. In case you havent gured it out, this book will be fun and slightly obnoxious. Thats not intended to insult you, but just to keep you interested so that you want to read it.

Typography
Usually the people running the web can be divided into three types of people: Steves, Edsgers, and Knuths. The Steves think that the entire internet should be a wonderful user experience where all pages are crafted with pixel-perfect fonts with high gloss visuals and coated with the most happy happy joy joy of all possible experiences. To them, design is paramount and actual stability isnt important unless it interferes with design. The Steves of the internet think the Edsgers of internet are destroying the universe with things like functionality, security, and stability. Just like the real Steve Jobs, they would rather everything look fantastic and then use awesome marketing to cover up any technical aws. The Edsgers feel that the internet is completely unsafe, and until it is a fully curated and crafted set of academic, peer reviewed papers, it will be a festering pile of dung. To the Edsgers, the world is dangerous and only a truly paranoid attitude toward security and stability will ensure that it becomes safe. They want every single piece of software to reject all reality and be crafted from nothing but pure mathematics, and hate the fact that the Steves want to run around painting the world with useless frivolous colors and words and things vii

viii that lead to ambiguity and happiness.

PREFACE

The typography in this book, and the entire project, is for the Knuths of the world. I like to think of the Knuths as the practical yet professional types with a light sense of humor. They are the ones who are getting things done while still balancing between great typography and solid bug-free functionality. They arent zealots, but practical, straight-forward type of people. That is why this book is written in TEX, and why it uses whatever fonts TEX uses.

Chapter 1

Introduction
Mongrel2 is a web server. HTTP requests come in, HTTP responses go out. Request, response. There is nothing revolutionary or extravagant in what Mongrel2 does with a browser, apart from supporting fancy asynchronous socket protocols. To the browser, Mongrel2 is just this nice web server that has WebSockets and Flash Sockets in it. Thats it. What makes Mongrel2 special is how it satises these requests in a language agnostic and asynchronous way using a simple messaging protocol to talk to applications; not just serve les. Mongrel2 is also designed to be incredibly easy to automatically manage it as part of your infrastructure. Other web servers do some of these things, but they either do them in a bastardized way or not all of them at once. Plenty of language specic web servers like Node.js and Jetty have asynchronous operation, but theyre not language agnostic1 . Other web servers will let you talk to any language as a backend, but they insist on using HTTP proxying or FastCGI, which is not friendly to asynchronous operations. Mongrel2 is the only web server I know of that actively tries to focus on these features as a cohesive whole. Note 1 TL;DR!

Dont want to read the manual?2 You can read the Getting Started page available many languages even. Its a fast crash course in getting Mongrel2 up and running.

1 Who

the hell wants to code Javascript all day? Yuck.

CHAPTER 1. INTRODUCTION

1.1 Language Agnostic


The term language agnostic came from people who read about Mongrel2 in the early days, and it means that Mongrel2 does not try to promote any one language over any others. Mongrel2 does not care if you run a Python shop, or if youre a die hard PHP fan, or if you hate PHP and love only Ruby on Rails. Mongrel2 only knows about HTTP requests, HTTP responses, async messages, and getting them to your gear to meet those requirements. Language Agnosticism is the most important feature of Mongrel2, and its entire purpose stems from the desire to reduce the amount of programming language religion in the world. Real people want to get things done, not wanker on which technology is the best or force other people to use their favorite toys. Instead, Mongrel2 works to just be great for every language and make it easy to use what works best for a given problem.

1.2 Asynchronous
Many web servers are asynchronous internally, and some force you to know way too much about how they work internally to get anything done. What makes Mongrel2s version of asynchronous messaging different is that it extends to outside the Mongrel2 server. This is a powerful concept that even your backends can operate asynchronously using simple identication of connected clients. Other servers assume that every request is received by a browser, then sent to a backend, and then directly sent out to the client and thats it. Mongrel2 assumes that there is a connected client, and it sends requests to backends, but it makes no assumptions about how those backends respond to the clients. All it requires is that the backend application send messages addressed to the client and it will write them on the socket. Because of this design, Mongrel2 can easily house both classic HTTP clients, keep-alive style HTTP client, chunked encoding responses, JSSockets, or WebSockets using the same code.

1.3 Message Protocol


In order to properly do asynchronous messaging in a language agnostic way, Mongrel2 needed a good base protocol that allowed for different messaging styles and worked with many different languages. HTTP proxying already does this, although its not asynchronous at all. What gives Mongrel2 its special

1.4. APPLICATION ORIENTED

powers is ZeroMQ, a language- and transport-mechanism-agnostic messaging system that does not require a centralized messaging server to operate. Using ZeroMQ lets Mongrel2 talk to a huge number of languages, operate within any kind of network architecture, and do it with a very simple communication model and API that most programmers can understand.

1.4 Application Oriented


Web servers today are written as if it was still 1995 and all anyone needs to do is serve les, maybe some graphics. Todays web applications are not about serving les; theyre about serving application logic and doing it asynchronously. The advent of the bewildering numbers of ways to hack HTTP into an async messaging protocol3 is proof enough that the pressure is on for web servers to be for applications with highly interactive interfaces. Mongrel2 can still serve les just ne. In fact, its got very accurate and easyto-understand le serving code. However, Mongrel2 will always be about applications. Fast, scalable, awesome, asynchronous or synchronous applications that need to use languages that mere mortals can work with, like PHP. If theres ever a choice, apps win.

1.5 Automated Management


The language agnostic philosophy even extends to the conguration system where you can use any language you need to congure it and manage it, as long as the results are a SQLite3 database Mongrel2 can read and work with to run. There are great tools for managing this database already written in Python and C, but if you hate Python or C then you can write anything you want. This pattern is established with servers like Postx, Exim, Sendmail, qmail, and others, that convert conguration les to half-assed SQL databases. Mongrel2 effectively adopts a Model-View-Controller design for its conguration system, the same way every web application is designed today. The Model is a SQLite3 database le, which any programming language can access. The Controller is a Mongrel2 process that reads this le and sets itself up accordingly. The View is a C binary (with no dependencies other than SQLite3 and ZeroMQ) called m2sh that gives you a command line UI to congure and setup the Mongrel2 sqlite model. It gives you commands for managing it, crafting congurations, looking at them; the works.
3 Comet,

long poll, Juggernaut, etc.

CHAPTER 1. INTRODUCTION

But, most importantly, you can write your own. You dont have to wait for a Mongrel2 developer to craft a conguration le parser for your favorite language, or use some hack job Nagios Perl junk to automate or scan it. Its SQLite3 with a solid, simple schema and even a well written Python and/or C code example showing you how it works. Nothing stops you from automating the hell out of Mongrel2 with that.

1.6 Using This Manual


This manual is intended to be fun to read, so probably the best way to use it is to actually read it. I know! Revolutionary, right? I mean, who has time to read and learn about something these days. You just want to get in there and get whatever problem you have done, now! No time for words. You just want a straight dump right into your brain so that you are able to solve all your problems instantly and screw all this talking. You ever ask yourself if this attitude about not wanting to read and learn is possibly the reason you always get stuck in emergencies with no time to read and learn? Something to think about. My recommendation is that you go through every page of this manual and do the stuff in it. Even if you think you wont need something, because youre not a programmer, or youre not in operations, you should learn it. Doing so will make the parts you do need clearer and give you better ideas for later.

Chapter 2

Installing
Mongrel2 is designed to build on most modern Unix systems, specically Linux and Mac OSX. It is written in C (not Ruby) and uses fairly vanilla C and standard libraries, except for one piece that implements the internal coroutines. Other than this, you should be able to compile and install Mongrel2 with nothing more than make all install after youve installed all the dependencies. Now, if when I said dependencies you started to groan at having to install software to use my software, well my friend, welcome to the future. You said you dont want people reinventing the wheel, right? Great, that means you need to install software for my software to work. Its either that or wait 10 years for me to build everything from scratch like some arrogant jackass. We good now? Great, lets get started.

2.1 Install Dependencies


To get everything working you will need the following dependencies: GNU make (gmake). ZeroMQ 2.1.4 for the messaging. SQLite3. If you install these things in this order, then everything should be good. Since every system is different, it is difcult to tell you exactly how to install required packages for your OS, but heres how I did it on my computer: 5

6 Source 1

CHAPTER 2. INSTALLING
Installing Dependencies on ArchLinux

# install ZeroMQ wget http://download.zeromq.org/zeromq-2.1.4.tar.gz tar -xzvf zeromq-2.1.4.tar.gz cd zeromq-2.1.4/ ./configure make sudo make install # install sqlite3 sudo pacman -S sqlite3

If you run into parts that your OS is missing, which is likely on Debian and SuSE systems, then youll have to go and gure out how to install it. Some distributions (like Ubuntu) split dev and runtime packages. In order to build mongrel2 on these distros, you must install libsqlite3-dev: this package contains sqlite3.h, which Mongrel2 needs during compilation. For the lazy, the command is: sudo apt-get install libsqlite3-dev Other pieces known to be missing on ubuntu-like systems: uuid-runtime: Needed by m2sh uuid command

2.2 Building Mongrel2


If everything went well you should be able to grab the Mongrel2 source and try building it. Theres two ways you can get the source code to Mongrel2: 1. Install Git and check out the source. 2. Grab the source .zip or tar.bz2 release and install it from there.

2.2.1

Using the .tar.bz2 File

The easiest way to build Mongrel2 is to use the .tar.bz2 le from the main page downloads section. Simply download it and youre done.

2.2.2

Using git

If you like living on the edge then heres how to follow the development source tree while we work on it:

2.3. BUILDING AND INSTALLING

First, get git running on your system, through your package manager or fetch the proper sources or binaries from the Git Download page. Once you have git you can then get the Mongrel2 source and open it up: Source 2 Cloning the Mongrel2 Source

git clone https://github.com/zedshaw/mongrel2.git cd mongrel2

Make sure you do this in order (just like with every set of instructions you follow) or else youll get errors.

2.3 Building And Installing


Once you have the source ready to go you can build it and then install it with one command: make all install There is no ./configure for Mongrel2 since we avoid too many OS specic differences or shield those away with good feature checks in the code. If you want to install to a different location than the default /usr/local, use PREFIX=/path/to/somewhere make all install instead. The end result of this should be: 1. Mongrel2 builds and compiles without errors. 2. All the unit tests run.1 3. The m2sh binary gets installed. 4. The mongrel2 binary gets installed. If any of these stages fail, then you can simply try to x them and then run: make clean all && sudo make install which will do everything all over again.

2.3.1

Other platforms than Linux

If you arent running Linux chances are good this standard procedure will not work for you. The Makele lists several targets for various platforms, as of writing this there are:
1 Please

tell us about failures.

8 freebsd netbsd openbsd macports

CHAPTER 2. INSTALLING

So for example you would probably install zeromq and sqlite3 as ports and then compile it like so: Source 3
# install ZeroMQ cd /usr/ports/devel/zmq make make install # install sqlite3 cd /usr/ports/databases/sqlite3 make make install # install mongrel2 cd /where/you/extracted gmake freebsd install

Installing Mongrel2 on FreeNSD

2.4 Testing The Installation


When you are done, you probably want to make sure that it installed correctly. Theres a test conguration le in tests/config.sqlite that you can use to try it out: Source 4
mkdir run mkdir logs mkdir tmp m2sh servers -db tests/config.sqlite m2sh start -db tests/config.sqlite -host localhost

First Test Run

Thats it. Just hit CTRL-c for now and well get into playing with this setup later.

2.5 Upgrading from trunk

2.6. UP NEXT
Source 5
cd mongrel2 git pull # make sure you get a clean build make clean all # install it once again sudo make install

9 Update your checkout

2.6 Up Next
You now should have a working Mongrel2 system installed and the m2sh conguration interface ready to go. In the rest of this manual well be simply learning how to do more with Mongrel2, like making our own congs, writing handlers, and other fun stuff.

10

CHAPTER 2. INSTALLING

Chapter 3

Managing
Mongrel2 is designed to be easy to deploy and automate the deployment. This is why it uses SQLite to store the conguration, but m2sh as an interface to creating the conguration. Doing this lets you access the conguration using any language that works for you, augment it, alter it, migrate it and automate it. In this chapter, Im going to show you how to make a basic conguration using m2sh and all the commands that are available. Youll learn how the conguration system is structured so that you know what goes where, but in the end its just a simple storage mechanism.

3.1 Model-View-Controller
When you hear Model-View-Controller, you think about web applications. This is a design pattern where you place different concerns into different parts of your system and try not to mix them too much. For an interactive application, if you keep the part that stores data (Model) separated from the logic (Controller) and use another piece to display and interact with the user (View), then its easier to change the system and adapt it over time to new features. The power of MVC is simply that these things really are separate orthogonal pieces that get ugly if theyre mixed together. Theres no math or theory that says why; just lots of experience has told us its usually a bad idea. When you start mixing them, you nd out that its hard to change for new requirements later, because youve sprinkled logic all over your web pages. Or you cant update your database because theres all these stored procedures that assume the tables are a certain way. 11

12 Note 2

CHAPTER 3. MANAGING
Apparently SQL Inspires FUD

When I rst started talking about Mongrel2, I said Id store the conguration in SQLite and do a Model-View-Controller kind of design. Immediately, people who cant read ipped out and thought this meant theyd be back in Windows registry hell, but with SQL as their only way to access it. They thought that theyd be stuck writing congurations with SQL; that SQL couldnt possibly congure a web server. They were wrong on many levels. Nobody was ever going to make anyone use SQL. That was repeated over and over but, again, people dont read and love spreading FUD. The SQLite cong database is nothing like the Windows Registry. No other web server really uses a true hierarchy; they just cram a relational model into a weirdo conguration format. The real goal was to make a web server that was easy to manage from any language, and then give people a nice tool to get their job done without having to ever touch SQL. EVER! In the end, what we got despite all this fear mongering is a bad ass conguration tool and a design that is simple, elegant, and works fantastically. If you read that Mongrel2 uses SQLite and thought this was weird, well, welcome to the future. Sometimes its weird out here (even though Postx has been doing this for a decade or more).

Mongrel2 needed a way to allow you to use various languages and tools to automate its conguration. Letting you automate your deployments is the entire point of the server. The idea was that if we gave you the Controller and the Model, then you can craft any View you wanted, and theres no better Model than a SQL database like SQLite: its embeddable, easily accessed from C or any language, portable, small, fast enough and full of all the features you need and then some. What you are doing when you use m2sh (from tools/m2sh) to congure a conguration for Mongrel2, is working with a View weve given you to create a Model for the Mongrel2 server to work with. Thats it, and you can create your own View if you want. It could be automated deployment scripts, a web interface, monitoring scripts, anything you need. The point is, if you just want to get Mongrel2 up and running, then use m2sh. If you want to do more advanced stuff, then get into the conguration database schema and see what you can do. The structure of the database very closely matches Mongrel2s internal structure, so understanding that means you understand how Mongrel2 works. This is a vast improvement over other web servers like Apache where youve got no idea why one stanza has to go in a

3.2. TRYING M2SH


particular place, or why information has to be duplicated. With Mongrel2, its all right there.

13

3.2 Trying m2sh


To give this conguration system a try, you just need to run the test conguration used in the unit tests. Lets try doing a few of the most basic commands with this conguration. First, make sure you are in the mongrel2 source and youve ran the build so that you get the tests/config.sqlite le primed. This is our base test case that we use in unit testing. After you have that, do this: Source 6
# get s list of the available servers to run m2sh servers -db tests/config.sqlite # see what hosts a server has m2sh hosts -db tests/config.sqlite -server test # find out if a server named test is running m2sh running -db tests/config.sqlite -name test # start a server whos default host is localhost m2sh start -db tests/config.sqlite -host localhost

Sample m2sh Commands

At this point, you should have seen lists of servers and hosts, seen that mongrel2 is not running, and then started it. You can nd out about all the commands and get help for them with m2sh help or ms2h help command. You can now try doing some simple starting, stopping and reloading using sudo (make sure you CTRL-c to exit from the previous start command): Awesome, right? Using just this one little management tool you are able to completely manage a Mongrel2 instance without having to hack on a cong le at all. But you probably need to know how this is all working anyway.

3.2.1

What The Hell Just Happened?

You now have done nearly everything you can to a conguration, but you might not know exactly whats going on. Heres an explanation of whats going on behind the scenes:

14 Source 7

CHAPTER 3. MANAGING
Starting, Stopping, Reloading

# start it so it runs in the background via sudo m2sh start -db tests/config.sqlite -host localhost -sudo tail logs/error.log # reload it m2sh reload -db tests/config.sqlite -host localhost tail logs/error.log # hit is with curl to see it do the reload curl http://localhost:6767/ tail logs/error.log # see if its running then stop it m2sh running -db tests/config.sqlite -host localhost m2sh stop -db tests/config.sqlite -host localhost

1. When you did m2sh start with the -sudo option, it actually runs sudo mongrel2 tests/config.sqlite localhost to start the server. 2. Mongrel2 is now running in the background as a daemon process, just like a regular server. However, what it did was chroot to the current directory and then drop privileges so that they match the owner of that directory (you). Use ps aux to take a look. 3. With Mongrel2 running, you can look in the logs/error.log le to see what it said. It should be a bunch of debug logging, but check out the messages: nice and detailed. 4. Next you did a soft reload with m2sh reload and you should notice that your mongrel2 process was able to load the new cong without restarting. 5. However, theres a slight bug that doesnt do the reload until the next request is served. Thats what the curl http://localhost:6767/ was for. 6. Now that you can see this reload work in logs/error.log, you used m2sh running to see if its running. This command is just reading the cong database to nd out where the PID le is (run/mongrel2.pid) and then checking if that process is running. 7. Finally, you tell mongrel2 to stop, and since it dropped privileges to be owned by you, you can do that without having to use sudo. All of this is happening by reading the tests/config.sqlite le and not reading any conguration les. You can now try building your own conguration that matches this one or some others.

3.3. A SIMPLE CONFIGURATION FILE

15

3.3 A Simple Conguration File


To congure a new cong database youll write a le that looks a lot like a conguration le. It looks like a Python le, because it comes from the rst m2sh we wrote in Python (living in examples/python), but now its written in C. Even though it was rewritten, we managed to keep the same format and even make it a little easier by making commas optional in most places. First you load your conguration into a fresh database using m2sh load. For our example, well use the example conguration from examples/configs/sample.conf to make a simple one: Source 8 Simple Little Cong Example

main = Server( uuid="f400bf85-4538-4f7a-8908-67e313d515c2", access_log="/logs/access.log", error_log="/logs/error.log", chroot="./", default_host="localhost", name="test", pid_file="/run/mongrel2.pid", port=6767, hosts = [ Host(name="localhost", routes={ '/tests/': Dir(base='tests/', index_file='index.html', default_ctype='text/plain') }) ] ) servers = [main]

If you arent familiar with Python, then this code might look freaky, but its really simple. Well get into how its structured in a second, but to load this le we would just do this: Source 9 Loading The Simple Cong

m2sh load -config examples/configs/sample.conf ls -l config.sqlite m2sh servers m2sh hosts -server test m2sh start -name test

Notice that we didnt have to tell m2sh that the database was config.sqlite. It assumes that is the default, as well as that mongrel2.conf is the cong le

16

CHAPTER 3. MANAGING

you want. If you use those two les, then you never have to type those parameters again. With this sequence of commands you: 1. Create a raw fresh cong database name config.sqlite and load the mongrel2.conf into it. 2. List the servers it has congured. 3. List the hosts that server has, with what routes it has. 4. Start this server to try it out. By now you should be getting the hang of the pattern here, which is to use m2sh and a conguration script to generate .sqlite les that Mongrel2 understands.

3.4 How A Cong Is Structured


The base structure of a Mongrel2 conguration is: Server This is the root of a cong, and you can have multiples of these in one database, even though each start command only runs one at a time. Host Servers have Hosts inside them, which say what DNS hostname Mongrel2 should answer for. You can have multiples of these in each Server. Route Hosts have Routes in them, which tells Mongrel2 what to do with URL paths and patterns that match them. Routes then have Dir, Handler or Proxy items in them. Dir A Dir serves les out of a directory, full with 304 and ETag support, default content types, and most of the things you need to serve them. Proxy A Proxy takes requests matching the Route theyre attached to and sends them to another HTTP server somewhere else. Mongrel2 will then act as a full proxy and also try to keep connections open in keep-alive mode if the browser supports it. Handler A Handler is the best part of Mongrel2. It takes HTTP requests, and turns them into nicely packed and processed ZeroMQ messages for your asynchronous handlers. Each of these nested objects then has a set of attributes you can use to congure them, and most of them have reasonable defaults.

3.4. HOW A CONFIG IS STRUCTURED

17

3.4.1

Server

The server is all about telling Mongrel2 where to listen on its port, where to chroot, and general server specic deployment gear. uuid A UUID is used to make sure that each deployed server is unique in your infrastructure. You could easily use any string thats letters, numbers, or - characters. chroot This is the directory that Mongrel2 should chroot to and drop privileges. access log The access log le relative to the chroot. Usually starts with a /. Make sure you congure your server so that this and other les arent accessible, or make this owned by root. error log The error log le, just like access log. pid le Like the access log, where within the chroot directory is the pid le stored. default host The server has a bunch of hosts listed, but it needs to know what the default host is. This is also used as a convenient way to refer to this Server. bind addr The IP address to bind to, default is 0.0.0.0. port The port the server should listen on for new connections.

3.4.2

Host

A host is matched using a kind of inverse route that matches the ending of Host: headers against a pattern. Youll see how this works when we talk about routes, but for now you just need to know that request to the Server.port are routed based on these Host congurations the Server contains. name The name that you use to talk about this Host in the server conguration. matching This is a pattern thats used to match incoming Host headers for routing purposes. server If you want to set the server separately you can use this attribute. maintenance This will a setting for the future that will let you have Mongrel2 throw up a maintenance page for this host. routes This is a dict (hashmap) of the URL patterns mapped to the targets that should be run.

18

CHAPTER 3. MANAGING

3.4.3

Route

The Route is the workhorse of the whole system. It uses some very fancy but still simple code in Mongrel2 to translate Host: headers to Hosts and URL paths to Handlers, Dirs, and Proxies. path This is the path pattern that matches a route. The pattern uses the Mongrel2 pattern langauge which is a reduced version of the Lua pattern matching system. reversed Determines if this pattern is reversed, which is useful for matching le extensions, hostnames, and other naming systems where the ending is really the prex. Usually you dont set this. host You can use this attribute to set the host manually. target This is the target that should handle the request, either a Dir, Handler or Proxy. Later on, youll learn about the pattern matching thats used, but its basically a stripped down version of your normal regular expressions, but with a few convenient syntaxes for doing simple string matching. When you congure a route, you write something like /images/(.*.jpg) and the part before the ( is used as a fast matched prex, while the part after it is considered a pattern to match. When a request comes in, Mongrel2 quickly nds the longest prex that matches the URL, and then tests its pattern if there is one. If the pattern is valid, the request goes through. If not, 404.

3.4.4

Dir

A Dir is a simple directory-serving route target that serves les out of a directory. It has caching built-in, handles if-modied-since, ETags, and all the various bizarre HTTP caching mechanisms as RFC-accurate as possible. It also has default content-types and index les. base This is the base directory from the chroot that is served. Files should not be served outside of this base directory, even if theyre in the chroot. index le This is the default index le to use if a request doesnt give one. The Dir also will do redirects if a request for a directory doesnt end in a slash. default ctype The default Content-Type to use if none matches the MIMEType table. Currently, we dont offer more parameters for conguration, but eventually youll be able to tweak more and more of the settings to control how Dirs work.

3.4. HOW A CONFIG IS STRUCTURED

19

3.4.5

Proxy

A proxy is used so that you can use Mongrel2 but not have to throw out your existing infrastructure. Mongrel2 goes to great pains to make sure that it implements a fast and dead-accurate proxy system internally, but no matter how good it is, it cant compete with ZeroMQ handlers. The idea with giving Proxy functionality is that you can point Mongrel2 at existing servers, and then slowly carve out pieces that will work as handlers. addr The DNS address of the server. port The port to connect to. Requests that match a Proxy route are still parsed by Mongrel2s incredibly accurate HTTP parser, so that your backend servers should not be receiving badly formatted HTTP requests. Responses from a Proxy server, however, are sent unaltered to the browser directly.

3.4.6

Handler

Now we get to the best part: the ZeroMQ Handlers that will receive asynchronous requests from Mongrel2. You need to use the ZeroMQ syntax for conguring them, but this means with one conguration format you can use handlers that are using UDP, TCP, Unix, or PGM transports. Most testing has been done with TCP transports. send spec This is the 0MQ sender specication, something like tcp://127.0.0.1:9999 will use TCP to connect to a server on 127.0.0.1 at port 9999. The type of socket used is a PUSH socket, so that handlers receive messages in roundrobin style. send ident This is an identier (usually a UUID) that will be used to register the send socket. This makes it so that messages are persisted between crashes. recv spec Same as the send spec, but its for receiving responses from Handlers. The type of socket used is a SUB socket, so that a cluster of Mongrel2 servers will receive handler responses but only the one with the right recv ident will process it. recv ident This is another UUID if you want the receive socket to subscribe to its messages. Handlers properly mention the send ident on all returned messages, so you should either set this to nothing and dont subscribe, or set it to the same as send ident. The interesting thing about the Handler conguration is that you dont have to say where the actual backend handlers live. Did you notice you arent

20

CHAPTER 3. MANAGING

declaring large clusters of proxies, proxy selection methods, or anything else, other than two 0MQ endpoints and some identiers? This is because Mongrel2 is binding these sockets and listening. Mongrel2 doesnt actively connect to backends; they connect to Mongrel2. This means, if you want to re up 10 more handlers, you just start them; no need to restart or recongure Mongrel2 to make them active.

3.4.7

Others

Theres also Log, MIMEType, and Setting objects/tables you can work with, but well get into those later, since you dont need to know about them to understand the Mongrel2 structure.

3.5 A More Complex Example


All of this knowledge about the Mongrel2 conguration structure can now be used to take a look at a more complex example. Well take a look at this example and Ill just say whats going on, and you try to match what Im saying to the code. Heres the examples/configs/mongrel2.conf le: If you havent guessed yet, this conguration is whats used on http://mongrel2.org to congure the main test system. In it weve got the following things to check out: 1. Our basic server, with a default host of mongrel2.org. 2. The route targets are separated out into their own variables, unlike the sample conf.py le where theyre just tossed into one big structure. 3. First target is a Dir that serves up les out of the tests directory and uses index.html as its default le. 4. Next we setup a Proxy pointing at the main websites server for testing the proxy. 5. Then theres a Dir target for the http://mongrel2.org:6767/chatdemo/ that well look at later. You MUST have ash for this to work! 6. And you have the Handler for the same chat demo that does the actual logic of a chat system. 7. After thats a little Handler for testing out doing HTTP requests to a handler. Notice how even though the chat demo and this handler use different protocols (chat demo is using JSSockets) you dont have tell mongrel2 that? It gures it out based on how theyre being used rather than by congurations.

3.5. A MORE COMPLEX EXAMPLE


Source 10

21 Mongrel2.org Cong Script

# here's a sample directory test_directory = Dir(base='tests/', index_file='index.html', default_ctype='text/plain') # a sample proxy route web_app_proxy = Proxy(addr='127.0.0.1', port=8080) chat_demo_dir = Dir(base='examples/chat/static/', index_file='index.html', default_ctype='text/plain') # a sample of doing some handlers chat_demo = Handler(send_spec='tcp://127.0.0.1:9999', send_ident='54c6755b-9628-40a4-9a2d-cc82a816345e', recv_spec='tcp://127.0.0.1:9998', recv_ident='') handler_test = Handler(send_spec='tcp://127.0.0.1:9997', send_ident='34f9ceee-cd52-4b7f-b197-88bf2f0ec378', recv_spec='tcp://127.0.0.1:9996', recv_ident='') profiler = Filter( name="/home/tordek/src/C/mongrel2/tools/filters/profiler.so", settings={} ) # your main host mongrel2 = Host(name="mongrel2.org", routes={ '@chat': chat_demo, '/handlertest': handler_test, '/chat/': web_app_proxy, '/': web_app_proxy, '/tests/': test_directory, '/testsmulti/(.*.json)': test_directory, '/chatdemo/': chat_demo_dir, '/static/': chat_demo_dir, '/mp3stream': Handler( send_spec='tcp://127.0.0.1:9995', send_ident='53f9f1d1-1116-4751-b6ff-4fbe3e43d142', recv_spec='tcp://127.0.0.1:9994', recv_ident='') }) # the server to run them all main = Server( uuid="2f62bd5-9e59-49cd-993c-3b6013c28f05", access_log="/logs/access.log", error_log="/logs/error.log", chroot="./", pid_file="/run/mongrel2.pid", default_host="mongrel2.org", name="main", port=6767, filters = [profiler], hosts=[mongrel2] )

22

CHAPTER 3. MANAGING
8. With all those handler targets, we can now make the mongrel2 Host with all the routes assigned once, nice and clean. However, look how I was lazy and just tossed the mp3stream demo right into the routes dict? You can totally do this and m2sh will gure it out. Remember also that you can use the blah string format to not have to double up on your \ chars in the patterns. 9. We then assign this mongrel2 variable as the hosts for the main server.

10. There is also a settings feature, which is just a dict of global settings you can tweak. In this case, were upping the number of threads that 0MQ is using for its operations. 11. Finally, we commit the whole thing to the database by passing in the servers to save and the settings to use. And that, my friends, is the most complex conguration we have so far.

3.6 Routing And Host Patterns


The pattern code was taken from Lua and is some of the simplest code for doing fast pattern matches. It is very much like regular expressions, except it removes a lot of features you dont need for routes. Also, unlike regular expressions, URL patterns always match from the start. Mongrel2 uses them by breaking routes up into a prex and pattern part. It then uses routes to nd the longest matching prex and then tests the pattern. If the pattern matches, then the route works. If the route doesnt have a pattern, then its assumed to match, and youre done. The only caveat is that you have to wrap your pattern parts in parenthesis, but these dont mean anything other than to delimit where a pattern starts. So instead of /images/.*.jpg, write /images/(.*.jpg) for it to work. Heres the list of characters you can use in your patterns: . (period) All characters. \a Letters. \c Control characters. \d Digits. \l Lowercase letters. \p Punctuation characters. \s Space characters.

3.6. ROUTING AND HOST PATTERNS


\u Uppercase letters. \w Alphanumeric characters. \x Hexadecimal digits. \z The 0 character (null terminator).

23

[set] Just like a regexs [] where set is a set of chars, like [0-9] for all digits. [set] Inverse character set, so [0-9] is anything but digits. * Longest match of 0 or more of the preceding character. + Longest match of 1 or more of the preceding character. - Shortest match of 0 or more of the preceding character. ? 0 or 1 match of of the preceding character \bxy Balanced match a substring starting with x and ending in y. So \b() will match balanced parentheses. $ End of the string. Using the uppercase version of an escaped character makes it work the opposite way (e.g., \A matches any character that isnt a letter). The backslash can be used to escape the following character, disabling its special abilities (e.g., \\ will match a backslash). Anything thats not listed here is matched literally. Here are some example routes you can try to get a feel for the system: "/images/" This will just match any path that has /images/ in it without any patterns. "/" The fastest possible route you can have. "/images/(.*.jpg)" Match only requests for jpg images in the images directory. Keep in mind that this isnt actually looking in the directory, its just matching the (.*.jpg) pattern. "/images/(\a-\-\d+\.jpg)" A more complex example that matches a short sequence of 0 or more letters (remember -), then a dash (\- escapes the -), then 1 or long sequence of digits and nally a .jpg) with the \. escaping the period. That should give the idea of how you can use them. Notice also that Im using the Python "blah" string syntax which is interchangeable with the blah syntax so I dont have to double escape everything.

24 Note 3

CHAPTER 3. MANAGING
Sorry, Unicodians, Its All ASCII

Yep, I get it. You think that everyone should use UTF-8 or some Unicode encoding for everything. You despise the dominance of the A in ASCII and hate that you cant put your spoken language right in a URL. Well, I hate to say it, but tough. Protocols are hard enough without having to worry about the bewildering mess that is Unicode. When you sit down to write a network protocol, the last thing you need is a format thats inconsistent, has multiple interpretations, cant be properly capitalized or lowercased, and requires extra translation steps for every operation. With ASCII, every computer just knows what it is, and its the fastest for creating wire protocol formats. This is why, on the Internet, you have to do things to URLs to make them ASCII, like encoding them with % signs. Its in the standard, and its the smart thing to do. I dont want to have to know the difference between the various accents in your spoken language to route a URL around. I just want to deal with a xed set of characters and be done with it. Dont blame me or Mongrel2 for this, its just the way the standard is and the way to get a server that is stable and works. Protocols work better when theres less politics in their design. This means you cant put Unicode into your URL patterns. I mean, you can try; but the behavior is completely undened.

3.6.1

How Routing Works

The routing algorithm is actually kind of simple, but its an unfamiliar algorithm to most programmers. I wont go into the details of how a Ternary Search Tree works, but basically it lets you match one prex against a bunch of other strings very fast. This data structure lets Mongrel2 very quickly determine the target for a route, and also know if it has a route at all. Typically, it can match a route in just a few characters, and reject a route in even fewer. For practical usage, its better to just read how it works, rather than how its implemented. Heres how Mongrel2 matches an incoming URL against routes youve given it: 1. Your conguration has a route for "/images/(.*.jpg)" and "/users". 2. Mongrel2 loads these and converts them to PREFIX/PATTERN pairs. For the rst one the PREFIX=images, PATTERN=(.*.jpg). For the second one its PREFIX=/users and PATTERN=None.

3.6. ROUTING AND HOST PATTERNS

25

3. It stores these in the URL routes by their PREFIX, and there can be only one PREFIX at a time. This means you cant put "/foo/(.*)" and "/foo/" in at the same time (thats always redundant anyway). 4. A request comes in for /images/hello.jpg so Mongrel2 takes the whole URL and searches for the longest rst route that can possibly match. In this case, thats the /images route. 5. It checks if the route it found has a pattern, and if it does then it runs the pattern match code for the whole thing. If they match, then this is the target and its good. If not, it returns a 404. In this case the /images URL and patterns match so its good. 6. Next, a request comes in for /users/johndoe/1234. 7. Mongrel2 does the PREFIX search again, and the longest matching prex is the route for "/users" so it gets that from the routing table. 8. Since the /users route doesnt have a PATTERN, then this is the route and it passes by default. No pattern matching code is run. 9. Now for a slightly confusing result: A request comes in for /us. Since a PREFIX for "/users" exists, and its the longest rst match, it will match that route. If you wanted this condition to fail, youd need to be explicit and add on a pattern like, "/users()$" to say you need an exact match. Another option is to give a "/" route for a default location (which usually happens). 10. Finally, a request comes in for /XRAY. This will match no prex at all, so it gets a 404. That example should show you how routes work, and the important thing to realize is that theyll try to match the longest rst route as what we call the best route. If you get unexpected routing behavior, then youll want to just make them explicit by putting a pattern at the end. Finally, heres some examples directly from the unit test that we have for the routing system. Imagine we have these routes: "/" == handler0 "/users/([0-9]+)" == handler1 "/users" == handler2 "/users/people/([0-9]+)$" == handler3 "/cars-fast/([a-z]-)$" == handler4 Then this is how a set of example requests would match: /users/1234/testing - handler1

26 /users - handler2 /users/people/1234 - handler3 /cars-fast/cadillac - handler4 /users/1234 - handler1 / - handler0 /usersBLAHAHAHAHA - handler2 /us - handler2

CHAPTER 3. MANAGING

Work through those in your head so you make sure you understand them.

3.6.2

JSON/XML Message Routing Syntax

Mongrel2 works with Flash sockets out of the box (with WebSockets coming soon) and can handle either XML messages or special JSON messages. It does this by modifying the parser it has internally to parse out HTTP or (exclusive) XML and JSON messages. This feature can be used by any TCP client, not just Flash, it just happens to be a simple way to send simple async messages without using HTTP. To make it work, theres a slight modication to the routes used by JSON or XML messages. Basically, JSON routes start with a @ and XML routes start with a < and both must be terminated with a NUL byte \0. When the parser sees these at the beginning of a request, it parses that message and sends it as-is to your target handler. Lets look at two examples from the chat demo and from some test suites: "@chat": chat_demo "<test": xml_demo The rst one will take any Flash (or just TCP connection) that sends lines like @chat {"msg": "hello"}\0 and route those to the chat_demo handler. You can connect, and then just stream these JSON messages all you want, and handlers can send back the same responses. In fact, as long as you dont include a \0 character, you could probably send anything you want. The second route will take any XML that is wrapped in a <test> tag and send that to your handlers. That means you can send <test name="joe"><age>21</age></test> and it will send it to xml_demo. This is powerful because Mongrel2 now becomes a generic XML or JSON messaging server very easily. For example, I wrote a simple little BBS demo with Mongrel2 and wrote a very basic terminal client in Python for people to use

3.6. ROUTING AND HOST PATTERNS

27

instead of the browser. Look at examples/bbs/client.py to see how that works in full, but the meat of it is: Source 11
CONN = socket.socket() CONN.connect((host, port)) def read_msg(): reply = " " ch = CONN.recv(1) while ch != \0 : reply += ch ch = CONN.recv(1) return json.loads(b64decode(reply)) def post_msg(data): msg = @bbs %s \x00 % ( json.dumps({ type : msg , msg : data})) CONN.send(msg)

BBS Client JSON Socket Handling

In that code, notice how (for historical reasons due to Flash sucking) the response is base64 encoded, but your handler doesnt have to do that. You can just adopt the same protocol back. Other than that, the BBS example client is just opening a socket and sending message, but Mongrel2 is converting them to messages to backend handlers for processing. Finally, heres the grammar rules in the parser for handling these messages: Source 12 JSON/XML Message Grammar

rel_path = ( path ? ( " ; " params) ? ) ( " ? " query) ? ; SocketJSONStart = ( " @ " rel_path); SocketJSONData = " { " any* " } " :>> " \0 " ; SocketXMLData = ( " < " [a-z0-9A-Z\-.]+) ( " / " | space | " > " ) any* " > " :>> " \0 " ; SocketJSON = SocketJSONStart " SocketXML = SocketXMLData; " SocketJSONData;

SocketRequest = (SocketXML | SocketJSON);

If you read that carefully, youll see you can actually pass query strings and path parameters to your JSON socket handlers. Thats currently not used, but in the future we might.

28

CHAPTER 3. MANAGING

One caveat to this whole feature is these targets can only be routed to the Server.default\_host of the server. Theres not enough information in these routes to determine a target host (like the Host: header in HTTP) so you can only send it to the default target host.

3.7 Deployment Logs And Commits


A very nice feature for people doing operations work is that m2sh keeps track of all the commands you run on it while you work, and lets you add little commit logs to the log for documentation later. These commit logs are then maintained even across m2sh load commands so you can see whats going on. They track who did something, what server they did it on, what time they did it and what they did. To see the logs for your own tests, just do m2sh log -db simple.sqlite and then, if you want to add a commit log message, you use the m2sh commit command. Heres an example from mongrel2.org: Source 13 Example Commit Log

> m2sh log [2010-07-18T04:14:53, mongrel2@zedshaw, init_command] /usr/bin/m2sh init [2010-07-18T04:15:06, mongrel2@zedshaw, load_command] /usr/bin/m2sh load [2010-07-18T04:22:06, mongrel2@zedshaw, load_command] /usr/bin/m2sh load [2010-07-18T04:23:32, mongrel2@zedshaw, load_command] /usr/bin/m2sh load [2010-07-18T04:26:16, mongrel2@zedshaw, upgrade] Latest code for Mongrel2. [2010-07-18T18:05:59, mongrel2@zedshaw, load_command] /usr/bin/m2sh load [2010-07-18T20:09:01, mongrel2@zedshaw, init_command] /usr/bin/m2sh config [2010-07-18T20:09:02, mongrel2@zedshaw, load_command] /usr/bin/m2sh config > m2sh commit -what mongrel2.org -why "Testing things out."

The motivation for this feature is the trend that ops stores server congurations in revision control systems like git or etckeeper. This works great for holding the conguration les, but it doesnt tell you what happened on each server. In many cases, the conguration les also need to be reworked or altered for each deployment. With the m2sh log and commit system, you can augment your revision control with deployment action tracking.

3.8. CONTROL PORT

29

Later versions of Mongrel2 will keep small amounts of statistics which will link these actions to changes in Mongrel2 behavior like frequent crashing, failures, slowness, or other problems. Basically, theres nowhere to hide. Mongrel2 will help operations gure out who needs to get red the next time Twitter goes down.

3.8 Control Port


Just before the release of 1.0, we added a feature called the Control Port, which lets you connect to a running Mongrel2 server over a unix (domain) socket and give it control commands. These commands let you get the status of running tasks, lists of currently connected sockets and how long theyve been connected, the servers current time and kill a connection. Using this control port, you can then implement any monitoring and timeout policies you want, and provide better status. By default, the control port is in your chroot at run/control, but you can set the control port setting to change this. You can actually change it to any ZeroMQ valid spec you want, although youre advised to use IPC for security. Once Mongrel2 starts, you can then use m2sh to connect to Mongrel2 and control it using the simple command language. Currently, what you get back is very raw, but it will improve as we work on the control port and what it does. The list of commands you can issue are: stop Stops the server using a SIGINT. reload Reloads the server using a SIGHUP. terminate Terminates the server with SIGTERM. help Prints out a simple help message. uuid Gives you the servers UUID. info More information about the server. status what=tasks Dumps a JSON formatted dict (object) of all the currently running tasks and what theyre doing. Think of it like an internal ps command. status what=net Dumps a JSON dict that matches connections IDs (same ones your handlers get) to the seconds since their last ping. In the case of an HTTP connection this is how long theyve been connected. In the case of a JSON socket this is the last time a ping message was received. time Prints the unix time the server thinks its using. Useful for synching.

30

CHAPTER 3. MANAGING

kill id=ID Does a forced close on the socket that is at this ID from the status net command. This is a rather violent way to kill a connection so dont do it that often, but if youre overloaded then this is where to go. control stop Shuts down the control port permanently in case you want to keep it from being accessed for some reason. You then use the control port by running m2sh: m2sh control -every m2 [test]> help name help stop stop the server (SIGINT) reload reload the server help this command control_stop stop control port kill kill a connection status status, what=[ net | tasks ] terminate terminate the server (SIGTERM) time the server s time uuid the server s uuid info information about this server m2 [test]> info port: 6767 bind_addr: 0.0.0.0 uuid: f400bf85-4538-4f7a-8908-67e313d515c2 chroot: ./ access_log: .//logs/access.log error_log: /logs/error.log pid_file: ./run/mongrel2.pid default_hostname: localhost m2 [test]> The protocol to and from the control socket is a simple tnetstring in and out that any langauge can read. Heres a nearly complete Python client that is using the control port: You obviously dont need to do this, but should you want to do something special like a management interface, this is your start.

3.9. MULTIPLE SERVERS


Source 14
import zmq from mongrel2 import tnetstrings from pprint import pprint CTX = zmq.Context() addr = " ipc://run/control " ctl = CTX.socket(zmq.REQ) print " CONNECTING " ctl.connect(addr)

31 Python Control Port Example

while True: cmd = raw_input( " > " ) # will only work with simple commands that have no arguments ctl.send(tnetstrings.dump([cmd, {}])) resp = ctl.recv() pprint(tnetstrings.parse(resp)) ctl.close()

3.9 Multiple Servers


A Mongrel2 process itself does not have any support for running multiple servers; instead, it takes two simple parameters: a sqlite cong database and a server uuid that names the server to be launched. This is done to keep the mongrel2 code simple and workable. However. Mongrel2s m2sh does support launching multiple servers from a single conguration database. By passing -every to many m2sh commands, you are able to perform actions on all congured servers at once. You can also perform actions on single servers by specifying their uuid, name or host. If any parameter given is ambiguous (that is if, for example, you search with -host localhost and your cong contains two servers which attempt to bind to localhost), m2sh will list the matching servers and ask you to clarify your selection. For example: > m2sh start -db config.sqlite -every Launching server localhost XXX on port 6768

32

CHAPTER 3. MANAGING

... Launching server localhost XXX on port 6767 ... > m2sh start -db config.sqlite -host localhost Not sure which server to run, what I found: NAME HOST UUID -------------localhost localhost XXX localhost localhost XXX * Use -every to run them all. > m2sh start -db config.sqlite -uuid XXX Launching server localhost XXX on port 6767 ... > m2sh running -db config.sqlite -every Found server localhost XXX RUNNING at PID 28525 PID file run/mongrel2.pid not found for server localhost XXX > m2sh stop -db config.sqlite -every

3.10 Tweakable Expert Settings


Many of Mongrel2s internal settings are congurable using the settings system. Some of these are dangerous to mess with, so make sure you test any changes before you try to run them. Setting them to 0 or negative numbers isnt checked, so if you make a setting and things go crazy, you need to not make that setting. All of these have good defaults so you can leave them alone unless you need to change them. To congure your settings, you set the variable settings and youre done: Source 15 Changing Settings

settings = {"zeromq.threads": 1, "limits.url_path": 1024} servers = [main]

Mongrel2 will read these on the y and write INFO log messages telling you what the settings are so you can debug them if they cause problems. The list of available settings are:

3.10. TWEAKABLE EXPERT SETTINGS

33

control port=ipc://run/control This is where Mongrel2 will listen with 0MQ for control messages. You should use ipc:// for the spec, so that only a local user with le access can get at it. limits.buffer size=2 * 1024 Internal IO buffers, used for things like proxying and handling requests. This is a very conservative setting, so if you get HTTP headers greater than this, youll want to increase this setting. Youll also want to shoot whoever is sending you those requests, because the average is 400-600 bytes. limits.client read retries=5 How many times it will attempt to read a complete HTTP header from a client. This prevents attacks where a client trickles an incomplete request at you until you run out of resources. limits.connection stack size=32 * 1024 Size of the stack used for connection coroutines. If youre trying to cram a ton of connections into very little RAM, see how low this can go. limits.content length=20 * 1024 Maximum allowed content length on submitted requests. This is, right now, a hard limit so requests that go over it are rejected. Later versions of Mongrel2 will use an upload mechanism that will allow any size upload. limits.dir max path=256 Max path length you can set for Dir handlers. limits.dir send buffer=16 * 1024 Maximum buffer used for le sending when we need to use one. limits.fdtask stack=100 * 1024 Stack frame size for the main IO reactor task. Theres only one, so set it high if you can, but it could possibly go lower. limits.handler stack=100 * 1024 The stack frame size for any Handler tasks. You probably want this high, since theres not many of these, but adjust and see what your system can handle. limits.handler targets=128 The maximum number of connection IDs a message from a Handler may target. Its not smart to set this really high. limits.header count=128 * 10 Maximum number of allowed headers from a client connection. limits.host name=256 Maximum hostname for Host speciers and other DNS related settings. limits.mime ext len=128 Maximum length of MIME type extensions. limits.proxy read retries=100 The number of read attempts Mongrel2 should make when reading from a backend proxy. Many backend servers dont buffer their I/O properly and Mongrel2 will ditch their HTTP response if it doesnt get a header after this many attempts.

34

CHAPTER 3. MANAGING

limits.proxy read retry warn=10 This is the threshold where you get a warning that a particular backend is having performance problems, useful for spotting potential errors before they become a problem. limits.url path=256 Max URL paths. Does not include query string, just path. superpoll.hot dividend=4 Ratio of the total (like 1/4th, 1/8th) that should be in the hot selection. Set this higher if you have lots of idle connections; set it lower if you have more active connections. superpoll.max fd=10 * 1024 Maximum possible open les. Do not set this above 64 * 1024, and expect it to take a bit while Mongrel2 sets up constant structures. upload.temp store=None This is not set by default. If you want large requests to reach your handlers, then set this to a directory they can access, and make sure they can handle it. Read about it in the Hacking section under Uploads. The le has to end in XXXXXX chars to work (read man mkstemp). upload.temp store mode=0666 The mode to chmod any les uploaded to upload.temp store. zeromq.threads=1 Number of 0MQ IO threads to run. Careful, weve experienced thread bugs in 0MQ sometimes with high numbers of these. limits.tick timer=10 Mongrel2 keeps an internal clock for efciency and to run the timeouts. This is how often that clock updates, and defaults to 10 seconds. limits.min ping=120 Minimum time since last activity before considering closing a socket. Set to 0 to disable it. limits.min write rate=300 Minimum bytes/second written before considering closing a socket. Set to 0 to disable it. limits.min read rate=300 Minimum bytes/second read before considering closing a socket. Set to 0 to disable it. limits.kill limit=2 How many of min ping, min write rate, and min read rate have to trigger before a socket is killed. You can also update your mimetypes in the same way, just set a variable with them:

3.11. SSL CONFIGURATION


Source 16

35 Changing Mimetypes

settings = {"zeromq.threads": 1, "limits.url_path": 1024} mimetypes = {".txt": "text/superawesome"} servers = [main]

3.11 SSL Conguration


Mongrel2 now support SSL, with preliminary support for SSL session caching. As of v1.8.0 (actually earlier) you can enable SSL very easily for your Mongrel2 server. How Mongrel2 congures SSL certs is with two options in setttings and then a directory of .crt and .key les named after the UUID of the servers that need them. To get started, you can make a simple self-signed certicate with some weak encryption and setup your certs directory: Source 17
# First make a certs directory mkdir certs # list out your servers so you can get the UUID m2sh servers # go into the certs directory cd certs # make a self-signed weak cert to play with openssl genrsa -des3 -out server.key 512 openssl req -new -key server.key -out server.csr cp server.key server.key.org openssl rsa -in server.key.org -out server.key openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt # finally, copy the sesrver.crt and server.key files over to the UUID for that # server configuration in your mongrel2.conf mv server.crt 2f62bd5-9e59-49cd-993c-3b6013c28f05.crt mv server.key 2f62bd5-9e59-49cd-993c-3b6013c28f05.key

Making A Self-Signed Certicate

I actually have a shell script kind of like this since I can never remember how to set this stuff up with openssl. Also, you should really adjust the RSA key strength from 512 to something youre comfortable with. Im using a weak key

36

CHAPTER 3. MANAGING

here so you can do performance testing and thrashing and then compare with your real key later. Once you have that done, you just have to add three little settings to your mongrel2 conf: 1. Add the settings certsdir pointed at ./certs/, make sure it has the trailing slash. 2. Add the Server.use\_ssl = 1 value to the Server that has this UUID you just created a cert for. 3. Optionally, set the settings ssl ciphers to SSL RSA RC4 128 SHA so you can play with the performance of a weak cipher. If you unset this then Mongrel2 will use the best one a browser wants. After you have those changes your cong should look something like this:

Source 18

Minimal SSL Conguration

main = Server( uuid="2f62bd5-9e59-49cd-993c-3b6013c28f05", use_ssl=1, access_log="/logs/access.log", error_log="/logs/error.log", chroot="./", pid_file="/run/mongrel2.pid", default_host="mongrel2.org", name="main", port=6767, hosts=[mysite] ) settings = { "certdir": "./certs/" "ssl_ciphers": "SSL_RSA_RC4_128_SHA" } servers = [main]

Get that written, rerun m2sh config to make the new cong, restart Mongrel2 (you cant reload to enable SSL), and it should be working. After you get this working you just have to get your own certicate, put it in the certs directory with the right lename, and you should be good to go.

3.12. CONFIGURING FILTERS (BETA)

37

3.11.1

Experimental SSL Caching

Weve got experimental SSL caching working, which will try to reuse the browsers SSL session if its there. This is meant to be a trade-off between memory and performance, so it can chew a bunch of RAM if you have a lot of SSL trafc over a short period of time. Well be making the caching more congurable, but for now, its working and does speed up SSL clients that do it properly.

3.12 Conguring Filters (BETA)


The Mongrel2 v1.8.0 release also included working lters that you can congure and load dynamically. The lters are very fresh, and the only one available is the null lter found in tools/filters/null.c but it does work and you can congure it. Its also currently not hooked into the reload gear that weve recently done, so dont expect it to work if you do frequently hot reloading. Conguring a lter is fairly easy, take a look at this example: First you can see that we setup the null lter with some arbitrary settings and point to where the .so le is. Filters can be congured with any arbitrarily nested data structure that can t into a tnetstring, so you can pass them pretty much anything that matters. Lists, dicts, numbers, strings, are the main ones. You can also use variables in the cong le, so you could create different servers and share cong options for Filters and other parts of the les. After that, theres simply a Server.filters which takes a list of lters to load. If you dont set this variable, then the lter gear isnt even loaded and your server behaves as normal. If you do set this variable, then the lters are installed and will work. If you run this cong, youll see the lter printing out its cong as a tnetstring, and then closing the connection, but only if you go to /nulltest/. If you go to /tests/sample.html to get at a directory, itll not even run. Well have more documentation on actually writing lters in the Hacking section.

38

CHAPTER 3. MANAGING

Source 19

Minimal Filter Conguration

null = Filter(name="/usr/local/lib/mongrel2/filters/null.so", settings={ "extensions": ["*.html", "*.txt"], "min_size": 1000 }) main = Server( uuid="f400bf85-4538-4f7a-8908-67e313d515c2", access_log="/logs/access.log", error_log="/logs/error.log", chroot="./", default_host="localhost", name="test", pid_file="/run/mongrel2.pid", port=6767, hosts = [ Host(name="localhost", routes={ '/tests/': Dir(base='tests/', index_file='index.html', default_ctype='text/plain') '/nulltest/': Proxy(addr='127.0.0.1', port=8080) }) ] filters = [null] ) servers = [main]

Chapter 4

Deploying
I am now going to try to get you to setup a small, tiny, little version of a good deployment that matches the conguration of the site at http://mongrel2.org, with all the examples running. This conguration will give you all the tools you need to make automated and managed deployments, but it is using small scale tools. The idea is that you learn what is involved in a nice, easy-tomanage setup, using simple things rst, then you can extrapolate that out into your own setup or something better.

4.1 Mongrel2 Deployment Requirements


It may seem obvious, but Ill go over the things you need in order to continue on in this section: Mongrel2 I know, hard to believe, but you actually need to have Mongrel2 installed. m2sh Again, not sure why, but some folks think they dont need this. Unless youve written your own, you need m2sh. Python Some systems (like Debian) dont install all of Python. Make sure your Python setup is good. root Youll need root access on your box. Either through sudo or some other means. Basic Python coding Right now, you should be able to do some basic Python. That will get you going at rst and, as we go, well do various other setups to get our application working. 39

40 Note 4

CHAPTER 4. DEPLOYING
Learning Python

Why should you learn programming? The trend is that if you are a system administrator who cant code, you are on your way out. Eventually, youll be in charge of automating systems; not manually managing them, and if you dont believe me then what do you think all those managed service companies are doing? Alright, so you need to learn to code, but most of the books suck for really learning if you know nothing. This is why I started my own book: Learn Python The Hard Way, for people who know nothing about programming but need or want to learn. It teaches Python, but it mostly teaches all the things programmers actually learn before they learn programming. When youre done with my book, youll have your programming brown belt. That means you can then move onto one of many other free online books and really learn programming, and have a higher chance of actually learning it. If you cant code Python then you can probably muddle through this and you may learn something, but learning Python will be important later. But dont read Dive Into Python. It is a horrible introduction.

4.1.1

Introducing procer

When I started working on this little manual, I wanted to get you into setting up a well-managed and automated deployment system. The m2sh program does much of the automation you need, but Mongrel2 also has to talk to quite a few separate little pieces that run as separate processes. Trying to juggle all these processes without a tool to help is a nightmare. You end up writing init scripts and merging them into your boot process and all sorts of crazy antics just so you can run a stupid hello world demo. What I needed was a user space process manager. These are programs that run other programs, but, more importantly, try to keep those other programs running without much human intervention. When you need to deploy a ton of processes that all have to be running, these USPMs are fantastic. They usually read some startup prole describing what needs to start and what they depend on, and then it kicks everything into gear and watches them. If any of the processes crash, they try to restart them. Very simple. Theres just one catch: all of them suck. Theres daemontools, which barely builds (if at all) and then assumes that daemons dont fork. Stupid. Theres minit, which bafingly required dietlibc to even compile and assumed it was going to be the one true init (not user space at all). Theres cinit, which got

4.1. MONGREL2 DEPLOYMENT REQUIREMENTS

41

through a compile, then barfed on its documentation, and the end result is some huge number of weird shell scripts to make it work, and, again, it wants to be the one true init. Finally , runit is some of the worst C code Ive seen in years and has the same weird design as daemontools. After trying every single one, I just gave up. Either they didnt build, were too complex, expected to be the one true init, poorly documented, not maintained, and denitely not going to work for this manual. My only choice was to shave a yak and write my own. The end result is procer, which lives in examples/procer and does most of what you need in a USPM. It works a lot like daemontools or minit, but is much simpler, with these differences: 1. It is much simpler, with only a single command to start all your stuff and keep it running. 2. It will build anywhere Mongrel2 builds, because it reuses the libm2.a library from the Mongrel2 project. 3. It doesnt want to be the one true init, or even expect to be running constantly. You can start it and stop it and it will only run whats not already running. 4. It assumes that programs will always daemonize and create a PID le. This turns out to be way easier to manage than what daemontools does, so Im sort of bafed why daemontools is how it is. 5. It has dependency management so that you can have processes start only after others have nished. 6. It still uses simple les to congure itself that are in separate directories. 7. It can be run as root and, like Mongrel2, it will drop privileges to the owner of the prole directory before it runs the command. This is incredibly useful because it lets you setup scripts that run as other users without much conguration or fuss. 8. It is dinky, tiny and well written so you can understand it, even though its written in C. 9. Best of all, I can use it in this book and you wont go insane trying to install it or use it like the others. Of course, if you have something else you like then, please, use it. Anything that automates process management will be your friend. In this manual, to keep things simple and easily understood, Ill be using procer to tell you how to setup everything.

42 Note 5

CHAPTER 4. DEPLOYING
Alternatives to procer

I wrote procer mostly for this book, but I also use it for my Mongrel2 deployments. It works for me but you can try other solutions. By default, Mongrel2 will work with either daemontools/runit style, or init.d style launchers. If Mongrel2 runs as a regular user, it assumes that you want runit style (dont fork, write to stdout/stderr). If you run as root, it assumes you want init.d style like what procer uses (fork, drop priv, chroot, etc.). You should check out proclaunch as another alternative that is similar to procer, and inspired by procer, but written in Perl with a few more features. Either way, Mongrel2 is practical, and does generally the right thing with todays tools. Want to use daemontools? Fine, just run it mongrel2 config.sqlite server_uuid and itll work right. Want to put it in init.d or use procer or similar? Fine, run it as root.

4.1.2

Installing procer

Installing procer is very easy. Its a single little binary and it lives in tools/procer in the Mongrel2 source. Heres how youd install it totally from scratch as if you hadnt even build Mongrel2 yet: Source 20
cd projects/mongrel2 make clean all && sudo make install

Install procer

Thats the entire install process, and now procer is in /usr/local/bin so you can use it. In the rest of this chapter youll learn how to use procer by just setting up the Mongrel2 demo completely and messing around with it.

4.2 The Plan


We need to plan this deployment to make sure we get the end result correct: 1. Create a deployment area where everything will live. 2. Create a cong.sqlite that will work with the demos in examples. 3. Setup procer to run Mongrel2 and the three demo Python scripts for chat, handlertest, and mp3stream, and have it run the fake backend web.py project so we have something to proxy to. 4. Get all the static le content working.

4.3. STEP 1: THE DEPLOYMENT AREA

43

5. Test out that procer is keeping things running and play with taking things down and up and using m2sh to work with the deployment. Once you have this setup working, you can then start to make your own deployments and tweak things as you need for your own applications. Remember that the goal is to get you to automate everything as much as possible, so you can go further than this then do it.

4.3 Step 1: The Deployment Area


Well need a place to put all this stuff and run it so that Mongrel2 can chroot there, procer knows where its proles are, and its all nice and clean. For these instructions, were just going to make some directories in your home directory, but feel free to change this up later if you nd a better way. Source 21
# go home first cd / # create the deployment dir mkdir deployment cd deployment/ # fill it with the directories we need mkdir run tmp logs static profiles # create the procer profile dirs for each thing cd profiles/ mkdir chat mp3stream handlertest web mongrel2 cd .. # copy the mongrel2.conf sample from the source to here cp /mongrel2/examples/configs/mongrel2.conf mongrel2.conf # setup the mongrel2 database initially m2sh load # see our end results ls

Make Deployment Directories

Hopefully, youre starting to see how you could easily automate this so that you dont have to do this all the time. Im just showing you how to make the sausage so that you know where everything goes. Future versions of m2sh will most likely create deployment directories like this automatically. What weve done here is the following:

44

CHAPTER 4. DEPLOYING
1. Setup a /deployment directory well put everything in. 2. Created run, tmp, logs, and profiles that Mongrel2 and procer need to run. 3. In proles we started dirs for chat, mp3stream, handlertest, web and mongrel2, that procer will read les out of to get all our gear up and running. 4. Copied the mongrel2.conf example le over to our deployment so we can modify it. 5. Initialized the config.sqlite le well be lling in with our modifed mongrel2.conf.

4.4 Step 2: The mongrel2.org Conguration


Now were ready to get the conguration working. Heres the thing, though: you should try to alter the conguration yourself. Ive already given you the le and you are going to have to make the changes to meet the requirements for this deployment directory. Heres what you have to change in mongrel2.conf to make everything work right: 1. Get rid of the test directory handler, since we wont need it, and any routes that mention it. 2. Change the base of chat demo dir to static/chatdemo/, which well setup at the end. 3. Modify the server chroot so that its /home/YOU/deployment/. 4. Use the m2sh uuid command to make some new UUIDs for all the existing ones. This is optional, but probably a good idea to get in the habit now. 5. Change the port for web app proxy so it points to 8080 instead of 80. 6. Finally, change any mention of mongrel2.org into localhost so that you can run it locally. Once you have that all edited, you should be able to run m2sh load -db config.sqlite -config mongrel2.conf and itll just load it up. Try using m2sh servers and m2sh hosts to take a peek. To test it out at this stage you can just run the cong.sqlite that you did with these commands: Thats enough to make sure it runs, but youve got nothing running, so it mostly wont work at all. Just start up and then kill it right after.

4.5. STEP 3: SETUP PROCER


Source 22

45 Testing The Initial Conguration

m2sh start -db config.sqlite -host localhost # hit C to exit out m2sh start -db config.sqlite -host localhost -sudo less logs/error.log m2sh stop -db config.sqlite -host localhost -murder

4.5 Step 3: Setup procer


Now we want to make procer start everything for us and keep it running. How procer works is you put a few special les into a directory in profiles. This directory (say chat) is the prole for that app. When you start procer, you point it at the main profiles directory and it tries to run it. Its dead simple and very easy to automate, so well do it by hand and then you can do some automation later. Lets rst setup a basic cong that gets our skeleton proles and make sure procer can run everything: Source 23
cd profiles/ ls # should see: chat

Skeleton procer Setup

handlertest

mongrel2

mp3stream

web

# make all the restart settings for i in *; do touch $i/restart; done # make all the empty dependencies for i in *; do touch $i/depends; done # setup the pid_files to some sort of default for i in *; do echo $PWD/$i/$i.pid > $i/pid_file; done cat chat/pid_file # get the run script setup to do nothing for i in *; do echo #!/bin/sh > $i/run; done for i in *; do chmod u+x $i/run; done # check out what we did ls -lR

With all of that, you can then try to run procer to watch it fail but still try to run everything: sudo procer $PWD $PWD/../run/procer.pid less error.log

46

CHAPTER 4. DEPLOYING

This is assuming that you are still in the profiles directory. You should see the le error.log get created and probably some messages printed to the screen. Just ignore any mention of Mongrel2 since thats probably just cruft from the libm2.a we havent removed. Take a look in the error.log and youll see its not necessarily errors but information on how things were run. You should see something like this for each prole: Source 24 First Dummy Run Of procer

DEBUG procer.c:232: Loading 5 actions. DEBUG procer.c:83: STARTED chat ERROR Failed to open PID file /home/zedshaw/deployment/profiles/chat/chat.pid for reading. ERROR Failed to open PID file /home/zedshaw/deployment/profiles/chat/chat.pid for reading. INFO No previous Mongrel2 running, continuing on. DEBUG procer.c:37: ACTION: command=/home/zedshaw/deployment/profiles/chat/run, pid_file=/home/zedshaw/deployment/profiles/chat/chat.pid, restart=1, depends=(null) DEBUG procer.c:56: WAITING FOR CHILD. INFO Now running as UID:1000, GID:1000 DEBUG procer.c:60: Command ran and exited successfully, now looking for the PID file. ERROR chat didn't make pidfile /home/zedshaw/deployment/profiles/chat/chat.pid.

Ive cleaned this up a bit and, again, ignore that its saying Mongrel2; thats just cruft from the library since it was originally designed for Mongrel2. What you can see here is the following: 1. It starts up and says it found 5 proles. 2. It starts chat, and says theres no PID le so its good to continue. 3. It reports what ACTION its running, so you can see the cong. 4. It spawns off your run script, drops privilege and says its WAITING for your script to exit. 5. After your script runs, it looks for the PID le you gave in pid file and, if its not there, it exits that action. 6. It does this for all of them and, since none of them run right, procer exits.

4.5. STEP 3: SETUP PROCER


Next up, lets get Mongrel2 running inside procer: Source 25
cd /deployment # make mongrel2 run as root sudo chown root.root profiles/mongrel2 # tell procer where mongrel2 puts its pid_file # notice the > not >> on this echo "$PWD/run/mongrel2.pid" > profiles/mongrel2/pid_file # make the run script start mongrel2 (notice the >> on this) echo "cd $PWD" >> profiles/mongrel2/run

47

procer Cong For Mongrel2

echo "m2sh start -db config.sqlite -host localhost" >> profiles/mongrel2/run # check out the results cat profiles/mongrel2/run #!/bin/sh cd /home/YOU/deployment m2sh start -db config.sqlite -host localhost

Obviously, you dont have to use a series of echo commands to make these scripts. You can edit them just ne, were just doing it this way so that you can follow along easier. Now, make sure you dont have any other Mongrel2 processes running, and then start procer again to see if it starts this conguration correctly. Source 26
cd /deployment # clear out the error.log for testing rm profiles/error.log # start procer sudo procer $PWD/profiles $PWD/procer.pid # see if procer is running ps ax | grep procer # should see: # 17934 ?

Using procer To Run Mongrel2

Ss

0:00 procer /home/zedshaw/deployment/profiles /home/zedshaw/deployment

# see if mongrel2 is running ps ax | grep mongrel2 # should see: # 17944 ?

Ssl

0:00 mongrel2 config.sqlite ba0019c0-9140-4f82-80ca-0f4f2e81def7

48

CHAPTER 4. DEPLOYING

To watch procer in action, try doing m2sh stop -db config.sqlite -host localhost -murder and then look at profiles/error.log and watch Mongrel2 come right back.

4.5.1

The Python Examples

Weve got a good setup of procer going and it keeps Mongrel2 running, so lets setup a similar thing for each of our little Python demos that well need. In order to do this, though, we sort of have to hack in making them daemonize and create PID les with a little shell script help. Lets start with the chat demo and, assuming your mongrel2 source is in /projects/mongrel2, you will change profiles/chat/run to be like this: Source 27
#!/bin/sh set -e DEPLOY=/home/YOU/deployment SOURCE=/home/YOU/projects/mongrel2 cd $SOURCE/examples/chat # WARNING: on some systems the nohup doesnt work, like OSX # try running it without nohup python -u chat.py 2>&1 > chat.log & echo $! > $DEPLOY/profiles/chat/chat.pid

Run Script For Chat Demo

This little script uses some funky features you might not be familiar with, but which are nice to learn, so lets take a look: 1. The rst trick is set -e, which tells bash to bail if theres any errors in your script. This is a huge life saver in system scripts. 2. Next, you point some variables at where the deployment and Mongrel2 source live, remembering to not type YOU but your username. 3. After that, you run the chat.py using a program called nohup. This basically daemonizes your script by redirecting output and preventing the program from exiting, and then you background it with &. 4. The nal thing we do is echo the magic variable $! (the PID of the last process started in the background) to the chat.pid le in the prole directory. When you run this manually, you should see something like this: ./profiles/chat/run nohup: redirecting stderr to stdout

4.5. STEP 3: SETUP PROCER


# youll only see the above if you needed nohup ps ax | grep chat # should see: 19305 pts/1 kill -TERM 19305 Sl 0:00 python chat.py

49

After all that, you can then try out procer again to see if it properly runs the chat demo as well as mongrel2:

Source 28

Running procer With Chat Demo

# run procer to get stuff started sudo procer $PWD/profiles $PWD/run/procer.pid # see if its all running ps ax | grep procer # should see: # 19607 ?

Ss

0:00 procer /home/zedshaw/deployment/profiles /home/zedshaw/deployment

ps ax | grep mongrel2 # should see: # 19621 ?

Ssl

0:00 mongrel2 config.sqlite ba0019c0-9140-4f82-80ca-0f4f2e81def7

ps ax | grep chat # should see: # 19609 ?

Sl

0:00 python chat.py

# try killing chat to see if it comes back kill -TERM cat profiles/chat/chat.pid ps ax | grep chat # should see: # 19669 ?

Sl

0:00 python chat.py

If you go look at profiles/error.log, youll see that procer is also running each of them as the right user, with chat being run as you, but Mongrel2 being run as root so it can chroot/drop privileges properly. Rather than give you a walk through each of these setups, heres the run scripts for the remaining les:

50

CHAPTER 4. DEPLOYING

Source 29
profiles/handlertest/run #!/bin/sh set -e DEPLOY=/home/YOU/deployment SOURCE=/home/YOU/projects/mongrel2

Remaining Run Scripts

cd $SOURCE/examples/http_0mq # WARNING: on some systems the nohup doesnt work, like OSX # try running it without nohup python -u http.py 2>&1 > http.log & echo $! > $DEPLOY/profiles/handlertest/handlertest.pid profiles/mp3stream/run #!/bin/sh set -e DEPLOY=/home/YOU/deployment SOURCE=/home/YOU/projects/mongrel2 cd $SOURCE/examples/mp3stream # WARNING: on some systems the nohup doesnt work, like OSX # try running it without nohup python -u handler.py 2>&1 > mp3stream.log & echo $! > $DEPLOY/profiles/mp3stream/mp3stream.pid profiles/web/run #!/bin/sh set -e DEPLOY=/home/YOU/deployment SOURCE=/home/YOU/projects/mongrel2 cd $SOURCE/examples/chat # WARNING: on some systems the nohup doesnt work, like OSX # try running it without nohup python -u www.py 2>&1 > www.log & echo $! > $DEPLOY/profiles/web/web.pid

4.6. STEP 4: STATIC CONTENT

51

4.5.2

Testing The New Setup

Once everything is running and procer is maintaining it, you just need to see if things work. Heres some curl commands to try: Source 30
curl http://localhost:6767/ # Hello, World! curl http://localhost:6767/handlertest

Testing With Curl

4.5.3

Nice Features of Procer

Theres some nice subtle features you get from using procer to run your stuff: Faster Development A great thing about procer is once you get all of this setup, it cuts down on a lot of your setup time and development time because it will properly restart things for you. This means you can simply make changes to code or congs, and then just kill the process and procer will kick it back over automatically. Easy Automation You should start to see how you could automate creating proles for new processes since the setup is consistent. profiles/run.log All your commands will have their output sent to this le so you can see how they might be blowing up in your scripts. Restart State Maintained Since procer is just tracking PID les and processes, if you shut it down, it wont kill the world. When you start it back up, it just starts new stuff or stuff it needs, then goes back to supervising. This means you can change the congs for procer then just kick it over and itll do the right thing. The key thing, though, is that you now have the whole application for the mongrel2.org demo up and running, including automated process management, conguration, and managing everything.

4.6 Step 4: Static Content


The nal thing we have to do is get the static content we need to try out the chat demo: If you get a good response then you should be able to go to http://localhost:6767/chatdemo/ and the chat should work. Notice also that you just killed mongrel2 with m2sh

52 Source 31

CHAPTER 4. DEPLOYING
Setting Up Static Content

cp -r /projects/mongrel2/examples/chat/static static/chatdemo m2sh stop -db config.sqlite -host localhost -murder curl -I http://localhost:6767/chatdemo/

and it came back because of procer. If you do your curl check too fast, you might miss it, so just wait a bit.

4.7 Step 5: Testing And Troubleshooting


You should have been testing the conguration as you went, but the main things to test are: 1. The /chatdemo/ works and you can send messages. Try a few different browsers. 2. You can get a simple message from the /handlertest/ and thats about it. 3. See if you can get the mp3streamer to stream some mp3s. Put a few in its directory, then kill it so procer brings it back. Then, point mplayer at http://localhost:6767/mp3stream and it should work. 4. Check that you can make the proxy go to the web.py app you start in the chat demos directory. 5. See if you can stop things and have procer bring them back. 6. Stop procer and then start it again to see if it properly doesnt step on things. If you run into problems, make sure that you can run each little piece and that the les you were supposed to make are correct. The best tool to use is diff.

4.8 Further Improvements


That ends this chapter, and at this point you should know how to setup nearly everything Mongrel2 has to offer right now. You should have a good idea of how procer will work or not for your real deployments, and how its used by me for my own deployments. A major improvement that we may eventually make is automating setup of procer proles, and just better overall management of the proles with m2sh. If you feel like hacking on that, just go ahead and try and let us know.

4.9. DEPLOYMENT TIPS


Other than that, automate, automate, automate.

53

4.9 Deployment Tips


Mongrel2 enforces the correct behavior when you run as root, which is to drop priv and chroot. This makes the server more secure, and it also simplies your deployments. Since everything you do always runs in a chroot, you now just need to rsync that chroot directory, or put it into a git or hg repo, and youre set. Youre literally forced to make your deployments portable to different directories and systems. As of the 1.0 push for Mongrel2, we havent done much work on how you deploy all the different languages. They sort of sprung up during development and our plan is to expand that out in the 2.0 version so that deployment is very well documented for all the different languages we support. That means youll probably run into some snags and things we didnt anticipate. The following are some general points weve come up with while deploying our own apps, with more to come as we work on the 2.0 version: 1. Dont run things as root if you can. Its bad habit that everyone tries to do their sysadmin completely as root, so Mongrel2 is designed to be run very easily as under a regular user account. The only time you really should be running as root is when you do a quick -sudo to m2sh to start mongrel2 up so it can chroot. 2. Use the chroot to keep your deployment simpler. I literally do all my work locally and then just rsync my changes up to my remote staging server. Everything has to live in the chroot anyway, and the chroot enforces that it is completely self-contained. 3. Use Pythons virtualenv or anything similar to get yourself a totally local environment. Too many systems, such as OSX, have very outdated packages and will change versions on you without telling you. The best way to make sure your software keeps working (and works as one cohesive deployment) is to use a virtualenv inside your chroot. It should even work cross-platform if you dont have compiled packages in there. 4. Create a user for your application and live in there. I dont have any root access on my stuff. Everything is run as a user named after the website, and is deployed right in the /home/USER directory. I login as that user, manage as that user, and I dont give them sudo access. For the times when I need to sudo to restart or run Mongrel2, I then use a separate login that I have open (with screen) and do it there. This reduces your risk of hacks, but also just simplies things. Its no problem for me to move my conguration over to new machines with this setup, or deploy

54

CHAPTER 4. DEPLOYING
clusters. I know that as long as theres the right user on the target, Im set. 5. Use GNU screen or die. 6. Keep your cong.sqlite and the .conf le in your chroot, and keep your content and everything else under that. This makes sure that the cong isnt accessible outside your content directories. Mongrel2 helps you get this right by not allowing certain Dir congurations that would expose your chroot to the world.

Theres a few additional tips for people who want to use alternative process supervision like daemontools, runit, or init.d setups. No matter what you use, you should probably follow this advice: 1. Whatever you use for process management, make sure it can run stuff as not root and can do chroot for you. If youre running your Mongrel2 as root, youre doing it wrong. Actually, if youre running any services as root that dont absolutely need to be, youre doing it wrong. 2. Mongrel2 is happy to run as a regular user, and assumes that if you do not run as root, then you probably want to run under daemontools or similar. It wont chroot or drop priv and logs to stdout/stderr. 3. If you need to bind to port 80 but run under daemontools as a regular user, then use privbind to do it. This tool will run any command, like mongrel2 but it does it in a way that lets the executable grab ports below 1024. This restriction on ports is actually really stupid so dont worry about doing this. 4. Make sure your process monitor is not a single point of failure. Some of them out there will take your whole world down if they crash. Try doing a harsh kill on your process manager and see how it behaves. As much as they like to tell you not to worry about this because they run forever, everything has bugs and stupid people tend to kill things they dont get. If taking one process down nukes your whole server, then thats a bad design. As we work on the next phase of Mongrel2 development, this will improve, so watch for news about deployment and real applications.

Chapter 5

Hacking
This chapter is all about making cool things with Mongrel2. It covers all the non-deployment features that you get from the browsers side and the handler/backend side of your application. Ill show you how the chat demo works for the async web sockets. Ill get into writing your own handlers using a few other demos. Ill cover some of the interesting things you can do with Mongrel2 you cant do with other servers. Finally, Ill get into practical things, when to do proxying and when to use a 0MQ handler. For the majority of this chapter, Ill be using Python, but the demos should translate to the other languages that are implemented. Ill periodically show how another language does one of the demos, so you can get the idea that Mongrel2 is language agnostic. In no way should you take me using Python in this chapter to mean you cant use something else for your handlers. Currently supported languages are: Python The directory examples/python contain the Mongrel2 Python library m2py. Ruby Probably the most extensively supported language, with good Rack support, by perplexes on github. C++ C++ support by akrennmair on github. PHP PHP support by winks on github. C You can also write handlers in C using the Mongrel2 library, but its really rough, and not recommended yet. A C library will come, though. Others? ZeroMQ supports Ada, Basic, C, C++, Common Lisp, Erlang, Go, Haskell, Java, Lua, .NET, Objective-C, ooc, Perl, PHP, Python, and Ruby, so after reading this chapter you can easily write handlers in any of those 55

56 languages too.

CHAPTER 5. HACKING

However, no matter how many languages Mongrel2 supports, you will still have applications that cant t into 0MQ handlers and just work better as classic web apps, either because youve already written them and have existing infrastructure, or because of some architectural issues that require it to run traditionally. Because of that, Mongrel2 supports HTTP proxying, which allows you to route requests to basic web server backends that dont support 0MQ. Note 6 What About FastCGI/AJP/CGI/SCGI/WSGI/Rack?

Nothing prevents you from writing your own connector between Mongrel2 and your deployment protocol of choice. If you need to run FastCGI or AJP in your environment, then your best bet is to just make a handler that translates Mongrel2 requests to the protocol you need and back. The Mongrel2 format is very easy to parse and translate, so you should be able to do it with no problem. The Ruby library already supports Rack as an example, and Python will support WSGI soon. However, Mongrel2 itself doesnt support any of these directly. Doing so would bring back the language specic infections that cause other web servers to go south. The design of most of these protocols tends to be either before the modern web, or specic to one particular language. Instead of trying to cater to all the possible languages out there, Mongrel2 just gives the tools to connect to it yourself.

5.1 Front-end Goodies


Mongrel2 supports your standard web server features like serving les, routing requests to another HTTP server, multiple host matching, good 304 support, and just generally being able to interact with a browser like normal. Youve seen most of these features as you setup and deployed a Mongrel2 conguration, but lets go through some of them in more detail so you know whats possible.

5.1.1

HTTP

Mongrel2 uses the original Mongrel parser that powers quite a few other web servers and large, successful websites. This parser is rock solid, dead accurate, and by design blocks a lot of security attacks. For the most part you dont have to worry about this and just need to know Mongrel2 is using the same stable HTTP processing that has been working great for many years.

5.1. FRONT-END GOODIES

57

Another way to put this is if Mongrel2 says your request is invalid, it most denitely is.

5.1.2

Proxying

Youve already seen congurations that have the Proxy routes working, so it should be easy to understand whats going on. You just create routes to backends that are HTTP servers and Mongrel2 shuttles requests to them, then proxies responses back. The Proxying support in Mongrel2 is accurate, but its not very capable right now. For example, theres not round-robin backend selection, or page caching, or other things you might need for more serious deployments. Those features will come eventually, though. What you do get with Mongrel2s proxying, though, is a dead accurate way of slicing up your application by routes. Other web servers make you go through great pain in order to have some URLs go to a proxy and others go to handlers or directories. They make you use odd le syntax, weird pseudo-turing logic if-statements, and other odd hacks to get exible route selection. They also tend to not maintain keep-alives properly between proxy requests and other requests. Mongrel2 uses the exact same routing syntax for all backends and has no distinction between them. It also properly does keep-alives for as long as it is efcient to do so.

5.1.3

WebSockets

Mongrel2 does not support WebSockets because the original protocol was a complete ugly hack with security holes galore. Theyve since xed the entire protocol and well be implementing the hybi-07 version of the protocol in the 1.7 or 1.8 release.

5.1.4

JSSocket

The Mongrel2 chat demo uses JSSocket to do its magic, and it works great, but it requires Flash and, oh, man, do I absolutely hate Flash. However, it works, and works now, and works in every browser, even really old, busted ones. That means its the rst thing we implemented and the one well keep for a while until it proves itself not useful. The chat demo well cover will show you how to hook this up for fast async messaging and presence detection.

58

CHAPTER 5. HACKING

Note 7

Idiots and RFC Implementers

I dont know why, but people who implement RFCs pick up very weird cargo cult beliefs peddled by the people who write the standards. In HTTP it was two things which the creators of HTTP have actually back-peddled on: Accept everything, and keep-alives with pipe-lines. The truth is, if you want a secure server of any kind, blindly accepting every single thing any idiot sends you is going to open your server up to a huge number of attacks. If you look at every attack on existing HTTP servers youll nd that about 80% of them are exploiting ambiguous parts of the HTTP grammar to pass through malicious content or overow buffers. In Mongrel2 we use a parser that rejects invalid requests from rst basic principles using technology thats 30 years old and backed by solid mathematics. Not only does Mongrel2 reject bad requests, but it can tell you why the request was bad, just like a compiler. This doesnt mean Mongrel2 is ruthless, but it denitely doesnt tolerate ambiguity or stupidity. Mongrel2 completey supports keep-alives because now, since its not using Ruby at all it can scale up beyond 1024 le descriptors. Ruby was limited in the number of open les a process could have, so the original Mongrel had to break keep-alive and kill connections in order to save itself from greedy browsers that never close them. Mongrel2 doesnt have this limitation, so it uses full keep-alives and has a dead accurate state machine to manage them correctly. Where problems come in is with pipe-lined requests, meaning a browser sends a bunch of requests in a big blast, then hangs out for all the responses. This was such a horrible stupid idea that pretty much everone gets it wrong and doesnt support it fully, if at all. The reason is its much too easy to blast a server with a ton of request, wait a bit so they hit proxied backends, and then close the socket. The web server and the backends are now screwed having to handle these requests which will go nowhere. Mongrel2 does not support pipe-lined requests. It sends one, and waits for the reponse, and if you want more, then tough. Screw you because it has no advantage for Mongrel2 and dubious advantages to you. It is simply one more attack vector for the server and is rejected outright. These two things are rejected outright by Mongrel2 simply because they are stupid ideas and in 2010 nobody should be writing clients so badly that they need these features.

5.1. FRONT-END GOODIES


Note 8

59 Proxying And 0MQ Handlers Are Like mod *

A quick note for people coming from other web servers. If you use nginx then you are probably familiar with the concept of proxying to a backend like Ruby on Rails or Django. If you use PHP or another language, you may be used to a system like mod php which manages your code for you and reloads when you make changes. If you use Apache, then you probably think in terms of virtual hosts and mod rewrite rules. In Mongrel2 all the same concepts are there, its just cleaned up. If you want Mongrel2 to nginx/mod rewrite style talk to another backend web server, then thats Proxying. If you want to have fast backend handlers then thats 0MQ Handlers. We really dont have anything like mod php because the whole idea of embedding a programming language runtime inside Mongrel2 would defeat the point of making it language agnostic.

5.1.5

Long Poll

Mongrel2 just works as if everything is an HTTP long poll, its just that normal request/responses are super fast long polls. For the most part you dont even need to know this exists; its just how things are and they make perfect sense. You get requests from a certain server with a certain connected identity, and then you send stuff to that target. Thats it. If you send it one response, or a stream of them, or setup a long poll conguration, then thats up to you.

5.1.6

Streaming

Because everything in Mongrel2 is asynchronous, and it allows you to target any connected listeners from your handlers, even with partial messages, you can easily do efcient streaming applications. ZeroMQ is an incredibly efcient transport mechanism, and with it you can send tons of information to many browsers or clients at once. This means streaming video and MP3 streams to listeners is very trivial. Well cover the mp3stream example where you get to see a simple implementation of the ICY MP3 streaming protocol.

5.1.7

N:M Responses

What makes streaming, async messaging, and long poll designs so efcient in Mongrel2 is that you can send one message and target up to 128 clients with that one message. This means sending large scale replies to many browsers requires less copying of the message and less transports.

60

CHAPTER 5. HACKING

In addition to this, you can setup Mongrel2 with the help of some 0MQ to send one request from a browser to as many target handlers as you like. You can even send them messages using OpenPGM for sending UDP messages reliably to clusters of computers. This means that Mongrel2 is the only web server capable of sending one request from a browser to N backends at once, and then return the replies from these handlers to M browsers. Not exactly sure what you could write with that, but its probably something really damn cool.

5.1.8

Async Uploads

Mongrel2 also solves the problem of large uploads choking your server because you cant stop them before theyre complete. Mongrel2 will stream large requests to temporary les, but it sends your handlers an initial upload started message. When the upload is done, you get a nal upload nished message. If, at any time, you want to kill the upload, you just send a 0-length reply (the ofcial KILL MESSAGE) and the whole thing is aborted and cleaned up.

5.2 Introduction to ZeroMQ


The ZeroMQ folks have nally written a decent manual for ZeroMQ which you should probably read. I recommend you read the 0MQ - The Guide as your introduction to 0MQ.

5.3 Handler ZeroMQ Format


Youve read the 0MQ Guide and now youre ready to see how Mongrel2 talks to your handlers with it. I wont really call this a protocol, since ZeroMQ is really doing the protocol, and we just pull fully baked messages out of it. Instead, this is just a format, as if you got strings out of a le or something similar. This message format is designed to accomplish a few things in the simplest way possible: 1. Be usable from languages that are statically compiled or scripting languages. 2. Be safe from buffer overows if done right, or easy to do right. 3. Be easy to understand and require very little code. 4. Be language agnostic and use a data format everyone can accept without

5.3. HANDLER ZEROMQ FORMAT


complaining that it should be done with their favorite1 .

61

5. Be easy to parse and generate inside Mongrel2 without have to parse the entire message to do routing or analysis. 6. Be useful within ZeroMQ so that you can do subscriptions and routing. To satisfy these features we use different types of ZeroMQ sockets (soon to be congurable), a request format that Mongrel2 sends and a response format that the handlers send back. Most importantly, there is nothing about the request and response that must be connected. In most cases they will be connected, but you can receive a request from one browser and send a response to a totally different one.

5.3.1

Socket Types Used

First, the types of ZeroMQ sockets used are a ZMQ PUSH socket for messages from Mongrel2 to Handlers, which means your Handlers receive socket should be a ZMQ PULL. Mongrel2 then uses a ZMQ SUB socket for receiving responses, which means your Handlers should send on a ZMQ PUB socket. This setup allows multiple handlers to connect to a Mongrel2 server, but only one Handler will get a message in a round-robin style. The PUB/SUB reply sockets, though, will let Handlers send back replies to a cluster of Mongrel2 servers, but only the one with the right subscription will process the request.2 In the various APIs weve implemented, you dont need to care about this. They provide an abstraction on top of this, but it does help to know it so that you understand why the message format is the way it is. This leads to rule number 1: Rule 1: Handlers receive with PULL and send with PUB sockets.

5.3.2

UUID Addressing

Do you remember all those UUIDs all over the place in the conguration les? They may have seemed odd, but they identify specic server deployments and processes in a cluster. This will let you identify exactly which member of a cluster sent a message, so that you can return the right reply. This is the rst part of our protocol format and it results in the next rule 2: Rule 2: Every message to and from Mongrel2 has that Mongrel2 instances UUID as the very rst thing.
1 Except 2 The

Erlang guys, cause theyll always complain that everythings not in Erlang types of sockets used will be congurable in later version

62

CHAPTER 5. HACKING

5.3.3

Numbers Identify Listeners

You then need a way to identify a particular listener (browser, client, etc.) that your message should target, and Mongrel2 needs to tell you who is sending your handler the request. This means Mongrel2 sends you just one identier, but you can send Mongrel2 a list of them. This leads to rule 3: Rule 3: Mongrel2 sends requests with one number right after the servers UUID separated by a space. Handlers return a netstring with a list of numbers separated by spaces. The numbers indicate the connected browser the message is to/from. In case you dont know what a netstring is, it is a very simple way to encode a block of data such that any language can read the block and know how big it is. A netstring is, simply, SIZE:DATA,. So, to send HI, you would do 2:HI,, and it is incredibly easy to parse in every language, even C. It is also a fast format and you can read it even if youre a human.

5.3.4

Paths Identify Targets

In order to make it possible to route or analyze a request in your handlers without having to parse a full request, every request has the path that was matched in the server as the next piece. That gives us: Rule 4: Requests have the path as a single string followed by a space and no paths may have spaces in them.

5.3.5

Request Headers And Body

We only have two more rules to complete the message format. Rule 5: Mongrel2 sends requests with a netstring that contains a JSON hash (dict) of the request headers, and then another netstring with the body of the request. Then theres a similar rule for responses: Rule 6: Handlers return just the body after a space character. It can be any data that Mongel2 is supposed to send to the listeners. HTTP headers, image data, HTML pages, streaming video. . . You can also send as many as you like to complete the request and any handler can send it.

5.3. HANDLER ZEROMQ FORMAT

63

5.3.6

Complete Message Examples

Now, even though we laid out all of this as a series of rules, the actual code to implement these is very simple. First heres a simple grammar for how a request that gets sent to your handlers is formatted: UUID ID PATH SIZE:HEADERS,SIZE:BODY, Thats obviously a much simpler way to specify the request than all those rules, but it also doesnt tell you why. The above description, while boring as hell, tells you why each of these pieces exist. Also remember that this is a strict format, so to be more precise its: Identifier = digit+ ?; IdentList = (Identifier)**; Length = digit+; UUID = (alpha | digit | -)+; Targets = Length : IdentList ","; Request = UUID Targets ; Mongrel2 will strictly enforce this grammar and reject any 0mq messages that dont follow it. To parse this in Python we simply do this: Source 32
import json def parse_netstring(ns): len, rest = ns.split( : , 1) len = int(len) assert rest[len] == , , " Netstring did not end in return rest[:len], rest[len+1:] def parse(msg): sender, conn_id, path, rest = msg.split( headers, rest = parse_netstring(rest) body, _ = parse_netstring(rest) headers = json.loads(headers) return uuid, id, path, headers, body

Parsing Mongrel2 Requests In Python

, "

, 3)

This is actually all of the code needed to parse a request, and is fairly the same in many other languages. If you look at the le examples/python/mongrel2/request.py, youll see a more complete example of making a full request object. A response is then just as simple and involves crafting a similar setup like this:

64 UUID SIZE:ID ID ID, BODY

CHAPTER 5. HACKING

Notice Ive got three IDs here, but you can do anywhere from 1 up to 128. Generating this is very easy in Python: Source 33 Generating Responses

def send(uuid, conn_id, msg): header = " %s %d : %s , " % (uuid, len(str(conn_id)), str(conn_id)) self.resp.send(header + + msg)

def deliver(uuid, idents, data): self.send(uuid, .join(idents), data)

That, again, is all there is to it. The send method is the one doing the real work of crafting the response, and the deliver method is just using send to do all the the target idents joined with a space.

5.3.7

TNetStrings Alternative Protocol

During the 1.6 development, it became clear that we needed a sort of internal protocol for some new Mongrel2 features. This internal protocol should be able to store all the same things that JSON can, but also store exact binary data. This came about because we want to send raw data to handlers and other parts of the system like the control port, but JSON involved too much work to parse and deal with that. We also did various analyses and found that much of our time was spent just generating JSON. What we did, then, is create a small modication to netstrings that tags each element with its type. We did this by changing the (fairly useless) trailing , character so that it signied the type of what it contained. Types can be any of the main data types that JSON has (dicts, lists, integers, etc.), except that strings are now entirely raw binary strings, with no denition about whether they hold anything other than 8-bit octets. We also made the design so it was backward compatible with netstrings. This lets us use it to directly parse a zeromq message from anyone, and it will work whether its a TNetString-style nested structure, or just a string with JSON in it. The end result is a simple specication at http://tnetstrings.org which encodes a nave parser that anyone can copy to other languages easily. Many other peo ple implemented the protocol and it looks like you can do it in every language in about 100 lines of code. Implementing a version with more performance (since every language needs tricks) seems to take about 500-1000 lines of code.

5.3. HANDLER ZEROMQ FORMAT

65

Mongrel2 now supports either TNetStrings or JSON as dened above, on the y, and without any modication to existing handlers. Internally, Mongrel2 uses TNetStrings to create its internal control port protocol, which makes working with Mongrel2 programatically even easier. To demonstrate this, heres the new code for parsing a request in Python:

Source 34
from mongrel2 import tnetstrings

Parsing TNetStrings Requests In Python

def parse(msg): sender, conn_id, path, rest = msg.split( headers, rest = tnetstrings.parse(rest) body, _ = tnetstrings.parse(rest) if type(headers) is str: headers = json.loads(headers)

, 3)

return Request(sender, conn_id, path, headers, body)

Our tests also show that TNetStrings are a good compromise between speed and ease of parsing. Theyre hard to get wrong in parsing, easy to write out, and faster than many other protocols out there. The few that are faster are also much, much, harder to parse and more error prone. In our tests, weve found that TNetStrings in Python can be faster than Pythons own pickle format when we use a C extension. The most important point about TNetStrings, though, is how it opens up Mongrel2 for even more control and automation.

5.3.8

Python Handler API

Instead of building all of this yourself, Ive created a Python library that wraps all this up and makes it easy to use. Each of the other libraries are designed around the same idea and should have a similar design. To check out how to use the Python API, well take a look at each of the demos that are available. These are the same demos you ran in the previous section to create a sample deployment. For the Python API, you may want to start by looking at two very small les that should be able to understand quickly: examples/python/mongrel2/request.py and examples/python/mongrel2/handler.py.

66

CHAPTER 5. HACKING

5.4 Basic Handler Demo


The most basic handler you can write is in the examples/http 0mq/http.py le and it just the simplest thing possible:3 Source 35
from mongrel2 import handler import json from uuid import uuid4 # ZMQ 2.1.x broke how PUSH/PULL round-robin works so each process # needs its own id for it to work sender_id = uuid4().hex conn = handler.Connection(sender_id, " tcp://127.0.0.1:9997 " , " tcp://127.0.0.1:9996 " ) while True: print " WAITING FOR REQUEST " req = conn.recv() if req.is_disconnect(): print " DISCONNECT " continue if req.headers.get( " killme " , None): print " They want to be killed. " response = " " else: response = " <pre> \n SENDER: %r \n IDENT: %r \n PATH: req.sender, req.conn_id, req.path, json.dumps(req.headers), req.body) print response conn.reply_http(req, response)

http.py example

%r \n HEADERS: %r

All this code does is print back a simple little dump of what it received, and its not even a valid HTML document. Lets walk through everything thats going on: 1. Import the handler module from mongrel2 and json. The json module is really only used for logging. 2. Establish the UUID for our handler, and create a connection. Its not really a connection but more of a virtual circuit that you can just pretend is a
3 This

is the same code as the original le, but with extraneous prints removed for simplicity.

5.5. ASYNC FILE UPLOAD DEMO

67

connection. Its using all ZeroMQ and the protocol we just described to create a simple API to use. 3. Go into a while loop forever and recv request objects off the connection. 4. One type of special message we can get from Mongrel2 is a disconnect message, which tells you that one of the listeners you tried to talk to was closed. You should either ignore those and read another, or update any internal state you may have. They can come asynchronously, and for the most part you can ignore them unless you need to keep them open as in, say, a chat application or streaming. 5. Craft the reply youre going to send back, which is just a dump of what you received. 6. Send this reply back to Mongrel2. Notice the subtle difference where you include the req object as part of how you reply? This is the major difference between this API and more traditional request/response APIs in that you need the request you are responding to so that it knows where to send things. In a normal socket-based server this is just assumed to be the socket youre talking about. This is all you need at rst to do simple HTTP handlers. In reality, the reply http method is just syntactic sugar on crafting a decent HTTP response. Heres the actual method that is crafting these replies: Source 36 HTTP Response Python Code

def http_response(body, code, status, headers): payload = { code : code, status : status, body : body} headers[ Content-Length ] = len(body) payload[ headers ] = " \r \n " .join( %s : %s % (k,v) for k,v in headers.items()) return HTTP_FORMAT % payload

Which is then used by Connection.reply http and Connection.deliver http to send an actual HTTP response. That means all this is doing is creating the raw bytes you want to go to the real browser, and how its delivered is irrelevant. For example, the deliver http method means that, yes, you can have one handler send a single response to target multiple browsers at once.

5.5 Async File Upload Demo


Mongrel2 uses an asynchronous method of doing uploads that helps you avoid receiving les you either cant accept or shouldnt accept. It does this by send-

68

CHAPTER 5. HACKING

ing your handler an initial message with just the headers, streaming the le to disk, and then a nal message so you can read the resulting le. If you dont want the upload, then you can send a kill message (a 0 length message) and the connection closes, and the le never lands. The upload mechanism works entirely on content length, and whether the le is larger than the limits.content length. This means if you dont want to deal with this for most form uploads, then just set limits.content length high enough and you wont have to. However, if you want to handle le uploads or large requests, then you add the setting upload.temp store to a mkstemp compatible path like /tmp/mongrel2.upload.XXXXXX with the XXXXXX chars being replaced with random characters. It doesnt have to /tmp either, and can be any store you want, network disk, anything. Heres an example handler in examples/http 0mq/upload.py that shows you how to do it:

You can test this with something like curl -T tests/config.sqlite http://localhost:6767 to upload a big le. Whats happening is the following process: 1. Mongrel2 receives a request from a browser (or curl in this case) that is greater than limits.content length in size. It actually doesnt read all of it yet, only about 2k. 2. Mongrel2 looks up the upload.temp store setting and makes a temp le there to write the contents. If you dont have this setting then it aborts and returns an error to the browser. 3. Mongrel2 sees that the request is for a Handler, so it crafts an initial request message. This request message has all the original headers, plus a X-Mongrel2-Upload-Start header with the path of the expected tmple you will read later. 4. Your handler receives this message, which has no actual content, but the original content length, all the headers, and this new header to indicate an upload is starting. 5. At this point, your handler can decide to kill the connection by simply responding with a kill message, or even with a valid HTTP error reponse then a kill message. 6. Otherwise your handler does nothing, and Mongrel2 is already streaming the le into the designated tmple for this upload. 7. When the upload is nally saved to the le, it adds a new header of X-Mongrel2-Upload-Done set to the same le as the rst header. Remember that both headers are in this nal request.

5.5. ASYNC FILE UPLOAD DEMO


Source 37
from mongrel2 import handler try: import json except: import simplejson as json import hashlib sender_id = " 82209006-86FF-4982-B5EA-D1E29E55D481 "

69 Async Upload Example

conn = handler.Connection(sender_id, " tcp://127.0.0.1:9997 " , " tcp://127.0.0.1:9996 " ) while True: print " WAITING FOR REQUEST " req = conn.recv() if req.is_disconnect(): print " DISCONNECT " continue elif req.headers.get( x-mongrel2-upload-done , None): expected = req.headers.get( x-mongrel2-upload-start , " BAD " ) upload = req.headers.get( x-mongrel2-upload-done , None) if expected != upload: print " GOT THE WRONG TARGET FILE: continue

" , expected, upload

body = open(upload, r ).read() print " UPLOAD DONE: BODY IS %d long, content length is len(body), req.headers[ content-length ]) response = " UPLOAD GOOD:

%s " % (

%s " % hashlib.md5(body).hexdigest()

elif req.headers.get( x-mongrel2-upload-start , None): print " UPLOAD starting, don t reply yet. " print " Will read file from continue

%s . " % req.headers.get( x-mongrel2-upload-start , N

else: response = " <pre> \n SENDER: %r \n IDENT: %r \n PATH: req.sender, req.conn_id, req.path, json.dumps(req.headers), req.body) print response conn.reply_http(req, response)

%r \n HEADERS: %r \n BODY:

70

CHAPTER 5. HACKING
8. Your handler then gets this nal request message that has both the X-Mongrel2-Upload-Start and X-Mongrel2-Upload-Done headers, which you can then use to read the upload contents. You should also make sure the headers match to prevent someone forging completed uploads.

Note 9

Watch The chroot Too

Remember, when you run Mongrel2 it will store the le relative to its chroot setting. In testing you probably arent running Mongrel2 as root so it works ne. You just then have to make sure that your handler know to look for the le in the same place. So if you have /var/www/mongrel2.org for your chroot and /uploads/file.XXXXXX then the actual le will be in /var/www/mongrel2.org/uploads/file.XXXXXX. The good thing is you can read the cong database in your handlers and nd out all this information as well.

5.6 MP3 Streaming Demo


The next example is a very simple and, well, kind of poorly implemented MP3 streaming demo that uses the ICY protocol. ICY is a really lame protocol that was obviously designed before HTTP was totally baked and probably by people who dont really get HTTP. It works in an odd way of having meta-data sent at specic sized intervals so the client can display an update to the meta-data. The mp3streamer demo creates a streaming system by having a thread that receives requests for connections, and then another thread that sends the current data to all currently connected clients. Rather than go through all the code, you can take a look at the main le and see how simple it is once you get the streaming thread right: Walking through this example is fairly easy, assuming you just trust that the streaming thread stuff works: 1. Starts off just like the handler test. 2. We gure out what .mp3 les are in the current directory. 3. Establish a data chunk size of 5k for the ICY protocol and make a ConnectState and Streamer from that. These are the streaming thread things found in mp3stream.py in the same directory. 4. We then loop forever, accepting requests. 5. Unlike the handler, we want to remove disconnected clients, so we take them out of the STATE when we are notied.

5.6. MP3 STREAMING DEMO

71

Source 38
from mp3stream import ConnectState, Streamer from mongrel2 import handler import glob

Base mp3stream Code

sender_id = " 9703b4dd-227a-45c4-b7a1-ef62d97962b2 " CONN = handler.Connection(sender_id, " tcp://127.0.0.1:9995 " , " tcp://127.0.0.1:9994 " )

STREAM_NAME = " Mongrel2 Radio " MP3_FILES = glob.glob( " *.mp3 " ) print " PLAYING: " , MP3_FILES CHUNK_SIZE = 8 * 1024 STATE = ConnectState() STREAMER = Streamer(MP3_FILES, STATE, CONN, CHUNK_SIZE, sender_id) STREAMER.start() HEADERS = { icy-metaint : CHUNK_SIZE, icy-name : STREAM_NAME}

while True: req = CONN.recv() if req.is_disconnect(): print " DISCONNECT " , req.headers, req.body, req.conn_id STATE.remove(req) else: print " REQUEST " , req.headers, req.body if STATE.count() > 20: print " TOO MANY " , STATE.count() CONN.reply_http(req, " Too Many Connected. Try Later. " ) else: STATE.add(req) CONN.reply_http(req, " " , headers=HEADERS)

72

CHAPTER 5. HACKING
6. If we have too many connected clients, we reply with a failure. 7. Otherwise, we add them to the STATE and then send the initial ICY protocol header to get things going.

That is the base of it, and if you point mplayer at it (which is the only player that works, really) you should hear it play: mplayer http://localhost:6767/mp3stream That is, assuming you put some mp3 les into the directory and started the handler again. For more on how the actual state and the protocol works, go look at mp3stream.py. Explaining it is far outside the scope of this manual, but the key points to realize are that this is one thread thats targetting randomly connected clients with a single message to the Mongrel2 server and streaming it.

5.7 Chat Demo


The chat demo is the most involved demonstration, and Im kind of getting tired of leading you by the hand, so you go read the code. Heres where to look: JavaScript Look at /examples/chat/static/*.js for the goodies. The key is to see how chat.js works with the JSSocket stuff, and then look at how I did app.js using fsm.js. Python Look at the /examples/chat/chat.py le to see how the chat states are maintained and how messages are sent around. cong The conguration you created in the last chapter actually works with the demo, and if youve been following along you should have tested it. Hopefully, you can gure it out from the code, but if not, let me know.

5.8 Writing A Filter (BETA)


In Mongrel2 v1.8.0 there was a new addition of the Filter system, which lets you intercept the Mongrel2 state machine and fully control how it operates. Its still a very new feature, but theres a simple piece of demo code you can look at to see how they work. You should also check out how to congure them in the Managing section. Lets just take a look at the code to the tools/filters/null.c lter.

5.8. WRITING A FILTER (BETA)


Source 39
#include <filter.h> #include <dbg.h> #include <tnetstrings.h>

73 The Basic null Filter

StateEvent filter_transition(StateEvent state, Connection *conn, tns_value_t *config) { size_t len = 0; char *data = tns_render(config, &len); if(data != NULL) { log_info( " CONFIG: %.*s " , (int)len, data); } free(data); return CLOSE; }

StateEvent *filter_init(Server *srv, bstring load_path, int *out_nstates) { StateEvent states[] = {HANDLER, PROXY}; *out_nstates = Filter_states_length(states); check(*out_nstates == 2, " Wrong state array length. " ); return Filter_state_list(states, *out_nstates); error: return NULL; }

In this code you are basically creating a .so le that Mongrel2 will load on the y when told to. How it works is you make two functions, always named filter init and filter transition. The filter init function sets up a simple array that lists all of the events (found in src/events.h) that you want to have your lter triggered on. Its important that you use the Filter state list function to return the actual list or else youll get the memory allocation wrong. Mongrel2 will load this null.so and call the filter init function and wire it up for each of the events you indicate. Next, when a request comes in, the server will go through each event that triggers, and call your filter transition function. This function will get the StateEvent that is about to happen, the Connection its happening on, and nally, the config that the user set in their config.sqlite database.

74

CHAPTER 5. HACKING

All your filter transition function has to do is use the Mongrel2 APIs to do what it needs, alter the Connection and work with the config to get its work done. When its done, it can then return the next state event that Mongrel2 should work with instead of what you were handed (or, just return the same one if you arent changing how Mongrel2 works). Thats all there is to it for now. Later releases will start having more lters that you can load and look at the example code to try.

5.9 Other Language APIs


Theres at least 10 langauges available for Mongrel2, so check out the main mongrel2.org site for the full list. If you want to implement another language, it should be fairly trivial. Just base your design on the Python API so that it is consistent, but, please, dont be a slave to the Python design if it doesnt t the chosen language; creating a direct translation of the Python is ne at rst, but try to make it idiomatic after that so people who use that language feel at home and its easy for them.

5.10 Writing Your Own m2sh


The very last thing I will cover in the section on hacking Mongrel2 is how to write your own m2sh script in your favorite language. Obviously, if youre doing this you should probably have a good reason4 . What writing your own, or understanding what m2sh is doing will do for you, though, is help you when you start to think about automating Mongrel2 for your deployments. Hopefully, I may have motivated you to automate, automate, automate. This is why we write software. If I wanted to do stuff manually Id go play guitars or juggle. I write software because I want a computer to do things for me, and nothing needs this more than managing your systems. This is why Mongrel2 is designed the way it is, using the MVC model. It lets you create your own View like m2sh, web interfaces, automation scripts, and anything else you need to make it easier to manage more. If you want to write your own m2sh then rst go have a look at the Python code in examples/python/config and the m2shpy script that installs. This is where each command lives, where the argument parsing is and, most importantly, the ORM model that works the raw SQLite database.
4 Like if youre a Ruby weenie and C is banned at your company because they like dogma more than money.

5.11. CONFIG FROM ANYTHING: EXPERIMENTAL

75

The next thing to do is to make your tool craft databases and compare the results to what m2sh does for a similar conguration. I recommend you make a database thats correct with m2sh, and then dump it via sqlite3. After that, use your tool to make your own database, dump it, and then use diff to compare your results to mine. You can also look at how the C version of m2sh that is installed by default is written. It lives in tools/m2sh and has a completely different design but does nearly the same things. If you know C then this comparing the two is also educational. Finally, youll need to look at two base schema les: src/config/config.sql and src/config/mimetypes.sql, where the database schema is created and the large list of mimetypes that Mongrel2 knows is stored.5 Your tool should be able to use this SQL to make its database, or at least know what it does. If you do something cool with all of this, let us know.

5.11 Cong From Anything: Experimental


As of v1.7 Mongrel2 has the ability to congure itself directly from a loadable module that you can dene. The feature is very new and probably not safe to use quite yet, but Im documenting it here so that people can start playing with it and then giving me feedback on how to use it. The rst thing to look at is the null.so module in tools/config modules/null.c which lays out a bare cong module that automatically fails. This module was using in unit testing to make sure that Mongrel2 handles some simple invalid inputs to the conguration system. Heres the code to the module: You can then get Mongrel2 to load this module directly by passing it as a fourth parameter to the mongrel2 executable: In this run, Mongrel2 detected that you gave it a fourth option and loaded that as the module to use for conguring itself. Normally it just assumes a sqlite3 database, but now its going to defer everything to the null.c code above. It also passes the 2nd parameter (the path) and 3rd (the UUID) to the module for the operations it needs to do. Mongrel2 also doesnt enforce anything for these strings other than they were arguments, so you dont have to use any real paths or UUIDs so long as your module can return the right data. What you then have to do to make your own cong module is: 1. Copy the null.c le to a new le in tools/config modules.
5 Incidentally,

if you want to add one, thats the table to put it in.

76 Source 40

CHAPTER 5. HACKING
The null Cong Module

/** * * Copyright (c) 2010, Zed A. Shaw and Mongrel2 Project Contributors. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * * * * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * * * Neither the name of the Mongrel2 Project, Zed A. Shaw, nor the names of its contributors may be used to endorse or promote products * derived from this software without specific prior written * permission. * * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS * IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include #include #include #include <filter.h> <dbg.h> <config/module.h> <config/db.h>

struct tagbstring GOODPATH = bsStatic( " goodpath " ); int config_init(const char *path) { if(biseqcstr(&GOODPATH, path)) { log_info( " Got the good path. " ); return 0; } else { log_info( " Got the bad path: %s " , path); return -1; } } void config_close() { } tns_value_t *config_load_handler(int handler_id) { return NULL;

5.11. CONFIG FROM ANYTHING: EXPERIMENTAL


Source 41

77 Loading The null Cong

mongrel2 goodpath 2f62bd5-9e59-49cd-993c-3b6013c28f05 /usr/local/lib/mongrel2/config_modules/n

# OUTPUT: #[INFO] (src/mongrel2.c:320) Using configuration module /usr/local/lib/mongrel2/config_modules #[INFO] (null.c:11) Got the good path. #[ERROR] (src/config/config.c:366: errno: None) Wrong type, expected valid rows. #[ERROR] (src/mongrel2.c:124: errno: None) Failed to load global settings. #[ERROR] (src/mongrel2.c:326: errno: None) Aborting since cant load server. #[ERROR] (src/mongrel2.c:362: errno: None) Exiting due to error.

mongrel2 badpath 2f62bd5-9e59-49cd-993c-3b6013c28f05 /usr/local/lib/mongrel2/config_modules/nu

#[INFO] (src/mongrel2.c:320) Using configuration module /usr/local/lib/mongrel2/config_modules #[INFO] (null.c:14) Got the bad path: badpath #[ERROR] (src/mongrel2.c:121: errno: None) Failed to load config database at badpath #[ERROR] (src/mongrel2.c:326: errno: None) Aborting since cant load server. #[ERROR] (src/mongrel2.c:362: errno: None) Exiting due to error.

2. Add your .so to the list of ones to build in tools/config modules/Makefile. 3. Run make to conrm that it builds, then sudo make install to make sure it shows up in $PREFIX/lib/mongrel2/config modules. 4. Start making each function return the right tns_value_t * results that it needs. Look at src/cong/module.c for what is currently being used. 5. Look at tests/config tests.c:test Config load module and write a similar unit test to make sure it works right. Finally, the protocol thats being used is basically a translation of the sqlite3 tables dened in the src/config/config.sql schema into a TNetString data type that Mongrel2 can understand. The queries are checked for every error I could think up, and you should get meaningful error messages about column types. When it doubt, just look at src/config/module.c to see how its being done and then replicate it exactly.

78

CHAPTER 5. HACKING

Note 10

m2sh conguration run

Youre On Your Own Theres also a way to run the same command using m2sh, but its mostly a convenience to get you started. If youre doing your own conguration system its assumed that you probably arent using m2sh and have written your own. In order to make m2sh work with your cong, wed have to alter m2sh quite a lot and turn it into a generic query the cong tool. That might happen, but its not there yet. Rather than confuse the issue, Ill skip documenting it until a later release when its more robust.

Chapter 6

Contributing
You have gone through a complete description of all the features that Mongrel2 has right now, but not all the features it will have. I tend to write small software that does exactly what it needs to do, and Mongrel2 is no different. What you see here are the majority of the things you need to do right now, and well be slowly adding things people need. If youd like to help, then join the mongrel2@librelist.com mailing list and then read the Contributor Instructions. Thanks, Zed

79

S-ar putea să vă placă și