Quantcast
submit to reddit       
       

Raspberry Pi Cluster

One popular use of Raspberry Pi computers is building clusters. Raspberry Pies are small and inexpensive so it's easier to use them to build a cluster than it would be with PCs. A cluster of Raspberry Pies would have to be quite large to compete with a single PC; you'd probably need about 20 Pies to produce a cluster with as much computing power as a PC. Although a Pi cluster may not be that powerful, it's a great opportunity to learn about distributed computing.

There are several different types of distributed computer which can be used for different purposes. There are super computers that are used for solving mathematical problems like modelling weather conditions or simulating chemical reactions. These systems often use the Message Passing Interface (MPI). A team at the University of Southampton built a 64 node MPI based super computer. This system is used for teaching students about supercomputing.

Another technology that's often used in distributed computing is Hadoop, which distributes data across many nodes. Hadoop is often used for processing large datasets and data mining. An engineer at Nvidia built a small Hadoop cluster using Raspberry Pies. He uses his cluster to experiment and test ideas before deploying them on more powerful systems.

Using a Raspberry Pi cluster as a web server

Clusters can be used as web servers. Many web sites get too much traffic to run on a single server, so several servers have to be used. Requests from web browsers are received by a node called a load balancer, which forwards requests to worker servers. The load balancer then forwards responses from servers back to the clients.

This site is now hosted on a Raspberry Pi cluster. The worker nodes are standard web servers that contain identical content. I just installed Apache on them and copied my site to each node.

I use an extra Raspberry Pi to host a development copy of this site, and to control the cluster. This Pi is connected to my local network via wifi, so I can access the development copy of my site from my laptop.

The extra Pi also has an ethernet connection to the Pi cluster. When I want to update my site, I can transfer changes from the development site to the live site on the cluster. Site updates are put into .tar.gz files which the worker nodes automatically download from the development site. Once downloaded, updates are then unpacked into the local file system.

Configuring the Raspberry Pi servers

All of the Pies in this system are headless. I can log into the Pi with the development site using the Remote Desktop Protocol, and from that Pi I can log into the worker Pies using SSH.

All the Pies in the cluster use a static IP address. In a larger cluster it would probably be better to set up a DHCP server on the load balancer. The IP addresses used in the cluster are on the 192.168.1.xxx subnet.

For each worker Pi, I set up a 4GB SD card using the latest version of Raspbian. In raspi-config I set the following options:

  • expand fs
  • set the hostname
  • set the password
  • set memory split to 16MB for the GPU
  • overclock the CPU to 800MHz
  • enable ssh

On each card I installed Apache and some libraries required by my CMS, libxml2 and python-libxml2. I used this command to enable mod rewrite, which is also required by my CMS:

$ sudo a2enmod rewrite

Finally, I copied some scripts onto each SD card which allow each Pi to synchronize its contents with the development Pi. In a larger cluster it would be worth creating an SD card image with all of these modifications made in advance.

Building a load balancer

The load balancer must have two network interfaces, one to receive requests from a router, and another network interface to forward requests to the server cluster. The nodes in the cluster are a on a different subnet than the rest of the network, so the IP address of the load balancer's second interface must be on the same subnet as the rest of the cluster. The load balancer's first interface has IP address 192.168.0.3 while the second interface's IP address is 192.168.1.1. All the Pies in the cluster have IP addresses on the 192.168.1.xxx subnet.

I built my load balancer using an old PC with 512MB of RAM and a 2.7GHz x86 CPU. I added a second PCI ethernet card and installed Lubuntu, a lightweight version of Ubuntu. I was going to install Ubuntu, but this PC is pretty old, so Lubuntu is probably a better choice. I used a PC becasue I wasn't sure if a single Pi would be powerful enough to act as a load balancer, and a Pi only has one ethernet connection. I want both of my load balancer's network connections to be ethernet for improved performance and stability.

Note that IP forwarding is not enabled. The load balancer isn't a router, it should only forward HTTP requests and not every IP packet that it receives.

Setting up the load balancer software

There are many different software implementations of load balancing. I used Apache's load balancer module because it's easy to set up. First I made sure my PC's OS was up to date:

sudo apt-get update
sudo apt-get upgrade

Then I installed Apache:

sudo apt-get install apache2

These Apache modules need to be enabled:

sudo a2enmod proxy
sudo a2enmod proxy_http
sudo a2enmod proxy_balancer

The next step is to edit /etc/apache2/sites-available/default in order to configure the load balancer. The proxy module is needed for HTTP forwarding, but it's best not to allow your server to behave as a proxy. Spammers and hackers often use other people's proxy servers to hide their IP address, so it's important to disable this feature by adding this line:

ProxyRequests off

Although proxy requests are disabled, the proxy module is still enabled and and acts as a reverse proxy. Next, define the cluster and its members by adding this code:

<Proxy balancer://rpicluster> BalancerMember http://192.168.1.2:80 BalancerMember http://192.168.1.3:80 BalancerMember http://192.168.1.4:80 BalancerMember http://192.168.1.5:80 AllowOverride None Order allow,deny allow from all ProxySet lbmethod=byrequests </Proxy>

Balancer manager interface

The balancer module has a web interface that makes it possible to monitor the status of the back end servers, and configure their settings. You can enable the web interface by adding this code to /etc/apache2/sites-available/default:

<Location /balancer-manager> SetHandler balancer-manager Order allow,deny allow from 192.168.0 </Location>

It's also necessary to instruct Apache to handle requests to the /balancer-manager page locally instead of forwarding these requests to a worker server. All other requests are forwarded to the cluster defined above.

ProxyPass /balancer-manager ! ProxyPass / balancer://rpicluster/

Once these changes have been saved, Apache should be restarted with this command:

$ sudo /etc/init.d/apache2 restart

when I open a browser and go to http://192.168.0.3 I see the front page of my web site. If I go to http://192.168.0.3/balancer-manager, I see this page in the image on the right.

The last step in getting the cluster online is adjusting the port forwarding settings in my router. I just needed to set up a rule for forwarding HTTP packets to http://192.168.0.3.

Here's the complete /etc/apache2/sites-available/default for the load balancer:

<VirtualHost *:80> ServerAdmin webmaster@localhost DocumentRoot /var/www <Directory /> Options FollowSymLinks AllowOverride All </Directory> <Directory /var/www/> Options Indexes FollowSymLinks MultiViews AllowOverride All Order allow,deny allow from all </Directory> ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/ <Directory "/usr/lib/cgi-bin"> AllowOverride None Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch AddHandler cgi-script .py Order allow,deny Allow from all </Directory> ProxyRequests Off <Proxy balancer://rpicluster> BalancerMember http://192.168.1.2:80 BalancerMember http://192.168.1.3:80 BalancerMember http://192.168.1.4:80 BalancerMember http://192.168.1.5:80 AllowOverride None Order allow,deny allow from all ProxySet lbmethod=byrequests </Proxy> <Location /balancer-manager> SetHandler balancer-manager Order allow,deny allow from 192.168.0 </Location> ProxyPass /balancer-manager ! ProxyPass / balancer://rpicluster/ ErrorLog ${APACHE_LOG_DIR}/error.log # Possible values include: debug, info, notice, warn, error, crit, # alert, emerg. LogLevel warn CustomLog ${APACHE_LOG_DIR}/access.log combined </VirtualHost>

Comments

Raspberry Pi Server Cluster Tests

Now that I've built a simple Raspberry Pi server cluster, it would be interesting to see how much traffic it can handle compared to a single Pi. There are many different server benchmarking tools available. I'm going to use siege.

Using Siege to test server response times

Siege is a program which can be used to send large numbers of requests to a web server. Using the following command will send HTTP requests to a server on my local network:

siege -d10 -c10 -t1m http://192.168.0.10/spec.html

The -c option specifies that there should be 10 concurrent users at a time. The -d option specifies the maximum delay between requests. The actual delay is randomized, but will be within the timeframe used with the -d option. The -t option tells siege how long the test should last - 1 minute in this case.

In each of the tests that I conducted, I used a maximum delay of 10 seconds and a total test time of 1 minute. All tests used siege to make requests across my local network, not via the internet.

Testing the load balancer with a single Raspberry Pi

I wanted to see how a single Raspberry Pi handles traffic with and without the load balancer. I suspected that the load balancer would introduce a small delay, but actually a single Pi seems to operate more efficiently when it's behind a load balancer. It seems that requests are buffered before they reach the Pi. This graph shows the average response times for varying numbers of concurrent users, with and without the load balancer:

Seeing how performance improves with more nodes

The next test that I conducted was to see how performance changed as more nodes were added to the cluster. I ran tests with varying numbers of concurrent users, but for simplicity I'll just show results for tests with 10 concurrent users.

This graph shows the average maximum and minimum response times for an increasing number of worker nodes:

As you can see, the minimum response time doesn't really improve as more nodes are added, which makes sense. The minimum response time is only as fast as any one of the nodes in the cluster, and adding more nodes won't change that.

The maximum response time improved dramatically as more nodes were added. Using a cluster has definitely increased my server's capacity.

Read more about siege.
Comments

Improving cluster performance by tuning Apache

In my last article about cluster performance, I found that a cluster performs better than a single Pi, but there was a still lot of room for improvement. I've made changes to Apache's configuration files, and I've modified the way page caching works in my CMS.

Moving the page cache

The CMS that I've written can generate pages dynamically, and it can cache pages so that they can be served instantly without having to be assembled. Only pages that consist of static HTML can be cached. Pages that contain dynamic content generated by executable scripts aren't cached.

The page cache used to be in /usr/share/cms/cache. The Python interpreter had to be loaded to serve cached pages from /usr/share/cms/cache. Now, the root directory of the page cache is /var/www, so Apache can serve cached pages without invoking Python to run the CMS script.

One downside is that the CMS can not count track traffic anymore. When the CMS executed every time a page was requested, a function was called to increment a number in a data file. This doesn't work now that pages can be served without the CMS executing.

Unload unused modules

One of the best ways to improve Apache's performance is by unloading modules that aren't needed. First you need to list all the modules that are currently loaded using this command:

apache2ctl -M

It can be difficult to determine which modules are in use. It really depends on which directives are used in .htaccess files and virtual host files. For example, if you disable authz_host_module, then Allow, Order and Deny directives won't work. Each time you disable a module, restart Apache with these commands:

$ sudo service apache2 stop $ sudo service apache2 start

You can use 'restart' instead of 'start' and 'stop', but there are some variables that require Apache to be stopped before they can be updated. It's a good idea to thoroughly test your site before you disable any more modules. I disabled these modules:

$ sudo a2dismod autoindex $ sudo a2dismod auth_basic $ sudo a2dismod status $ sudo a2dismod deflate $ sudo a2dismod ssl $ sudo a2dismod authz_default

If you find that a module is required, you can re-enable it with the a2enmod command like this:

$ sudo a2enmod authz_host

Lower the timeout

I set the timeout in /etc/apache2/apache2.conf to 30. This prevents concurrent requests from occupying memory for long periods and reduces memory usage.

Tune Apache processes

Apache has several different multiprocessing models. They each use a number of server processes and child threads to handle HTTP requests. MPM worker is the most modern configuration. MPM prefork used to be the standard, but MPM worker gives better performance and memory usage. Use this command to check what mode Apache is in:

$ sudo apache2 -V

Part of the output will be something like this:

Server MPM: Worker threaded: yes (fixed thread count) forked: yes (variable process count)

These are the default settings for MPM Worker:

<IfModule mpm_worker_module> StartServers 5 MinSpareThreads 25 MaxSpareThreads 75 ThreadLimit 64 ThreadsPerChild 25 MaxClients 150 MaxRequestsPerChild 0 </IfModule>

The size of the Apache processes varies depending on the content being served and any scripts that might be running. This command shows the number of Apache processes, and their size:

ps aux | grep 'apache2'

The 6th column contains the amount of memory used by each process. Dividing the amount of spare memory by the size of the average Apache process gives a rough indication of the maximum number of server processes that you can run. Each Pi in my cluster has about 280MB of RAM that's free, and the average size of Apache processes is about 7MB. 280 divided by 7 gives 40.

StartServers is the number of server threads that Apache creates when it starts up. Creating new server processes can be time consuming, so I want Apache to start a lot server processes when it starts. This means it won't have to spend time creating more processes while it's busy processing a lot of traffic. I've set StartServers to 40.

I don't want Apache to be able to create too many processes, as my Pi might run out of memory, so I've set the ServerLimit to 40.

Each server process can have a varying number of threads. It's the threads that actually process requests. I've set the default number of threads per child to 8. I didn't calculate this, I just tried a lot of different numbers and ran a lot of tests with siege until I found the optimum value.

The total number of threads is the number of server processes multiplied by ThreadsPerChild, which is 320 with my settings. I set MaxClients to 320 to prevent Apache from creating extra threads.

These settings will cause Apache to create a lot of processes and threads that don't get used immediately. In order to prevent Apache from deleting them, I set MaxSpareThreads to 320.

MaxRequestsPerChild is the number of requests a process should handle before it is killed and a replacement process is started. This is done to prevent memory leaks from accumulating large amounts of memory. It should be set to the number of hits a server gets in a day so that processes are restarted once a day.

The MPM Worker settings are now

<IfModule mpm_worker_module> StartServers 40 ServerLimit 40 MinSpareThreads 25 MaxSpareThreads 320 ThreadLimit 64 ThreadsPerChild 8 MaxClients 320 MaxRequestsPerChild 2000 </IfModule>

Before I made changes to the caching system, a siege test with only 25 concurrent users yeilded these results:

Lifting the server siege... done. Transactions: 30 hits Availability: 100.00 % Elapsed time: 59.79 secs Data transferred: 0.08 MB Response time: 26.07 secs Transaction rate: 0.50 trans/sec Throughput: 0.00 MB/sec Concurrency: 13.08 Successful transactions: 30 Failed transactions: 0 Longest transaction: 29.32 Shortest transaction: 14.03

After I improved the caching system I tested a single node with 200 concurrent users using this command:

$ siege -d1 -c200 -t1m http://192.168.0.4/specs.html

The results were

Lifting the server siege... done. Transactions: 6492 hits Availability: 100.00 % Elapsed time: 59.28 secs Data transferred: 38.86 MB Response time: 1.29 secs Transaction rate: 109.51 trans/sec Throughput: 0.66 MB/sec Concurrency: 141.10 Successful transactions: 6492 Failed transactions: 0 Longest transaction: 11.23 Shortest transaction: 0.32

After restarting Apache with the configuration changes and running the same test again, I got these results:

Lifting the server siege... done. Transactions: 6449 hits Availability: 100.00 % Elapsed time: 59.53 secs Data transferred: 38.60 MB Response time: 1.31 secs Transaction rate: 108.33 trans/sec Throughput: 0.65 MB/sec Concurrency: 142.32 Successful transactions: 6449 Failed transactions: 0 Longest transaction: 4.16 Shortest transaction: 0.01

After optimizing page caching, removing unused modules from Apache and tuning the server process, the number of transactions per second for a single node has gone from 0.5 to over a hundred. The number of conncurrent requests that can be handled has increased by a factor of 8.

Tuning Apache processes has resulted in a very small decrease in the number of transactions per second, but the longest transaction time has decreased considerably.

Testing the whole cluster

Once I was happy with my new settings, I flushed them through to the entire cluster. I ran more tests with siege, first with 200 concurrent users:

Lifting the server siege... done. Transactions: 23218 hits Availability: 100.00 % Elapsed time: 59.26 secs Data transferred: 48.44 MB Response time: 0.01 secs Transaction rate: 391.80 trans/sec Throughput: 0.82 MB/sec Concurrency: 3.90 Successful transactions: 23218 Failed transactions: 0 Longest transaction: 0.66 Shortest transaction: 0.00

...and then with 800 concurrent users:

Lifting the server siege... done. Transactions: 56899 hits Availability: 100.00 % Elapsed time: 60.05 secs Data transferred: 118.59 MB Response time: 0.34 secs Transaction rate: 947.53 trans/sec Throughput: 1.97 MB/sec Concurrency: 317.44 Successful transactions: 56899 Failed transactions: 0 Longest transaction: 9.71 Shortest transaction: 0.00

Before tuning Apache, the cluster could handle 350 concurrent requests, and the maximum transaction rate was 460 transactions per second. Now the maximum number of concurrent users with 100% success rate is 800. The maximum number of transactions per second is now 947.

I will carefully watch the amount of spare memory over the next few days. If it starts to get too low, I'll reduce some of these settings.

Using siege isn't completely realistic. A request from a browser puts a much larger load on a server than a request from siege, and the timing of tests in siege is different from the ebb and flow of real traffic. Tests done using siege do not predict the number of requests a server can handle. Tests like these do give a basis for comparison of different server configurations. I don't think my cluster can handle 947 actual visitors per second, but I'm confident that my server's performance is better than it was.


Comments

On Site Optimization

Page load times are important for a good user experience. People are more engaged with sites that load quickly, and usually visit more pages.

Making pages load quickly isn't just about making web servers faster. When browsers load pages, first they have to download the page from a web server. If the page references other files, like CSS, Javascript and image files, the browser also needs to fetch those files. Once all the files have been downloaded, the browser has to render the page. Pages can be optimized so that browsers can load them quickly and efficiently.

This kind of optimization isn't about tuning the server, it's about optimizing your site's pages so that browsers can load them easily. This involves spending time tuning a site's HTML template.

A good place to get ideas on how to optimize your site is Google PageSpeed Insights. You can use PageSpeed Insights to analyse pages on a site and determine ways to improve page load times. PageSpeed Insights gives detailed suggestions on how you can improve performance. Your site is assigned a page rank score which improves as you work through the list of issues.

Inline CSS

I used to keep the CSS code for my site in a seperate file in /var/www/default_theme.css. It was referenced in the head section of my HTML template with this line:

<link type="text/css" rel='stylesheet' href='/default_theme.css' >

When pages from my site were loaded by browsers, they wouldn't be able to carry on rendering that page until they had made another request to my server and downloaded the CSS file. I have eliminated that delay by including the CSS code directly in the head section of my HTML template. I added the following tags to my template's head section, and pasted the contents of my CSS file between them:

<style> </style>

This makes each web page larger, but reduces the number of requests that my server has to handle. It also means that browsers can render the page more efficiently.

Deferred Javascript Loading

It's quite common to reference javascript source files in the head section of a web page. I used to have the following tag in my site's head so that images can be viewed using the Lightbox Javascript plugin:

<script src="/js/lightbox.js"></script>

The problem with this is that when a page is downloaded, the browser has to pause the processing of the page to download the Javascript file. This interrupts the browser before it can begin rendering the page.

The solution is to move the references to Javascript files 'below the fold'. I updated my HTML template to include this line just after the footer instead of in the head section. When pages on my site are loaded in a browser, most of the page renders before the browser has to get the Javascript file.

Asynchronous Javascript Sharing Buttons

Sharing buttons are a great way to boost traffic to your site. Visitors click the sharing buttons to share your site with their friends and followers on social media, thus increasing the amount of traffic that your site gets from social networks. There are many companies that provide plugins that can be added to a site. The downside with some sharing plugins is that they can reduce page load speed because of Javascript issues.

Instead of using a single plugin to display sharing buttons, you can use buttons from different sites. Each social network has it's own sharing buttons that you can use to put sharing buttons on your site, and most of them are asynchronous (but not Reddit at the time of writing). Asynchronous buttons are loaded after the page has rendered, so the page loads faster.

Since I removed the sharing plugin and replaced it with individual sharing buttons, page load times displayed in Google Analytics have become much more consistent, and a lot lower.

Minify CSS and Javascript

CSS and Javascript files often contain a lot of white space such as space and new line characters. Using white space helps to make code more readable, but it also adds to file sizes. Removing white space characters makes code unreadable, but it can significantly reduce the load on your server. I decided not to minify my CSS code as I want to be able to read it, but I have deleted a lot of white space, while still preserving some basic formatting. This is how my CSS code used to look:

.navbar li { display:inline; border-left: solid 1px black; padding-left:6px; } first { border-left: none; } .navbar li:first-child { border: none; }

This is how it looks with some white space removed:

.navbar li { display:inline; border-left: solid 1px black; padding-left:6px; } first { border-left: none; } .navbar li:first-child { border: none; }

Javascript code can also be minified. I don't plan to modify the Javascript code, so I have minified it completely. There are two Javascript files that browsers download from my site every time they load a page, /js/lightbox.js, for displaying images, and /js/jquery-1.7.2.min.js. The jquery file is already minified.

The Lightbox code isn't minified by default, so I went to this Javascipt minifier tool and pasted the code from lightbox.js into the input text area, and hit the submit button. I created a new file in /var/www/js called lightbox.min.js and pasted the output from the minifier tool into the file. I changed my site's HTML template to reference this new file instead of the original unminified version. The unminified version of this file was 11.6KB, and the minified version was 6.2KB.

Leverage browser caching

Web browsers can cache pages so that they don't need to be downloaded again if a users goes back to a page that they have already visited. Browsers can be told to cache pages by sending a cache control header before the page is sent by the server. This requires a couple of modifications to Apache's configuration. First, the headers module needs to be installed:

$ sudo a2enmod headers

Next, the following code needs to be pasted somewhere in /etc/apache2/apache2.conf:

<FilesMatch "\.(ico|png|gif|jpg|js)$"> Header set Cache-control "public, max-age=2592000" </FilesMatch> <FilesMatch "\.html$"> Header set Cache-control "public, max-age=604800" </FilesMatch>

This tells Apache that any file with a .ico, .png, .gif, .jpg or .js endings should be cached for 2592000 seconds (30 days) and files with a .html ending should be cached for 604800 seconds (7 days). The final step is to restart Apache:

$ sudo service apache2 restart

I executed this command on another computer to ensure that caching worked properly:

$ wget --save-headers http://raspberrywebserver.com/feed-icon-14x14.png

When I opened the downloaded file in a text editor, these HTTP headers were at the top:

HTTP/1.1 200 OK Date: Sun, 13 Oct 2013 01:24:25 GMT Server: Apache/2.2.22 (Debian) Last-Modified: Tue, 01 Oct 2013 03:40:50 GMT ETag: "226f0-2b1-4e7a5b80cb907" Accept-Ranges: bytes Content-Length: 689 Cache-control: public, max-age=259200 Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Content-Type: image/png

The cache control header can now be seen amongst the other HTTP headers.

Google Analytics Page Timings View

The Google Analytics screen shot on the right shows how page load times have improved. Page load times fluctuated a lot while I was using a sharing plugin, but levelled out at as soon as I got rid of it on the 10th of October. Over the next few days page load time reduced even more as I minified CSS and Javascript.


Comments

Adding more nodes to the cluster

My Raspberry Pi server cluster has been running for three months, and is now serving 45,000 page views per month. The amount of traffic reaching my site is increasing all the time, and occasionally there are large spikes in traffic from social networking sites.

As the load on the server increases, it's important to make sure it has enough capcity, so I decided to increase it's computing power by adding four more Raspberry Pi server nodes to the cluster.

I built two new racks, each holding four Raspberry Pi servers. I daisy-chained two ethernet switches together, with four Pi servers connected to each switch.

I cloned an SD card from one of the Pi nodes and set up four almost identical SD cards. The only difference between them is the IP address in /etc/network/interfaces. I did a backup of my site from the dashboard of my CMS, and used a script to synchonize the content all worker nodes.

The next step was to modify the load balancer settings to start using the new nodes. On the load balancer, /etc/apache2/sites-available/default needed to be updated to include the new nodes in the cluster declaration:

<Proxy balancer://rpicluster> BalancerMember http://192.168.1.2:80 BalancerMember http://192.168.1.3:80 BalancerMember http://192.168.1.4:80 BalancerMember http://192.168.1.5:80 BalancerMember http://192.168.1.6:80 BalancerMember http://192.168.1.7:80 BalancerMember http://192.168.1.8:80 BalancerMember http://192.168.1.9:80 AllowOverride None Order allow,deny allow from all ProxySet lbmethod=byrequests </Proxy>

When I finished making changes, I ran this command to load them into Apache:

$ sudo /etc/init.d/apache2 reload

This tells Apache to reload its configuration files without restarting. I went to the balancer manager interface at 192.168.0.3/balancer-manager to make sure the new nodes had been added:

Testing

I tested the new eight Pi cluster using the same tests as when I was only using four Pi servers. First, I used seige to generate 200 concurrent requests over a minute:

$ siege -d1 -c200 -t1m http://192.168.0.3/specs.html Lifting the server siege... done. Transactions: 23492 hits Availability: 100.00 % Elapsed time: 59.81 secs Data transferred: 48.93 MB Response time: 0.01 secs Transaction rate: 392.78 trans/sec Throughput: 0.82 MB/sec Concurrency: 3.81 Successful transactions: 23492 Failed transactions: 0 Longest transaction: 0.63 Shortest transaction: 0.00

The result is very similar to the result for the same test on the cluster with just four nodes (see 'Testing the whole cluster' Improving cluster performance by tuning Apache).

Next, I ran siege with 800 concurrent requests:

Lifting the server siege... done. Transactions: 76510 hits Availability: 100.00 % Elapsed time: 59.76 secs Data transferred: 159.39 MB Response time: 0.12 secs Transaction rate: 1280.29 trans/sec Throughput: 2.67 MB/sec Concurrency: 148.45 Successful transactions: 76510 Failed transactions: 0 Longest transaction: 13.04 Shortest transaction: 0.00

The longest transaction time has increased, but the through-put and the number of transactions per second have also increased.

Performance with a low number of concurrent request hasn't really changed, but performance has improved for an increased number of concurrent requests. This is to be expected as adding more nodes doesn't make a cluster faster, it's meant to increase the cluster's capacity.

I was surprised that the time for the longest transaction increased. Most requests completed in 20ms or less, and unfortunately siege doesn't record the average response time.

Tests with Apache Bench, ab, show that the longest transaction time was 5.111 seconds, but the mean transaction time was 0.214 seconds, so an increase in the longest transaction time doesn't mean that overall performance is worse, but it is cause for concern. I logged into the load balancer using ssh and re-ran the tests on my cluster. I ran the uptime command followed by free on the load balancer:

$ uptime 13:30:04 up 1 day, 5:20, 1 user, load average: 9.27, 3.52, 1.33 $ free total used free shared buffers cached Mem: 498268 487188 11080 0 37328 299752 -/+ buffers/cache: 150108 348160 Swap: 513020 20 513000

The load average figures for the load balancer are much higher than for the servers. The load balancer only has 11MB of RAM left and has started to use swap. When web servers start to use swap space, they slow down dramatically, so I need to look into the performance and memory usage of my load balancer. Using siege to test with 800 concurrent users is testing for the worst case. At the moment my site isn't getting that much traffic, so the performance issues with the load balancer aren't an immediate problem, but it's something I need to look at.

I still don't know how much traffic this system can actually handle because serving real traffic is not the same as testing with siege. I do know that my server can handle at least 45,000 hits a month, and probably a lot more now that I have added more nodes.


Comments

Comparing the performance of Nginx and Apache web servers

Ever since I built my cluster people have been asking me why I used Apache and not Nginx. I started using Apache because I was just used to it. People say it's slow and takes up too much memory, but I find that with a little tuning it can perform quite well.

Still, Nginx does have a reputation for being fast. I wanted to see which web server would be best for my cluster, so I installed Apache on one Pi and Nginx on another.

I used two Raspberry Pi model B servers, each with identical Sandisk 4GB class 4 SD cards. In raspi-config, I overclocked each Pi to 800MHz, and allocated 16MB of memory to the GPU. I used the same version of Raspbian (released on 2013-09-25) on both servers. I used exactly the same test data and scripts on each Pi.

Follow this link to see how I set up Nginx and uWSGI.

I tuned Apache by removing modules and increasing the number of server processes. These tuning techniques don't apply to Nginx.

I tested each server with three different types of request: a static HTML file, a simple CGI script, and a complex CGI script. The HTML file is a cached page in my Content Management System (the CMS doesn't need to execute for cached pages to be served, they can be served by Apache as normal HTML files). The simple script just prints an HTTP header, prints "Hello World!" and exits.

The complex script used in these tests was the CMS that this site is built on. I disabled page caching on both Pi servers, so that pages had to be generated dynamically by the CMS. When a page is served, the CMS script has to parse two XML files to get meta data, read several snippets of HTML from the file system, and print them to a socket.

Requests were generated with Apache Bench using a command like this:

ab -n 1000 -c 250 http://192.168.0.21/spec.html

where 1000 is the number of requests to issue, and 250 is the number of concurrent users.

The Raspberry Pi running Nginx had IP address 192.168.0.21, and the Pi running Apache had 192.168.0.22. I tested each server over a range of concurrent users for each type of request.

Static files

Static files are easy to serve, so I used a range of 50 to 250 concurrent users in these tests. Apache handled 220 connections per second, while Nginx handled around 300 connections per second.

Nginx came out ahead on these tests.

Dynamic content tests

Simple script

In these tests I used ab to request this URL: http://192.168.0.21/cgi-bin/hello.py. I set the number of request to 100, and tested over a range of 10 to 50 concurrent users.

Apache handled 4.78 connections per second, and Nginx handled 4.65 connections per second, but the results showed that the mean transaction time was lower for Nginx than Apache, so Apache was slower in this test. The difference was not very pronounced under a low load, but it increased as the load increased.

Complex script

The URL used in these test was http://192.168.0.21/spec.html. This test is the most CPU intensive so I used from 5 to 25 concurrent users in these tests.

Under a low load, Apache's performance was slightly better than Nginx, but only by a very slim margin. With 25 concurrent users, Apache was slower than Nginx. The difference under a low load is negligible, but with 25 concurrent users, Nginx was noticeably faster than Apache.

Conclusions

There are many variables involved in server performance. These tests don't definitively determine which server is 'better', they just helped me decide which one is best for my specific needs.

Most of the pages on my site are static, and Nginx is faster when it comes to static pages. It looks like Nginx is a better choice for me.


Comments



Follow me


This site is powered by Pyplate, a lightweight Python CMS for the Raspberry Pi.