Quantcast
submit to reddit       
       

Raspberry Pi Cluster

One popular use of Raspberry Pi computers is building clusters. Raspberry Pies are small and inexpensive so it's easier to use them to build a cluster than it would be with PCs. A cluster of Raspberry Pies would have to be quite large to compete with a single PC; you'd probably need about 20 Pies to produce a cluster with as much computing power as a PC. Although a Pi cluster may not be that powerful, it's a great opportunity to learn about distributed computing.

There are several different types of distributed computer which can be used for different purposes. There are super computers that are used for solving mathematical problems like modelling weather conditions or simulating chemical reactions. These systems often use the Message Passing Interface (MPI). A team at the University of Southampton built a 64 node MPI based super computer. This system is used for teaching students about supercomputing.

Another technology that's often used in distributed computing is Hadoop, which distributes data across many nodes. Hadoop is often used for processing large datasets and data mining. An engineer at Nvidia built a small Hadoop cluster using Raspberry Pies. He uses his cluster to experiment and test ideas before deploying them on more powerful systems.

Using a Raspberry Pi cluster as a web server

Clusters can be used as web servers. Many web sites get too much traffic to run on a single server, so several servers have to be used. Requests from web browsers are received by a node called a load balancer, which forwards requests to worker servers. The load balancer then forwards responses from servers back to the clients.

This site is now hosted on a Raspberry Pi cluster. The worker nodes are standard web servers that contain identical content. I just installed Apache on them and copied my site to each node.

I use an extra Raspberry Pi to host a development copy of this site, and to control the cluster. This Pi is connected to my local network via wifi, so I can access the development copy of my site from my laptop.

The extra Pi also has an ethernet connection to the Pi cluster. When I want to update my site, I can transfer changes from the development site to the live site on the cluster. Site updates are put into .tar.gz files which the worker nodes automatically download from the development site. Once downloaded, updates are then unpacked into the local file system.

Configuring the Raspberry Pi servers

All of the Pies in this system are headless. I can log into the Pi with the development site using the Remote Desktop Protocol, and from that Pi I can log into the worker Pies using SSH.

All the Pies in the cluster use a static IP address. In a larger cluster it would probably be better to set up a DHCP server on the load balancer. The IP addresses used in the cluster are on the 192.168.1.xxx subnet.

For each worker Pi, I set up a 4GB SD card using the latest version of Raspbian. In raspi-config I set the following options:

  • expand fs
  • set the hostname
  • set the password
  • set memory split to 16MB for the GPU
  • overclock the CPU to 800MHz
  • enable ssh

On each card I installed Apache and some libraries required by my CMS, libxml2 and python-libxml2. I used this command to enable mod rewrite, which is also required by my CMS:

$ sudo a2enmod rewrite

Finally, I copied some scripts onto each SD card which allow each Pi to synchronize its contents with the development Pi. In a larger cluster it would be worth creating an SD card image with all of these modifications made in advance.

Building a load balancer

The load balancer must have two network interfaces, one to receive requests from a router, and another network interface to forward requests to the server cluster. The nodes in the cluster are a on a different subnet than the rest of the network, so the IP address of the load balancer's second interface must be on the same subnet as the rest of the cluster. The load balancer's first interface has IP address 192.168.0.3 while the second interface's IP address is 192.168.1.1. All the Pies in the cluster have IP addresses on the 192.168.1.xxx subnet.

I built my load balancer using an old PC with 512MB of RAM and a 2.7GHz x86 CPU. I added a second PCI ethernet card and installed Lubuntu, a lightweight version of Ubuntu. I was going to install Ubuntu, but this PC is pretty old, so Lubuntu is probably a better choice. I used a PC becasue I wasn't sure if a single Pi would be powerful enough to act as a load balancer, and a Pi only has one ethernet connection. I want both of my load balancer's network connections to be ethernet for improved performance and stability.

Note that IP forwarding is not enabled. The load balancer isn't a router, it should only forward HTTP requests and not every IP packet that it receives.

Setting up the load balancer software

There are many different software implementations of load balancing. I used Apache's load balancer module because it's easy to set up. First I made sure my PC's OS was up to date:

sudo apt-get update
sudo apt-get upgrade

Then I installed Apache:

sudo apt-get install apache2

These Apache modules need to be enabled:

sudo a2enmod proxy
sudo a2enmod proxy_http
sudo a2enmod proxy_balancer

The next step is to edit /etc/apache2/sites-available/default in order to configure the load balancer. The proxy module is needed for HTTP forwarding, but it's best not to allow your server to behave as a proxy. Spammers and hackers often use other people's proxy servers to hide their IP address, so it's important to disable this feature by adding this line:

ProxyRequests off

Although proxy requests are disabled, the proxy module is still enabled and and acts as a reverse proxy. Next, define the cluster and its members by adding this code:

<Proxy balancer://rpicluster> BalancerMember http://192.168.1.2:80 BalancerMember http://192.168.1.3:80 BalancerMember http://192.168.1.4:80 BalancerMember http://192.168.1.5:80 AllowOverride None Order allow,deny allow from all ProxySet lbmethod=byrequests </Proxy>

Balancer manager interface

The balancer module has a web interface that makes it possible to monitor the status of the back end servers, and configure their settings. You can enable the web interface by adding this code to /etc/apache2/sites-available/default:

<Location /balancer-manager> SetHandler balancer-manager Order allow,deny allow from 192.168.0 </Location>

It's also necessary to instruct Apache to handle requests to the /balancer-manager page locally instead of forwarding these requests to a worker server. All other requests are forwarded to the cluster defined above.

ProxyPass /balancer-manager ! ProxyPass / balancer://rpicluster/

Once these changes have been saved, Apache should be restarted with this command:

$ sudo /etc/init.d/apache2 restart

when I open a browser and go to http://192.168.0.3 I see the front page of my web site. If I go to http://192.168.0.3/balancer-manager, I see this page in the image on the right.

The last step in getting the cluster online is adjusting the port forwarding settings in my router. I just needed to set up a rule for forwarding HTTP packets to http://192.168.0.3.

Here's the complete /etc/apache2/sites-available/default for the load balancer:

<VirtualHost *:80> ServerAdmin webmaster@localhost DocumentRoot /var/www <Directory /> Options FollowSymLinks AllowOverride All </Directory> <Directory /var/www/> Options Indexes FollowSymLinks MultiViews AllowOverride All Order allow,deny allow from all </Directory> ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/ <Directory "/usr/lib/cgi-bin"> AllowOverride None Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch AddHandler cgi-script .py Order allow,deny Allow from all </Directory> ProxyRequests Off <Proxy balancer://rpicluster> BalancerMember http://192.168.1.2:80 BalancerMember http://192.168.1.3:80 BalancerMember http://192.168.1.4:80 BalancerMember http://192.168.1.5:80 AllowOverride None Order allow,deny allow from all ProxySet lbmethod=byrequests </Proxy> <Location /balancer-manager> SetHandler balancer-manager Order allow,deny allow from 192.168.0 </Location> ProxyPass /balancer-manager ! ProxyPass / balancer://rpicluster/ ErrorLog ${APACHE_LOG_DIR}/error.log # Possible values include: debug, info, notice, warn, error, crit, # alert, emerg. LogLevel warn CustomLog ${APACHE_LOG_DIR}/access.log combined </VirtualHost>

Comments

Comments

comments powered by Disqus



Follow me


This site is powered by Pyplate, a lightweight Python CMS for the Raspberry Pi.