September 2, 2014
 
 
RSSRSS feed

Case Study: Clusters and Image Processing, Part II - page 4

The ImageLink Case, Reviewed

  • March 25, 2002
  • By Dee-Ann LeBlanc

Setting up a cluster is significantly easier for Red Hat Linux users--this is the distribution ImageLinks uses. Red Hat uses the Linux Virtual Server (LVS) cluster, discussed at www.linuxvirtualserver.org, rather than Beowulf. The LVS specifically provides high-end Web and FTP services. It is not meant for as broad a use as Beowulf. To set up your LVS, do the following:

  1. Obtain a group of machines to make into the cluster. It is always nice to have identical hardware, but if you can't manage that, don't worry.
  2. Choose the one or two machines you want to use as the routers. In this case, the routers are the machines that interact with the outside world and manage the machines behind them.

NOTE: A second router is a backup that can take over if the first router goes down. These two routers do not share work.

  1. Ensure that you have two Ethernet cards in each of the routers.
  2. Follow Steps 5 through 10 for all the machines.
  3. Can your computer boot from the CD-ROM? If so, edit the BIOS and set it to do so. Otherwise, place a boot disk in the floppy drive.
  4. Boot the machine. This automatically begins the installation process. If you have to boot with a floppy, press Enter at the initial prompt.
  5. Work through the installation until you reach the Install Type section. Choose Custom.
  6. Continue through the installation until you reach the Package Selection section. Make sure Clustering is among the package groups you choose.

TIP: Already have the machines installed? With Red Hat, you can type rpm -ivh piranha* to install the clustering tools all at once.

  1. Finish your package selections and then continue with the installation.
  2. Once the install is complete, interrupt the reboot process and set the BIOS back to booting from the hard drive first. This will save you some headaches if you leave a CD-ROM in the drive later while the machine is rebooting.
  3. Physically network these machines together, as shown in Figure 3.
  4. On the router machine(s), examine the contents of /etc/sysconfig/network for the following text:
      FORWARD_IPV4=yes    
      DEFRAG_IPV4=yes
  1. On the router machine(s), edit the file /etc/rc.d/rc.local and add the following:
  1. On the server machine(s)--also called nodes in cluster-speak--you have some choices to make. I'll walk you through each one and the implementations. First, you have to choose whether you want to use the traditional rsh tool to configure the remote logins on each of the nodes or use the much more secure ssh. I won't try to sway you in either direction. Your ultimate choice depends on what level of security you need. Because ssh is such an involved issue, I'll continue here with the rsh discussion. See the section "Implementing LVS Clusters with ssh" for more information on how to utilize the secure shell. If there is no security between your cluster and the Internet, you'll probably want to use ssh.
  2. If you are using rsh, log onto all the cluster machines as root, one by one. In each case, create the file /root/.rhosts. In each file, create a list of the hostnames of the other computers in the cluster, one per line. Save and exit the file on each. Now root can remotely log in on all the cluster machines.

WARNING! Don't do this in any other situation! This is a major security hole, because someone would only need to break into one of the root accounts and then could use rlogin to access all the other root accounts without needing a password.

  1. The router needs to be able to monitor what the servers are up to. Therefore, you need to choose one of the three tools available for this purpose: rup, ruptime, or uptime. These three tools are virtually identical. All three offer information on how long a machine has been up and running and what its load averages are. The ruptime and uptime tools also give information on how many users are logged on. One of the main differences is in how often the information comes:
    • rup sends it once
    • ruptime sends it once a minute as a broadcast
    • uptime sends it once
  2. If you decide to use uptime, you have no further setup to do for this service.
  3. If you decide to use ruptime, log onto all the machines as root and then add an entry either in your startup daemon directory--in Red Hat, for booting into command-line mode, this would be /etc/rc.d/rc3.d--or in a system startup file, such as /etc/rc.d/rc.local, to ensure that the daemon rwhod starts at boot time. In Red Hat, this daemon is in /usr/sbin.
  4. If you choose to use rup, on each of the server machines ensure that rpc.rstatd starts at boot time, using the same methods as discussed in Step 18.
  5. On the main router machine, use your favorite text editor to open the file /etc/lvs.cf.
  6. This file has three sections. The first is the global section, where you set the defaults for everything. The options that you must fill in are detailed in Table 5.

Table 5 Required global options for LVS.

Option

Purpose

Value

Example

deadtime

How long the cluster waits before giving up on an individual node and handing the service it was offering over to another.

Number of seconds to wait after the node goes quiet.

deadtime = 90

network

Which networking method to use when directing incoming and outgoing TCP/IP traffic.

Use direct for "standard" routing, nat for Natural Address Translation, or tunnel for IP Tunneling.

network = direct.

primary

The IP address for the main LVS router.

IP address.

primary = 192.168.10.5

rsh_command

The command to use to ensure that all the nodes have the same information.

rsh or ssh.

rsh_command = rsh

service

Used to set how the cluster should behave. (I am specifically covering one aspect of the LVS cluster.)

lvs or fos (however, I'm only covering lvs).

service = lvs

NOTE: The direct network method is what's covered here.

    For example, you might have this:

      deadtime = 30    
      network = direct    
      primary = 202.164.14.9    
      rsh_command = rsh    
      service = lvs
  1. If you are using a pair of routers, you also need to set the options shown in Table 6.

Table 6 Required global options for LVS with backup router.

Option

Purpose

Value

Example

backup

The IP address for the backup router. Only needed if you have one.

IP address

backup = 192.168.10.6

heartbeat

Utilizes a method called heartbeat, where the primary LVS router sends a regular notification to the backup that it is alive.

0 for no, 1 for yes

heartbeat = 1

keepalive

If you choose to use the heartbeat (recommended), this option specifies how many seconds to wait between heartbeats.

Number of seconds

keepalive = 10

    For example, you might add this:

      backup = 202.164.14.10    
      heartbeat = 1    
      keepalive = 15
  1. The second section contains settings for the virtual server(s). A virtual server in an LVS cluster refers to the address and machine that clients from the outside are trying to reach. The clients have no idea that they're talking to a cluster, and they really don't care. They just want an answer to their request. You start each virtual server section with this:
      virtual server hostname {
  1. You must now fill in the pertinent information for this virtual server. The required values are listed in Table 7.

Table 7 Required options for LVS virtual server setup.

Option

Purpose

Value

Example

active

Defines whether this server is considered up and running.

0 for no, 1 for yes.

active = 1

address

The IP address used to access this virtual server.

IP address.

address = 201.14.1.5

load_monitor

Assigns the method used to get information concerning how much load each node is under.

uptime, ruptime, or rup.

load_monitor = ruptime

reentry

Specifies how long a node has to be alive after going down before the LVS routers will trust it enough to return it to their routing tables.

Seconds to wait.

reentry = 250

scheduler

An LVS cluster uses four different methods to determine where to send each request to the virtual server.

rr for round robin, where the request is sent to the next node; lc for least connections, where the request is sent to the node handling the least requests; wlc (default) for weighted least connections, where the request is sent to the node handling the least requests, but the decision is adjusted by the weight assigned to the node; wrr for weighted round robin, where the request is sent to the next node with the most promising weight.

scheduler = wrr

timeout

Specifies how long the routers will wait before giving up on a node and removing it from their routing tables.

Number of seconds (the default is 10).

timeout = 30

    For example, you might have something like the following:

      active = 1
      address = 202.164.14.15
      load_monitor = rup
      reentry = 200
      scheduler = wrr
      timeout = 20
  1. Close off each server's section with this:
    }
  1. Once you have your virtual server(s) created, you have to set up your actual servers/nodes. Start each section with the following line:
      server hostname {
  1. Fill in the required values for this section. The options you must use are listed in Table 8.

Table 8 Required options for LVS node setup.

Option

Purpose

Value

Example

active

Specifies whether this node is considered up or down.

0 for down, 1 for up.

active = 1

address

The IP address assigned to the node.

IP address.

address = 192.168.15.10

weight

If you are using a weight-based scheduler, you need to set this option. Its value is used to tell the routers how much processing power this node has. The higher the number, the more jobs that are sent.

A positive nonzero integer. The default is 1.

weight = 100

    For example, you might use these values:

      active = 1
      address = 202.164.14.11
      weight = 5
  1. Close off each server's section with this:
      }
  1. Save and close this file.
  2. If you're using a primary and backup router, copy this configuration file to the backup.
  3. If you're using a primary and backup router and are utilizing a heartbeat, you need to start the pulse daemon on both machines. Open the file /etc/rc.d/rc.local on both machines, go to the end, and add the following entry:
      /etc/rc.d/init.d/pulse start

    If you don't want to wait until a reboot to test this, type the same text you see above on the command line on both machines.

  1. Install the Web server software--Apache or otherwise--on all of the nodes. Clone the configuration across each of them.

Sitemap | Contact Us