Diskless cluster nodes with Debian

This is a description of how to set up diskless cluster nodes for a small scale beowulf cluster using Debian GNU/Linux.

Although this was done in the context of using the nodes for a HPC cluster, the technique described here should work for diskless nodes for any purpose where they may be useful (although things like setting up x11 for the nodes, which would be required for diskless workstations, is not described).

Contents

  1. Overview
  2. Setting up the netboot server
  3. Creating an NFS root for the nodes
  4. Creating an initrd image for the nodes
  5. Cluster maintenance tools
  6. Building node kernel from source

Overview

Brief description of the boot-up process

Requirements

There is essentially only one hardware requirement for a diskless system: a PXE netboot capable network interface. PXE is the defacto standard protocal for handling netboot. All though they aren't ubiquitous, most modern nics support PXE, and all modern motherboards that I've come across will recognize the network adapter as a valid boot device.

Setting up the netboot server

Three services are required for a diskless system:

  1. DHCP server - to supply the diskless nodes with IP addresses and other boot information
  2. TFTP server - to serve the boot image
  3. NFS server - to supply the diskless root and other network filesystems that might be needed

These three services are independent, so there's no reas on that they need to be on the same machine, but there's really no reason not to put them on the same machine, and that's what's done in the system described herein. Since all of these services need to information about the nodes that are going to be serviced, it makes keeping track of things much easier if these services are all on the same machine anyway.

TFTP and the bootloader

DHCP

NFS

Creating an NFS root for the nodes

  1. Make debootstrap root that contains needed packages. This will be mounted as root via NFS.
    debootstrap --verbose --resolve-deps $RELEASE $NFSROOTDIR $MIRROR
    where
    $RELEASE="sarge"
    $MIRROR="http://debian.csail.mit.edu/debian-amd64/debian/"
    
  2. Make initrd with necessary modules/scripts in new chroot. This is done using mkinitrd. It runs scripts in /etc/mkinitrd/scripts.
  3. link initrd and vmlinuz into tftpboot directory
install lessdisks dhcp3-server atftpd syslinux

cp /usr/share/doc/lessdisks-doc/pxe/default /var/lib/lessdisks/boot/pxelinux.cfg/

setup atftpd to serve $TFTPBOOT

ln -s /usr/lib/syslinux/pxelinux.0 $TFTPBOOT/
create $TFTPBOOT/pxelinux.cfg/default

optionally, configure serial console during boot (modify pxelinux.cfg/default):
first line: serial 0 115200 0x303
add to initrd line: console=tty0 console=ttyS0,115200n8

configure dhcp3-server to point filename to /$TFTPBOOT/pxelinux.0

once lessdisks is configured:
ln -s $LESSDISKS/boot/vmlinuz $TFTPBOOT/
ln -s $LESSDISKS/boot/initrd.img $TFTPBOOT/

Creating an initrd image for the nodes

mkinitrd setup

  1. bring in necessary executables to get everything loaded. Needs to includes:
    /sbin/ifconfig
    /sbin/dhclient
    /sbin/route
      
    This should also obviously include their dependencies.
  2. bring in necessary files/scripts. Needs to include:
    /etc/dhclient-script
    $CONFDIR/config-file
      
  3. bring in necessary modules, specifically the nic modules, and nfs modules:
    # network card modules
    nic_modules="via-velocity"
    
    # dependencies of network card modules
    nic_dep_modules="crc_ccitt"
                                                                                      
    # nfs and related dependencies
    net_modules="nfs af_packet sunrpc unix lockd"
      
  4. bring scripts into $INITRD/scripts. These are used to create necessary tmpfs mount points for directories that need them, among other things.

general files needed

/etc/hosts

node-specific files

The nodes will require some host-specific files:

should probably have some host-specific file that includes hostname/ip/mac/etc.
/etc/hostname

Each node will also require their own specific nfs mount for /var. To

other things to do

Cluster maintenance tools

update-cluster

To use, run update-cluster-regenerate to update various config files (using /usr/lib/update-cluster/[configuration].updatelist).

update-cluster-regenerate all

dsh

Way to issue commands to multiple nodes. (updated using update-cluster).


Building node kernel from source

admin@zajos:~$ 
cd /srv/cluster/nodes/kernel
tar xjf /usr/src/[kernel-source].tar.bz2
cd [kernel-source]
make-kpkg --rootcmd fakeroot --config menuconfig --append-to-version +abit-kv8-cluster --initrd --revision 0.1 kernel_image
admin@zajos:/srv/cluster/nodes/kernel# 

Lessdisks

Installed and configured package "lessdisks". This places a diskless root in /var/lib/lessdisks.

Creates the basic Debian system root for the nodes using debootstap:

debootstrap --verbose --resolve-deps $RELEASE $DISKLESSROOT $MIRROR
where
$RELEASE="sarge"
$MIRROR="http://debian.csail.mit.edu/debian-amd64/debian/"

lessdisks-install (configures lessdisks and installs diskless root in /var/lib/lessdisks/)

debug:
$LDROOT/etc/mkinitrd/scripts/netboot
sources
$LDROOT/etc/lessdisks/mkinitrd/initrd-netboot.conf
script_dir="$CONFDIR/install_scripts/"

scripts in $INITRD/scripts/ get run by....

**************************
initrd needs /var/lib/dhcp for leases!
**************************

added to $LDROOT/etc/lessdisks/mkinitrd/install_scripts/
60_mount_var
#!/bin/sh

# give dhclient a place for its leases
echo " Creating tmpfs /var..."
mount -nt tmpfs tmpfs /var || mount -nt ramfs ramfs /var
mkdir -p /var/lib/dhcp

chown root:lessdisks $LDROOT/etc/lessdisks/mkinitrd/install_scripts/60_mount_var 
chmod 755 $LDROOT/etc/lessdisks/mkinitrd/install_scripts/60_mount_var 

modified /etc/inittab in nfs root:
setup serial console in initrd
single virtual console

TODO:
what about "diskless"?
add "condor" client packages

How lessdisks works

during mkinitrd:
---------------
run /etc/mkinitrd/scripts/netboot
sources /etc/lessdisks/mkinitrd/initrd-netboot.conf (variables)
 cp all /etc/lessdisks/mkinitrd/install_scripts/ to $INITRDDIR/scripts/
 bring in extra executables specified in initrd-netboot.conf (initrd_exe)
  /usr/bin/cut
  /sbin/ifconfig
  /bin/grep
  /usr/bin/head
  /bin/hostname
  /usr/bin/tr
  /sbin/route
  /sbin/udhcpc
 bring in extra files specified in initrd-netboot.conf (initrd_files)
  /dev/ttyS0
  /dev/console
  /etc/lessdisks/mkinitrd/network_script
  /etc/lessdisks/mkinitrd/initrd-netboot.conf
 bring in executable dependencies (ldd $initrd_exe | sort -u | awk '{print $3}')
 copy in modules:
***************************
if [ -z "$MODULEDIR" ]; then
  echo "MODULEDIR not defined, exiting..."
  exit 1
fi

if [ -r "$MODULEDIR/modules.dep" ]; then
  modules_dep_file="$MODULEDIR/modules.dep"
fi

if [ -z "$modules_dep_file" ]; then
  modules_dep_file="/lib/modules/$(uname -r)/modules.dep"
fi

# TODO use "modprobe -nv $module" instead of module dependencies in variables

for m in $nic_modules $nic_dep_modules $net_modules ; do
  module=$(egrep "/$m.o:|/$m.ko:" $modules_dep_file | cut -d : -f 1)
  if [ -n "$module" ]; then
    install_modules="$install_modules $module"
  fi
done
# copy found modules into initrd dir
for m in $install_modules ; do
  if [ -r "$m" ]; then
    mdir="$(dirname $m)"   
    mkdir -p "$INITRDDIR/$mdir"
    cp -a $m $INITRDDIR/$m
  fi
done
***************************

add /etc/init.d/lessdisk /etc/init.d/lessdisk-session to nfsroot

loading initrd /scripts:
---------------


Last modified: Sat Sep 10 13:49:17 EDT 2005