Tags:
create new tag
view all tags

Booting DHCP for the Big Cluster

To be the DHCP server, you must be on an x86 computer.

DHCP server, follow this sequence:

  • Boot up to the BCCD splash screen
  • Hit F3, then type framebuffer_mode_number startdhcp (framebuffer_mode_number just refers to what screen resolution to use; 4 is 1024x768.)
  • Enter the password we decide on the day of the event.
  • Follow directions for trivial-net-setup. Hit Enter to select the highlighted answer and the arrow keys to change the selection.
    • Say NO when it asks if it should autoconfigure with DHCP and YES/OK for everything else.
    • When it asks for IP addresses, configure as in the examples. You can just type in the addresses they use in the dialogs, which are
      • IP address 192.168.1.1
      • netmastk 255.255.255.0
      • router address 192.168.1.254
      • DNS server 192.168.1.1

  • When you get to the option of logging in:
    • Login as root, using the password listed at the login prompt
    • Change the password. If you are helping the owner, let the owner set the password.
    • Copy the example code from wherever to ~bccd/src
    • chmod -R bccd ~/src
    • df to get a list of the mounted partitions
    • umount partition for all of your local drive partitions, (e.g. umount /mnt/rw/discs/disc0/part3/home/fred) Macs don't seem to mount all your local drives.
    • exit
  • Sign in as bccd, with the password given earlier.
  • Answer yes when it asks if you want to run a heartbeat.

Switch to LAM

Switch to LAM:
  • edit ~/.bashrc
    • edit the PATH line so that the line reads export PATH=/lam-mpi/bin:$PATH
    • write file and quit
  • source ~/.bashrc (or log out and log back in)
  • For each node, rebuild the library cache:
    • su - root (using the root password given)
    • ldconfig -v | less
    • exit (back to bccd)
  • bccd-allowall (Answer yes.)
  • bccd-snarfhosts
  • recon -v ~/machines (It might take a few tries for this to work, not sure why.)
  • lamboot -v ~/machines

You have the option of

  • startx

Compile the target code

Everybody needs to compile the target code.
  • bccd-syncdir ~bccd/src ~/machines
  • cp -r dirname/cs521.arch ~bccd/cs521 (where dirname is the tmp dir name and arch is either x86 or ppc, e.g. cp -r /tmp/6g2w98s/cs521.x86 ~bccd/cs521)
  • cd ~bccd/cs521
  • make
  • run the program

As more machines come online, you might need to refresh your system state:

  • bccd-allowall
  • bccd-snarfhosts
  • recon -v ~/machines

There is some order dependency that Ducky hasn't quite figured out yet; keep doing those and eventually it will all get settled out.

Topic revision: r1 - 2006-04-07 - TWikiGuest
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback