Difference: BigClusterNodeInstructions (1 vs. 5)

Revision 52006-04-11 - TWikiGuest

Line: 1 to 1
 
META TOPICPARENT name="BigClusterProject20060411"

Big Cluster Node Instructions

If you are the DHCP master, see BigClusterDHCPInstructions.
Line: 21 to 21
 
  • Follow directions for trivial-net-setup.
    • Say YES/OK (Enter to accept) to everything.
    • When it tells you what your IP address is, write the last bit down in big letters (easy to read from six feet away). So if it tells you that you are 198.162.1.105, write down 105 on the card.
Changed:
<
<
    • Write your name (legibly!) on the card. If you are not the owner, write the owner's name down also. First name is probably enough.
>
>
    • Write your name (legibly!) on the index card. If you are not the owner, write the owner's name down also. First name is probably enough.
 
  • When you get to the option of logging in:
    • Login as root, using the password listed at the login prompt
Line: 31 to 31
 
    • exit
  • Sign in as bccd, with the password given earlier.
  • Answer yes when it asks if you want to run a heartbeat.
Deleted:
<
<

Switch to LAM

Everybody continue on to switching to LAM:
  • edit ~/.bashrc
    • edit the PATH line so that the line reads export PATH=/lam-mpi/bin:$PATH
    • write file and quit
  • source ~/.bashrc (or log out and log back in)
  • For each node, rebuild the library cache:
    • su - root (using the root password given)
    • ldconfig -v | less
    • exit (back to bccd)
 
  • bccd-allowall (Answer yes.)
  • bccd-snarfhosts
Changed:
<
<
  • recon -v ~/machines
  • lamboot -v ~/machines
>
>
  Windows users only have the option of
  • startx
Line: 54 to 42
 

Compile the target code

Everybody needs to compile the target code.
Changed:
<
<
  • The DHCP server needs to push out the demo code.
>
>
  • Ducky needs to push out the demo code.
 
  • The code will show up in a directory named something like /tmp/6g2w98s.
Changed:
<
<
  • cp -r /name.arch ~bccd/cs521 (where name is name of the code we're going to run, dirname is the tmp dir name and arch is either x86 or ppc, e.g. cp -r /tmp/6g2w98s/findPi.x86 ~bccd/cs521)
  • cd ~bccd/cs521
>
>
  • cp -r dirname/povray31 ~bccd/ (where dirname is something like /tmp/6g2w98s)

For x86 machines, the code should just run. PPC users (and those who feel inclined to do more) recompile by doing:

  • cd ~bccd/povray31/source/libpng
  • make -f makefile.lnx clean
  • make -f makefile.lnx
  • cd ~bccd/povray31/source/mpi-unix
  • change the makefile to use the Generic Linux CFLAGS: comment out line 71 and uncomment 65.
  • make clean
 
  • make
Added:
>
>
  • cd ~bccd/povray31
 
  • When you are done with all your setup, stick the index card in your keyboard (so that we can easily find a node if we need to).
Added:
>
>
Other code that we might run:
  • hello world
    • cd ~bccd
    • mpicc -o hello hello_*.c NOTE THE UNDERSCORE!
    • mpirun -np 8 -machinefile ~/machines hello
  • various code in ~bccd/lam-mpi. Note that it requires a fair amount of gymnastics to switch to LAM.


To switch to LAM

To switch to LAM, you need to do this:

  • edit ~/.bashrc
    • edit the PATH line so that the line reads export PATH=/lam-mpi/bin:$PATH
    • write file and quit
  • source ~/.bashrc (or log out and log back in)
  • For each node, rebuild the library cache:
    • su - root (using the root password given)
    • ldconfig -v | less
    • exit (back to bccd)
  • bccd-allowall (Answer yes.)
  • bccd-snarfhosts
  • recon -v ~/machines
  • lamboot -v ~/machines
 

As more machines come online, you might need to refresh your system state:

Revision 42006-04-11 - TWikiGuest

Line: 1 to 1
 
META TOPICPARENT name="BigClusterProject20060411"

Big Cluster Node Instructions

If you are the DHCP master, see BigClusterDHCPInstructions.
Line: 13 to 13
 

On the Mac, it will boot you straight through the boot sequence. On x86 machines:

Changed:
<
<
  • Boot up to the BCCD splash screen.
>
>
  • Boot up to the BCCD splash screen. (When it says boot>, ignore it. After a second, it will continue booting.)
 
  • Hit Enter

Everybody (x86 and PPC both) then:

Revision 32006-04-11 - TWikiGuest

Line: 1 to 1
 
META TOPICPARENT name="BigClusterProject20060411"

Big Cluster Node Instructions

If you are the DHCP master, see BigClusterDHCPInstructions.
Added:
>
>
Please read and follow these instructions!

You've been given this sheet and an index card. The index card is to identify your computer; instructions are later on.

 

Boot

  • Follow directions for trivial-net-setup. Hit Enter to select the highlighted answer and the arrow keys to change the selection.
Line: 16 to 20
 
  • Enter the password we give you on the day of the event.
  • Follow directions for trivial-net-setup.
    • Say YES/OK (Enter to accept) to everything.
Added:
>
>
    • When it tells you what your IP address is, write the last bit down in big letters (easy to read from six feet away). So if it tells you that you are 198.162.1.105, write down 105 on the card.
    • Write your name (legibly!) on the card. If you are not the owner, write the owner's name down also. First name is probably enough.
 
  • When you get to the option of logging in:
    • Login as root, using the password listed at the login prompt
Line: 54 to 60
 
  • cd ~bccd/cs521
  • make
Added:
>
>
  • When you are done with all your setup, stick the index card in your keyboard (so that we can easily find a node if we need to).
 

Revision 22006-04-11 - TWikiGuest

Line: 1 to 1
 
META TOPICPARENT name="BigClusterProject20060411"

Big Cluster Node Instructions

If you are the DHCP master, see BigClusterDHCPInstructions.
Line: 8 to 8
 
  • Follow directions for trivial-net-setup. Hit Enter to select the highlighted answer and the arrow keys to change the selection.

Changed:
<
<
All the other machines need to be clients. On the Mac, it will boot you straight through the boot sequence. On x86 machines:
>
>
On the Mac, it will boot you straight through the boot sequence. On x86 machines:
 
  • Boot up to the BCCD splash screen.
Changed:
<
<
  • Hit F3
  • Type framebuffer_mode_number nodemode (framebuffer_mode_number just refers to what screen resolution to use; 4 is 1024x768.)
>
>
  • Hit Enter
 
Changed:
<
<

Everybody (x86 and PPT both) then:

>
>
Everybody (x86 and PPC both) then:
 
  • Enter the password we give you on the day of the event.
  • Follow directions for trivial-net-setup.
Changed:
<
<
    • Say NO when it asks if it should autoconfigure with DHCP and YES/OK for everything else.
    • When it asks for IP addresses, configure as in the examples. You can just type in the addresses they use in the dialogs, which are
      • IP address 192.168.1.hostnumber (We will give you a hostnumber.)
      • netmastk 255.255.255.0
      • router address 192.168.1.254
      • DNS server 192.168.1.1
>
>
    • Say YES/OK (Enter to accept) to everything.
 
  • When you get to the option of logging in:
    • Login as root, using the password listed at the login prompt
    • Change the password. If you are helping the owner, let the owner set the password.
    • df to get a list of the mounted partitions
Changed:
<
<
    • umount partition for all of your local drive partitions, (e.g. umount /mnt/rw/discs/disc0/part3/home/fred) Macs don't seem to mount all your local drives.
>
>
    • umount partition for all of your local drive partitions, (e.g. umount /mnt/rw/discs/disc0/part3/home/fred) Macs don't seem to mount any of your local drives.
 
    • exit
  • Sign in as bccd, with the password given earlier.
  • Answer yes when it asks if you want to run a heartbeat.

Revision 12006-04-07 - TWikiGuest

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="BigClusterProject20060411"

Big Cluster Node Instructions

If you are the DHCP master, see BigClusterDHCPInstructions.

Boot

  • Follow directions for trivial-net-setup. Hit Enter to select the highlighted answer and the arrow keys to change the selection.

All the other machines need to be clients. On the Mac, it will boot you straight through the boot sequence. On x86 machines:

  • Boot up to the BCCD splash screen.
  • Hit F3
  • Type framebuffer_mode_number nodemode (framebuffer_mode_number just refers to what screen resolution to use; 4 is 1024x768.)

Everybody (x86 and PPT both) then:

  • Enter the password we give you on the day of the event.
  • Follow directions for trivial-net-setup.
    • Say NO when it asks if it should autoconfigure with DHCP and YES/OK for everything else.
    • When it asks for IP addresses, configure as in the examples. You can just type in the addresses they use in the dialogs, which are
      • IP address 192.168.1.hostnumber (We will give you a hostnumber.)
      • netmastk 255.255.255.0
      • router address 192.168.1.254
      • DNS server 192.168.1.1

  • When you get to the option of logging in:
    • Login as root, using the password listed at the login prompt
    • Change the password. If you are helping the owner, let the owner set the password.
    • df to get a list of the mounted partitions
    • umount partition for all of your local drive partitions, (e.g. umount /mnt/rw/discs/disc0/part3/home/fred) Macs don't seem to mount all your local drives.
    • exit
  • Sign in as bccd, with the password given earlier.
  • Answer yes when it asks if you want to run a heartbeat.

Switch to LAM

Everybody continue on to switching to LAM:
  • edit ~/.bashrc
    • edit the PATH line so that the line reads export PATH=/lam-mpi/bin:$PATH
    • write file and quit
  • source ~/.bashrc (or log out and log back in)
  • For each node, rebuild the library cache:
    • su - root (using the root password given)
    • ldconfig -v | less
    • exit (back to bccd)
  • bccd-allowall (Answer yes.)
  • bccd-snarfhosts
  • recon -v ~/machines
  • lamboot -v ~/machines

Windows users only have the option of

  • startx
but Mac users, your trackpad might not work; you might get stuck and hosed. Don't startx.

Compile the target code

Everybody needs to compile the target code.
  • The DHCP server needs to push out the demo code.
  • The code will show up in a directory named something like /tmp/6g2w98s.
  • cp -r /name.arch ~bccd/cs521 (where name is name of the code we're going to run, dirname is the tmp dir name and arch is either x86 or ppc, e.g. cp -r /tmp/6g2w98s/findPi.x86 ~bccd/cs521)
  • cd ~bccd/cs521
  • make


As more machines come online, you might need to refresh your system state:

  • bccd-allowall
  • bccd-snarfhosts
  • recon -v ~/machines
There is some order dependency that Ducky hasn't quite figured out yet; keep doing those and eventually it will all get settled out.

Mac users, if you have trouble rebooting into OS X immediately after booting into BCCD, try holding Control-Command-Power after an non-starting boot.

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback