sfCluster installation
---------------------------------------------------------
If you are installing sfCluster from tar.gz, please note that you install
R
Perl
SSH (OpenSSH or replacements)
LAM (with header files. Most likely packagw with "-dev")
ncurses (with header files. Most likely packagw with "-dev")
first.
Depending CRAN packages (R) will be installed if not existing.
Depending CPAN packages (Perl) will be installed if not existing.
Both CRAN/CPAN installers may probe you for installation options. On CRAN, just
choose mirror. On CPAN, you can leave all variables to their defaults (just enter
through the questions) and just need to choose a specific mirror llike with CRAN.
Installation
---------------------------------------------------------
The installation is quite commmon Unix style, just run
./configure
make
make install
for normal installation (using default directories and default R).
Interesting options:
--disable-rwrapper
Do not install R wrappers for sfCluster calls, for following commands
R CMD x
where x can be PAR/PARBATCH/PARMON/SEQ for all sfCluster running modes.
--disable-rpackages
Do not install depending R packages (Rmpi, snow, snowfall) from CRAN.
--disable-perlpackages
Do not install depending Perl packages.
--prefix=
Install sfCluster to different place. Default prefix depends on your system
(most likely /usr or /usr/local).
--localstatedir=
This is an important directory, where the slave log files are saved. This
location needs to be identical on ALL nodes (even if the nodes have different
sfCluster or R installations), but MUST be unique to each machine (so it
shall not be located on any shared (NFS or something) drive)!
Maybe you want to make an artificial or linkes directory on any machine
pointing to a temporary directory.
(Only temporary data is saved in this directory).
For all configuration options call
./configure --help
Host-Configuration
---------------------------------------------------------
After a first installation you only have "localhost" configured for cluster usage.
If you are running a single machine cluster (your local machine) you are fine.
Else you have to configure any machine useable in the cluster.
The hostfile is located usually in
/usr/local/etc/sfCluster/hosts.cfg
but this is depending on your prefix (during configuration).
If you running a single machine and never get the amount of CPUs you have physically
in your machine, you probably want to increase option "maxload" for localhost, too
(as maxload is set to the amount of your CPUs during installation and depending on
your load, you probably never get any CPU if you have only one machine).
SSH access
---------------------------------------------------------
You need to have passwordless SSH access to all machines (nodes) in your cluster.
If you do not have this currently, search at google or take a look e.g. on this
page:
http://ariadne.mse.uiuc.edu/Cluster/ssh_log_through.html
If your are running "lsh" instead of "OpenSSH" this can be quite difficult and much
work (although result is safer). We propose OpenSSH for easier handling and enough
security.
Slave installation
---------------------------------------------------------
You need to install sfCluster on any machine which should be used as a node
in your cluster.
Don't be afraid: you don't have to install it manually on each node, sfCluster
builds an own installer on it's installation.
After editing your hostfile you cann call
sfNodeInstall nodename
for any node included in hostfile. Installation will be identical to current
installation if you don't overwrite parameters on the commandline. Call
sfNodeInstall --help
for more informations.
As installation requires root (administrator) rights in most cases (see above), but
most machines are configured to disallow SSH root logins, you can configure two
methods to install nodes via sfNodeInstall:
--su - change to root via 'su' (Debian, Suse, Fedora etc.)
--sudo - change to root via 'sudo' (Ubuntu etc.)
If root access is permitted, call it from an ordinary user account of course.
Running / getting started
---------------------------------------------------------
If you want to run sfCluster just on a single machine, you are finished.
For real cluster usage, you have to edit the hostfile "host.cfg" AFTER the
installation.
For running you can either call
sfCluster -i
sfCluster -b
sfCluster -m
sfCluster -s
or the wrappers (if installed):
R CMD PAR
R CMD PARBATCH
R CMD PARMON
R CMD SEQ
For a start just call "sfCluster --help" or read the vignette of the snowfall
package.
Notes
---------------------------------------------------------
Currently, the installation only works correct running as administrator (root).
This is just because of the CPAN packages, which require a new CPAN-module themself
to install in user directories. Sadly this package is new version is not available
though common Linux distributions (like Ubuntu/Debian/Suse...).
Use install is possible if you update that package manually first (as root), using
perl -MCPAN -e 'install Bundle::CPAN'
Internal notes
---------------------------------------------------------
snow can be used as normal even after sfCluster installation (although snow's
RunSnowNode and RMPInode.sh are replaced). Both work with default pathes of R
as well, but run slaves not in "vanilla", but normal mode.
Multiple R versions
---------------------------------------------------------
By default, sfCluster is just installed for the default R (which is run if called
"R").
For multiple R versions you need to configure the base pathes manually for the
slave starting script "RunSnowNode" (located in path, depending on $exec_prefix
on configure).
Documentation is inside script.
If you run 32- and 64-bit environments together, it should work out of the box,
as R delivers it's "R CONF LIBnn" value, which is asked and set during installation.
Known Bugs
---------------------------------------------------------
On Ubuntu 8 (Hardy Heron) Rmpi does not work with LAM/MPI.
On Ubuntu 7 ensure to install Rmpi NOT over Ubuntu package tools (Aptitude, apt-get),
but install it using R (with "install.packages( 'Rmpi' )").
Future
---------------------------------------------------------
- Possibility to use OpenMPI instead of LAM.
- Installation packages for most commin Linux distributions (Ubuntu, Fedora, Suse).
- Extended documentation.
Deinstallation
---------------------------------------------------------
Just call in installation folder.
make deinstall