sfCluster installation --------------------------------------------------------- If you are installing sfCluster from tar.gz, please note that you install R Perl SSH (OpenSSH or replacements) LAM (with header files. Most likely packagw with "-dev") ncurses (with header files. Most likely packagw with "-dev") first. Depending CRAN packages (R) will be installed if not existing. Depending CPAN packages (Perl) will be installed if not existing. Both CRAN/CPAN installers may probe you for installation options. On CRAN, just choose mirror. On CPAN, you can leave all variables to their defaults (just enter through the questions) and just need to choose a specific mirror llike with CRAN. Installation --------------------------------------------------------- The installation is quite commmon Unix style, just run ./configure make make install for normal installation (using default directories and default R). Interesting options: --disable-rwrapper Do not install R wrappers for sfCluster calls, for following commands R CMD x where x can be PAR/PARBATCH/PARMON/SEQ for all sfCluster running modes. --disable-rpackages Do not install depending R packages (Rmpi, snow, snowfall) from CRAN. --disable-perlpackages Do not install depending Perl packages. --prefix= Install sfCluster to different place. Default prefix depends on your system (most likely /usr or /usr/local). --localstatedir= This is an important directory, where the slave log files are saved. This location needs to be identical on ALL nodes (even if the nodes have different sfCluster or R installations), but MUST be unique to each machine (so it shall not be located on any shared (NFS or something) drive)! Maybe you want to make an artificial or linkes directory on any machine pointing to a temporary directory. (Only temporary data is saved in this directory). For all configuration options call ./configure --help Host-Configuration --------------------------------------------------------- After a first installation you only have "localhost" configured for cluster usage. If you are running a single machine cluster (your local machine) you are fine. Else you have to configure any machine useable in the cluster. The hostfile is located usually in /usr/local/etc/sfCluster/hosts.cfg but this is depending on your prefix (during configuration). If you running a single machine and never get the amount of CPUs you have physically in your machine, you probably want to increase option "maxload" for localhost, too (as maxload is set to the amount of your CPUs during installation and depending on your load, you probably never get any CPU if you have only one machine). SSH access --------------------------------------------------------- You need to have passwordless SSH access to all machines (nodes) in your cluster. If you do not have this currently, search at google or take a look e.g. on this page: http://ariadne.mse.uiuc.edu/Cluster/ssh_log_through.html If your are running "lsh" instead of "OpenSSH" this can be quite difficult and much work (although result is safer). We propose OpenSSH for easier handling and enough security. Slave installation --------------------------------------------------------- You need to install sfCluster on any machine which should be used as a node in your cluster. Don't be afraid: you don't have to install it manually on each node, sfCluster builds an own installer on it's installation. After editing your hostfile you cann call sfNodeInstall nodename for any node included in hostfile. Installation will be identical to current installation if you don't overwrite parameters on the commandline. Call sfNodeInstall --help for more informations. As installation requires root (administrator) rights in most cases (see above), but most machines are configured to disallow SSH root logins, you can configure two methods to install nodes via sfNodeInstall: --su - change to root via 'su' (Debian, Suse, Fedora etc.) --sudo - change to root via 'sudo' (Ubuntu etc.) If root access is permitted, call it from an ordinary user account of course. Running / getting started --------------------------------------------------------- If you want to run sfCluster just on a single machine, you are finished. For real cluster usage, you have to edit the hostfile "host.cfg" AFTER the installation. For running you can either call sfCluster -i sfCluster -b sfCluster -m sfCluster -s or the wrappers (if installed): R CMD PAR R CMD PARBATCH R CMD PARMON R CMD SEQ For a start just call "sfCluster --help" or read the vignette of the snowfall package. Notes --------------------------------------------------------- Currently, the installation only works correct running as administrator (root). This is just because of the CPAN packages, which require a new CPAN-module themself to install in user directories. Sadly this package is new version is not available though common Linux distributions (like Ubuntu/Debian/Suse...). Use install is possible if you update that package manually first (as root), using perl -MCPAN -e 'install Bundle::CPAN' Internal notes --------------------------------------------------------- snow can be used as normal even after sfCluster installation (although snow's RunSnowNode and RMPInode.sh are replaced). Both work with default pathes of R as well, but run slaves not in "vanilla", but normal mode. Multiple R versions --------------------------------------------------------- By default, sfCluster is just installed for the default R (which is run if called "R"). For multiple R versions you need to configure the base pathes manually for the slave starting script "RunSnowNode" (located in path, depending on $exec_prefix on configure). Documentation is inside script. If you run 32- and 64-bit environments together, it should work out of the box, as R delivers it's "R CONF LIBnn" value, which is asked and set during installation. Known Bugs --------------------------------------------------------- On Ubuntu 8 (Hardy Heron) Rmpi does not work with LAM/MPI. On Ubuntu 7 ensure to install Rmpi NOT over Ubuntu package tools (Aptitude, apt-get), but install it using R (with "install.packages( 'Rmpi' )"). Future --------------------------------------------------------- - Possibility to use OpenMPI instead of LAM. - Installation packages for most commin Linux distributions (Ubuntu, Fedora, Suse). - Extended documentation. Deinstallation --------------------------------------------------------- Just call in installation folder. make deinstall