- Socket-connection: the easiest, where everything is managed inside R. Socket connections run over direct TCP/IP connection and so can be used on virtual any machines. If you just want to use parallelization on one computer (laptop or workstation) or on very few machines, you are fine. Biggest advantage is that you do not have to install additional software to use this kind of clusters.
- MPI: Message Passing Interface. Basically an definition of a networking protocoll. There are several different implementations today, where openMPI is the most common and widely used. sfCluster uses the bit more older LAM, but will feature openMPI in the future, as well. Open MPI home.
- PVM: Parallel Virtual Machine. Most Unix distributions offers packages for PVM. PVM home.
- NetWorkSpaces/NWS is a framework for coordinating programs written in scripting languages. It has support for parallel computing with it's Sleight mode. NetWorkSpaces for R.
For most of the needed cluster techniques secure shell (SSH) connections needed. As these require the input of passwords, you should have access without password on these machines you want to use (even if it is only your local machine).
A description on howto install this can be found here.
First, install snow and snowfall on any machine you want to use in your cluster:
If you want to use more than one machine or a probably existing cluster installation in your institute (like MPI or PVM), most likely the desired R packages are installed (if you are not sure about the cluster techniques available, please consult your administrators). If the needed R packages are not present, here is the list to install them as well:
- Rmpi for MPI clusters (OpenMPI, LAM and other variants)
- rpvm for PVM clusters
- nws for NetWorkSpaces
Additional you probably want to install rlecuyer and/or rsprng for network enabled random number generators.
Now you are ready for a first test. We use the Socket cluster type, as this should run anywhere:
On the first call, we are running sequential mode, which means the program is just running on one cpu (like any "normal" R program before). This is just an example that you do not have to change your snowfall-using programs even if it is run on a machine without parallel computing possibilities. The call sfLapply is equivalent to the R function lapply.
We stop snowfall afterwards (sfStop()) and reinitialise it for running in parallel using 2 CPUs using the Socket cluster type (sfInit( parallel=TRUE, cpus=2, type="SOCK" )). Afterwards, the list functions runs on two CPUs, where CPU 1 is calculating list indices 1 and 3 and CPU 2 is calculating list indices 2 and 4.
The complete list of functions and options is listed in the snowfall help files (help( snowfall ), and the more detailed help files e.g. for the initialisation help( sfInit ), parallel calculations help( sfLapply ) or tools help( sfLibrary ).
- I just want to run parallel programs on my multicore laptop/workstation/PC. What do I need?You only need to install the R packages snow and snowfall. No further software is needed. sfCluster or any other workload/management solution is not needed.
- Using Socket or MPI clusters, I am asked for SSH passwords (on Unix). How to get rid of these?Secure Shell (SSH) can configured not to ask for passwords in most installations. In this document it is described how to get rid of the passwords on login.
- I am using snow. Is there a reason to change to snowfall?As snowfall is no technical abstraction above snow and mostly adds functions for comfort, you get nothing more. But due to better error control, some comfort functions, cleaner API, snowfall may be easier for people without further knowledge of computer networks.
- Does snowfall runs on Microsoft Windows?Yes. But there was an error in snowfall before 1.7 that prevented it from working correctly (shame on me!). If you want to run snowfall on Windows, be sure to use version 1.70 or later.
- Are the results from sequential and parallel runs absolutetly identically?Yes. But beware if your parallel functions are using random numbers. Use
sfClusterSetupPRNG()for reproducible results if run on the same amount of CPUs. If you really want to have identically results for sequential executions and parallel runs on different amounts of CPUs, you have to use some tricks (more documentation to come). - Are there worker/slave processes spawn on any sfClusterApply/sfLapply etc. call?No. worker R instances are only created on the
sfInit()call and reused on any parallel call. Only if callingsfStop()and afterwards reinitialise withsfInit()new workers are spawned.
- sfCluster just runs on LAM/MPI, why?Well, this is an "historical" decision. As we started, we only thought about an internal solution, but it grew so fast we released the program. We decided pro LAM, as it features a very handy ressource management, which is possible with openMPI only with additional programs. The support for openMPI is coming in the near future.
- sfCluster requires root (administrator) rights to install. Why?This is true, but it is possible to install it without root rights. The reason is not sfCluster itself, but the installation of the required Perl libraries and the R additions (like
R CMD par), as these need root rights.
How to install sfCluster without root rights:
The latter you can get rid off using the--disable-rwrapperoption on theconfigurecall, the first is a bit harder to manage. Sadly the CPAN (the Perl kind of CRAN) can not easily modified to install Packages in local directories. You can tweak that by changing CPANs installation path: Perlmonks article. If done so, sfCluster can be installed without root rights. But sadly, the Perl problems remains and therefore the installation needs root rights by default.
- R package snowfall (1.70, 2008-12-23): Unix package (tar.gz),
Windows package (Zip)
snowfall is also available through CRAN, too: snowfall on CRAN. Over this page you also have direct access to help files and vignette of snowfall.
Information about the lastest changes. - sfCluster sfCluster.tar.gz (0.4beta, 2008-06-23)
Small installation documentation: README.txt - sfCluster installation packages for Debian distribution, additional documentation and tutorials are coming soon...
sfCluster --help.
A tutorial will be available soon.
- The package peperr by Christine Porzelius is designed for prediction error estimation through resampling techniques and uses snowfall for automatically enhancing speed if a working cluster is accessible.
- Jochen Knaus and Christine Porzelius: Tutorial: Parallel computing using package snowfall (PDF)
- Materials, documents and solutions can be downloaded here (PDF + R Code)
- Jochen Knaus: The sfCluster/snowfall system: convenient parallel computing in R (PDF)
- Christine Porzelius: Practical experience with the sfCluster/snowfall system while developing the peperr package for prediction error estimation (PDF)
- Jochen Knaus: Creating R code and managing R programs for parallel execution with snowfall/sfCluster (PDF).
- Christine Porzelius: Developing a module for model evaluation based on the parallelization techniques of snowfall (PDF).
Login using Secure Shell (SSH) is common on *nix systems. Normally you need to type your password on login, which is quite unhandy for cluster computing. This section describes the way to get rid of the password typing on SSH logins.
Please note: we only cover OpenSSH version 2 in this description. The very old version 1 is not described. You can finf short informations e.g. on this site.
~/.ssh
("~" means: your home directory, most likely something like /home/yourlogin).
For login without password, the SSH program needs something to identify you. This is done via a pair of encrypted files: a private and a public key. First you have to create this pair of files with the command:
The suggested location is most likely correct, the command should offer your SSH data directory (i.e. the directory ~/.ssh, see above).
If asked for a pass phrase, do not enter one! (If you would, you have to type the pass phrase on each login - which most likely would not be a real improvement to typing your password)
The command will create two files (id_rsa and id_rsa.pub) in the SSH data directory. The first file contains your private key, the second file contains your public key. These files authorize yourself even without password. You should change the permissions for these two files using the following commands.
As your private key is the "opener" for the public key, it have to be restricted for yourself:
The public keyfile should now be copied to the file authorized_keys containing information on machines allowed to connect without password:
Please use two ">", as this will append the content of the file id_rsa.pub
to the file authorized_keys. One ">" would overwrite it.
Now, you can connect without password to your local machine:
The generated public key has to be appended to the file authorized_keys on any machine you want to
connect without password.
- snowfall 1.70 (2008-12-23):
Due to many bugfixes an update is strongly recommended!- Windows initialisation error fixed:
snowfall is now working correctly
and creates probably needed temporary directories
in the users home (for example running Windows Server 2003:
Documents and Settings/USER/My Documents/.sfCluster). - NWS startup works correct now (thanks to M. Schmidtberger for the little patch).
- sfSapply is now working as intended.
- New default is no logging of slave/worker output. Can be changed
via new argument
slaveOutfileatsfInitcall or the command line argument--tmpdir. If using sfCluster everything is set correctly anytimes. - Restore function of
sfClusterApplySRcan now be globally set using the new argumentrestoreonsfInitcall. This is equal to the command line argument--restoreSR, which can now also be simply written as--restore. - Bugfix for
sfSource()in sequential mode. - Changing CPU amount during runtime (with multiple
sfInit()calls with different settings in a single program) is now possible using socket and NWS clusters. - Many messages reworked and with changed behavior (many sfCluster depending warnings are now not displayed if used without it). Some typos are corrected.
sfCat()is now using argumentmaster, too.- Package vignette is slightly corrected and extended.
- Several code improvements without user relevance.
Sadly the reworking of the internal snowfall configuration (moving from global to namespace) has not finished yet. These cause several warnings (in this case: no errors!) during package check. - Windows initialisation error fixed:
snowfall is now working correctly
and creates probably needed temporary directories
in the users home (for example running Windows Server 2003:
- snowfall 1.60:
- snowfall now supports all snow featured cluster types:
Socket, MPI, PVM and nws. Type can be changed during initialisation
(on
sfInit()call).
Please note: On parallel mode, default is Socket-cluster now (previous MPI)! - Larger extension to help an vignette.
- snowfalls settings can be changed on the command line, even
without sfCluster. This worked in the previous version as well,
but is extended now:
jo@biom9:~$ R --no-save --args --cpus=4 --parallel --type=SOCK > library(snowfall) Loading required package: snow > sfInit() [...] snowfall 1.60 initialized (parallel=TRUE, CPUs=4) > sfType() [1] "SOCK" > sfCpus() [1] 4 > sfParallel() [1] TRUE
- snowfall now supports all snow featured cluster types:
Socket, MPI, PVM and nws. Type can be changed during initialisation
(on
- snowfall 1.53 (2008-08-22):
- Only one change: if a snowfall function is called without
having a call to
sfInit()first,sfInit()will be called without parameters. Useful with sfCluster, where sfInit() is always called without parameters:
jo@biom9:~$ sfCluster -i --cpus=4 Session-ID : 0T3NyYGN_R [...] ASSIGNED 4 cpus on 1 machines (4 requested). -- /usr/local/bin/sfCluster: START R-interactive session -- Rechner: biom9 > library(snowfall) > sum(unlist(sfLapply(1:10, exp))) 4 slaves are spawned successfully. 0 failed. [...] R Version: R version 2.5.1 (2007-06-27) snowfall 1.53 initialized (parallel=TRUE, CPUs=4) Warning message: Calling to snowfall function without calling sfInit. Calling sfInit() now. in: sfCheck() [1] 34843.77 > q()
- Only one change: if a snowfall function is called without
having a call to
jo[at]imbi.uni-freiburg.de
IMBI Freiburg, Germany