The Laboratory's MPI Programming Environment

Last Update : 22nd June 2001

[ MPI Page Index | Lab. Home Page | Lab. News | The Quick Help Index ]

All MPI users MUST read this page regularly.

Todos os usuarios de MPI DEVERIA ler esta página regularmente.

Breaking MPI News

... will appear here. Watch this space!

Please report any problems to suporte@ic.uff.br.

MPI Page Index

The LAM MPI Programming Environment installed on the Solaris based Sun Ultra machines in Room 1 of the Laboratory.
Help with writing MPI programs.
Running MPI programs in the Solaris Lab.

The MPI Programming Environment on Solaris Machines

The currently supported version of MPI 2 is the University of Notre Dame's LAM-MPI version 6.3.2 (soon to be 6.5.2).

LAM (Local Area Multicomputer) is an MPI programming environment and development system for heterogeneous computers on a network. With LAM, a dedicated cluster or an existing network computing infrastructure can act as one parallel computer solving one problem.

LAM features a full implementation of the MPI communication standard (with the exception that canceling sent messages is not supported) as well as extensive debugging support in the application development cycle and peak performance for production applications.

LAM is a daemon-based implementation of MPI. This means that a daemon process is launched on each machine that will be in the parallel environment. Once the daemons have been launched, LAM is ready to be used. A typical usage scenario is as follows:

Boot LAM on all the nodes;
Run MPI programs;
Shut down LAM.

LAM does not need to be booted in order to compile MPI programs.

LAM is a user-based MPI environment; each user who wishes to use LAM must boot their own LAM environment. LAM is not a client-server environment where a single LAM daemon can service all LAM users on a given machine.

MPI Tutorials and other Aids

The University of Notre Dame offers a "Getting Started with LAM" tutorial that, although somewhat biased towards the Notre Dame computing environment, is a good starting point to getting familiar with LAM.

[ Back to the MPI Index ]

Usando MPI Versão lam6.3.2 no Laboratorio

Essa seção descreve os passos a serem seguidos para compilação, preparação do ambiente e execução de aplicações MPI no rede Solaris do Lab. de Pós do IC.

IMPORTANTE: Leia TODOS os passos antes de começar seus testes, pois o mau uso do ambiente pode trazer problemas à rede.

Passo 1 - Configuração do Manual On Line de Unix

Esse passo não esta necessária se voce só usa o web-based Manual de UNIX é somente necessária uma unica vez antes que for utilizar o MPI)

Para que o manual on line funcione, é preciso adicionar o caminho /export/Solaris_2.5.1/apps/mpi2-lam6.3.2/lam-6.3.2/man à variável de ambiente MANPATH do arquivo de inicialização do seu shell (.cshrc si é csh, .brc si é bash).

Para isto utilize o editor de texto, escolha a opção "Open" do menu "File", selecione "Show" em arquivos escondidos e clique no arquivo ".cshrc". Your MANPATH entry should something like the following:

setenv MANPATH ":/usr/man:/export/local/man:/usr/openwin/share/man:/usr/openwin/man: /usr/share/man:/export/Solaris_2.5.1/apps/mpi2-lam6.3.2/lam-6.3.2/man"

Passo 2 - Compilação

Para compilar um programa MPI, utilize o comando mpicc.

Sintaxe mais utilizada:

prompt> mpicc -o

Para maiores informações consultar o manual on line (man mpicc).

Passo 3 - Inicialização do Ambiente

Para executar os programas que utilizam as bibliotecas do MPI 2 é preciso executar o comando lamboot que inicializa seu ambiente MPI na rede.

Obs: Para edição e compilação não é necessário que o ambiente tenha sido inicializado.

Sintaxe mais utilizada:

prompt> lamboot -v

Esse comando inicializa o MPI nas máquinas descritas no esquema de boot padrão e relata os passos importantes assim que estes são executados. Para maiores informações consultar o manual on line (man lamboot).

Caso seja reportado algum erro pelo comando lamboot, utilize o comando recon que verifica se a rede está apta a rodar o MPI de acordo com o esquema de boot padrão.

Sintaxe mais utilizada:

prompt> recon -v -a

Checa se o MPI pode ser inicializado em todas as máquinas unix descritas no esquema de boot padrão e relata os passos importantes assim que estes são executados.

The recon(1) tool checks if LAM can be started on the given boot schema. There are several prerequisites that enable LAM to be started on a remote machine:

The machine must be reachable and operational.
The user must have an account on the machine.
The user must be able to rsh(1) to the machine (permissions must be set in either the /etc/hosts.equiv file or the user's .rhosts file on the machine).
The LAM executables must be locatable on that machine, using the shell's search path and possibly the LAMHOME environment variable, as described above.
The shell's start-up script must not print anything on standard error. The user can take advantage of the fact that rsh(1) will start the shell non-interactively. The start-up script can exit early in this case, before executing many commands relevant only to interactive sessions and likely to generate output.

All of these prerequisites must be met before LAM will function properly. If recon does not complete successfully, the "-d" option will give verbose descriptions of what it tried to do, and suggestions to fix the problem.

Also keep in mind that just because recon works, lamboot itself may still fail. This usually happens when the hboot program (that lamboot invokes on remote nodes) fails for some reason. Again, the "-d" option to lamboot will enable extremely verbose output, and suggest solutions to common problems.

Para maiores informações consultar o manual on line (man recon)(man recon).

Passo 4A - Execução da Aplicação

Para executar um programa MPI, utilize o comando mpirun. Sintaxe mais utilizada:

prompt> mpirun -np

Para maiores informações consultar o manual on line (man mpirun).

Passo 4B - Execuções Interrompidas

Ao executar uma aplicação, processos podem ser interrompidos pelo usuário possibilitando que outros processos da mesma aplicação continuem em execução. O acúmulo destes processos sobrecarregam as máquinas, podendo travar servidores e parar a rede. Portanto ao perceber uma finalização anormal ou causá-la pressionando "CTRL+c", utilize o comando lamclean que elimina os seus processos em execução na rede.

Passo 5 - Encerramento do Ambiente

Após a utilização do ambiente MPI é preciso encerrá-lo, para isto utilize o comando wipe. Este comando executa automaticamente o comando tkill em todos os nós que fazem parte do esquema de boot, eliminando todos os processos MPI em execução que pertençam ao usuário.

Sintaxe mais utilizada:

prompt> wipe -v

Final Comments

Users should read the MPI manual page (on the Solaris Machines) to get started using the LAM commands, tools and libraries. The following Solaris machines are available to run your LAM-MPI programs:

beta.ic.uff.br;
camboim.ic.uff.br;
cambuci.ic.uff.br;
jaboticaba.ic.uff.br;
murici.ic.uff.br;
omega.ic.uff.br; and
pitomba.ic.uff.br

To access this environment remotely, first ssh ssh.ic.uff.br and then telnet cambuci.ic.uff.br. Remember do NOT run MPI programs on the linux servers abacate and manga.

The source for LAM-MPI can be found in

/export/Solaris_2.5.1/apps/mpi2-lam6.3.2/lam-6.3.2.

Compiling MPI Programs

Currently, the version of compiler gcc available on all Solaris machines only compiles C programs. A newer version (GCC version 2.95.3) will soon be made available at which point the latest version of LAM-MPI will be installed to run C, C++ and Fortran MPI programs.

[ Back to the MPI Index ]

[ Top of the Page | MPI Index | Lab. Home Page]

Last modified on the 22st June 2001, Vinod Rebello. suporte@ic.uff.br