SC-Camp 2014 - Challanges

Introduction

We first need access to the guane cluster to start using a distributed infrastructure. OAR is the batch scheduler that will grant users exclusive access to cluster computing nodes.

Some useful links:

Create an account to access Guane.

Go to the website https://grid.uis.edu.co/usuarios. To create an account use the link Solicitud de Cuenta. Fill the form with your data, on the field Proyecto use adiestramiento SC3. Please avoid using special caracters such as accents, it may crash the system.

Using Guane: Step-by-step

Log into the cluster facillity.

$ ssh <youruser> toctoc.grid.uis.edu.co
youruser@toctoc:~$

Once logged in access the front end through 192.168.66.70.

$ ssh 192.168.66.70
youruser@guane:~$

List the available nodes.

$ oarnodes -l
guane09
guane10
guane11
guane12
guane13
guane14
guane15
guane16

You can ask oar to run an specific command for you on all nodes. Below we ask OAR to run hostname (a command that outputs the hostname of the current machine on the screen). The command runs on all nodes. Note that the nodes assigned to you on a reservation may vary from the example below.

$ oarsub -l nodes=2 hostname
[ADMISSION RULE] Set default walltime to 7200.
[ADMISSION RULE] Modify resource description with type constraints
OAR_JOB_ID=552

The output of your job is on files OAR.xxx.stderr and OAR.xxx.stdout, where xxx is the number of your job ID. In the last case we get 552 as job id, so oar created files OAR.552.stdout and OAR.552.stderr. To check the standard output of our program on all hosts we can list the file content.

$ cat OAR.552.stdout 
guane15
guane16

First challenge.

Crack a compressed file password by brute force method

We have hidden the answers of all exercises of the MPI hand-on in a compressed tgz file. You as a good lazy student is willing to use this in the next days so you can enjoy more the nature and the camp facility.

To do so we are going to help you. But it is improbable that you will make it through without using the cluster and grid computing programming skills you are learning here ;)

The brute force method consist in simply trying to scan for all the possible password existents. Of course, if we don't give you any tip about the password it would take years to crack the password even with the help of a small cluster. Although, as we like you all, we will say that the password is a number (only digits) in the range [1, 10 000 000].

You can find the files for this challenge here http://www.sc-camp.org/challange/students_pack.tgz.

Tips

First copy the files to guane.

$ scp students_pack.tgz <youruser>@toctoc.grid.uis.edu.co:.
students_pack.tgz                                     100%   22KB 21.7KB/s   00:00

To uncompress the student files

$ tar zxvf students_pack.tgz

The script crack-secret-opessl.sh tries to crack the password on an specific range. For instance to try a range between 111 and 222, including 111 and 222 use:

$ ./crack-secret-opessl.sh 111 222

You big problem is to give different tasks for each processor. When each process tries a different range the solution may be found in less time.

Good luck!

Second challenge

Parallel Rendering

Now that you are already familiar with parallel and distributed computing we have a more advanced challange for you. We create an exercise based on programming contests where the understanding of the problem is part of the problem. So, the problem is found in this link http://www.sc-camp.org/challange/anti-aliasing.pdf

Tips

To generate random input we have made a special program that can be found here http://www.sc-camp.org/challange/generate.c.

The rendering is just a small example of an application that can profit from several parallel and distributed programming practices. The challenge is to split the original matrix in regions so that each computer can run on a separate set. This way we can avoid sending/receiving the entire matrix to all process.

OpenMP and CUDA are two examples of programming practices that can be used to achieve in-node parallelism. MPI can be used to distribute the computation among nodes.

The boarder of each region creates a problemas because pixels in the edge of the regions must use information from other regions. To avoid a complex communication approach send redundant information to nodes.