This article will some guideline on how to do MATLAB-based number crunching on Amazon’s Elastic Cloud Computing service (EC2). However, this is not a detailed HOWTO. I just want to point you to the right direction.
While working on my diploma thesis I had to simulate a huge number of independent random experiments to acquire a database for a solid statistical analysis of the results of those experiments.
Instead of running n experiments in a single process I was able modify my simulation to run in two independent processes each with n/2 experiments. Due to the large number of experiments and the independence of each experiment, my problem was almost arbitrary scaleable – and therefore perfect for cloud computing.
Of course you need MATLAB but also the MATLAB compiler which is a separate product. I’m assuming that you have access to a cloud computing service that offers you a linux-based machine. I used EC2 with a basic Debian installation. It can’t hurt to have a local (virtual) machine with the same OS as the target machine to do some local testing.
As far as I can tell, MATLAB’s GUI does not support cross-compiling so you might need to run MATLAB on the same architecture (x86 or amd64) and roughly the same OS (Windows or Linux) as your target machine. I did all of this on an 64 bit Arch Linux.
We will use the MATLAB compiler to transform your MATLAB code into C and then compile an executable binary for your target system (most likely a 64 bit Linux). Then we will transfer the binary with a runtime environment to the cloud and run multiple instances of it to do the number crunching.
Preparing the MATLAB code
First, you have to create a main function file, for example main.m. The function implemented in this file can take arguments form the command line which is handy because you might want to do the same calculation with different parameters in parallel or at least want the processes to save their results in different files.
All command line arguments passed to the main function are passed as strings so you might need to do some type converting to get their correct representation. A simple main.m might look like this:
This calculates the square root of the first parameter x and saves the result in a file specified by the second parameter datapath.
Generating the binary package
Now it’s time to compile the executable binary file. Open the compiler GUI by typing “deploytool” into the MATLAB command line. Create a new standalone application project and save it. I named my example project SquareRoot.prj. Then add main.m in the “Main function” subfolder. Other files used by your script should be added to the “Other files” subfolder.
You can now try to build the binary file. If no errors occur then you’re fine. If you run into errors, don’t panic. I had problems during linking because my MATLAB distribution shipped its own 64 bit libraries but not its own gcc. Because I have a recent linux distribution and a relatively old MATLAB build, my local gcc was on a newer version than the corresponding libc shipped with MATLAB. However, thanks to my Linux distribution I was able to install a matching version of the gcc. I then edited the CC environmental variable in the glnxa64 section of ~/.matlab/<MATLAB version>/mbuildops.sh to point to the matching gcc version. After that I had to create some symlinks in /opt/matlab/bin/glnxa64 pointing to the appropriate libraries of my Linux distribution to resolve the remaining linking errors.
After successfully compiling your MATLAB code, you will find two files in the <project name>/distrib subdirectory: an executable binary file named after your project and a luncher script named run_<project name>.sh. Create a tar archiv containing these two files and the MATLAB Compiler Runtime (MCR) installer (I found mine at /opt/matlab/toolbox/compiler/deploy/glnxa64/MCRInstaller.bin). Then copy the archive to your test machine (or directly to your cloud instance if you’re tough).
On your test machine, run MCRInstaller.bin (with “-console” flag if you don’t have an X display) to install the MCR, for example in ~/MCR. Because the installer might have more dependencies than your script, you should directly copy the ~/MCR directory and not the MCR installer to your EC2 instance, so you don’t have to run the installer again.
Now you can try calling run_<project name>.sh. The first parameter is the path to your MCR. The following arguments are passed to your MATLAB code. For my example project I would run (replace XX with your MCR Version):
./run_SquareRoot.sh ~/MCR/vXX 5 data1.mat
On your first try, this will probably cause some errors due to missing libraries in your Linux install. Try to install those. The following worked for me:
apt-get install xserver-xorg libxp6
That’s it. Move your binary file, the loader shell script and your MCR directory to your cloud computing instance and let the number crunching begin. 😉
You might really want to check out GNU Octave which is an open-source MATLAB alternative. Its syntax is MATLAB compatible so your scripts might run on Octave without any changes. If you do not use special MATLAB toolboxes or objects, chances are that you can use Octave instead of MATLAB and therefore avoid messing around with the MATLAB compiler, its runtime and those library issues. Just install Octave on your cloud instance, for example with
and directly run your MATLAB scripts.
In my case, I managed to get my script working with Octave by only making minor changes to avoid the use of objects but Octave turned out to need about 50 time more computation time than MATLAB. That pretty much cancels out the advantage of cloud computing for me. That’s why I still use MATLAB.