Licensed Software/MATLAB

From HPCwiki
(Redirected from Matlab)
Jump to navigation Jump to search

MATLAB is a commercial numerical computing language owned by MathWorks. It is installed on Anunna under the WUR academic licence, so it is available to WUR users only. It is one of the licensed packages on the cluster.

You can run MATLAB on the cluster either by loading the module and working in a cluster session, or by configuring a MATLAB client on your own desktop to submit jobs to Anunna. The sections below cover the parallel-computing setup, which lets MATLAB dispatch work to the cluster's compute nodes.

Configuring MATLAB on the cluster

After logging in, load the module and run configCluster.sh once per MATLAB version. This configures MATLAB so that jobs default to the cluster rather than the login node.

module load matlab
configCluster.sh

Configuring a MATLAB client on your desktop

To submit from MATLAB on your own machine, first install the WUR support package. Download the archive for your platform:

Unpack it in the directory MATLAB reports for userpath, then configure the cluster profile (once per MATLAB version):

>> userpath
>> configCluster

Submitting to the cluster requires SSH credentials; MATLAB prompts for your SSH username and password or identity file and remembers them for later sessions. Jobs then default to the cluster instead of your local machine. To submit to your local machine instead:

>> c = parcluster('local');

Configuring jobs

Before submitting, set job parameters through AdditionalProperties. Only MemUsage and WallTime are required; the rest are optional.

>> c = parcluster;

% Required
>> c.AdditionalProperties.MemUsage = '6gb';        % memory per core (default 4gb)
>> c.AdditionalProperties.WallTime = '05:00:00';   % walltime, e.g. 5 hours

% Optional
>> c.AdditionalProperties.AccountName = 'account-name';
>> c.AdditionalProperties.Comment = 'a-comment';
>> c.AdditionalProperties.Constraint = 'V100';     % request a specific GPU flavour
>> c.AdditionalProperties.EmailAddress = 'user-id@wur.nl';
>> c.AdditionalProperties.GpusPerNode = 1;
>> c.AdditionalProperties.QoS = 'the-qos';         % default: std
>> c.AdditionalProperties.QueueName = 'queue-name';
>> c.AdditionalProperties.RequireExclusiveNode = true;
>> c.AdditionalProperties.Reservation = 'a-reservation';
>> c.AdditionalProperties.Tmp = '20g';             % local /tmp space

Save the profile so the settings persist between MATLAB sessions, and display the current values with:

>> c.saveProfile
>> c.AdditionalProperties

Unset a value by assigning an empty string and saving again:

>> c.AdditionalProperties.EmailAddress = '';
>> c.saveProfile

Interactive pool jobs

Use parpool as usual; the pool now runs across cluster nodes rather than locally. Depending on cluster load, it may take a while for the requested resources to become available.

>> c = parcluster;
>> pool = c.parpool(64);          % open a pool of 64 workers

>> parfor idx = 1:1000
       a(idx) = ...
   end

>> pool.delete                    % close the pool when done

Independent batch jobs

The batch command submits asynchronous jobs and returns a job object used to retrieve the output.

>> c = parcluster;
>> job = c.batch(@pwd, 1, {}, 'CurrentFolder', '.', 'AutoAddClientPath', false);

>> job.State                      % query state
>> job.fetchOutputs{:}            % fetch results once finished
>> job.delete                     % delete when no longer needed

To list current and past jobs, and fetch results from an earlier one:

>> c = parcluster;
>> jobs = c.Jobs;                 % array of all jobs
>> job2 = c.Jobs(2);              % a job by index
>> job2.fetchOutputs{:}

fetchOutputs retrieves function return values; for a script job, use load instead. Data written to files on the cluster must be retrieved from the filesystem directly.

Parallel batch jobs

To run a parallel workflow, pass a Pool size to batch. Take this example function, saved as parallel_example.m:

function [t, A] = parallel_example(iter)

if nargin == 0
    iter = 8;
end

disp('Start sim')
t0 = tic;
parfor idx = 1:iter
    A(idx) = idx;
    pause(2)
end
t = toc(t0);
disp('Sim completed')
save RESULTS A

end

Submit it with a pool of workers:

>> c = parcluster;
>> job = c.batch(@parallel_example, 1, {16}, 'Pool', 4, 'CurrentFolder', '.', 'AutoAddClientPath', false);
>> job.State
>> job.fetchOutputs{:}

A pool job always requests N+1 CPU cores: one worker manages the pool, so a job needing eight workers consumes nine cores. Allocating too many workers can give diminishing returns once the coordination overhead outweighs the computation, so tune the worker count for your workload.

To retrieve a job later, keep its ID and find it again:

>> id = job.ID
>> clear job

>> c = parcluster;
>> job = c.findJob('ID', id);
>> job.State
>> job.fetchOutputs{:}

Debugging

If a job errors, view its log with getDebugLog. For an independent multi-task job, give the task; for a pool job, give the job object:

>> c.getDebugLog(job.Tasks(3))   % independent job, specific task
>> c.getDebugLog(job)            % pool job

If an administrator asks for the scheduler ID of a job:

>> schedID(job)

See also