Licensed Software/MATLAB: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
Created page with "MATLAB is a non-free calculation language owned by Mathworks. The HPC has this installed as part of the WUR academic license, and so this is only available to WUR users. == U..."
 
Phase 1 § 5 P1.5.8: rewrite — sentence-case headings, fix broken [url|text] links, syntaxhighlight matlab, fix MATLAB ... continuation; link into Licensed Software hub (via update-page on MediaWiki MCP Server)
 
(7 intermediate revisions by 2 users not shown)
Line 1: Line 1:
MATLAB is a non-free calculation language owned by Mathworks. The HPC has this installed as part of the WUR academic license, and so this is only available to WUR users.
MATLAB is a commercial numerical computing language owned by MathWorks. It is installed on Anunna under the WUR academic licence, so it is available to WUR users only. It is one of the [[Licensed Software | licensed packages]] on the cluster.


== Using the Parallel Computing Toolbox ==
You can run MATLAB on the cluster either by loading the module and working in a cluster session, or by configuring a MATLAB client on your own desktop to submit jobs to Anunna. The sections below cover the parallel-computing setup, which lets MATLAB dispatch work to the cluster's compute nodes.


It's possible to remotely submit jobs to the cluster using the PCT if correctly configured.
== Configuring MATLAB on the cluster ==


See here: https://git.wageningenur.nl/WUR-MATLAB-tools/ParallelComputingToolbox-HPCAG
After logging in, load the module and run <code>configCluster.sh</code> once per MATLAB version. This configures MATLAB so that jobs default to the cluster rather than the login node.
 
<syntaxhighlight lang="bash">
module load matlab
configCluster.sh
</syntaxhighlight>
 
== Configuring a MATLAB client on your desktop ==
 
To submit from MATLAB on your own machine, first install the WUR support package. Download the archive for your platform:
 
* [https://git.wur.nl/WUR-MATLAB-tools/support_packages/-/raw/main/wur.nonshared.R2022a.zip?inline=false Windows (R2022a, .zip)]
* [https://git.wur.nl/WUR-MATLAB-tools/support_packages/-/raw/main/wur.nonshared.R2022a.tar.gz?inline=false Linux / macOS (R2022a, .tar.gz)]
 
Unpack it in the directory MATLAB reports for <code>userpath</code>, then configure the cluster profile (once per MATLAB version):
 
<syntaxhighlight lang="matlab">
>> userpath
>> configCluster
</syntaxhighlight>
 
Submitting to the cluster requires SSH credentials; MATLAB prompts for your SSH username and password or identity file and remembers them for later sessions. Jobs then default to the cluster instead of your local machine. To submit to your local machine instead:
 
<syntaxhighlight lang="matlab">
>> c = parcluster('local');
</syntaxhighlight>
 
== Configuring jobs ==
 
Before submitting, set job parameters through <code>AdditionalProperties</code>. Only <code>MemUsage</code> and <code>WallTime</code> are required; the rest are optional.
 
<syntaxhighlight lang="matlab">
>> c = parcluster;
 
% Required
>> c.AdditionalProperties.MemUsage = '6gb';        % memory per core (default 4gb)
>> c.AdditionalProperties.WallTime = '05:00:00';  % walltime, e.g. 5 hours
 
% Optional
>> c.AdditionalProperties.AccountName = 'account-name';
>> c.AdditionalProperties.Comment = 'a-comment';
>> c.AdditionalProperties.Constraint = 'V100';    % request a specific GPU flavour
>> c.AdditionalProperties.EmailAddress = 'user-id@wur.nl';
>> c.AdditionalProperties.GpusPerNode = 1;
>> c.AdditionalProperties.QoS = 'the-qos';        % default: std
>> c.AdditionalProperties.QueueName = 'queue-name';
>> c.AdditionalProperties.RequireExclusiveNode = true;
>> c.AdditionalProperties.Reservation = 'a-reservation';
>> c.AdditionalProperties.Tmp = '20g';            % local /tmp space
</syntaxhighlight>
 
Save the profile so the settings persist between MATLAB sessions, and display the current values with:
 
<syntaxhighlight lang="matlab">
>> c.saveProfile
>> c.AdditionalProperties
</syntaxhighlight>
 
Unset a value by assigning an empty string and saving again:
 
<syntaxhighlight lang="matlab">
>> c.AdditionalProperties.EmailAddress = '';
>> c.saveProfile
</syntaxhighlight>
 
== Interactive pool jobs ==
 
Use <code>parpool</code> as usual; the pool now runs across cluster nodes rather than locally. Depending on cluster load, it may take a while for the requested resources to become available.
 
<syntaxhighlight lang="matlab">
>> c = parcluster;
>> pool = c.parpool(64);          % open a pool of 64 workers
 
>> parfor idx = 1:1000
      a(idx) = ...
  end
 
>> pool.delete                    % close the pool when done
</syntaxhighlight>
 
== Independent batch jobs ==
 
The <code>batch</code> command submits asynchronous jobs and returns a job object used to retrieve the output.
 
<syntaxhighlight lang="matlab">
>> c = parcluster;
>> job = c.batch(@pwd, 1, {}, 'CurrentFolder', '.', 'AutoAddClientPath', false);
 
>> job.State                      % query state
>> job.fetchOutputs{:}            % fetch results once finished
>> job.delete                    % delete when no longer needed
</syntaxhighlight>
 
To list current and past jobs, and fetch results from an earlier one:
 
<syntaxhighlight lang="matlab">
>> c = parcluster;
>> jobs = c.Jobs;                % array of all jobs
>> job2 = c.Jobs(2);              % a job by index
>> job2.fetchOutputs{:}
</syntaxhighlight>
 
<code>fetchOutputs</code> retrieves function return values; for a script job, use <code>load</code> instead. Data written to files on the cluster must be retrieved from the filesystem directly.
 
== Parallel batch jobs ==
 
To run a parallel workflow, pass a <code>Pool</code> size to <code>batch</code>. Take this example function, saved as <code>parallel_example.m</code>:
 
<syntaxhighlight lang="matlab">
function [t, A] = parallel_example(iter)
 
if nargin == 0
    iter = 8;
end
 
disp('Start sim')
t0 = tic;
parfor idx = 1:iter
    A(idx) = idx;
    pause(2)
end
t = toc(t0);
disp('Sim completed')
save RESULTS A
 
end
</syntaxhighlight>
 
Submit it with a pool of workers:
 
<syntaxhighlight lang="matlab">
>> c = parcluster;
>> job = c.batch(@parallel_example, 1, {16}, 'Pool', 4, 'CurrentFolder', '.', 'AutoAddClientPath', false);
>> job.State
>> job.fetchOutputs{:}
</syntaxhighlight>
 
A pool job always requests '''N+1''' CPU cores: one worker manages the pool, so a job needing eight workers consumes nine cores. Allocating too many workers can give diminishing returns once the coordination overhead outweighs the computation, so tune the worker count for your workload.
 
To retrieve a job later, keep its ID and find it again:
 
<syntaxhighlight lang="matlab">
>> id = job.ID
>> clear job
 
>> c = parcluster;
>> job = c.findJob('ID', id);
>> job.State
>> job.fetchOutputs{:}
</syntaxhighlight>
 
== Debugging ==
 
If a job errors, view its log with <code>getDebugLog</code>. For an independent multi-task job, give the task; for a pool job, give the job object:
 
<syntaxhighlight lang="matlab">
>> c.getDebugLog(job.Tasks(3))  % independent job, specific task
>> c.getDebugLog(job)            % pool job
</syntaxhighlight>
 
If an administrator asks for the scheduler ID of a job:
 
<syntaxhighlight lang="matlab">
>> schedID(job)
</syntaxhighlight>
 
== See also ==
 
* [[Licensed Software]]
* [[Environment Modules]]
* [[Batch Jobs]]
 
== External links ==
 
* [https://www.mathworks.com/help/parallel-computing/ MATLAB Parallel Computing Toolbox documentation]
* [https://www.mathworks.com/help/parallel-computing/examples.html MATLAB Parallel Computing examples]

Latest revision as of 14:23, 16 June 2026

MATLAB is a commercial numerical computing language owned by MathWorks. It is installed on Anunna under the WUR academic licence, so it is available to WUR users only. It is one of the licensed packages on the cluster.

You can run MATLAB on the cluster either by loading the module and working in a cluster session, or by configuring a MATLAB client on your own desktop to submit jobs to Anunna. The sections below cover the parallel-computing setup, which lets MATLAB dispatch work to the cluster's compute nodes.

Configuring MATLAB on the cluster

After logging in, load the module and run configCluster.sh once per MATLAB version. This configures MATLAB so that jobs default to the cluster rather than the login node.

module load matlab
configCluster.sh

Configuring a MATLAB client on your desktop

To submit from MATLAB on your own machine, first install the WUR support package. Download the archive for your platform:

Unpack it in the directory MATLAB reports for userpath, then configure the cluster profile (once per MATLAB version):

>> userpath
>> configCluster

Submitting to the cluster requires SSH credentials; MATLAB prompts for your SSH username and password or identity file and remembers them for later sessions. Jobs then default to the cluster instead of your local machine. To submit to your local machine instead:

>> c = parcluster('local');

Configuring jobs

Before submitting, set job parameters through AdditionalProperties. Only MemUsage and WallTime are required; the rest are optional.

>> c = parcluster;

% Required
>> c.AdditionalProperties.MemUsage = '6gb';        % memory per core (default 4gb)
>> c.AdditionalProperties.WallTime = '05:00:00';   % walltime, e.g. 5 hours

% Optional
>> c.AdditionalProperties.AccountName = 'account-name';
>> c.AdditionalProperties.Comment = 'a-comment';
>> c.AdditionalProperties.Constraint = 'V100';     % request a specific GPU flavour
>> c.AdditionalProperties.EmailAddress = 'user-id@wur.nl';
>> c.AdditionalProperties.GpusPerNode = 1;
>> c.AdditionalProperties.QoS = 'the-qos';         % default: std
>> c.AdditionalProperties.QueueName = 'queue-name';
>> c.AdditionalProperties.RequireExclusiveNode = true;
>> c.AdditionalProperties.Reservation = 'a-reservation';
>> c.AdditionalProperties.Tmp = '20g';             % local /tmp space

Save the profile so the settings persist between MATLAB sessions, and display the current values with:

>> c.saveProfile
>> c.AdditionalProperties

Unset a value by assigning an empty string and saving again:

>> c.AdditionalProperties.EmailAddress = '';
>> c.saveProfile

Interactive pool jobs

Use parpool as usual; the pool now runs across cluster nodes rather than locally. Depending on cluster load, it may take a while for the requested resources to become available.

>> c = parcluster;
>> pool = c.parpool(64);          % open a pool of 64 workers

>> parfor idx = 1:1000
       a(idx) = ...
   end

>> pool.delete                    % close the pool when done

Independent batch jobs

The batch command submits asynchronous jobs and returns a job object used to retrieve the output.

>> c = parcluster;
>> job = c.batch(@pwd, 1, {}, 'CurrentFolder', '.', 'AutoAddClientPath', false);

>> job.State                      % query state
>> job.fetchOutputs{:}            % fetch results once finished
>> job.delete                     % delete when no longer needed

To list current and past jobs, and fetch results from an earlier one:

>> c = parcluster;
>> jobs = c.Jobs;                 % array of all jobs
>> job2 = c.Jobs(2);              % a job by index
>> job2.fetchOutputs{:}

fetchOutputs retrieves function return values; for a script job, use load instead. Data written to files on the cluster must be retrieved from the filesystem directly.

Parallel batch jobs

To run a parallel workflow, pass a Pool size to batch. Take this example function, saved as parallel_example.m:

function [t, A] = parallel_example(iter)

if nargin == 0
    iter = 8;
end

disp('Start sim')
t0 = tic;
parfor idx = 1:iter
    A(idx) = idx;
    pause(2)
end
t = toc(t0);
disp('Sim completed')
save RESULTS A

end

Submit it with a pool of workers:

>> c = parcluster;
>> job = c.batch(@parallel_example, 1, {16}, 'Pool', 4, 'CurrentFolder', '.', 'AutoAddClientPath', false);
>> job.State
>> job.fetchOutputs{:}

A pool job always requests N+1 CPU cores: one worker manages the pool, so a job needing eight workers consumes nine cores. Allocating too many workers can give diminishing returns once the coordination overhead outweighs the computation, so tune the worker count for your workload.

To retrieve a job later, keep its ID and find it again:

>> id = job.ID
>> clear job

>> c = parcluster;
>> job = c.findJob('ID', id);
>> job.State
>> job.fetchOutputs{:}

Debugging

If a job errors, view its log with getDebugLog. For an independent multi-task job, give the task; for a pool job, give the job object:

>> c.getDebugLog(job.Tasks(3))   % independent job, specific task
>> c.getDebugLog(job)            % pool job

If an administrator asks for the scheduler ID of a job:

>> schedID(job)

See also