LaPalma3 (1): Introduction
Please note that all the SIEpedia's articles address specific issues or questions raised by IAC users, so they do not attempt to be rigorous or exhaustive, and may or may not be useful or applicable in different or more general contexts.
Introduction | Connecting | Useful Commands (preparations) | Useful Commands (executions) | Script files | FAQs |
IMPORTANT: This documentation is deprecated. It will not be further updated. The new documentation for LaPalma can be found here for external users or here if you are connected to IAC's internal network.
Introduction
This user's guide for the LaPalma Supercomputer (v3) is intended to provide the minimum amount of information needed by a new user on this system. As such, it assumes that the user is familiar with many of the standard aspects of supercomputing as the Unix operating system.
We hope you can find most of the information you need to use our computing resources: from applications and libraries to technical documentation about LaPalma; how to include references in publications and so on. Please read carefully this document and if any doubt arises do not hesitate to contact our support group at res_support@iac.es
System Overview
LaPalma comprises 252 IBM dx360 M4 compute nodes. Every node has 16 cores (Intel E5-2670) at 2.6 GHz, running Linux operating system with 32 GB of memory RAM (2 GB per core) and 500GB of local disk storage. Two Bull R423 servers are connected to a pair of Netapp E5600 storage systems providing a total amount of 346 TB of disk storage accessible from every node through Lustre Parallel File System. The networks that interconnect the LaPalma are:
- Infiniband Network: High bandwidth network used by parallel applications communications.
- Gigabit Network: Ethernet network used by the nodes to mount remotely their root file system from the servers and the network over which Lustre works.
File Systems
IMPORTANT: It is your responsibility as a user of the LaPalma system to backup all your critical data. NO backup of user data will be done in any of the filesystems of LaPalma.
Each user has several areas of disk space for storing files. These areas may have size or time limits, please read carefully all this section to know about the policy of usage of each of these filesystems.
There are 3 different types of storage available inside a node:
- Root filesystem: It is the filesystem where the operating system resides
- Lustre filesystems: Lustre is a distributed networked filesystem which can be accessed from all the nodes
- Local hard drive: Every node has an internal hard drive
Root Filesystem
The root file system, where the operating system is stored does not reside in the node, this is a NFS filesystem mounted from a Network Attached Storage (NAS).
As this is a remote filesystem only data from the operating system has to reside in this filesystem. It is NOT permitted the use of /tmp for temporary user data. The local hard drive can be used for this purpose as you could read later.
Furthermore, the environment variable $TMPDIR
is already configured to force the normal applications to use
the local hard drive to store their temporary files.
Lustre Filesystem
Lustre is an open-source, parallel file system that can provide fast, reliable data access from all nodes of the cluster to a global filesystem, with a remarkable scale capacity and performance. Lustre allows parallel applications simultaneous access to a set of files (even a single file) from any node that has the Lustre file system mounted while providing a high level of control over all file system operations. These filesystems are the recommended to use with most jobs, because Lustre provides high performance I/O by "striping" blocks of data from individual files across multiple disks on multiple storage devices and reading/writing these blocks in parallel. In addition, Lustre can read or write large blocks of data in a single I/O operation, thereby minimizing overhead.
Even though there is only one Lustre filesystem mounted on LaPalma, there are different locations for different purposes:
/storage/home
: This location has the home directories of all the users. When you log into LaPalma you start in your home directory by default. Every user will have their own home directory to store the executables, own developed sources and their personal data./storage/projects
: In addition to the home directory, there is a directory in/storage/projects
for each group of users of LaPalma. For instance, the groupiac01
will have a/storage/projects/iac01
directory ready to use. This space is intended to store data that needs to be shared between the users of the same group or project. All the users of the same project will share their common/storage/projects
space and it is responsibility of each project manager to determine and coordinate the better use of this space, and how it is distributed or shared between their users./storage/scratch
: Each LaPalma user will have a directory over/storage/scratch
, you must use this space to store temporary files of your jobs during its execution.
The previous three locations share the same quota in order to limit the amount of data that can be saved by each group. Since the locations /storage/home
, /storage/projects
and /storage/scratch
are in the same filesystem, the quota assigned is the sum of “Disk Projects” and “Disk Scratch” established by the access committee.
The quota and the usage of space can be consulted via the quota command:
usertest@login1:~> lfs quota -hg <GROUP> /storage
For example, if your group has been granted the following resources: Disk Projects: 1000GB and Disk Scratch: 500GB, the command quota will report the sum of the two values:
usertest@login1:~>lfs quota -hg usergroup /storage
Disk quotas for grp usergroup (gid 123):
Filesystem used quota limit grace files quota limit grace
/storage/ 500G 1.5T 1.5T - 700 100000 100000 -
The amount of files is limited as well. By default the quota for files is set to 100000 files.
If you need more disk space or number of files, the responsible of your project has to make a request for this
extra space needed, specifying the requested space and the reasons why it is needed. The request can be sent
by email or any other way of contact to the user support team.
/storage/apps
: Over this location will reside the applications and libraries that have already been installed on LaPalma. Take a look at the directories or to XXXXXX to know the applications available for general use. Before installing any application that is needed by your project, first check if this application is already installed on the system. If some application that you need is not on the system, you will have to ask our user support team to install it. If it is a general application with no restrictions in its use, this will be in stalled over a public directory, that is over /storage/apps
so all users on LaPalma could make use of it. If the application needs some type of license and its use must be restricted, a private directory over /storage/apps
will be created, so only the required users of LaPalma could make use of this application.
All applications installed on /storage/apps
will be installed, controlled and supervised by the user support
team. This doesn't mean that the users could not help in this task, both can work together to get the best result.
The user support can provide his wide experience in compiling and optimizing applications in the LaPalma
platform and the users can provide his knowledge of the application to be installed. All that general
applications that have been modified in some way from its normal behavior by the project users' for their own
study, and may not be suitable for general use, must be installed over /storage/projects
or /storage/home
depending on the usage scope of the application, but not over /storage/apps
.
Local Hard Drive
Every node has a local hard drive that can be used as a local scratch space to store temporary files during
executions of one of your jobs. This space is mounted over /scratch directory. The amount of space within
the /scratch
filesystem varies from node to node (depending on the total amount of disk space available). All
data stored in these local hard drives at the compute nodes will not be available from the login nodes. Local
hard drive data is not automatically removed, so each job should have to remove its data when finishes.
Acknowledgments
Please, add next text to your publications if they are based on the results obtained in LaPalma:
The author thankfully acknowledges the technical expertise and assistance provided by the Spanish Supercomputing Network (Red Española de Supercomputación), as well as the computer resources used: the LaPalma Supercomputer, located at the Instituto de Astrofísica de Canarias.
Where can I get support if I have any issue?
- Please, read the other sections, like Connecting, Useful Commands (preparations), Useful Commands (executions), examples of Script files and FAQs.
- Man pages: Information about most commands and software installed is available via the standard UNIX man command. For example, to read about command-name just type on a shell inside LaPalma:
usertest@login1:~> man command-name
-k
flag. For example:
usertest@login1:~> man -k compiler
usertest@login1:~> man man
- If you need help or further information, please, contact us sending an email to res_support@iac.es