The current health situation forces us to work in degraded mode. We are doing what is necessary to keep PlaFRIM operational as long as the building remains open.
The follow-up and support of the users will probably be done in a degraded way.
Of course, we can always be joined via plafrim-support@inria.fr.
Thank you for your understanding
Category: Latest News
New parallel storage space
A new parallel BeeGFS file system (see https://www.beegfs.io/content/) is now available on PlaFRIM. You will find a /beegfs/<LOGIN> directory on all PlaFRIM nodes.
This parallel file system replaces the /lustre storage which is no longer under warranty and is not recommended.
The data in this directory are not saved. The quota is 1 TB per user.
If you need to write/read massively and/or in parallel, this is the preferred file system unlike your /home/<LOGIN> directory.
Information on all available storage spaces can be found in the plafrim website’s FAQ (see here).
Please do not hesitate to contact us if you need further information
New AMD nodes available
New AMD compute nodes are now available (see Hardware Documentation/) :
– diablo [01-04]: 2 * AMD EPYC 7452 32-Core Processor and 256 GB RAM
– diablo05 : 2 * AMD EPYC 7702 64-Core Processor and 1 TB RAM
To reserve them, use the “amd” or “diablo” constraints.
Please do not hesitate to contact us if you need further information.
End of Network instability on bora[001-040] and devel[01-02]
The network switch was replaced on January 15. Since then, we have not observed any outages.
We confirm that the problems of access to the bora [001-040] and devel [01-02] nodes are solved.
Please do not hesitate to contact us if you need further information.
New PlaFRIM nodes fingerprint
Many of you are reporting connection problems related to changing the fingerprint of development servers when connecting to SSH.
To make sure you can accept this change, you will find below the new fingerprint of these servers :
SHA256:HTzNYIkxeVcDVORkoKXleJTOEqbtq5gs9UfqjyHBOGY
Do not hesitate if you need more information.
Migration
As indicated in the message of November 8th, the migration is ongoing.
Currently, as a reminder:
1. when you connect to PlaFRIM, you arrive on the devel01 or
devel02 front nodes, which are latest generation Skylake machines.
2. you can still connect to the old front node by typing ‘ssh
plafrim2’; this part of the platform (miriel, breeze, mouse, arm01,
sirocco…) will be migrated quickly to the new version of the platform
3. quickly, you will find on PlaFRIM:
* the machines you have been working on until now
* the new bora machines (dual-socket Skylake 36 cores and 192 GB of memory)
* modules with default modules dedicated to the target architecture you are working on
* a /dev space for modules where all users can provide their
own software stacks available to other users; it will be necessary to
redo all previous dev modules, the old ones being no longer functional
for this new version of the platform
* a single slurm partition
(routing) that allows you to address all machines; to choose a
particular category of machines, you can specify the associated
“feature” (to know the “features” associated with a node: sinfo -Nel)
using the -C option of slurm (salloc -C Bora, for example)
* guix to manage your experimental environments
Finally, if at the first connection on this new platform you have a message containing: > ssh-keygen -f “~/.ssh/known_hosts” -R “plafrim-ext”
type the given command and everything should work
PlaFRIM evolutions and calendar
We remind you that from next Tuesday, disruptions are to be expected on PlaFRIM.
When you connect to PlaFRIM, you will be connected to the new nodes available (devel01 and devel02). You will then be able to submit jobs on the new bora001 to bora040 nodes.
As long as PlaFRIM2 nodes are not migrated to PlaFRIM3, you can still connect to the old development nodes (devel11 to devel13) via the “ssh plafrim2” command from the new devel nodes (devel01 or devel02).
The planned schedule remains the same, migrate as many nodes as possible to PlaFRIM3 over the next week.
We still have some configurations to make so that everything is fully operational on PlaFRIM3. Feel free to test it next Tuesday and give us feedback via plafrim-support@inria.fr
Feel free to contact us for any information you may need about this migration.
defq
PlaFRIM shutdown from 1st to 6th august 2018
The server room of the Talence building will be undergoing refurbishment work on the cooling system and will require a total shutdown of PlaFRIM from 1st August starting at 12.00 with a restart during the 6th of August.
Thank you for your understanding.
Cleaning of /lustre storage and quota implementation
The Luster filesystem “/lustre” is close to be saturated (in terms of inodes). If the system reaches the saturation, creation of new files on /lustre will no longer be possible.
It is therefore necessary for each user to clean up his personal directory /lustre/<LOGIN>.
We remind you that the storage space /lustre is a temporary work area and is not saved.
To store your non-temporary and big data, it is also possible to use the iRODS service. Information on it is available here
We will implement a quota in terms of number of file per user (~ 400 000).
Thank you for your understanding,