SGI tools

Introduction

SGI machines have their own NUMA architecture. To obtain good performance you need to use a specified MPI library (MPT).

For MPI a dedicated tools is provided : perfboost

If the hyper-threading is activated, note that the first part of the cores corresponds to "real" cores, and the second part  to virtual cores.

SGI documentation

- on the machine space : web_browser /usr/share/doc/packages/sgi
- web : http://techpubs.sgi.com (uv2000, uv100)

Lib MPT (MPI de SGI) and perfboost

Message Passing Toolkit (MPT) User Guide

SGI UV machines implement the NumaLink connection technology. MPT (Message Passing Toolkit) is the library to use.

module load mpt
compilation avec : -lmpt

When re-compiling with MPT is not possible, the tool perfboost can be used.

module load mpt
module load perfboost
man perfboost
mpirun -np NN perfboost -XMPI VOTRE_BINAIRE

If you get the error:

MPT ERROR: with GRU send from rank 0 to rank 7: -1

then you can try:

export MPI_SHARED_NEIGHBORHOOD=HOST

perf & monitoring tools

monitoring/infos tool

nodeinfo : info memory -  local/distant

Memory Statistics Tue Dec 9 10:32:10 2014

------------------------- Per Node KB -------------------------------------------- ------ Preferred Alloc ------- -- Loc/Rem ---
node Total Free Used Dirty Anon Slab Shmem hit miss foreign interlv local remote
0 257905644 248978080 8927564 20 14760 57684 24 0 0 0 0 0 0
1 260030464 255714748 4315716 20 13600 55488 20 0 0 0 0 0 0
2 260030464 255774932 4255532 24 21572 23904 20 0 0 0 0 0 0
3 260030464 255766680 4263784 24 1868 32888 24 0 0 0 0 0 0
4 260030464 255793728 4236736 16 8484 29052 12 0 0 0 0 0 0
5 260030464 255781864 4248600 28 13452 29500 28 0 0 0 0 0 0
6 260030464 255775168 4255296 72 26572 26020 32 0 0 0 0 0 0
7 260030464 255805596 4224868 16 11128 24088 8 0 0 0 0 0 0
8 260030464 253300928 6729536 16 10824 96568 28 564 0 0 0 564 0
9 260030464 255650724 4379740 16 10340 30356 12 0 0 0 0 0 0
10 260030464 255795236 4235228 24 2500 21912 8 0 0 0 0 0 0
11 260030464 255785304 4245160 56 3376 32900 4 0 0 0 0 0 0
TOT 3118240748 3059922988 58317760 332 138476 460360 220 564 0 0 0 564 0

Display :

  • local memory
    • total / free / used / dirty? anon? slab? Shmem (shared)
    • Preferred Alloc ??
    • Loc/Rem : local (fast)  / distant (slow)

gr_systat : monitoring with graphics widget

Graphics tool (X11) to display real time information

module load gr_systat
gr_systat

collectctl : monitoring (text version)

collectl -sxX : stat infiniband
collectl -i10 : stat every 10 seconds

pmchart : monitonring (graphics version)

pmchart -c Linkstat-uv : numalink transfert visualisation

pmgcluster

Graphics view (X11) in real time

  • Point with the mouse + space : gives the context.
  • right clic : to quit

pmgcluster -t 10 localhost

cpumap

Display cores information

Information on the machine state

lscpu

List the cores on the machine.

lstopo (hwloc)

Display elements of the machine

gstack : the calling stack

Display the stack and the processes state

gstack PID

info  NUMA

Info nodes and distance between nodes

numactl --hardware

Profiling

perf top : profiling

perf displays the time spent within functions (-g must be used)

perf top -p PID

About the process placement

Architecture

The machine has 12 nodes with 256 Go of memory. Because of the NUMA memory, local memory must be preferred.

The tool dplace should be used to optimize the resources usage.

dplace is a SGI tool.

dplace : process placement (without torque)

dplace MY_PROGRAM
mpirun -np 2 dplace MY_PROGRAM

  • dplace -q : display the cores occupied by processes given by dplace
  • dplace -c N1,N2,N3 : to affect a process to a core
  • … man dplace