diff options
| author | Vito Graffagnino <vito@graffagnino.xyz> | 2020-09-08 18:10:49 +0100 |
|---|---|---|
| committer | Vito Graffagnino <vito@graffagnino.xyz> | 2020-09-08 18:10:49 +0100 |
| commit | 3b0142cedcde39e4c2097ecd916a870a3ced5ec6 (patch) | |
| tree | 2116c49a845dfc0945778f2aa3e2118d72be428b /vimwiki/Nodes.md | |
| parent | 8cc927e930d5b6aafe3e9862a61e81705479a1b4 (diff) | |
Added the relevent parts of the .config directory. Alss add ssh config
Diffstat (limited to 'vimwiki/Nodes.md')
| -rw-r--r-- | vimwiki/Nodes.md | 24 |
1 files changed, 24 insertions, 0 deletions
diff --git a/vimwiki/Nodes.md b/vimwiki/Nodes.md new file mode 100644 index 0000000..4555781 --- /dev/null +++ b/vimwiki/Nodes.md @@ -0,0 +1,24 @@ +__Ganglia__ (https://uhhpc.herts.ac.uk/ganglia/) can be useful to see the state of nodes. + +If a node goes down while a user’s job is running on it, the job will not terminate properly +and may flood the user’s inbox with notifications. If `Ganglia` or `showstate` report a node +is down, consider rebooting it with + +`sudo rebootnode.pl nodexxx` + +This will prompt you for the IDRAC password, which is `rianhs4b`. Once a node has been rebooted, +wait a few minutes, then check that you can ssh into it as a normal user and view your home +directory and /beegfs. If so, bring it back on line with + +`sudo pbsnodes –c nodexxx` + +If a node is misbehaving and you don’t want to/can’t reboot it, you can temporarily remove it +from the pool used the job control system with + +`pbsnodes –o nodexxx` + +– also reversed by + +`pbsnodes –c` + + |
