summaryrefslogtreecommitdiff
path: root/vimwiki/Nodes.md
diff options
context:
space:
mode:
authorVito Graffagnino <vito@graffagnino.xyz>2020-09-08 18:10:49 +0100
committerVito Graffagnino <vito@graffagnino.xyz>2020-09-08 18:10:49 +0100
commit3b0142cedcde39e4c2097ecd916a870a3ced5ec6 (patch)
tree2116c49a845dfc0945778f2aa3e2118d72be428b /vimwiki/Nodes.md
parent8cc927e930d5b6aafe3e9862a61e81705479a1b4 (diff)
Added the relevent parts of the .config directory. Alss add ssh config
Diffstat (limited to 'vimwiki/Nodes.md')
-rw-r--r--vimwiki/Nodes.md24
1 files changed, 24 insertions, 0 deletions
diff --git a/vimwiki/Nodes.md b/vimwiki/Nodes.md
new file mode 100644
index 0000000..4555781
--- /dev/null
+++ b/vimwiki/Nodes.md
@@ -0,0 +1,24 @@
+__Ganglia__ (https://uhhpc.herts.ac.uk/ganglia/) can be useful to see the state of nodes.
+
+If a node goes down while a user’s job is running on it, the job will not terminate properly
+and may flood the user’s inbox with notifications. If `Ganglia` or `showstate` report a node
+is down, consider rebooting it with
+
+`sudo rebootnode.pl nodexxx`
+
+This will prompt you for the IDRAC password, which is `rianhs4b`. Once a node has been rebooted,
+wait a few minutes, then check that you can ssh into it as a normal user and view your home
+directory and /beegfs. If so, bring it back on line with
+
+`sudo pbsnodes –c nodexxx`
+
+If a node is misbehaving and you don’t want to/can’t reboot it, you can temporarily remove it
+from the pool used the job control system with
+
+`pbsnodes –o nodexxx`
+
+– also reversed by
+
+`pbsnodes –c`
+
+