Commit Graph

75 Commits

Author SHA1 Message Date
Wirawan Purwanto
dfb9db6a60 * Added tool to get a detailed picture of GPU utilization on a SLURM cluster.
Tested on Wahab cluster.
2023-04-06 16:25:30 -04:00
Wirawan Purwanto
3aa1688f8e * Added basic README. 2023-03-01 10:39:12 -05:00
Wirawan Purwanto
4af174ea34 * Containers: Added simplistic tool to dump info about Python
inside the container.
2023-03-01 09:35:41 -05:00
Wirawan Purwanto
ce695cd672 * Created initial tool to scan user dirs for the size of their trash folders. 2022-10-03 15:35:30 -04:00
Wirawan Purwanto
c4601a5a30 * (WIP) Minor improvement + documentation. 2022-09-30 17:22:19 -04:00
Wirawan Purwanto
5edd528511 * Saved a sample revised ipython module loader, to be used with
"fix3" version of lmod_python_fix.py.

  Sigs:
    # -rw------- 1 wpurwant users 291 2020-06-09 20:09 000-odurc-lmod.py.rev3
    # 70e84e7eaaf28da297fa342c8318fe2b  000-odurc-lmod.py.rev3
2022-02-22 11:23:54 -05:00
Wirawan Purwanto
4a1b5d0a69 * Saved developmental notebooks to devise/test the "lmod_python_fix.py".
These were done on Wahab with legacy (lmod-based) Python suite.

  Sigs:
    # -rw-r--r-- 1 wpurwant users 56588 2020-06-09 17:35 debug_ipython_start_1.ipynb
    # -rw-r--r-- 1 wpurwant users 20395 2020-06-09 17:34 debug_ipython_start_2.ipynb
    # -rw-r--r-- 1 wpurwant users 21965 2020-06-09 19:20 debug_ipython_start_3.ipynb
    # -rw-r--r-- 1 wpurwant users 10717 2020-06-09 19:23 debug_ipython_start_4_vanilla.ipynb
    # -rw-r--r-- 1 wpurwant users 37041 2020-06-09 19:59 debug_ipython_start_5.ipynb
    # b061fbc819806bc18d48287a9654bcc5  debug_ipython_start_1.ipynb
    # bc262fc0d3cd290da1f46a242a6fcccb  debug_ipython_start_2.ipynb
    # 5f289c909e967096712b3196a19979b7  debug_ipython_start_3.ipynb
    # b5f0cd9251e325b0c7e0222ba4cfc037  debug_ipython_start_4_vanilla.ipynb
    # abc440a885da0872d5bc215adfc3edd7  debug_ipython_start_5.ipynb
2022-02-22 10:44:05 -05:00
Wirawan Purwanto
0ad73f50d6 * Saved README.txt of the lmod_python fix (notes written to Minhao, 2020-06-09). 2022-02-22 10:41:11 -05:00
Wirawan Purwanto
d159d57578 * Third attempt to fix lmod "module" for ipython ("fix3")
(dated: 2020-06-09).
* Supports both Turing and Wahab.
* Allows both addition and deletion of paths from sys.path.

* Author's note: Later on, I discovered that these complex steps do not appear
  to be necessary, somehow the lmod Python commands were able to take care
  of additions and deletions through the modification of os.environ['PYTHONPATH']
  alone, so I am not sure if this fix is absolutely needed. (?)

  Sigs:
    # -rw------- 1 wpurwant users 3298 2020-06-09 20:07 lmod_python_fix3.py
    # 78abc7bae90bf23a42aa40285175ca48  lmod_python_fix3.py
2022-02-22 10:28:14 -05:00
Wirawan Purwanto
232cc5cafb * Second attempt to fix lmod "module" for ipython ("fix2")
(dated: 2020-06-09).
  WIP: Prevent indefinite lengthening of sys.path but not complete
  fix yet.

  Sigs:
    # -rw------- 1 wpurwant users 1290 2020-06-09 19:28 lmod_python_fix2.py
    # e218a4ac73762e0380c4bbe2290906b1  lmod_python_fix2.py
2022-02-22 10:22:45 -05:00
Wirawan Purwanto
5c0b516026 * First attempt to fix lmod "module" for ipython ("fix1")
(dated: 2020-06-09)

  Sigs:
    # -rw------- 1 wpurwant users 915 2020-06-09 17:20 lmod_python_fix1.py
    # bb42b2bde01f2e235eb205fe555fab9e  lmod_python_fix1.py
2022-02-22 10:16:47 -05:00
Wirawan Purwanto
7c02045fa1 * Saved the original lmod_python.py and
~/.ipython/profile_default/startup/000-odurc-lmod.py
  from Wahab cluster on the "legacy" lmod-based ipython/jupyter
  session. Last update: 2020-06-09.

  Sigs:
    # -rw-r--r-- 1 wpurwant users 179 2020-06-03 15:49 000-odurc-lmod.py
    # -rw-r--r-- 1 wpurwant users 843 2020-03-17 14:02 lmod_python.py
    # 8718a2d7a7d6d152701e6c31254f7168  000-odurc-lmod.py
    # 24a740e83656875bd39f942529378414  lmod_python.py
2022-02-22 10:11:47 -05:00
Wirawan Purwanto
9edb1e040d * Imported tools to dump information about files in a container.
The "dump-info" tool works only for Debian-based distro for now.
2021-06-29 13:29:56 -04:00
Wirawan Purwanto
e828a16a49 * Expanded support to Wahab.
Last updated 2021-03-02.
2021-06-29 13:25:15 -04:00
Wirawan Purwanto
cc95430e06 * Added lmod module for OPENIB-based nwchem installation on Wahab. 2021-04-20 00:29:57 -04:00
Wirawan Purwanto
f9d350aa7c * Added make-based install procedure for nwchem 7.0.0. 2021-04-20 00:29:27 -04:00
Wirawan Purwanto
9e7c5b3312 * Modularize & tidied up the nwbuild05 script. 2021-04-20 00:28:39 -04:00
Wirawan Purwanto
b3f1877f73 * Initial version of nwchem "build05": Using "OPENIB" backend. 2021-04-15 12:41:17 -04:00
Wirawan Purwanto
67c05e7efd * Added small build notes for nwchem on Wahab. 2021-04-15 12:40:55 -04:00
Wirawan Purwanto
08dfff3ce4 * Modifications for building armci-mpi on the Wahab's install location. 2021-04-15 12:40:36 -04:00
Wirawan Purwanto
8cf081b4b1 * Copied armci-mpi build script from ODU container repo.
This is taken from commit #dd7cf94ecdb1c1368fed2458c4f45bd95e1a140c
   from ODU container repo. No changes were made at all.
2021-04-14 12:01:31 -04:00
Wirawan Purwanto
b97592a8c7 * Copied "build04" nwchem build script from ODU container repo.
This is taken from commit #a278c9cb246fc267f93018e607856c1141942d32
  from ODU container repo. No changes were made at all.
2021-04-14 11:36:44 -04:00
Wirawan Purwanto
d68c1cc608 * Wahab: Imported comsol-51, last modified 2021-02-11 to accommodate
launching from another SLURM job (e.g. OOD virtual desktop interface).
2021-02-12 15:16:43 -05:00
Wirawan Purwanto
0331ea6ed3 * Committed: Original user-invocable "qe-<VERSION>" from Wahab.
Signatures:
    # -rwxr-xr-x 1 wpurwant users 520 2020-03-10 14:39:05 qe-6.3
    # -rwxr-xr-x 1 wpurwant users 514 2020-03-10 14:29:54 qe-6.4
    # -rwxr-xr-x 1 wpurwant users 531 2020-03-10 14:28:29 qe-6.4-intel
    # bda38c3efc9bb0f83db351924e06da43  qe-6.3
    # 2b2e9ec9c641dd60206776a5a80c75a3  qe-6.4
    # 083455403fa47897c4d6e7ce6d591f9a  qe-6.4-intel
    # 8d29ad9caa756b55747e58a06c39aa59576a896b  qe-6.3
    # 1443cdb47e4340be1106d4e80ba3dc47717d9ca1  qe-6.4
    # 1bf39ffc828cdc784658ea510fbda0602d34e237  qe-6.4-intel
2021-01-28 09:54:38 -05:00
Wirawan Purwanto
e9597af157 * Minor typographical improvements. 2021-01-28 09:43:50 -05:00
Wirawan Purwanto
dda0e1b7ad * ARCHIVE: Imported some tools from my stats work back in 2016.
Warning: These tools are obsolete, and may or may not work.
  Orig-cwd: TURING:/home/wpurwant/hpc-explore/turing-hw/tests20160826
  Sigs:
    # -rwx------ 1 wpurwant users 214 2016-09-13 11:58:21 show-disabled-nodes2.awk
    # -rwx------ 1 wpurwant users 111 2016-08-26 13:10:14 show-disabled-nodes.sh
    # -rwx------ 1 wpurwant users 122 2016-08-26 10:25:16 show-node-summary.sh
    # -rwx------ 1 wpurwant users 153 2016-08-26 12:06:25 show-non-full-nodes.sh
    # 71c64a7e6cd9df34ca6a9348ebc44ab1  show-disabled-nodes.sh
    # 8a3570829d132c526cc44adf72212887  show-disabled-nodes2.awk
    # ebbc0642cd29a1575670c86629d89c72  show-node-summary.sh
    # 5eee777a8380333654d3a06e3482f38e  show-non-full-nodes.sh
    # dfb36573c6d758bc675f70247ff0befe9ddbe365  show-disabled-nodes.sh
    # d1995fc91d402c9dcc30adc0b461e2beaa25ab67  show-disabled-nodes2.awk
    # 3267d169b476396d66f6127e1343bd7c23b37e1f  show-node-summary.sh
    # 18673a78914d9cce47c95d4fd5146805fc227416  show-non-full-nodes.sh
2021-01-26 10:59:44 -05:00
Wirawan Purwanto
9429ad3697 * Added capability to parse & create scratch dirs. 2020-12-09 17:37:06 -05:00
Wirawan Purwanto
7dc03821a8 * g09slurm initial update: correctly parse nprocshared / nproc in link0
command in a case-insensitive manner.
2020-12-09 13:35:21 -05:00
19eacfbf54 * Committed: Original user-invocable "g09slurm" from Wahab
and the "g09.slurm" backend script.

  Signatures:
    # -rwxr-xr-x 1 wpurwant users  394 2020-09-10 01:06:20 g09.slurm
    # -rwxr-xr-x 1 wpurwant users 1120 2019-09-16 15:24:20 g09slurm
    # 387f828b04454678b23b05af61e4f334  g09.slurm
    # 9ee4bfcfad766165a81239cb13d7ef30  g09slurm
    # e0dbc8b24aeecbbecf7255a3e3f84dc3fc3a422f  g09.slurm
    # d5a33e554766c41f17f6839242c5c7f4d60e7fe4  g09slurm

  g09.slurm is originally located at
  /cm/shared/apps/gaussian/g09revD.01/script/g09/script/g09.slurm
2020-12-09 13:32:14 -05:00
Wirawan Purwanto
19c833c3ff * sinfo-report-node-stats.sh: Simple tool to report status of compute
nodes based on SLURM's "sinfo" output.
2020-06-29 13:30:04 -04:00
Wirawan Purwanto
18d79dd34b * On custom sacct1.sh, include job (most recent) state as well. 2020-06-21 17:45:23 -04:00
Wirawan Purwanto
d384d0320d * Added explicitly headless option (--headless).
* Increased wait time to 5 minutes.
2020-01-17 13:58:59 -05:00
Wirawan Purwanto
c2a5ae8863 * slurm: Modified fields to print by default. 2019-11-25 11:12:43 -05:00
Wirawan Purwanto
32b82db7a3 * slurm: Added custom sacct wrapper script which contains my preferences. 2019-07-19 14:58:47 -04:00
Wirawan Purwanto
db2ca075ed * Must re-add SLURM if it is not loaded. 2019-04-25 17:35:34 -04:00
Wirawan Purwanto
a65338a8bf * Added "wgo" (what's going on) to check the status of processes
on login node. Currently supposed to be used for Turing only.
2019-04-25 17:34:23 -04:00
Wirawan Purwanto
1387997010 * Prints an error message and quit in case Jupyter
did not start after 2 minutes.
2019-03-27 13:49:19 -04:00
Wirawan Purwanto
ba6c9f53ed * Added accommodation for Anaconda as well as site-provided python.
* Added support for fully headless mode (connect via local browser).
* Added some safeguards against job failing to start due to executable
  not found, etc.
2019-03-27 13:28:33 -04:00
Wirawan Purwanto
1c8a5da492 * Imported launch_jupyter from Turing.
Originally furnished by Min Dong, 2019-03-01 14:30 EST.
2019-03-27 11:25:32 -04:00
Wirawan Purwanto
28eb7a0d98 * sq: Customizable squeue wrapper: introduce new defaults / default
behavior on squeue.
2018-07-20 11:20:22 -04:00
Wirawan Purwanto
94e0aa9490 * jupyter-anaconda2: A script that will start Jupyter notebook process
for Anaconda2 distribution.

  Note: for now it is a script that has to be submitted in the compute
  node. I will upgrade this to become a self-submitting script eventually.
2018-07-20 09:19:59 -04:00
Wirawan Purwanto
95034685ff * interact: Tool to allocate an interactive session on a regular
compute node under SLURM.
2018-06-01 12:50:56 -04:00
Wirawan Purwanto
dbce662c5a * interact-gpu: Tool to allocate an interactive session on a GPU
compute node under SLURM.
2018-06-01 12:49:31 -04:00
Wirawan Purwanto
82ea3bc689 * Archived: SGE version of pwscf-5.3 script. 2017-05-24 14:37:32 -04:00
Wirawan Purwanto
9c82c4d465 * Update bash module support: with recent changes on Turing,
'module' seems to be supported out of the box for bash.
  If 'module' environment is detected, we skip the initiation step.
2017-05-24 14:28:24 -04:00
Wirawan Purwanto
68c0e70d4d * Added "regular" runsas which runs with more limited memory. 2016-11-10 12:48:43 -05:00
Wirawan Purwanto
27b8ccd6ae * Added runsas-himem from earlier consultation this year. 2016-11-10 12:45:29 -05:00
Wirawan Purwanto
df6facce86 * pwscf: Ad-hoc fix for Turing after 2016 upgrade.
We force using the old (TCL) module system since the new module
  system (LMOD) always executes itself whenever a bash batch script
  is executed on Turing right now.
2016-11-07 11:57:34 -05:00
Wirawan Purwanto
739d765f53 * Added convenience for gathering & analyzing CPUs on the cluster.
* Documentation update.
2016-10-31 15:21:10 -04:00
Wirawan Purwanto
aa597b907c * In hoststats subcommand: Also print node status flags if they exist. 2016-10-20 10:11:18 -04:00