Tag: Slurm

  • Killing Jobs Using Slurm

    Cancel a pending or running job To delete a job, use “scancel” followed by the job ID. For example: Cancel all of your pending and running jobs To delete all your jobs across all partitions simultaneously, in case they are mistakenly submitted, use: The –user option terminates all of your jobs, both pending and running.

  • Sbatch Options

    The following table can be used as a reference for the basic flags available to the sbatch, salloc, and few other commands. To get a better understanding of the commands and their flags, please use the “man” command while logged into discover. For more information on sbatch, please refer to the man pages. Use the…

  • Slurm Example Scripts

    Serial Job Script By default, Slurm executes your job from the current directory where you submit the job. You can change the work directory by “cd” to it in the script, or specify –workdir option for SBATCH. OPENMP Job Script Note: The option “–cpus-per-task=n” advises the Slurm controller that ensuring job steps will require “n”…

  • Srun Environment Variables

    The following information is largely replicated from SchedMD’s srun man page, and is the subset that is likely most relevant to most NCCS users. The srun command honors the following environment variables, when present (these override any inline directives within your batch script, but will be overridden by those also specified on the srun command…

  • Salloc Environment Variables

    The following information is largely replicated from SchedMD’s salloc man page, and is the subset that is likely most relevant to most NCCS users. The salloc command honors the following environment variables, when present (these override any inline directives within your batch script, but will be overridden by those also specified on the salloc command…

  • Sbatch Environment Variables

    The following information is largely replicated from SchedMD’s sbatch man page, and is the subset that is likely most relevant to most NCCS users. The sbatch command honors the following environment variables, when present (these override any inline directives within your batch script, but will be overridden by those also specified on the sbatch command…

  • Running Jobs on Discover using Slurm

    Submit a job In general, you will create a batch job script. Either a shell script or a Python script is allowed, but throughout the user guide we use only shell scripts for demonstration. You will then submit the batch script using sbatch; the following example requests 2 nodes with at least 2GBs of memory…

  • Slurm Best Practices on Discover

    The following approaches allow Slurm’s advanced scheduling algorithm the greatest flexibility to schedule your job to run as soon as possible. Learn how to request Cascade Lake, or Milan nodes to run your slurm job.Inline directives (#SBATCH) should be included in the beginning of your job script.See “man sbatch” for the corresponding command line options. Feel free to…

  • Using Slurm

    NCCS provides SchedMD’s Slurm resource manager for users to control their scientific computing jobs and workflows on the Discover supercomputer. This video gives instructions on how users can submit jobs to be scheduled, specifying resource requests such as CPU time, memory, as well as other options for optimizing productivity. Use Slurm commands to request both…