Run jobs on ifarm
Slurm jobs
Here is a memo for running jobs with slurm
- JLab Scientific Computing website
- Simple(very) slurm start
- Check that you are not in jobLimit list
- Check jobs by username
- JLab slurm info page (partitions, nodes, etc)
- Slurm commands cheat sheet
- More explanation from JLab
Common commands:
# Start a job
sbatch try_slurm
# see jobs
squeue -u romanov
# Stop/cancel jobs
scancel 41887209
# Check what accound belongs to
sacctmgr show user romanov
# Check existing partitions
sinfo -s
Error output can be redirected with #SBATCH --output=
by default both error and output are redirected to one file. But there is a separate #SBATCH --error=
can be added to split output and error streams. filename-pattern:
#SBATCH --output=/u/scratch/romanov/test_slurm/%j.out
#SBATCH --error=/u/scratch/romanov/test_slurm/%j.err
Run eic_shell on slurm
There are several approaches how to run eic_shell under the slurm on ifarm.
For Meson-Structure campaigns we create 2 scripts: one for batch submission and another one for what to do in the container (listed below). It is also possible to run just one script with slurm command.
eic_shell
is a slim wrapper around singularity (or now apptainer) containers. Instead of eic_shell direct singularity command could be used.
singularity exec -B /host/dir:/container/dir {image} script_to_run.sh
Where:
-B
is binding of existing directory to some path in container. One needs to bind all paths that are used. And also container doesn't follow links if they point to something that is not bound{image}
- images avialable on ifarm on CVMFS at folder:bash# eic_shell images location: ls /cvmfs/singularity.opensciencegrid.org/eicweb/ # the latest `nightly` image: /cvmfs/singularity.opensciencegrid.org/eicweb/eic_xl:nightly # release or stable images are available by names like: /cvmfs/singularity.opensciencegrid.org/eicweb/eic_xl:25.07-stable
script_to_run.sh
your script to run in the eic shell
Here is an example (it is wordy but it illustrates real ifarm paths and images)
singularity exec \
-B /volatile/eic/romanov/meson-structure-2025-07:/volatile/eic/romanov/meson-structure-2025-07 \
/cvmfs/singularity.opensciencegrid.org/eicweb/eic_xl:25.07-stable \
/volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_073.container.sh
Instead of script_to_run.sh
one can put commands directly, but it might be tricky in terms of quotes, special symbols, etc. Here is an example from csv_convert/convert_campaign.slurm.sh:
singularity exec -B "$CAMPAIGN":/work -B "$CSV_CONVERT_DIR":/code "$IMG" \
bash -c 'cd /code && python3 convert_campaign.py /work && cd /work && for f in *.csv; do zip "${f}.zip" "$f"; done'
For simulation campaign full-sim-pipeline/create_jobs.py create such jobs for each hepmc file.
Full scripts example
Full script to start a slurm job (still be run under bash for debug purposes)
#!/bin/bash
#SBATCH --account=eic
#SBATCH --partition=production
#SBATCH --job-name=k_lambda_18x275_5000evt_073
#SBATCH --time=24:00:00
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=5G
#SBATCH --output=/volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_073.slurm.log
#SBATCH --error=/volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_073.slurm.err
set -e
# Ensure singularity is available
if ! command -v singularity &> /dev/null; then
echo "singularity not found. Please load the module or install singularity."
exit 1
fi
echo "Running job k_lambda_18x275_5000evt_073 on $(hostname)"
echo "Using container image: /cvmfs/singularity.opensciencegrid.org/eicweb/eic_xl:25.07-stable"
echo "Executing container-run script: /volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_073.container.sh"
# Execute the container-run script inside the container.
singularity exec -B /volatile/eic/romanov/meson-structure-2025-07:/volatile/eic/romanov/meson-structure-2025-07 /cvmfs/singularity.opensciencegrid.org/eicweb/eic_xl:25.07-stable /volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_073.container.sh
echo "Slurm job finished for k_lambda_18x275_5000evt_073!"
Script of what to do in containers
#!/bin/bash
set -e
# This script is intended to run INSIDE the Singularity container.
echo "Sourcing EIC environment..."
# Adjust the path if needed:
source /opt/detector/epic-main/bin/thisepic.sh
echo ">"
echo "=ABCONV===================================================================="
echo "==========================================================================="
echo " Running afterburner on:"
echo " /volatile/eic/romanov/meson-structure-2025-07/eg-hepmc-priority-18x275/k_lambda_18x275_5000evt_074.hepmc"
echo " Resulting files:"
echo " /volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_074.afterburner.*"
/usr/bin/time -v abconv /volatile/eic/romanov/meson-structure-2025-07/eg-hepmc-priority-18x275/k_lambda_18x275_5000evt_074.hepmc --output /volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_074.afterburner 2>&1
echo ">"
echo "=NPSIM====================================================================="
echo "==========================================================================="
echo " Running npsim on:"
echo " /volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_074.afterburner.hepmc"
echo " Resulting file:"
echo " /volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_074.edm4hep.root"
echo " Events to process:"
echo " 5000"
/usr/bin/time -v npsim --compactFile=$DETECTOR_PATH/epic.xml --runType run --inputFiles /volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_074.afterburner.hepmc --outputFile /volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_074.edm4hep.root --numberOfEvents 5000 2>&1
echo ">"
echo "=EICRECON=================================================================="
echo "==========================================================================="
echo " Running eicrecon on:"
echo " /volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_074.edm4hep.root"
echo " Resulting files:"
echo " /volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_074.edm4eic.root"
/usr/bin/time -v eicrecon -Ppodio:output_file=/volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_074.edm4eic.root /volatile/eic/romanov/meson-structure-2025-07/reco/k_lambda_18x275_5000evt_074.edm4hep.root 2>&1
echo "All steps completed for k_lambda_18x275_5000evt_074!"