troubleshooting

re-running samples

Always good to check your outputs.

In the trimmomatic step of our pipeline, we use our catcher function to see whether all samples were processed, and it suggests (because it should return missing samples) that everything went fine (i.e. it does not return any sample IDs.)

# again, using the catcher function we wrote in the programme-set up stage. 
catcher $filt 

# looks okay, but...

However, if we check the slrm log outputs from the trimmomatic step, we will see a different story:

# again, using the catcher function we wrote in the programme-set up stage. 
grep "rror" ./omm__2*

>  ./omm__2__trimmo4.1228821.slrm:slurmstepd: error: *** JOB 1228821 ON adacompute01 CANCELLED AT 2026-06-16T17:09:21 DUE TO TIME LIMIT ***
>  ./omm__2__trimmo4.1228846.slrm:slurmstepd: error: *** JOB 1228846 ON adacompute01 CANCELLED AT 2026-06-16T17:36:51 DUE TO TIME LIMIT ***
>  ./omm__2__trimmo4.1228851.slrm:slurmstepd: error: *** JOB 1228851 ON adacompute01 CANCELLED AT 2026-06-16T17:39:51 DUE TO TIME LIMIT ***
>  ./omm__2__trimmo4.1228853.slrm:slurmstepd: error: *** JOB 1228853 ON adacompute01 CANCELLED AT 2026-06-16T17:43:21 DUE TO TIME LIMIT ***

We have a problem - 4 samples (note that the IDs are not mentioned) took took longer than we defined in our slurm script, so the HPC killed those processes. However, because trimmomatic was running, some files were produced, and these were picked up by catcher, which is why it looked like everything was fine. We can assume that these four files are broken, malformed, incomplete - or all of the above.

So wat do. We need to get those ID’s, increase the time limit on trimmomatic, and re-submit those 4 sample to SLURM. We could also consider deleting the incomplete outputs, just to be safe.

You can do this manually if you like (it would take about three minutes) but for funs’ sake we’re going to script it here (an hour to code, +1 second to run).

# -L option just returns the logfile - we grab the id from that.
grep -l "rror" ./omm__2*
# 

# add that grep into a for-loop and search for samople IDs:
for slrm in $( grep -l "rror" ./omm__2* ) ;
do
  grep -E ".*__join\/.*_R1.*" $slrm ;
done


##  no fgood solve yet :(
# # challenge: output is long, with multiple instances of that ID. how to parse properly?
# for slrm in $( grep -l "rror" ./omm__2* ) ;
# do
# grep -E ".*__join\/.*_R1.*" $slrm | sed -E 's/.*__join\/(.*)_R1.*/########\1#######/g' ;
# done

Copy and paste and write it down. No shame.

vaginal-69-Visit-1_S36
vaginal-95-Visit-1_S45
vaginal-98-Visit-3_S96
vaginal-9-Visit-3_S56

First, inspect and then delete the broken versions:

# always inspect before deleting something - just to be sure.
lk $filt/{vaginal-69-Visit-1_S36,vaginal-95-Visit-1_S45,vaginal-98-Visit-3_S96,vaginal-9-Visit-3_S56}*

# Note: the $filt/{a,b,c,d}* format used finds any file matching that format. (a bash trick called globbing)

# if youre sure, delete them!
rm $filt/{vaginal-69-Visit-1_S36,vaginal-95-Visit-1_S45,vaginal-98-Visit-3_S96,vaginal-9-Visit-3_S56}*

then increase the max time limit on the trimmo4.sh script

nano $mat/ucc__fbio__slurm__trimmo4.sh    # from 20 to 30?

then refire with SLURM:

for id in vaginal-69-Visit-1_S36 vaginal-95-Visit-1_S45 vaginal-98-Visit-3_S96 vaginal-9-Visit-3_S56 ; 
do 
  sbatch $mat/${proj}__slurm__trimmo4.sh $id $join $filt ; 
done

Seems to work fine…

connection issues

under construction and solutions can go here in the future.

corrupted `FASTQ`

under construction and solutions can go here in the future.

troubleshooting

`climber_v0.4`

`jfg`, Jun 2026

re-running samples

connection issues

corrupted `FASTQ`

troubleshooting

climber_v0.4

jfg, Jun 2026

re-running samples

connection issues

corrupted FASTQ

`climber_v0.4`

`jfg`, Jun 2026

corrupted `FASTQ`