Consider this use case: A build has 4 steps which can each be lengthy. Say it takes 5 minutes to do each step.
The last step requires making many API calls to a service which will IP ban you if you are using many processes at once because you won’t be obeying its throttling rules.
Currently I have to set my whole pipeline to
concurrency: limit: 1 so that if I push 2 separate branches at the same time, they won’t “DDOS” the server if they both get to Step 4 around the same time. However, my cluster can more than handle doing Steps 1-3 on many builds concurrently. It’s just the Step 4s which need to be sequenced. So if I push 5 builds, it will take 100 minutes for the last branch to finish, even though the builds could have all been ready to start Step 4 in 15 minutes, and then in 25 minutes they could all single-file through Step 4.
To enable this to work, concurrency would need to also be able to be declared on the step level. The ‘scheduler’ could check to see how many items from this STEP of this pipeline are running before allowing the a Step to start, just as today it probably counts how many jobs from this pipeline are currently running.