Skip to content

Create new job to make cancelled jobs as failure jobs in CI runner#780

Open
ThreeMonth03 wants to merge 1 commit into
solvcon:masterfrom
ThreeMonth03:issue772
Open

Create new job to make cancelled jobs as failure jobs in CI runner#780
ThreeMonth03 wants to merge 1 commit into
solvcon:masterfrom
ThreeMonth03:issue772

Conversation

@ThreeMonth03
Copy link
Copy Markdown
Collaborator

@ThreeMonth03 ThreeMonth03 commented May 14, 2026

In issue #772, we notice that when a job times out, it would be cancelled instead of fail.
Therefore, this pull request is to create new job, that could monitor cancelled jobs, and make cancelled jobs as failure jobs in CI runner.

Copy link
Copy Markdown
Collaborator Author

@ThreeMonth03 ThreeMonth03 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yungyuc Please review this pull request when you are available. Thanks.

Comment thread .github/workflows/nouse_install.yml Outdated
Comment on lines +101 to +109
send_email_on_failure:
needs: [nouse_install, fail_on_cancelled_jobs]
# Run if any of the dependencies failed or timed out in master branch
if: ${{ always() && contains(needs.*.result, 'failure') && github.ref_name == 'master' && github.event.repository.fork == false }}
uses: ./.github/workflows/send_email_on_fail.yml
with:
job_results: |
- nouse_install: ${{ needs.nouse_install.result }}
- fail_on_cancelled_jobs: ${{ needs.fail_on_cancelled_jobs.result }}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a job is cancelled, a notified email would be sent now.

Comment thread .github/workflows/nouse_install.yml Outdated
Comment on lines +88 to +99
send_email_on_failure:
fail_on_cancelled_jobs:
needs: [nouse_install]
# Run if any of the dependencies failed in master branch
if: ${{ always() }}
runs-on: ubuntu-24.04
steps:
- name: Fail on cancelled jobs
run: |
result="${{ needs.nouse_install.result }}"
echo "nouse_install: ${result}"
if [ "${result}" = "cancelled" ]; then
exit 1
fi
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot make a time-out job as a failure job easily. There are two workaround method.
The first opinion is to create a new job to monitor the cancelled jobs, and it would raise failure when there is a cancelled job.
The second opinion is to time the process by a timer. If the process times out, the timer would return 1.

I prefer the first opinion, so I create a new job to monitor the cancelled jobs.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would work, but is not as elegant as I hoped for.

If there is not a better way, I would take it. But I hope there is.

@yungyuc yungyuc added the build Build system and automation label May 14, 2026
Copy link
Copy Markdown
Member

@yungyuc yungyuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ExplorerRay @chestercheng Could you please help brainstorm?

Comment thread .github/workflows/devbuild.yml Outdated
fail_on_cancelled_jobs:
needs: [standalone_buffer, build, build_windows]
# Run if any of the dependencies failed in master branch
if: ${{ always() }}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to write the if, if it always() happens?

Copy link
Copy Markdown
Collaborator Author

@ThreeMonth03 ThreeMonth03 May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is. It's an interesting technique. See this reference.

If a job fails or is skipped, all jobs that need it are skipped unless the jobs use a conditional expression that causes the job to continue. If a run contains a series of jobs that need each other, a failure or skip applies to all jobs in the dependency chain from the point of failure or skip onwards. If you would like a job to run even if a job it is dependent on did not succeed, use the always() conditional expression in jobs.<job_id>.if.

In other words, if we remove if: ${{ always() }}, this job would be skipped when the jobs in needs are cancelled or fail. However, we don't want to skip this job at any time, so we need to add if: ${{ always() }}.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool. Could you please add a code comment above the line for the link to the reference? It look so innocent and future maintainers will thank for the point.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, I've added the code comment to make the statement clear.

Comment thread .github/workflows/nouse_install.yml Outdated
Comment on lines +88 to +99
send_email_on_failure:
fail_on_cancelled_jobs:
needs: [nouse_install]
# Run if any of the dependencies failed in master branch
if: ${{ always() }}
runs-on: ubuntu-24.04
steps:
- name: Fail on cancelled jobs
run: |
result="${{ needs.nouse_install.result }}"
echo "nouse_install: ${result}"
if [ "${result}" = "cancelled" ]; then
exit 1
fi
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would work, but is not as elegant as I hoped for.

If there is not a better way, I would take it. But I hope there is.

@ThreeMonth03 ThreeMonth03 force-pushed the issue772 branch 2 times, most recently from 26947d3 to d82bb1d Compare May 15, 2026 07:20
@ExplorerRay
Copy link
Copy Markdown
Collaborator

I think this PR can use https://github.com/marketplace/actions/run-and-post-run-action to solve this so that we don't need to handle a new job in CI.

I am considering adding a step like the following for each job that might timeout. And we may differentiate return code for failure and timeout (I write 2 in the following example).

- name: Failure when job timeout cancelled
  uses: gacts/run-and-post-run@v1
  with:
    post: |
      if [ "${{ job.status }}" == "cancelled" ]; then
        echo "::error::CI job timed out! Forcing failure."
        exit 2
      fi

@ThreeMonth03 ThreeMonth03 force-pushed the issue772 branch 2 times, most recently from e2a463d to 187499c Compare May 16, 2026 06:25
Copy link
Copy Markdown
Collaborator

@ExplorerRay ExplorerRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to adding post check step into setup dependency composite action.

If there isn't a composite action to set up dependency like Windows, add an extra step like what you original did.

job_results: |
- standalone_buffer: ${{ needs.standalone_buffer.result }}
- build: ${{ needs.build.result }}
- build_windows: ${{ needs.build_windows.result }}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. I forget this part.

Comment thread .github/workflows/devbuild.yml Outdated

steps:

- name: Fail if job was cancelled
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these lines can be moved into the beginning of setup dependency composite action so that we can set this when running that action.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build Build system and automation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants