Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WORKFLOW-199: polling every X seconds for job completion #138

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bahamas10
Copy link
Contributor

This adds the ability to pass new parameters to POST /jobs when creating a new job via the api

  • wait: set this to true to have the POST request block while the job completes
  • callback_urls: an optional array of endpoints to call, by the runner, when a job completes.

A caller of the api service can create a job like this now

{
    "workflow": "82280dc4-f5d9-4def-b5c1-a1361db9dd27",
    "wait": true,
    "callback_urls": [
        "http://localhost:8080/jobdone"
    ]
}

where wait tells the API to return the headers immediately (containing the same headers that POST /jobs would normally give), but to wait until the job has completed to return the body. The body of the request will be the same body that would be supplied by GET /jobs/:uuid... ie. it will be a JSON body that gives the exit status of the job (completed, failed, etc.)

callback_urls should contain a URL to hit the api service (the runner calls the urls in callback_urls) at the endpoint /jobdone, which is a new endpoint for the API service that takes a POST request with the output of a job, and uses that to end any requests that are currently waiting.

todo

  • add tests
  • add ability to handle headers already sent with restify (if it makes sense)
  • limit to 80 columns
  • i think node limits open HTTP requests to a 2 minute timeout, so this should be modified

caveats

  • if the runner crashes after a job completes, but before the callbacks are fired, they will never trigger
  • specifying wait: true without registering the API service as a callback_url will result in the request hanging indefinitely

@kusor
Copy link
Contributor

kusor commented Jun 17, 2015

@bahamas10 can't we modify it in a way that when wait: true is given w/o callback_urls we either return a 400 Bad Request or behave as if wait: true weren't present?.

Additionally, we should review if we can keep HTTP requests open more than 2 minutes, otherwise it will not be too useful for jobs like "SDC import images", where the average time is pretty much over that time always.

And, speaking of such jobs. Do you think that it may have sense to add some variable to workflow definition which may allow/disallow using wait for a given type of jobs?.

Everything else looks fine to me. Remember to add some test case and fix the 80 columns issues, please ;-)

@bahamas10
Copy link
Contributor Author

@bahamas10 bahamas10 force-pushed the dave.eddy-1434050760 branch from ca779f3 to 5c169d2 Compare June 17, 2015 19:22
@bahamas10
Copy link
Contributor Author

@kusor thanks for the review.. i've added the logic to handle throwing a ConflictError (409) if wait is specified but callback_urls is not (or is set but is an empty array or something equally bad)... the rest of the code uses that status code to signify missing or conflicting parameters.

I'll test the 2 minute thing, as well as check out restricting blocking jobs in the workflow definition

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants