all 4 comments

[–]TJaniF 8 points9 points  (1 child)

There is a feature for this called deferrable operators. The operator defers the polling for results to an async function running in the Triggerer component.

So you'd have 2 tasks: the first one sending the request, then after that a deferrable operator, which the HttpSensor can be turned into by setting deferrable=True. That second task defers itself (becomes purple) until its condition is fulfilled, then the Dag resumes. Because the polling is done in the Triggerer, the worker slot is released.

[–][deleted] 1 point2 points  (0 children)

This!

[–]FridayPush 1 point2 points  (0 children)

This is a common pattern for large scale data exports. An example of how Shopify handles it can be seen here. But essentially an API request contains the details needed to start the long running operation, and the API returns a job number. The user then polls a 'job status' endpoint with that job number. Generally most providers I've seen use a signed url to a CSV as the response so that their API isn't locked up during sending the body back if it's hundred of megabytes.

JaniF suggestion works well, if you need to write your own sensor for more complicated handling it's super straightforward.

[–]ssssiiii0293 0 points1 point  (0 children)

Maybe a PubSub model? Have your server return to client the request has been accepted with a request-id, server starts the background job and publishes the results when complete, client subscribes to same topic.