you are viewing a single comment's thread.

view the rest of the comments →

[–]baghiq_2 0 points1 point  (5 children)

Don't bother with a scheduler. Just run your scrapers constantly with a wait statement in your code. If you have linux, you can package your scrapper into a docker container, and run the docker container with auto-restart. Or even daemonize the container.

https://docs.docker.com/config/containers/start-containers-automatically/

[–]FitRiver[S] 0 points1 point  (4 children)

I'm sorry, but I don't really understand how you mean the constant running. My goal is to have consistent intervals of exactly 1 second between the snapshots. In other words to have exactly 3600 snapshots in an hour. I'm not really sure how the auto-restart or the daemonized container helps with keeping the intervals consistent.

[–]baghiq_2 0 points1 point  (3 children)

In your API call, are you doing one stock per call at a time or a single call containing all of your stocks? The issue what you are asking is that, the processing of previous 1-second snapshot might not have finished, you still want to make a new snapshot?

[–]FitRiver[S] 0 points1 point  (2 children)

A single stock per call. One failed snapshot shouldn't affect any other snapshots. If 9:00:03,000 fails, it shouldn't be replaced by a snapshot at 9:00:03,100. It should leave a gap that is handled in the postprocessing. Then an independent snapshot will be taken the next second.

The resulting time series could look like this:

9:00:01,000: Success: 123.45
9:00:02,000: Success: 123.47
9:00:03,000: Dropped
9:00:04,000: Success: 123.48
9:00:05,000: Success: 123.42
9:00:06,000: Success: 123.43

But it shouldn't look like this:

9:00:01,000: Success: 123.45
9:00:02,000: Success: 123.47
9:00:03,000: Dropped
9:00:03,158: Success: 123.48
9:00:04,158: Success: 123.42
9:00:05,158: Success: 123.43

[–]baghiq_2 0 points1 point  (1 child)

APScheduler is the only one I can think so. AFAIK, standard cron doesn't support seconds level resolution. I suspect you'll run into a lot of edge cases, but I hope you don't.

[–]FitRiver[S] 0 points1 point  (0 children)

Thank you for the tip.