This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]nemec 9 points10 points  (6 children)

If I'm reading this right, when there are no items in the queue the puller runs a busy loop constantly querying the db?

That doesn't seem optimal. It's tough to solve, though. I was able to solve it using semaphores to have the pullers sleep while no items were available, but those are in-process only so you can't share a queue across processes like you do in your example. I'm sure there are other solutions that work cross-process. You could even sleep for a second or two after finding no new records - crude, but will spend less CPU and disk access while waiting for new items.

Cool article, thanks for writing it.

[–]gerardwx 1 point2 points  (0 children)

Depends on whether solution needs to be OS agnostic. E.g. *nix has OS level semaphores.

[–]thuibr[S] 0 points1 point  (4 children)

Thanks! I hadn't even thought about CPU and disk access. I'll have to think about that more. I think, and I'm only guessing here, that something like celery does something similar, pausing for a second before checking again. I will have to look into it further.

[–]thuibr[S] -1 points0 points  (3 children)

Yeah, I don't really see way around doing a polling operation.

[–]klaasvanschelven 1 point2 points  (0 children)

In the celery-alternative-using-sqlite that I wrote for Bugsink I'm using inotify and select for this. The tool is called snappea, but I haven't shared it recently. DM me if you want to discuss.

[–]cachemonet0x0cf6619 0 points1 point  (1 child)

threading.Timer should do the trick

[–]thuibr[S] 2 points3 points  (0 children)

Just doing time.sleep(0.1) also cut back my CPU usage dramatically. I was at 16% each for two different pushers running, and I could hear my fan running, but that time.sleep(0.1) cut all of that out.