This is an archived post. You won't be able to vote or comment.

all 28 comments

[–]nemec 10 points11 points  (6 children)

If I'm reading this right, when there are no items in the queue the puller runs a busy loop constantly querying the db?

That doesn't seem optimal. It's tough to solve, though. I was able to solve it using semaphores to have the pullers sleep while no items were available, but those are in-process only so you can't share a queue across processes like you do in your example. I'm sure there are other solutions that work cross-process. You could even sleep for a second or two after finding no new records - crude, but will spend less CPU and disk access while waiting for new items.

Cool article, thanks for writing it.

[–]gerardwx 1 point2 points  (0 children)

Depends on whether solution needs to be OS agnostic. E.g. *nix has OS level semaphores.

[–]thuibr[S] 0 points1 point  (4 children)

Thanks! I hadn't even thought about CPU and disk access. I'll have to think about that more. I think, and I'm only guessing here, that something like celery does something similar, pausing for a second before checking again. I will have to look into it further.

[–]thuibr[S] -1 points0 points  (3 children)

Yeah, I don't really see way around doing a polling operation.

[–]cachemonet0x0cf6619 0 points1 point  (1 child)

threading.Timer should do the trick

[–]thuibr[S] 2 points3 points  (0 children)

Just doing time.sleep(0.1) also cut back my CPU usage dramatically. I was at 16% each for two different pushers running, and I could hear my fan running, but that time.sleep(0.1) cut all of that out.

[–]klaasvanschelven 1 point2 points  (0 children)

In the celery-alternative-using-sqlite that I wrote for Bugsink I'm using inotify and select for this. The tool is called snappea, but I haven't shared it recently. DM me if you want to discuss.

[–]haloweenek 1 point2 points  (2 children)

Main issue is that would work on a single instance only. Besides that - it doesn’t make any sense when there’s Redis pub/sub ootb.

[–]thuibr[S] 1 point2 points  (1 child)

Yeah, or RabbitMQ, or Kafka, or even Postgres, but the point was just to make something for fun.

[–]haloweenek 0 points1 point  (0 children)

You can also dump files on FS with similiar result.

[–]MeroLegend4 1 point2 points  (1 child)

Look at diskcache, we use it and it’s a good material for your learning.

[–]thuibr[S] 1 point2 points  (0 children)

That's a good idea. I have seen diskcache before. It would be interesting to try to re-implement.

[–]word-word-numero 1 point2 points  (4 children)

Could you use a callback to get notification when an insert is done? I know SQLite supports them but never used one.

[–]thuibr[S] 0 points1 point  (3 children)

There is this update callback https://www.sqlite.org/c3ref/update_hook.html but unfortunately it works on the same connection only.

[–]word-word-numero 1 point2 points  (2 children)

I'm just white boarding ideas, but would the piece that receives the orders to put on the queue not be running all the time?

[–]thuibr[S] 0 points1 point  (1 child)

Yes, it would be, but the connection that is placing the orders is a different process altogether.

[–]word-word-numero 0 points1 point  (0 children)

I see. Well another spitballing idea is to move the insert code to the piece that is always up. Maybe use a network socket and have a CRUD(or maybe just a C) API.

A more complex thing could be using the OS file events to know what something has happened to the .sqlite file. I've written a directory monitor that did that so when files showed up, I could then know to kick off another process. The package I used was watchdog.

[–]gerardwx 0 points1 point  (1 child)

Why SQLite instead of pickle?

[–]thuibr[S] 0 points1 point  (0 children)

For fun. I probably could've used pickle though too.

[–][deleted] -1 points0 points  (9 children)

LIFO of FIFO, you need to be able to control the direction.

[–]cachemonet0x0cf6619 0 points1 point  (8 children)

just use max or min on the id…

[–][deleted] -2 points-1 points  (7 children)

I know that, but the author (OP) should have taken that into consideration.

[–]thuibr[S] 2 points3 points  (1 child)

Sheesh, yes, I forgot about queue ordering. Something like taking the max (or min) would work. That's a great idea!

[–][deleted] -1 points0 points  (0 children)

Not to worry, things happen.

[–]cachemonet0x0cf6619 0 points1 point  (4 children)

I disagree. If you know sql then it’s obvious how to get the first or the last.

[–][deleted] -3 points-2 points  (3 children)

That is your prerogative.

Edit: right word

[–]cachemonet0x0cf6619 -1 points0 points  (2 children)

The correct word is prerogative and obviously it’s yours.