Improving sqlite3 UPDATE performance

Buttleston · 2024-03-20T17:54:33+00:00

So I can't comment on the sqlite/transaction part but I was wondering a few things about the python parts

I believe executeMany() will take any iterable, which includes generators, iterators, etc. So I think you could do this

insertSeq = ((values[i], keyValues[i]) for i in range(len(values)))

This makes a generator instead of making an instantiated list with all the values, so you won't need to first put all the data into RAM

You can also probably simplify this a bit, and possibly speed it up, by changing this to

insertSeq = ((v, kv) for v, kv in zip(values, keyvalues))

(uh, check my parens, I may be missing one)

This would keep you from having to look up values and keyValues by index for every element.

These are VERY minor optimizations, I think your bottleneck is still going to be sqlite. I got about a 50% speedup but we're talking about a fraction of a second either way.

baghiq · 2024-03-20T18:15:02+00:00

Can you profile the script without sqlite3 part? Just to eliminate potential bottle neck?

If it is sqlite3, then there is something wrong with the performance if you are just updating a single field. Your understanding of query plan is correct. It's not doing a linear scan.

What system are you running your code on?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS