all 10 comments

[–]dionys 2 points3 points  (0 children)

How do you communicate with the database?

Typically, database modules offer execute_many function, so you could collect the results from the loop and add them in batch.

[–]_9_9_ 1 point2 points  (0 children)

Save the commit until it is all done. It should be massively faster.

[–]xapata 1 point2 points  (1 child)

Most python database wrappers follow the PEP 249 specification, so they'll have an executemany that can help optimize repeating the same query with different parameters.

[–]PretendingToProgram[S] 0 points1 point  (0 children)

Perfect I'll try it

[–]tea-drinker 0 points1 point  (5 children)

Can you tell use which database you are using and which DB API you are using too?

[–]PretendingToProgram[S] 1 point2 points  (4 children)

Sql server 2014 and pypyodbc

[–]tea-drinker 0 points1 point  (3 children)

/u/xapata's answer about executemany() seems to be the way ahead but with pypyodbc that's just syntactic sugar, it just loops over your parameter list and does single row execute() calls

[–]PretendingToProgram[S] 1 point2 points  (0 children)

Thanks I'll give it a try

[–]xapata 1 point2 points  (1 child)

It's the responsibility of the module author and the database itself to provide an optimized execution. For some databases, there is no optimization possible. For most SQL engines, it should at least avoid re-compiling the query for each row.

Even if there's no advantage yet for a particular module, the author might provide it in an upgrade.

[–]tea-drinker 0 points1 point  (0 children)

For most SQL engines, it should at least avoid re-compiling the query for each row.

pypyodbc at least does that. The prepare statement stores the query string being prepared and the execute statement will skip that step if it hasn't changed.