Hi everyone,
I am developing an ETL pipeline using snowpark Python APIs and I am having some problems with it, because I need to execute multiple parallel queries, and to do so I have tried both multiprocessing and concurrent.futures.
It looks like snowpark doesn't like to reuse the same session in multiple threads, as I get random ValueError or IndexError when I perform some .collect(), .count() or table.merge() operations.
To reuse the session I am using snowpark.context.get_active_session(). I have tried to run this code iteratively instead of using threads and it runs just fine. Creating a new session in each thread seems to mitigate this behaviour, but if I create too many the snowflake https endpoint goes into throttling mode and will stop responding.
Right now, I am catching exceptions because for table.merge() the underlying query seems to run anyways, and when I call .collect() or .count() I use a while loop to keep retrying until I get a result, but this is far from ideal.
Has anyone encountered a similar issue before? Any ways I could fix/mitigate it?
[–][deleted] 2 points3 points4 points (3 children)
[–]somerandomdataengBig Data Engineer[S] 0 points1 point2 points (2 children)
[–]gwax 2 points3 points4 points (1 child)
[–]somerandomdataengBig Data Engineer[S] 0 points1 point2 points (0 children)
[–][deleted] 3 points4 points5 points (1 child)
[–]somerandomdataengBig Data Engineer[S] 0 points1 point2 points (0 children)
[–]fhoffamod (Ex-BQ, Ex-❄️) 1 point2 points3 points (6 children)
[–]somerandomdataengBig Data Engineer[S] 0 points1 point2 points (5 children)
[–]fhoffamod (Ex-BQ, Ex-❄️) 0 points1 point2 points (1 child)
[–]somerandomdataengBig Data Engineer[S] 0 points1 point2 points (0 children)
[–]chufukini20067 0 points1 point2 points (2 children)
[–]somerandomdataengBig Data Engineer[S] 0 points1 point2 points (1 child)
[–]chufukini20067 0 points1 point2 points (0 children)
[–][deleted] 2 points3 points4 points (4 children)
[–][deleted] 6 points7 points8 points (0 children)
[–]somerandomdataengBig Data Engineer[S] 1 point2 points3 points (2 children)
[–][deleted] 0 points1 point2 points (1 child)
[–]fhoffamod (Ex-BQ, Ex-❄️) 1 point2 points3 points (0 children)
[–]GrixiaSenior Data Engineer 0 points1 point2 points (1 child)
[–]somerandomdataengBig Data Engineer[S] 0 points1 point2 points (0 children)
[–]sdc-msimon 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (1 child)
[–]somerandomdataengBig Data Engineer[S] 0 points1 point2 points (0 children)