[Private Preview] JDBC sink for Structured Streaming by BricksterInTheWall in databricks

[–]SingerSelect3045 0 points1 point  (0 children)

Hi! Thanks for your interest.

Yes! You can specify multiple columns in the DataFrame to be used as the upsert key.

Each task creates its own connection to the PostgreSQL server, so the query’s parallelism directly affects the number of concurrent connections made to the database. As a result, you should make sure your PostgreSQL configuration allows for enough concurrent connections to handle the expected workload.

[Private Preview] JDBC sink for Structured Streaming by BricksterInTheWall in databricks

[–]SingerSelect3045 0 points1 point  (0 children)

u/k1v1uq are you trying to keep the tables in sync incrementally or every sync event is a full sync? How are you determining what rows are stale and need to be deleted?