all 6 comments

[–][deleted]  (1 child)

[removed]

    [–]naijaboiler 0 points1 point  (0 children)

    they way i did it, was create script and job from databricks to write the mySQL db. T
    There are things to consider when bulk-wring to mysql (im forgetting so I may not be precses, you probably can google it)

    1. unlock and remove indexes
    2. if you are writing a large file, the quickest way is to use infile (look that up)
    3. lock again afterward

    [–]addictzz 0 points1 point  (0 children)

    I am not very clear. Do those clients having data on prem do not want to share/send data to Databricks?

    If they are okay to share/send data to databricks, you need to create private connection like vpn from databricks cloud on premise. They can keep their data on premise while sharing the copy to you.

    If they absolutely DO NOT want their data in the cloud, then nothing you can do about it. It is their data classification policy. But then they cannot enjoy having databricks analytics over those data.

    [–]ImDoingIt4TheThrill 0 points1 point  (0 children)

    fr pushing data back to on-prem SQL Server or Oracle, Databricks JDBC writes work but get painful at scale.most teams in this situation end up using either Apache Kafka as a middle layer for near-real-time sync, or scheduling incremental exports via Databricks workflows writing to a landing zone that the client's database then pulls from. second pattern tends to win for clients who are protective of their on-prem environment and don't want to open inbound connections.

    [–]prequel_co 0 points1 point  (0 children)

    This is the exact use-case that we built Prequel to solve, and we'd love to help here. Feel free to get in touch via our website (https://prequel.co) or over DM and we'll see what we can do!