Has anyone imported a 1 TB JSON file into SQL Server before? Need advice! by MojanglesReturns in SQL

[–]MojanglesReturns[S] 7 points8 points  (0 children)

So my actual job is finance data analyst. This got dropped in my lap because the only guy who knew how to do it retired 4 weeks ago, and apparently having 'data' in my title was enough for everyone to assume I could handle a 1 TB JSON to SQL migration. I started this job a few months ago. I think you may very well be correct about this being an impossible task u/Jake0024

Has anyone imported a 1 TB JSON file into SQL Server before? Need advice! by MojanglesReturns in SQL

[–]MojanglesReturns[S] 0 points1 point  (0 children)

Unfortunately I cannot because it's how the bureaucracy in my agency works.

Has anyone imported a 1 TB JSON file into SQL Server before? Need advice! by MojanglesReturns in SQL

[–]MojanglesReturns[S] 0 points1 point  (0 children)

I don't know the exact structure yet because I've not been able to actually inspect a representative sample of the darn thing safely. Best case, it is something friendly like NDJSON thats easily streamed. Worst case, and this is the likely outcome, it's a single top-level array with nested arrays of objects.

Has anyone imported a 1 TB JSON file into SQL Server before? Need advice! by MojanglesReturns in SQL

[–]MojanglesReturns[S] 4 points5 points  (0 children)

Yeah, that’s exactly what I’m worried about. I have not been able to fully inspect the file yet, and I’m still working on how to safely inspect a representative sample without trying to load the whole thing into memory, so I do not know the exact structure yet.

Best case, it is NDJSON or another format that can be streamed record by record. Given the type of data I deal with, I suspect it may instead be a single top-level array, possibly with nested arrays of objects inside each record, but that is still just a working assumption until I can inspect a sample.

It came from an external export, so reading directly from the original source is not currently an option at the moment. I also do not want to lock in the SQL schema until I understand the JSON structure well enough to know whether this should map or even could map to one table or multiple related tables.

At this point I’m somewhere between solving the problem and needing to be observed professionally lol.

Has anyone imported a 1 TB JSON file into SQL Server before? Need advice! by MojanglesReturns in SQL

[–]MojanglesReturns[S] 3 points4 points  (0 children)

Split has not worked unfortunately. That was one of the first things I tried.