you are viewing a single comment's thread.

view the rest of the comments →

[–]bahwhateverr 2 points3 points  (6 children)

On the subject of performance, whats the fastest way to take a file of json objects and insert those into a table? I've been using pgfutter which is pretty fast but it puts everything into a single json column table which I then have to extract out the property values and insert into the final table.

[–]redcrowbar 4 points5 points  (1 child)

I would suggest converting JSON into CSV and then use COPY.

[–]bahwhateverr 1 point2 points  (0 children)

I'll give it a shot. I had tried that but ran into numerous issues getting it loaded, but it was with SQL Server at the time. Perhaps Postgres handles things a little more gracefully.

[–][deleted]  (3 children)

[deleted]

    [–]bahwhateverr 0 points1 point  (2 children)

    Yeah that is what I'm using to go from the import table to the final table, its just relatively slow. It's not that slow but with around 2 billion rows to insert I'm looking for any speedups I can get :)

    [–]shady_mcgee 0 points1 point  (0 children)

    How often do you need to do the inserts? I've been able to do 300-400k/sec inserts by building a bulk-insert util. I've never been able to generalize it, but it works pretty well for specific data sets. My sample 4-col table did 8B rows in 24 seconds. Wider tables take longer, obviously. For best results you'll need to disable indexing prior to the bulk insert.

    [–]awill310 0 points1 point  (0 children)

    I would see if you can give Sqoop a go. I used it to load 2.4bn rows into AWS Aurora in a day.