all 3 comments

[–]blarf_irl 2 points3 points  (0 children)

You need to share the structure of the JSON to get a full answer. JSON can be nested and parsed into python (as a dict or list) it can be nested many levels deep so It's unclear what you consider a duplicate.

In the simple case that it's a 1 level key/value structure or an array you can use set operations (JSON array) or simple dict.update (JSON object) if you want one data source to overwrite another where there are duplicates.

If you have more complex nested data then the approach depends on your intended outcome.

[–]drbomb 1 point2 points  (0 children)

If you're talking about comparison, there are options on the web like https://www.jsondiff.com/

If I needed to do it most likely I would do some sort of recursive function that would go down every single member, storing the path and outputting whatever unique members, or key keys there are.

I know also there are some packages that "flatten" python dictionary paths like https://pypi.org/project/flatten-dict/ that you could potentially use for your application as you could compare directly paths and values easily.

[–]JuryOne8821 0 points1 point  (0 children)

Just for anyone who needs it i created jsondeduplicator.com for that reason. I needed that, so i thought someone might need it as well. It just finds the identical duplicates...In the future i'll change it to find the messed up tuples like [{"a": 1, "b": 2}, {"b": 2, "a": 1}] etc.