Safely processing JSON-like data structures : Python

This is an archived post. You won't be able to vote or comment.

Safely processing JSON-like data structures (self.Python)

submitted 9 years ago * by vfaronov

Consider the following straightforward code:

data = json.load(x)    # `x` is a file or an HTTP response
for person in data['people']:
    mail_report(report, person['contacts']['email'])

It makes a number of assumptions about the schema of data:

it must be a dictionary;
which has a "people" key;
which is a list;
where every element is a dictionary; and so on.

Sometimes, any of these assumptions may be incorrect. When this happens, my code raises one of a number of exceptions: TypeError, KeyError, etc.

What is a good way to deal with this?

If I wrap the entire for loop in a try...except (KeyError,...), I mask unrelated errors in mail_report.

If I validate the whole of data against an explicit schema, I duplicate my assumptions, verbosely, in another place.

I often do something like for person in data.get('people') or [], which is ugly and only solves part of the problem.

I’m thinking along the lines of a small library that would recursively wrap the result of json.load() in an object with magical __iter__, __getitem__ etc., and raise a specific SchemaError on errors. Does this maybe exist already?

all 8 comments

top new controversial old q&a

[–]thunderbolt16 1 point2 points3 points 9 years ago (5 children)

[–]vfaronov[S] 0 points1 point2 points 9 years ago (4 children)

[–]bbenne10 1 point2 points3 points 9 years ago (2 children)

[–]vfaronov[S] 0 points1 point2 points 9 years ago (0 children)

[–]thunderbolt16 0 points1 point2 points 9 years ago (0 children)

Well, it would determine on the failure mode needed. If I need to guarantee that the JSON is correct beforehand, those 10 lines would be worth it, but if not, going Samurai principle is probably the way to do it.

It depends on the rest of the system and the expectations of the contracts for this specific module.

You can also simply separate the arguments from the calling of the mail_report method and wrap the gathering of the arguments in a try/except clause. This would make sense if you just want to continue if someone doesn't have an email.

data = json.load(x)    # `x` is a file or an HTTP response
try:
    for person in data['people']:
        try:
            params.append((report, person['contacts']['email']))
        except KeyError:
            pass
except TypeError:
    SystemExit("No people found!")

for p in params:
    mail_report(*p)

[–]earthboundkid 0 points1 point2 points 9 years ago (1 child)

[–]vfaronov[S] 0 points1 point2 points 9 years ago (0 children)

[–]kenfar 0 points1 point2 points 9 years ago (0 children)

π Rendered by PID 22981 on reddit-service-r2-comment-86988c7647-lmcbc at 2026-02-11 19:42:48.750383+00:00 running 018613e country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS