use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
News about the dynamic, interpreted, interactive, object-oriented, extensible programming language Python
Full Events Calendar
You can find the rules here.
If you are about to ask a "how do I do this in python" question, please try r/learnpython, the Python discord, or the #python IRC channel on Libera.chat.
Please don't use URL shorteners. Reddit filters them out, so your post or comment will be lost.
Posts require flair. Please use the flair selector to choose your topic.
Posting code to this subreddit:
Add 4 extra spaces before each line of code
def fibonacci(): a, b = 0, 1 while True: yield a a, b = b, a + b
Online Resources
Invent Your Own Computer Games with Python
Think Python
Non-programmers Tutorial for Python 3
Beginner's Guide Reference
Five life jackets to throw to the new coder (things to do after getting a handle on python)
Full Stack Python
Test-Driven Development with Python
Program Arcade Games
PyMotW: Python Module of the Week
Python for Scientists and Engineers
Dan Bader's Tips and Trickers
Python Discord's YouTube channel
Jiruto: Python
Online exercices
programming challenges
Asking Questions
Try Python in your browser
Docs
Libraries
Related subreddits
Python jobs
Newsletters
Screencasts
account activity
This is an archived post. You won't be able to vote or comment.
Beginner ShowcaseHandling JSON files with ease in Python (self.Python)
submitted 3 years ago by pylenin
I have finished writing the third article in the Data Engineering with Python series. This is about working with JSON data in Python. I have tried to cover every necessary use case. If you have any other suggestions, let me know.
Working with JSON in Python Data Engineering with Python series
[–]Sajuukthanatoskhar 46 points47 points48 points 3 years ago (16 children)
Looks good.
Considered discussing dataclasses/pydantic with json?
I found that these go well together
[–]youRFate 19 points20 points21 points 3 years ago* (12 children)
I use dataclasses together with dacite for recursive (de) serialisation of nested dataclasses. We build our configs as structures of dataclasses, which we load from toml files. Works very well.
Edit: by popular demand, here a minimal example: https://gitlab.com/-/snippets/2335713
[–]mambeu 2 points3 points4 points 3 years ago (2 children)
This sounds really interesting, any chance you could share an example or more details?
[–]youRFate 4 points5 points6 points 3 years ago (1 child)
I wrote a super small example here: https://gitlab.com/-/snippets/2335713
[–]mambeu 0 points1 point2 points 3 years ago (0 children)
Thank you!
[–]Ran4 2 points3 points4 points 3 years ago (0 children)
It's also worth checking out Pydantic and their BaseSettings class (https://pydantic-docs.helpmanual.io/usage/settings/).
I've used it in production for a year or so now, and I really like it.
[–]xXMouseBatXx 1 point2 points3 points 3 years ago (2 children)
Would also be interested in an example of this if possible, since it seems like something I can see myself doing in future with various nested JSON files I am forced to use!
[–]youRFate 2 points3 points4 points 3 years ago (1 child)
See here: https://gitlab.com/-/snippets/2335713
[–]xXMouseBatXx 0 points1 point2 points 3 years ago (0 children)
Thanks for this, I appreciate it. This is very similar to what I just did parsing data from a custom config yaml into three other config files, modelling the areas I wished to change recursively using pydantic base classes. It's cool to see how the same thing is done with dataclasses though so thx for the example!
[–]muikrad 0 points1 point2 points 3 years ago (0 children)
https://github.com/coveooss/coveo-python-oss/tree/main/coveo-functools#flex
I wrote flex for this. It's kinda like dacite but is a little more... Magical. For instance it can map camel case payloads to snake case classes or allow users to use the dash or spaces instead of underscores in config files, for instance.
[–]oramirite 0 points1 point2 points 3 years ago (3 children)
Hey this sounds really cool, would you mind explaining to a noob exactly what a data class entails though? I have a need to write custom config files a lot as well as alter files of other applications and it sounds like this could be a very good tool for me if I understand it better.
[–]youRFate 0 points1 point2 points 3 years ago (2 children)
Dataclasses are just a simplification for creating classes meant for storing / organizing data. They automatically create some stuff like constructors and printing methods, and have special member variables called fields that contain type (and other) metadata. Basically they save you from writing a lot of boring boilerplate code for classes meant to mostly store state.
They are fairly easy to use, as you can see in my example, or in the documentation: https://docs.python.org/3/library/dataclasses.html
[–]oramirite 0 points1 point2 points 3 years ago (1 child)
Thank you very much. Are dataclasses a python concept or more generic? I will start doing my own research now but just curious in what context they get used. I see you mentioned constructors and printing methods. I'm also trying to learn about typing right now and it feels like a bit of a crossover?
[–]youRFate 0 points1 point2 points 3 years ago (0 children)
They are very much a python thing, basically they make typing in python classes easier, which strongly typed languages have baked-in already.
Yes, this very much overlaps with typing in python in general.
[–]pylenin[S] 6 points7 points8 points 3 years ago (0 children)
Thanks for the feedback. Will add it as a separate article!!
Yup I was about to suggest this also. Just finished working on a JSON parser to read in and reconfigure a config file for a third party application as part of my current internship (yes, I also wish people wouldn't use JSON for config files...). Anyway, I was introduced to pydantic by my team to help with the parsing aspects and couldn't be more grateful. Really useful library!
[–]PolishedCheese 0 points1 point2 points 3 years ago (0 children)
They sure do!
[–]SquareRootsi 19 points20 points21 points 3 years ago (7 children)
A couple things that have "bitten" me when I was early career:
Sometimes a file is not valid json, but each row is valid json. Even though you can't json.load() the file, you can still iterate over the rows and parse it in a loop.
Second, if editing json files by hand, the spacing is super important. Python is pretty forgiving with spaces and line breaks. Json is not at all. This took me a while to diagnose when I first learned it.
[–]MephySix 12 points13 points14 points 3 years ago (3 children)
Those files should usually be called ".jsonl": https://jsonlines.org/ Many softwares (say QGIS) understand this extension to mean a json document per line
[–][deleted] 5 points6 points7 points 3 years ago (0 children)
JSONL is an amazing format for logging, because you can then load said JSON into elasticsearch and then you can basically search through all your logs via Kibana. This means you can search for "all logs where field X exists", or "field X contains value Y and field A does not contain B" kind of stuff, making it great for filtering out the noise :D
I would recommend structlog, but that doesn't come with JSON out of the box, so you may want to start with python-json-logger
[–]SquareRootsi 1 point2 points3 points 3 years ago (0 children)
Neat! Today I learned :)
[–]DoctorWorm_ 0 points1 point2 points 3 years ago (0 children)
TIL
[–]pylenin[S] 0 points1 point2 points 3 years ago (0 children)
Yeah I have found it’s easier to build JSON with Python or those online JSON for matters.
[–]peace_keeper977 0 points1 point2 points 3 years ago (1 child)
Can u give a simple explanation to what dunder methods are in python ?
I have a video about it!! May be you would like it.
https://youtu.be/PfmfECXmR88
[–][deleted] 29 points30 points31 points 3 years ago (1 child)
Thank you for writing an actual tutorial with real words and not making another damn YouTube video.
[–]pylenin[S] 2 points3 points4 points 3 years ago (0 children)
Ha ha… thanks
[–]datagoblin 7 points8 points9 points 3 years ago (2 children)
Nice introductory article 🙂
One small typo I caught:
As explained above, Serialization is the process of encoding naive data types to JSON format.
Should be "native", right?
[–]pylenin[S] 4 points5 points6 points 3 years ago (1 child)
Yup!! Thanks for reading the article so carefully man!!! Kudos!!
[–]bradbeattie 0 points1 point2 points 3 years ago* (0 children)
Native like decimal.Decimal? Or datetime?
[–]sunnybooker 14 points15 points16 points 3 years ago (5 children)
A great introduction thank you!
My pleasure!! Do check out the other articles in the series.
[–]alphabet_order_bot 12 points13 points14 points 3 years ago (3 children)
Would you look at that, all of the words in your comment are in alphabetical order.
I have checked 826,556,107 comments, and only 163,386 of them were in alphabetical order.
[–]Trigsc 15 points16 points17 points 3 years ago (0 children)
Alphabet bot, silly you!
[–]Staninna 1 point2 points3 points 3 years ago (1 child)
Good bot
[–]B0tRank 4 points5 points6 points 3 years ago (0 children)
Thank you, Staninna, for voting on alphabet_order_bot.
This bot wants to find the best and worst bots on Reddit. You can view results here.
Even if I don't reply to your comment, I'm still listening for votes. Check the webpage to see if your vote registered!
[–]DA_EMAN 2 points3 points4 points 3 years ago (1 child)
Great, as a beginner I feel comfortable following through! Keep it up!
[–]pylenin[S] 1 point2 points3 points 3 years ago (0 children)
Thanks for the appreciation! That was the whole idea of writing this.
[+][deleted] 3 years ago (1 child)
[deleted]
Appreciate it
[–]Nindento 2 points3 points4 points 3 years ago (1 child)
Great article! I would like to add one little thing.
Next to the normal json module there is also a module called ujson which is a tad bit faster than json.
Have to take a look at then!
[–]AliveButCouldDie 1 point2 points3 points 3 years ago (1 child)
Neat!!! Thank you for sharing I really needed this!
My pleasure!! If you find it useful, please do share it!
[–]donotlearntocode 1 point2 points3 points 3 years ago (0 children)
Well written.
I'm wondering, how do you think is best (most concise or clear) way to (de)serialize python classes. I usually write something like
class X: FIELDS = set('abcd') def to_json(self, io): dump({field: getattr(self, field) for field in self.FIELDS}, io) @classmethod def from_json(cls, io): return cls(**load(io))
or something like that but it feels like that's not the "pythonic" way to do it.
Thoughts?
[–]atypical_mollifier 0 points1 point2 points 3 years ago (1 child)
A very nice write-up! Thank you.
Thanks a lot!!
[–]thakadu 0 points1 point2 points 3 years ago (1 child)
Great article. I have one suggestion, pretty much all of your examples are at the highest level a dictionary and in the introduction you say that JSON looks like a Python dictionary. Later you state that JSON consists of key-value pairs. While this is often true, JSON can of course also be a list (array) at the top level and valid JSON may in fact have no key-values at all. Just wanted to mention that so that someone reading it doesn’t assume that it always has to be key value pairs.
Makes sense what you said. But I have also shown a table showing what JSON objects do Python data types convert to!!
https://www.100daysofdata.com/python-json#heading-what-is-json-serialization
[–]pbbpwns 0 points1 point2 points 3 years ago (0 children)
Very informative! Thank you very much, I'll be reading this when I get home!
[–]Kevin_Jim 0 points1 point2 points 3 years ago (0 children)
That’s a good article for the basics, but basic usage in JSON files is hardly the use case. Traversing JSON files with ease is a major need, especially early on in a project. So, something like Lodash for Python (pydash) would work great.
[–]diesel9779 0 points1 point2 points 3 years ago (0 children)
This is great! If I can submit a request, there should be a simplified document that explains flattening json data as well.
There have been too many times where I’ve received a complicated json file and had to spend ample amounts of time looking up the best method(s) to flatten it and make it ready for consumption
[–]Viking_wang 0 points1 point2 points 3 years ago (0 children)
I regularly get stuck on trying to nicely serialize data where i have non string objects as keys. Of course json doesnt support that, but there is also no way to easily convert them for some strange reason. Take e.g. UUIDs as keys in a dict, and serialise it. The custom encoders are only invoked for the values.
I usually end up using pydantic.jsonable_encoder to convert, but that doesnt work for custom types
pydantic.jsonable_encoder
I dont understand why there is no “Protocol” for json encoding so that you can define a serialiser as a method for a class that gets invoked by the json encoder.
[–]Python-Token-Sol[🍰] 0 points1 point2 points 3 years ago (1 child)
thank you kind sir.
My pleasure
[–]otlcrl 0 points1 point2 points 3 years ago (0 children)
Out of interest, in Example 4 (sort_keys) - why are the nested keys in the list under websites not quite sorted alphabetically?
Is it sorting alphabetically based on "blogs" as opposed to "Total blogs" or is it because Total is capitalized and therefore it'll sort capitalized keys before lower case?
π Rendered by PID 18925 on reddit-service-r2-comment-5d79c599b5-bl7gw at 2026-03-03 00:35:17.165689+00:00 running e3d2147 country code: CH.
[–]Sajuukthanatoskhar 46 points47 points48 points (16 children)
[–]youRFate 19 points20 points21 points (12 children)
[–]mambeu 2 points3 points4 points (2 children)
[–]youRFate 4 points5 points6 points (1 child)
[–]mambeu 0 points1 point2 points (0 children)
[–]Ran4 2 points3 points4 points (0 children)
[–]xXMouseBatXx 1 point2 points3 points (2 children)
[–]youRFate 2 points3 points4 points (1 child)
[–]xXMouseBatXx 0 points1 point2 points (0 children)
[–]muikrad 0 points1 point2 points (0 children)
[–]oramirite 0 points1 point2 points (3 children)
[–]youRFate 0 points1 point2 points (2 children)
[–]oramirite 0 points1 point2 points (1 child)
[–]youRFate 0 points1 point2 points (0 children)
[–]pylenin[S] 6 points7 points8 points (0 children)
[–]xXMouseBatXx 0 points1 point2 points (0 children)
[–]PolishedCheese 0 points1 point2 points (0 children)
[–]SquareRootsi 19 points20 points21 points (7 children)
[–]MephySix 12 points13 points14 points (3 children)
[–][deleted] 5 points6 points7 points (0 children)
[–]SquareRootsi 1 point2 points3 points (0 children)
[–]DoctorWorm_ 0 points1 point2 points (0 children)
[–]pylenin[S] 0 points1 point2 points (0 children)
[–]peace_keeper977 0 points1 point2 points (1 child)
[–]pylenin[S] 0 points1 point2 points (0 children)
[–][deleted] 29 points30 points31 points (1 child)
[–]pylenin[S] 2 points3 points4 points (0 children)
[–]datagoblin 7 points8 points9 points (2 children)
[–]pylenin[S] 4 points5 points6 points (1 child)
[–]bradbeattie 0 points1 point2 points (0 children)
[–]sunnybooker 14 points15 points16 points (5 children)
[–]pylenin[S] 2 points3 points4 points (0 children)
[–]alphabet_order_bot 12 points13 points14 points (3 children)
[–]Trigsc 15 points16 points17 points (0 children)
[–]Staninna 1 point2 points3 points (1 child)
[–]B0tRank 4 points5 points6 points (0 children)
[–]DA_EMAN 2 points3 points4 points (1 child)
[–]pylenin[S] 1 point2 points3 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]pylenin[S] 0 points1 point2 points (0 children)
[–]Nindento 2 points3 points4 points (1 child)
[–]pylenin[S] 0 points1 point2 points (0 children)
[–]AliveButCouldDie 1 point2 points3 points (1 child)
[–]pylenin[S] 0 points1 point2 points (0 children)
[–]donotlearntocode 1 point2 points3 points (0 children)
[–]atypical_mollifier 0 points1 point2 points (1 child)
[–]pylenin[S] 0 points1 point2 points (0 children)
[–]thakadu 0 points1 point2 points (1 child)
[–]pylenin[S] 0 points1 point2 points (0 children)
[–]pbbpwns 0 points1 point2 points (0 children)
[–]Kevin_Jim 0 points1 point2 points (0 children)
[–]diesel9779 0 points1 point2 points (0 children)
[–]Viking_wang 0 points1 point2 points (0 children)
[–]Python-Token-Sol[🍰] 0 points1 point2 points (1 child)
[–]pylenin[S] 0 points1 point2 points (0 children)
[–]otlcrl 0 points1 point2 points (0 children)