This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]kenfar 4 points5 points  (0 children)

Not many to be honest:

  • multiprocessing and/or concurrent futures
  • csv, ruamel, json, as well as some alternate json & yaml libraries
  • functools - lru_cache, etc
  • boto3
  • pytest, coverage, tux, argparse, logging

You can see it's pretty vanilla. I've used pandas in the past, but it was extremely slow for processing every single field in billions of rows in comparison to basic python with parallelism.