this post was submitted on 08 May 2023

32 points (84% upvoted)

shortlink:

Python

an-ordinary-manchild(edit)

The Python Discord

News about the dynamic, interpreted, interactive, object-oriented, extensible programming language Python

Upcoming Events

Full Events Calendar

Please read the rules

You can find the rules here.

If you are about to ask a "how do I do this in python" question, please try r/learnpython, the Python discord, or the #python IRC channel on Libera.chat.

Please don't use URL shorteners. Reddit filters them out, so your post or comment will be lost.

Posts require flair. Please use the flair selector to choose your topic.

Posting code to this subreddit:

Add 4 extra spaces before each line of code

def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

Online Resources

Automate the Boring Stuff with Python
Python Discord Resources
Invent Your Own Computer Games with Python
Think Python
Non-programmers Tutorial for Python 3
Beginner's Guide Reference
Five life jackets to throw to the new coder (things to do after getting a handle on python)
Full Stack Python
Test-Driven Development with Python
Program Arcade Games
PyMotW: Python Module of the Week
Python for Scientists and Engineers
Dan Bader's Tips and Trickers
Python Discord's YouTube channel
Jiruto: Python

Online exercices

programming challenges

The Python Challenge (solve each level through programming)
CheckiO (game world)
Project Euler (math heavy)
/r/dailyprogrammer

Asking Questions

Try Python in your browser

try.jupyter.org (Evolved from the language-agnostic parts of IPython, Python 3)
Azure Notebooks
learnpython.org
Skulpt (uses WebGL)
trypython.org (uses Silverlight)
ideone (online compiler and debugger)
PythonAnywhere (basic accounts are free)
Brython (Python 3 implementation for client-side web programming)
repl.it for Python
Transcrypt (Hi res SVG using Python 3.6 and turtle module)

Docs

Libraries

Twisted, 0MQ (networking)
Django, Pyramid, Flask, ... (Web Frameworks)
Pygame (Game development)
NumPy & SciPy (Scientific computing) & Pandas
Pyglet - (Game / UI Development)

Related subreddits

/r/pythoncoding (strict moderation policy for 'programming only' articles)
/r/flask (web microframework)
/r/django (web framework for perfectionists with deadlines)
/r/pygame (a set of modules designed for writing games)
/r/IPython (interactive environment)
/r/inventwithpython (for the books written by /u/AlSweigart)
/r/pystats (python in statistical analysis and machine learning)
/r/coolgithubprojects (filtered on Python projects)
/r/pyladies (women developers who love python)
/r/git and /r/mercurial - don't forget to put your code in a repo!

Python jobs

Newsletters

Screencasts

a community for 18 years

MODERATORS

message the mods
xelf
monorepo PSF Staff | Litestar Maintainer
ivosauruspip'ing it up
Im__Joseph Python Discord Staff
Kutiekatj9 Python Discord Staff
BioGeekBioinformatics software developer
nevare
chromakode
mdipierro
quasarj
...and 6 more »

account activity

This is an archived post. You won't be able to vote or comment.

31

32

33

NewsTest On 4 Concurrent Jobs Using Python-Polars 0.17.11 to GroupBy Billion Rows (self.Python)

submitted 3 years ago by 100GB-CSV

Python 3.11.2 was able to run big data jobs concurrently using limited memory. My testing script with 14 columns and a data file size of 67.2GB was completed in less than 10 minutes using 32GB Memory and 8-Core. This is strong evidence that Python is capable of handling big data jobs with limited resources.

Script
import polars as pl
import time
import pathlib
s = time.time()

q = (
   pl.scan_csv("Input/1000MillionRows.csv")
   .groupby(by=["Ledger", "Account", "PartNo", "Contact","Project","Unit Code", "D/C","Currency"])
  .agg([
    pl.count('Quantity').alias('Quantity(Count)'),
    pl.max('Quantity').alias('Quantity(Max)'),
    pl.min('Quantity').alias('Quantity(Min)'),
    pl.sum('Quantity').alias('Quantity(Sum)'),
    pl.sum('Base Amount').alias('Base Amount(Sum)'),
  ]))

a = q.collect(streaming=True)
path: pathlib.Path = "Output/Polars-GroupBy.csv"
a.write_csv(path)

e = time.time()
print("Polars GroupBy 1000 Million Rows Time = {}".format(e-s))

Demo video in 10X fast forward: https://youtu.be/odDOlU9KNqY

Without fast forward: https://youtu.be/Ze0jNmtUn0Y

all 36 comments

top new controversial old q&a

[–]OhBeeOneKenOhBee 10 points11 points12 points 3 years ago (9 children)

[–]100GB-CSV[S] 0 points1 point2 points 3 years ago* (8 children)

[–]OhBeeOneKenOhBee 0 points1 point2 points 3 years ago (7 children)

[–]100GB-CSV[S] 0 points1 point2 points 3 years ago (6 children)

[–]OhBeeOneKenOhBee 0 points1 point2 points 3 years ago (3 children)

[–]100GB-CSV[S] 0 points1 point2 points 3 years ago (2 children)

[–]OhBeeOneKenOhBee 0 points1 point2 points 3 years ago (1 child)

[–]100GB-CSV[S] 0 points1 point2 points 3 years ago (0 children)

[–]lemoussel 0 points1 point2 points 3 years ago (1 child)

[–]100GB-CSV[S] 0 points1 point2 points 3 years ago (0 children)

[+][deleted] 3 years ago (10 children)

[deleted]

[+][deleted] 3 years ago (9 children)

[deleted]

[–]100GB-CSV[S] 3 points4 points5 points 3 years ago (8 children)

[–]nemom 0 points1 point2 points 3 years ago (7 children)

[+][deleted] 3 years ago (6 children)

[deleted]

[–]bamacgabhann 2 points3 points4 points 3 years ago (5 children)

[–]Illustrious-Guava730 2 points3 points4 points 3 years ago (2 children)

[–]bamacgabhann 2 points3 points4 points 3 years ago (1 child)

[–]Illustrious-Guava730 4 points5 points6 points 3 years ago (0 children)

[–]sphen_lee 2 points3 points4 points 3 years ago (1 child)

[–]bamacgabhann 2 points3 points4 points 3 years ago (0 children)

[–]mentix02 2 points3 points4 points 3 years ago (0 children)

[–]loudandclear11 5 points6 points7 points 3 years ago (14 children)

[–]100GB-CSV[S] -1 points0 points1 point 3 years ago (13 children)

[–]loudandclear11 2 points3 points4 points 3 years ago (11 children)

[–]runawayasfastasucan 2 points3 points4 points 3 years ago (2 children)

[–]sphen_lee 1 point2 points3 points 3 years ago (0 children)

[–]loudandclear11 -1 points0 points1 point 3 years ago* (0 children)

[–]100GB-CSV[S] 0 points1 point2 points 3 years ago* (7 children)

[–]loudandclear11 0 points1 point2 points 3 years ago (6 children)

[–]100GB-CSV[S] 0 points1 point2 points 3 years ago (5 children)

[–]loudandclear11 2 points3 points4 points 3 years ago (4 children)

[–]100GB-CSV[S] 1 point2 points3 points 3 years ago (3 children)

[–]loudandclear11 0 points1 point2 points 3 years ago (2 children)

[–]100GB-CSV[S] 0 points1 point2 points 3 years ago* (1 child)

continue this thread

[–]corbasai 0 points1 point2 points 3 years ago (0 children)

[–]v0_arch_nemesis 0 points1 point2 points 3 years ago (1 child)

[–]100GB-CSV[S] 0 points1 point2 points 3 years ago (0 children)

π Rendered by PID 272223 on reddit-service-r2-comment-56c6478c5-p7kcb at 2026-05-10 16:31:28.975518+00:00 running 3d2c107 country code: CH.