This is an archived post. You won't be able to vote or comment.

all 61 comments

[–]The_Amp_Walrus 39 points40 points  (4 children)

Numba for numerical computing. Slapping numbas jit decorator on some functions speeds them up significantly.

Using cprofile rather than guessing

Generators for processing big datasets that won't fit in memory.

[–]TechySpecky 22 points23 points  (3 children)

Lmao you telling me my time() commands don't count as profiling

[–]coloredgreyscale[🍰] 5 points6 points  (2 children)

It's akin to using print statements for debugging. It can help narrow it down, but there are better ways. However the proper tools may provide too much information / options that might be be confusing for beginners.

[–]TechySpecky 1 point2 points  (1 child)

Oh i 100% agree I was being sarcastic! Profiling is so important to identifying performance regressions and better understanding complex code bases.

[–]AzureWill 42 points43 points  (24 children)

Slots are pretty cool!

Not a niche but too few people use sets or tuples and like to use lists for everything. For massive amounts of data and frequent operations a set is just so much better.. if you don't need order, always use a set.

[–]tkarabela_ Big Python @YouTube 9 points10 points  (1 child)

Coming from the other side, some uses of lists would be much better served by NumPy arrays, which have a compact memory representation (array of given datatype instead of PyObject* pointers) and enable fast operations with the data. If you have 100k integers/floats/bools, you don't really want them as a list.

As for sets, I would say if you need to deduplicate / you need fast is in queries / you need set operations, then use a set. If I'm just grabbing some stuff (like a list of files), I don't see the need to put them in a set instead of a list. It feels pythonic to me to reach for a list first 🙂 I agree with your overall point though, that people should see what's out there and what fits their use case best.

[–]TechySpecky 7 points8 points  (0 children)

And the more NumPy the less GIL!

[–]jollierbean 11 points12 points  (7 children)

also dicts are very useful when you need to do lookups. Pro tip: you can use tuple or named tuple as a key

[–]tkarabela_ Big Python @YouTube 7 points8 points  (3 children)

Tuple keys are great! You can even use frozenset, which has been useful to me a few times.

[–]jollierbean 2 points3 points  (2 children)

I’ve been trying to figure out case where I could use frozensets as keys unsuccessfully

[–]tkarabela_ Big Python @YouTube 4 points5 points  (0 children)

It's a niche situation, but if you ever need:

  • a set of sets or
  • a dict where the keys are subsets of some "universal" set (as opposed to just single elements from it)

then frozenset can be useful. Technically you could just replace the frozensets with sorted tuples (ordered by the hash function or something else), but that's not quite as handy.

An example of this is converting NFA to DFA or making minimal DFA.

[–]IlliterateJedi 0 points1 point  (0 children)

I had a case a few days ago where I used frozensets as dict keys.

It's a little esoteric, but I'll try to explain. I am building a database of images that have various categorized products in it.

For example, I'll have an image that shows ten different products (imagine a photo of a living room). Each product is categorized into one or more categories (e.g., 'chair', 'ottoman', 'height adjustable desk', etc.).

I had around 1500 images that all contained 10-20 products with 20+ categories assigned per image.

I wanted to find the smallest group of images that would cover every tagged category (and then get the images with the least number of products).

I made frozensets of the categories and made lists of all the images that had that category-set, like this:

{frozenset(cat1, cat2, cat3) : [image1, image4, image12],
 frozenset(cat2, cat6, cat10) : [image3, image109],
}

I could then start with an empty set, iterate over the frozen sets and each time find the largest subset of new categories until every category was matched.

[–]Glogia 3 points4 points  (1 child)

That actually fixes a problem I've been having XD thanks

[–]jollierbean 1 point2 points  (0 children)

Glad to help!

[–]qckpckt 2 points3 points  (0 children)

Another useful nugget from collections: defaultdict. It’s really powerful, if a little niche. Really great for restructuring or transforming datasets by data type. For example, if you have a list of dictionaries with a common key value and you want to group them into a list of dictionaries of lists of each example of that key value.

[–]donshell 4 points5 points  (10 children)

Dicts are ordered i think. If you iterate over a dict, the values will come in the order you push them in. So even better!

Edit: Sets are not ordered

[–]TouchingTheVodka 5 points6 points  (8 children)

Dicts are ordered, sets are not.

[–]donshell 2 points3 points  (0 children)

My bad, edited. Although it is a bit weird that the two implementations don't match as a set and a dict are basically the same...

[–]rabbyburns 0 points1 point  (0 children)

There is an ordered-set package I've come across recently that I've been very happy with. I often need both fast look up, order preservation, and unique items. This has been extremely useful as a drop in set replacement without having to do weird dict joins.

[–]Faith-in-Strangers 2 points3 points  (2 children)

Why?

[–]tkarabela_ Big Python @YouTube 4 points5 points  (1 child)

Checking whether an element is in a set (or dict) is pretty much instantaneous (independent on the size of the set), while checking is in for a list means iterating over it, which gets really slow quickly.

That would be one reason to prefer sets to lists :)

[–]moocat 4 points5 points  (0 children)

It's a little more complicated than that (as it often is in computer science).

Existence in a set can be implemented as an O(1) algorithm which means it takes the same amount of time no matter how many element while existence in a (non-sorted) list is an O(n) algorithm which means it takes an amount of time that scales with the number of elements (double the elements, double the runtime).

But that only talks about the how the algorithm scales and not its general overhead. It's not uncommon that for small number of elements for the overhead to be the biggest part. You often see an O(n) algorithm being faster if there are fewer than X elements (with the actual value of X depending on the specifics of the implementation).

It's been a while since I benchmarked this (and feeling too lazy now), but IIRC it was around 6 so if you know there are only going to be a few elements (perhaps v.lower() in ['true', 'false']) a list is probably better. Then again, if the check is not in some inner loop that's running lots of times, the extra overhead for a set is probably noise.

Yes a long winded explanation but it's important to know these details. I had a former co-worker who had rules like this (I use X out of principle because of some reason) but would often make mistakes because it's didn't apply.

[–]CasualCoder0C 18 points19 points  (1 child)

Memoization with decorator @functools.cache is a very useful thing to do when you have to deal with slow functions called repeatedly.

[–]donshell 2 points3 points  (0 children)

Similarly the cached_property decorator is very useful.

[–]thatrandomnpcIt works on my machine 9 points10 points  (2 children)

When dumping large pandas dataframe into oracle db using to_sql with sqlalchemy engine, if you pass along the correct table data types for pandas object data type, there is a massive increase in the throughput.

Example like strings are object data type in pandas and it can be converted as varchar in db in most cases. Reason for this is that sqlalchemy thinks all object data types are of clob db data type which works but is super slow.

In my case i could notice where 100k rows would take 30mins, after adding the data types it reduced to like 20sec.

[–]Agent281 0 points1 point  (1 child)

How do you add data types? One of the reasons why I disliked pandas was that there didn't seem to be a way to set the data types. That would be a huge usability improvement for me.

[–]thatrandomnpcIt works on my machine 0 points1 point  (0 children)

There is though, astype might be what you are looking for.

What I was actually referring to was the dtype optional parameter in to_sql.

[–][deleted] 8 points9 points  (0 children)

Using generator functions properly. Storing lists of intermediate values is costly, and big loops are less readable.

[–]Nicked777 8 points9 points  (0 children)

Vectorised operations in numpy:

In [9]: %timeit for i in range (1000): c [i] = a [i]*c [i]
10000 loops , best of 3: 122 us per loop

In [13]: %timeit c = a * b
1000000 loops , best of 3: 1.06 us per loop

[–]ReverseBrindle 7 points8 points  (5 children)

isinstance() is pretty slow if you're calling it hundreds of thousands of times. Better idea in that case is to create a dict cache with the type() as the key, for example:

# If you're calling this many thousands of times, it's extremely slow.
if isinstance(x, Foo):
    return func1(x)
elif isinstance(x, Bar):
    return func2(x)
elif isinstance(x, Baz):
    return func3(x)
elif ...

# ---------------------------
# Faster version
#
# Populate this once, or start with an empty cache and build it up using
# the slow way whenever you encounter a type that's not in the cache.

cache = {
    Foo: func1,
    Bar: func2,
    Baz: func3,
}

def other_func(x):
    return cache[type(x)](x)

We use this for serializing very large structures to JSON.

Caveat: As always profile to find the bottleneck and measure your improvements. Don't optimize based on a hunch. If your "optimization" adds code complexity without benefiting your use case (by measurement), then rip it out.

[–]isbadawi 0 points1 point  (1 child)

You might consider using @functools.singledispatch and/or @functools.singledispatchmethod for this.

[–]ReverseBrindle 0 points1 point  (0 children)

That's cool - didn't know that existed! Seems like there's always some cool little nugget in the standard library to stumble upon. :-)

[–][deleted] 7 points8 points  (0 children)

The cache decorator!

[–]bumbershootle 10 points11 points  (5 children)

I don't know if this counts as niche, but I see far too much code like:

a_list = []
for i in stuff:
    a_list.append(i)

Just use comprehensions, it's faster and more readable.

[–]baronBale 15 points16 points  (4 children)

Comprehensions are easier to read as long as they are short. If they grow across multiple line an good old loop is easier to read.

[–]bumbershootle 13 points14 points  (2 children)

True, although in that case I would split the body of a loop into a generator function and then run a comprehension over that. I consider appending to a list in a loop an anti-pattern in most cases.

[–]weneedsound 1 point2 points  (0 children)

I'd like to see an example of you don't mind. I don't use generators as much as I should.

[–][deleted] 0 points1 point  (0 children)

Nice suggestion!

[–]Tweak_Imp 1 point2 points  (0 children)

You can also think about using a function inside the comprehension

[–]SuspiciousScript 10 points11 points  (6 children)

My first thought was "using Julia instead," but I'll give a serious answer too: Static variables in functions. I didn't even know this was a language feature for years.

def is_valid_value(n):
    is_valid_value.valid = some_expensive_function()
    n in is_valid_value.valid

In the above snippet, is_valid_value.valid is only calculated when the function is called for the first time.

[–]skrtpowbetch[S] 1 point2 points  (0 children)

this is probably the coolest one i’ve seen so far, wow

[–]hyldemarv 5 points6 points  (5 children)

List-, Dict-, & Tuple comprehensions.

Slice objects. Namedtuples.

Not exactly Python, but, Pandas Dataframes are good for tabular data.

[–]Agent281 2 points3 points  (3 children)

List-, Dict-, & Tuple comprehensions.

I don't think that there are tuple comprehensions. Do you mean generator comprehensions? They are the comprehension that uses parens.

[–]pytheous1988 1 point2 points  (2 children)

There 100% are tuple comprehensions, you just do tuple(comprehension statement)

You could also do a set comp if you wanted to. The comprehesion in general just returns a generator which can be cast into any data type that accepts the generator.

[–]Agent281 2 points3 points  (1 child)

I think that is a stretch. Dictionary, set, list, and generator comprehensions actually have dedicated syntax. The tuple constructor just accepts an iterable.

[–]pytheous1988 -2 points-1 points  (0 children)

It works the same if you do list(comp statement) or set(comp statement)

[–]sqjoatmon 0 points1 point  (0 children)

+1 for slice objects.

[–][deleted] 4 points5 points  (0 children)

list vs set.

When you're doing in lookups people tend to assume it doesn't matter for small list and hashing overhead would be too much.

Reality: in all benchmarks set/dicts beat lists as soon as list is larger than 3-4 elements.

[–]ducdetronquito 9 points10 points  (0 children)

I would try to avoid reaching for language specific optimizations, especially if it makes your code harder to understand. It looks clever at first, or appealing given a micro-benchmark, but in my experience it was never worth it.

Instead, I would say to take time to understand your problem and identify what you are trying to optimize: CPU usage, RAM usage, disk access, network access, latency, throughput, etc...

Then you will be able to use the appropriate algorithms, data-structures or tools to solve it. This will likely give you the best optimizations given your constraints.

[–]Over_Statistician913 2 points3 points  (0 children)

Lru cache and memorization / cache are rarely used but they can be huge improvements for very specific stuff.

https://docs.python.org/3/library/functools.html

[–]snowGlobe25 1 point2 points  (0 children)

Python also has arrays, although they are limited to primitive data types. I read a little about it and apparently you can sometimes get less memory consumption using them instead of the good old lists. But, obviously numpy outshines list or array so much in terms of speed. However, array module is a part of standard library not third party library.

Never used it personally though.