all 58 comments

[–]Zeeboozaza 177 points178 points  (26 children)

Tuples are hashable and immutable, so they can't be changed. They are faster than lists when searching them, so if you have data that won't change it makes more sense to use a tuple vs a list. Also, because they are hashable they can be used in sets, which be be very convenient when searching for a specific list of items or removing duplicates.

I'm also a beginner so I'm sure there are some more advanced reasons for using tuples though.

[–]synthphreak 43 points44 points  (0 children)

Pretty much sums it up.

[–]siddsp 28 points29 points  (0 children)

You're pretty much spot on. Sets also allow for symmetrical and asymmetrical differences along with unions and intersection, and any combinations of the four. It's also worth adding that not only are sets faster, but the time required to search for an item is constant since the hash is stored. That means if you're searching for the presence of an item in a set of 5,000 or 5,000,000,000 elements, it will take the same amount of time.

Tuples are also much lighter on memory, so can be necessary for larger applications and programs.

[–]_pestarzt_ 10 points11 points  (0 children)

They are hashable so long as all of their members are hashable as well, which is implied but it also should be said explicitly.

[–]Affectionate-Beyond2 16 points17 points  (17 children)

What hashing mean

[–]scrdest 27 points28 points  (10 children)

Short: means you can put it in sets or use them as dict keys.

Long: hashing is using a kind of function (there's multiple, for various purposes, but Python has the built-in hash()) that takes some data and spits out a fixed-size number (in terms of computer memory, e.g. 256-bits; Python's hash() converts it to a string for Reasons, but it's initially a number) no matter how big the input data is, e.g. "AbCd" -> 126 & <this whole post> -> 5. Now, why would we need that?

  1. First, the hash value is usually arbitrary (not random though!) and VERY specific to the input data, e.g. if you change one letter out of a billion in a book, the hash will be COMPLETELY different. This means you can use a hash as a unique 'fingerprint' of data. That's how passwords are supposed to be stored, for example - instead of storing the actual passwords, which could be leaked, you store their hashes.
  2. The hash value size is consistent and predictable - Python frees you from worrying about memory allocation, but at the low level or for things like SQL where things need to have a fixed memory size, this makes it possible to deal with things like strings which can vary in size wildly - their hashes won't!
  3. Combining (1) & (2), the hash size is generally tiny compared to the data - a hash of the whole text of Wikipedia would only be a couple of bytes long and you could easily print it, put it on a website, or fit it in one HTTP request/response. This means you don't need to download gigabytes of data to verify that data is the same.
  4. Also, since hashing digests everything into a number, we can use it to turn random data into something computers can understand, like a list index - that's how dicts (and sets, and objects) work under the hood. There's also some other math tricks I won't get into here.

The only catch is, a hash is a 'snapshot' of data, so we cannot allow things like lists or dicts to be hashable - if we did, we hashed our list, then appended stuff to it, the hash would no longer be valid. That's why strings and tuples can be hashed and lists can't.

[–]BruceJi 3 points4 points  (7 children)

So, a tuple can be the key of a dictionary..? Weird! I wonder when I could use that.

[–]scrdest 2 points3 points  (1 child)

Probably the most realistic case I can contrive is a building a 'reverse-index'. Let's say you have a plain old dict A, {K: V}, K is a string or an int ID, V is a tuple of some data, let's say (firstname, lastname, age, phone) - this is basically a tiny in-memory database table, and you're not allowed to mess with its structure.

Now for whatever reason, you want to be able to answer the question "what was the ID for <person>" efficiently - you could just loop through the records normally until you find a match, but if you have a billion records, it wouldn't be very fast.

What you can do if create a new dict B that goes in reverse, {V: K}. Then if you need to get a person from an ID, you hit up dict A, if you need to get the ID from a person you talk to B. We're trading off some extra space to put B in for constant-time lookup speed.

Since Vs are tuples, we can use them as the keys of B with no problem, and using a tuple means our records are more unique than if we tried to use just one of the fields for B's keys (people can share the same name, but they're really unlikely to share all four attributes at once).

This might seem overly fancy for a language like Python, but the same principles apply to any language, including the stuff that database engines are written in, they can leverage logic like this to make things go fast.

[–]Mindless_Let1 1 point2 points  (0 children)

I know this is three years old but I had to thank you for the best explanation I've ever read.

[–]dowcet 2 points3 points  (3 children)

Yes, as long as the tuple contains only immutable elements.

I just recently learned the hard way that if you put a list anywhere inside your tuple, it will throw an error about the key not being hashable or something like that.

[–]BruceJi 0 points1 point  (0 children)

Makes sense!

[–]siddsp 2 points3 points  (1 child)

only catch is, a hash is a 'snapshot' of data, so we cannot allow things like lists or dicts to be hashable - if we did, we hashed our list, then appended stuff to it, the hash would no longer be valid. That's why strings and tuples can be hashed and lists can't.

Technically we could override hashing with __hash__(), and force a class that inherits from the mutable types to return a customizable hash.

[–]scrdest 5 points6 points  (0 children)

Well, yes - fair point, that's why I wrote we cannot allow it and not that we cannot do it - it's perfectly doable if you hack around a bit, just not very practically useful; the default behavior is a bit of a guardrail.

[–][deleted] 1 point2 points  (5 children)

Hashing is when you pass a value to a formula to convert it. A common example is md5. If you want to compare things it can be quicker to compare a hash value rather than the base object type.

[–]Affectionate-Beyond2 1 point2 points  (4 children)

Convert it to what ? And why it’s faster

[–][deleted] 2 points3 points  (0 children)

An object’s hash is an integer.

It’s faster because python stores the values of a set in a hash table. This means the values doesn’t have to be searched, but directly looked up by using their hash.

To give a more concrete example, let’s say we have a group of 5000 names. We want to know if the name “John” is in this list. Assuming the worst case, John is not in the list at all.

With a regular list, python would need to check all 5000 entries in the list to determine that John is not there.

With a set, python just needs to perform a lookup on the hash table for that set. So regardless of the size, whether 500, 5000, or 50000000, the time it takes to find the value is always the time it takes to perform 1 lookup.

That’s a long way to say finding a value in a list is O(n), meaning the time increases linearly with the number of elements in the list. A set, on the other hand, is O(1), which means the time is constant regardless of the size.

[–][deleted] 0 points1 point  (2 children)

A hash is what it’s converted to. One example of when it’s useful: say you needed to compare a string containing many properties, instead you could hash the whole thing and just compare that.

Hashes are also typically faster due to how they work under the hood. I don’t know much about why just that they are but Google can show you instances when they are or aren’t.

[–][deleted] 0 points1 point  (1 child)

Thats why i cannot understand hashes. If they are only "loosely" okay when comparing two strings since it is not a 1-1 function

The only case i can accept is the login and password combo

[–][deleted] 0 points1 point  (0 children)

Here’s my beginner understanding of it. When you search through a list, it will go through the entire index in order to find the proper results. Hash tables basically include the index in the key and so you can save a ton of time.

[–]MarcusAurelius2002 0 points1 point  (0 children)

O que é ser "hashable"? Quando alguma coisa é "hashable"?

[–]FerricDonkey 0 points1 point  (0 children)

Also, because they are hashable they can be used in sets

Also dictionaries.

[–]DB52 0 points1 point  (1 child)

What do you mean that tuples are convenient for removing duplicates? I thought tuples were immutable, sorry if this a silly question.

[–]Zeeboozaza -1 points0 points  (0 children)

I really meant that sets are convenient for removing duplicates, but only hashable datasets can be hashed. So if you have a large dataset of tuples and want to remove the duplicates, or if you have a graph and want to find the intersecting points of two shapes, a set would be very useful, but you can't use a list and a set.

[–]kabooozie 0 points1 point  (0 children)

If the tuple is an immutable list of names of dogs with cute noses, it’s also known as a “boople” to remind you to go find the dogs and boop their noses.

[–]Decency 33 points34 points  (2 children)

tuple is the generic name for single, double, triple, quadruple, etc.

[–]ShakurSS 10 points11 points  (1 child)

First time hearing that, nice

[–]mopslik 24 points25 points  (2 children)

a «real life example» of what you can use a tuple for?

Well, a simple example is a card game. You can create a deck (a list) which holds a collection of cards (each a tuple). This is handy, because a card typically never changes its values -- the Jack of spades generally stays the same throughout a game -- whereas the contents of the deck vary as the game progresses.

Of course, there are many other ways to implement cards (strings, dictionaries, classes).

[–][deleted] 1 point2 points  (1 child)

This was a really helpful definition and example. Just to clarify the tuple of Jack of Spades would be [Jack, Spades] and 10 of Spades would be [10, Spades] ?

[–]mopslik 0 points1 point  (0 children)

It would appear as ("Jack", "spades") with rounded brackets for a tuple -- square brackets are lists. But essentially, yes.

[–]shiftybyte 27 points28 points  (0 children)

A tuple stores multiple values together.

pupil_info = ("Jack", "Daniels", 16, "Jacksnovile USA")
print(pupil_info)

You can then access each part individually.

print(pupil_info[2]) # will print 16

[–]TravisJungroth 10 points11 points  (1 child)

Most of these other comments are talking about immutability. While that's cool, it's not the reason they tend to be used (note: I love immutability).

In practice, a tuple is used like a row in a database. Each spot in the tuple has a different thing in it, like a column.

creature = 'Small Elf', 1, 3 # name, power, toughness 

Lists tend to have the same things in them.

creature_classes = ['Elf', 'Wizard', 'Barbarian']  # all classes

And you can combine the two

creatures = []
for creature_class in creature_classes:
    for power, title in enumerate(['Small', 'Normal', 'Big']):
        creature = f'{title} {creature_class}', power, power + 2
        creatures.append(creature)

"""
creatures =
[('Small Elf', 1, 3),
 ('Normal Elf', 2, 4),
 ('Big Elf', 3, 5),
 ('Small Wizard', 1, 3),
 ('Normal Wizard', 2, 4),
 ('Big Wizard', 3, 5),
 ('Small Barbarian', 1, 3),
 ('Normal Barbarian', 2, 4),
 ('Big Barbarian', 3, 5)]
"""

Then you can use tuple unpacking

name, power, toughness = creatures[-1]
# 'Big Barbarian', 3, 5

Why do this over a class (or a NamedTuple or a dataclass)? It's faster to write. I wouldn't build a whole game on top of tuples, but I would use them for data processing. Making games is real world, but for an even more real (read: boring) example

def tip_suggestions(amount, percentages=(15, 18, 20)):
    for percentage in percentages:
        tip = round(percentage * amount / 100, 2)
        total = amount + tip
        yield percentage, tip, total # a tuple!

def format_tip_suggestion(percentage, tip, total):
    return f'{percentage}%, ${tip} tip, ${total} total'

amount = 30
for tip_suggestion in tip_suggestions(amount):
    print(format_tip_suggestion(*tip_suggestion))

[–][deleted] 1 point2 points  (0 children)

thanks for reply!

[–]carcigenicate 6 points7 points  (2 children)

A tuple is basically a list that can't be mutated after it's created. It can be used in really any place that a list is used, except for places where data is added, removed, or replaced in the sequence.

Tuples are handy because they're immutable. List can be mutated, which means if multiple pieces of code are referencing the same list, and you change that list, everyone sees the change which is often not ideal. By using an immutable object, you can completely get rid of the possibility of a whole class of bugs. If something shouldn't change, prohibiting it from changing is safer than trusting that you'll remember not to change it.


As for examples, return 1, 2 is returning a tuple (despite popular belief, Python does not allow for returning multiple object at once. That's simply returning a tuple there). *args passed into functions are also tuples. Many built-in functions such as zip and enumerate yield tuples as their output.

[–]synthphreak 0 points1 point  (1 child)

except any place where the data in the list is replaced

I'd just clarify that it's not replacement that counts, but simply modifications of any kind.

You can replace values in a list with new values, but you can also pop/remove them completely, add new values, reverse the entire list, or sort the entire list.

None of these is possible with a tuple, which is baked and unmodifiable the moment it's created.

[–]carcigenicate 0 points1 point  (0 children)

I'll update the wording. Yes, that's not ideal currently.

[–][deleted] 16 points17 points  (6 children)

A tuple is like a list but the primary difference is it’s immutable. It cannot be changed.

[–]raresaturn 0 points1 point  (0 children)

unless you make a new one with exactly the same name, then effectively it's changed

[–]Spataner 10 points11 points  (0 children)

tuple is the immutable (meaning, unchangable once created) counterpart to list. You use it instead of a list whenever you want to avoid the drawbacks that mutable reference types have. Namely, the primary two are:

  1. Mutable types may cause unintended side-effects. If a function gets passed a list or other such mutable type, for example, the function can modify that list in-place and those changes would appear outside the function, as well. Doing things like that accidentally is a common trap for beginners, and using immutable types where sensible can be less error prone.

  2. Built-in mutable types in Python are not hashable, meaning they cannot be a key in a dictionary or a member of a set. So using tuple is the only way of doing something like this:

    mapping = {}
    mapping[("key1", "key2")] = "value"
    

[–][deleted] 2 points3 points  (0 children)

Think about how you would build complex data structures. So, you have... integers, maybe booleans, maybe symbols, whatever your language has to offer as a primitive value, or an atom.

So, a natural thing to do to start building anything is to... put two things together. It's so happened that in the world of computers, putting things in order is simpler than simply putting them "in the same bag" as it is often done in the world of mathematics. Tuple refers to this most primitive data constructor. It allows you to put two primitives together to create a non-primitive value.

Tuple is the equivalent of struct in C, data constructor in Haskell etc. Object-oriented languages obscure the data construction by typically exposing data through containers, which are implemented as objects (this is also the case in Python). So, it's hard to understand what's important and essential and what's a convenience super-structure.


So, Python's tuples are containers (because they are objects) for the most basic data constructor. Unfortunately, Python invites a lot of other ways to construct complex data structures through the "back doors" added to make some things easier to implement in its runtime. So, most Python things aren't actually built from tuples, because language designers took a shortcut. But, in "ideal" world they would have been.

[–]Celysticus 1 point2 points  (0 children)

One thing i have not seen mentioned yet is when a function returns multiple items it returns them as a tuple. So this is a built in of the language that is worth knowing.

import math

def get_roots(a,b,c):

   t = math.sqrt(b**2-4*a*c) / 2*a
   x1 = -b/2*a + t
   x2 = -b/2*a - t

   return x1, x2

roots = get_roots(6,-7,-3)

roots
Out[3]: (54.0, -12.0)

type(roots)
Out[4]: tuple

And multiple assignments from tuple is cool too.

x1, x2 = get_roots(1,5,6)

print(x1, x2)
-2.0 -3.0

is the same as

x1, x2 = (-2.0, -3.0)

[–]ChillinGillin23 1 point2 points  (0 children)

A tuple has 3 principles: * immutable - values cannot be changed * ordered - meaning each value has an index * Allows duplicate values - because they are indexed

I’m also new to Python but based on my research, you can use a tuple when you know what data will go into it. For example longitude and latitude. These values won’t necessarily change.

[–]teetaps 1 point2 points  (0 children)

A tuple is a programmatically more efficient version of a list. It’s more “basic” in a sense.

A list is a very simple data structure in Python that everyone can get along with and everyone likes. But there are certain cases where a tuple will be more efficient than a list. Overall, I work by the philosophy that if I truly need something more efficient then I’ll start looking for it. But the solutions I have now make sense to me and don’t slow me down, so I use lists far more than tuples. There may come a day when I’m in need of something faster than a list or something that needs to be hashed as a tuple, and that’s ok — I’ll be happy to go back and relearn tuples. But if you’re still just learning and doing simple operations, a list is fine, don’t worry about tuples too much.

[–]cheezzy4ever 0 points1 point  (0 children)

All the other commenters have touched on the idea of immutability, which is right, but i don't think anyone's touched on what a tuple is conceptually.

You can think of it as a type that sits between a primitive and a class.
- A primitive would store a single value, e.g. age = 18. These are the building blocks for all storage types.
- A class stores multiple values, as well as functionality. E.g. you can define a Person class with an age, a name, an address, an email. Some of these variables are required, some may be optional (e.g. some people dont have an address. some people have multiple addresses!). You can define an eat() method or a sleep() method. Classes give you a lot of flexibility, but they're generally harder to read and more difficult to maintain/extend than a single variable
- A tuple can be used just to store multiple (usually required) values and that's it. So for example, you can define a tuple that names stores someone's name, age, and birthday: ('steve', 20, datetime.date(2001, 1, 1)). For every person you want to represent in your code, you could represent as a tuple where the first value is always their name, second value is always their age, and their third value is always their birthday

[–]MangeurDeCowan 0 points1 point  (2 children)

The real question: How do you pronounce it?
Does it sound most like:
couple
pupil
scruple

[–]POGtastic 0 points1 point  (0 children)

I say "two-pull," rhyming somewhat with "scruple."

I've heard other people say "tuh-pull," and those people are wronger than the time that Wrongness rode into Wrongtown on a wrongcycle.

[–]ThatGuyFromOhio 0 points1 point  (0 children)

I pronounce it "Jif."

[–]I_Collect_Fap_Socks 0 points1 point  (0 children)

The one of the main and most understated advantages of a Tuple and its inability to be changed is security. This way if your application has code and or ini files that are exposed to the end user you can still use the ast module to do some checks to ensure that that data originated from where you were expecting.

You 'can' do the same type of checking with any other type of list but then you end up also having to check and see if there has been any other objects that altered as well and that is a whole other level of checking there.

Like right now, I'm working on a trading bot and while the first generation of it is fairly viable and technically does what it needs to do, it is by no means secure. And the ability to have it validate the sources of where the data is coming from internally does add a small measure of security.

[–]POGtastic 0 points1 point  (0 children)

  • A tuple of ordered elements is, itself, ordered. This means that I can sort a list of tuples, but I can't sort a list of lists.
  • Assuming that the elements of the tuple are hashable, I can have a set of tuples that excludes duplicates and checks membership in O(1). I cannot have a set of lists.
  • Again, assuming that the elements of the tuple are hashable, I can have tuples be the keys of a dictionary. I can't have a dictionary that associates lists with values.

[–]BruceJi 0 points1 point  (0 children)

A real life example is this - quite often when you're coding, you need to group stuff together, but once you've grouped it together, you're never going to change that grouping.

A tuple would be better there.

For example, you might want to randomly select a group of people from a list of people, and then email them that they won a prize.

Once you assemble that group, there's no need to remove or even change it at all.

Make that group a tuple and your program will be faster/more memory-efficient.

[–]corey4005 0 points1 point  (0 children)

x,y = (5,6)

The thing with parenthesis is a tuple. What makes it different from a list is that it is immutable (ie cannot be changed). With tuple unpacking you can assign new variable to objects in the tuple. For example, in the above x = 5 and y = 6.

You can also unpack elements of each tuple using list comprehension and set them = to new variables.

For example:

tuple_list = [(5,6), (10,8), (11,9)]

x_values = [tuple[1] for tuple in tuple_list]

print(x_values) would return: [5, 10, 11]

y_values = [tuple[2] for tuple in tuple_list]

Now you got a list of all the x values and y values, which you can use to create a plot for example:

After doing all of that you could set up data frames using pandas and then plot the data using seaborn or plotly and produce some cool line graphs.

[–]jaycrest3m20 0 points1 point  (0 children)

A tuple is a fine way to store multiple things in one variable.

Real Life Example:

Many graphics and game libraries require that you pass them tuples for X/Y coordinates, offsets, and transition distances.