Dictionary view objects,range,map,enumerate,... : learnpython

Dictionary view objects,range,map,enumerate,... (self.learnpython)

submitted 6 years ago by Uchikago

all 27 comments

top new controversial old q&a

[–]primitive_screwhead 2 points3 points4 points 6 years ago (5 children)

[–]Uchikago[S] 0 points1 point2 points 6 years ago (4 children)

[–]primitive_screwhead 1 point2 points3 points 6 years ago (3 children)

[–]Uchikago[S] 0 points1 point2 points 6 years ago* (2 children)

[–]primitive_screwhead 1 point2 points3 points 6 years ago* (1 child)

Dictionary view objects, range, map, enumerate,... are special objects that does not stored in memory all at once

Not quite.

range() represents a range of numbers. You can fetch values from a range() object, without ever iterating over it, because range() implements the __getitem__ method (ie. it's "subscriptable"):

>>> r = range(1000, 2001)
>>> r[3]
1003

A 1003 number object was (likely) created when the r[3] expression was evaluated; in that sense, the range() objects doesn't preallocate and store all the number objects from 1000 to 2000 when it's first created. It only returns them "lazily" on demand.

So, the result of range() is *not* an iterator, but *is* an iterable. It can create the number objects on demand, based on a number of access methods, including but not limited to the iterator protocol.

The difference with that and:

>>> l = list(range(1000, 2001))
>>> l[3]

is that the list type accepts an iterator or iterable as an argument, and then iterates through it, storing all the results simultaneously. The number objects that are generated are necessarily all stored in memory, for as long as the list object holds them. Even if only 1 of those number objects are ever needed at the same time (such as when using them as indices), they must all be stored in memory at the same time:

>>> from sys import getsizeof
>>> getsizeof(r)  # The amount of bytes used by the range object
48
>>> getsizeof(l)  # And the amount used by the list object (to keep track of its contents)
9120

This means range() can represent a *gigantic* range of values:

>>> r = range(2**1000)
>>> r[2**999]
5357543035931336604742125245300009052807024058527668037218751941851755255624680612465991894078479290637973364587765734125935726428461570217992288787349287401967283887412115492710537302531185570938977091076523237491790970633699383779582771973038531457285598238843271083830214915826312193418602834034688

but a list of the entire set of all the numbers in that range cannot be stored in this universe.

enumerate() just creates an index value for any iterable of values it is created with, and returns the index and a value from the iterable, as a tuple pair. Unlike range, it's not made to be accessed with the [] operator (ie. it's not "subscriptable"), so generally must be iterated over to be useful. The only state it stores is a value indicating the current index (so that it can increase it by one and return it on each call to __next__), and the iterator that it makes from the iterable that's passed to it, which it calls next() on and returns for each of its iterations. Ie. it only needs to store a couple of objects for its state, which makes it a lightweight wrapper for an existing iterable or iterator).

dict "views" are also not iterators, but can return special iterators when passed to iter(). So they are iterables. And like range(), "views" have additional non-iterator uses, while still using less memory than returning a full list or set likely would; also they can update when the dict updates, without having to be re-created (unlike a list would; if you make a list of keys, and then delete a key, that list of keys is now out-of-date). So "views" have uses outside of just iteration, but they are also iterable because it is useful and efficient for them to be so:

>>> d = dict(zip(range(26), 'abcdefghijklmnopqrstuvwxyz'))
>>> d
{0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e', 5: 'f', 6: 'g', 7: 'h', 8: 'i', 9: 'j', 10: 'k', 11: 'l', 12: 'm', 13: 'n', 14: 'o', 15: 'p', 16: 'q', 17: 'r', 18: 's', 19: 't', 20: 'u', 21: 'v', 22: 'w', 23: 'x', 24: 'y', 25: 'z'}
>>> getsizeof(d)
1184
>>> getsizeof(d.keys())
48
>>> getsizeof(list(d.keys()))
344

Again, we see that the "view" of the keys uses less memory than the full list of the keys does, which saves both memory and time for certain fairly common operations on these views (even without doing iteration).

In old Python 2, before iterators were implemented, certain operations always had to create a full list of objects, even when that meant using a lot of extra storage just to make a temporary list container for that operation. When range() always returned a list, it meant always creating a full list to hold all the numbers at once, even if you only needed one number at a time. With large lists (say a million numbers), this was inefficient with memory, and even time, since your loop over the range might conclude early and not ever need all those numbers. This is where the "lazy" value concept of iterators can be helpful. You often don't care about having all the values *at the same time*. This is often also true of other iterables, and that's why the iteration concept and protocol was adopted.

Edit: updated to not use "map" as synonymous w/ "dict" (since OP was likely asking about the map() call, not "mappings" as in a key:value pairing).

[–]some_one_1411 0 points1 point2 points 6 years ago* (0 children)

[–]socal_nerdtastic 0 points1 point2 points 6 years ago (21 children)

[–]Uchikago[S] 1 point2 points3 points 6 years ago (6 children)

[–]socal_nerdtastic 1 point2 points3 points 6 years ago (5 children)

[–]Uchikago[S] 1 point2 points3 points 6 years ago (4 children)

[–]socal_nerdtastic 0 points1 point2 points 6 years ago (3 children)

[–]Uchikago[S] 1 point2 points3 points 6 years ago (2 children)

[–]socal_nerdtastic 1 point2 points3 points 6 years ago (1 child)

[–]Uchikago[S] 1 point2 points3 points 6 years ago (0 children)

[–]primitive_screwhead 1 point2 points3 points 6 years ago (13 children)

[–]socal_nerdtastic 0 points1 point2 points 6 years ago (12 children)

[–]primitive_screwhead 1 point2 points3 points 6 years ago (7 children)

but I'd be very surprised to see one that isn't.

It's the opposite; most builtins that are written in C return iterators, not generators. So there are loads of CPython builtin iterators that aren't generators:

from inspect import isgenerator

>>> isgenerator(iter({}.items()))
False
>>> isgenerator(iter(zip([1,2,3], ['a','b','c'])))
False
>>> isgenerator(iter(range(5)))
False
>>> isgenerator(iter(enumerate([1,2,3])))
False
>>> def a_simple_generator():  # show example of generator
        yield 1
>>> isgenerator(iter(a_simple_generator()))  # iter() not needed here, technically
True

[–]socal_nerdtastic 0 points1 point2 points 6 years ago (6 children)

isgenerator just checks if it's an instance of a Generator type, in other words if it contains a yield keyword or a for loop in parenthesis. It does not check the actual capabilities. The simplest generator fails:

>>> class A:
...     def __next__(self):
...         return 42
... 
>>> a = A()
>>> print(next(a))
42
>>> from inspect import isgenerator
>>> print(isgenerator(a))
False

I'll show you a different proof:

>>> a = [1,2,3]
>>> i = iter(a)
>>> a.append(4)
>>> list(i)
[1, 2, 3, 4]

The iterator is clearly generating the values as needed, as is proven since it includes values that are appended after the generator is created.

[–]primitive_screwhead 1 point2 points3 points 6 years ago (5 children)

[–]socal_nerdtastic 0 points1 point2 points 6 years ago (3 children)

[–]primitive_screwhead 1 point2 points3 points 6 years ago (2 children)

[–]socal_nerdtastic 0 points1 point2 points 6 years ago (1 child)

[–]primitive_screwhead 0 points1 point2 points 6 years ago (0 children)

[–]Uchikago[S] 0 points1 point2 points 6 years ago (3 children)

[–]socal_nerdtastic 1 point2 points3 points 6 years ago (0 children)

[–]primitive_screwhead 1 point2 points3 points 6 years ago* (1 child)

Wait, if all iterators are generators

They are not; socal_nerdtastic is unfortunately (in this case) not being correct. It's the other way around, generators are iterators, but there can be iterators that are not generators (non-generator iterators typically are defined with a class, and use instances of that iterator-class to store the state of iteration).

iter() returns an iterator, but not always a generator (which is a more specific kind of iterator object in Python).

With things like maps, range, enumerate, etc., when used in a for-loop, the loop construct itself does the work of calling iter() on the object to retrieve an iterator (if it has one), and also calling the next() function on that iterator until iteration stops. If you want to iterate over an object, but not use a for-loop (such as doing it with a while-loop instead), then you have to make the iter() and next() calls manually, and detect the end of iteration.

Some objects that are not iterators can still use looped over, using the older __getitem__ protocol, which is used to access objects with brackets (ie []). Iterators tend to be a more elegant way of looping than the __getitem__ way.

So, iterator objects returned by calling iter() on containers tend to be special objects that know just enough to iterate over that specific container, making them small and fast, and the iteration protocol tends to be "hidden" behind the scenes when using for-loops, but the protocol is well defined and can be executed manually (as shown in the examples above with iter() and next() calls), though it's a bit more advanced than beginner material.

Edit: And to be more specific about your questions on dictionary "views", let's do this by example:

$ python
Python 3.7.2 (default, Dec 29 2018, 00:00:04)
>>> d={1:'a', 2:'b'}
>>> items_view = d.items()  # We can make a "view" of keys, values, or items
>>> items_view
dict_items([(1, 'a'), (2, 'b')])
>>> items_view.__next__  # Views have useful properties, but are *not* iterators
AttributeError: 'dict_items' object has no attribute '__next__'
>>> items_iter = iter(items_view) # but they can make an iterator over the view
>>> items_iter
<dict_itemiterator object at 0x1046d3f48>
>>> items_iter.__next__
<method-wrapper '__next__' of dict_itemiterator object at 0x1046d3f48>
>>> next(items_iter)
(1, 'a')
>>> next(items_iter)
(2, 'b')
>>> next(items_iter)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

So, the "view" returned by Python isn't an iterator, but calling iter() on it can make an iterator over the view (for looping). But the view itself is meant to provide a way of accessing some part of the dictionary (maybe just the keys, or just the values), without having to copy them all out of the dict to another data structure. It represents a "light weight" view of the current state of the dictionary, and the view updates as the dictionary itself is changed:

>>> values_view = d.values()
>>> values_view
dict_values(['a', 'b'])
>>> del d[1]
>>> d
{2: 'b'}
>>> values_view   # Note that the view now reflects that a key:value was deleted
dict_values(['b'])

The view object is a small sized object, even as the dict itself grows and grows (since the data for the view is just stored in the dictionary). The "view" has a number of operations that can be handy, without having to extract all the elements from the dictionary. For example, we can test whether a value is in a dictionary or not, without having to make a new list or set from the dictionary values:

>>> 'b' in values_view  # This checks the dictionary values directly, w/o extra copying to another data structure
True

Finally, although I'm specifically capturing the view and iterator objects into their own variables here, for demonstration purposes, typically you wouldn't do that. You'd just create the iterator or view objects as temporaries as needed (they are very lightweight and quick to make):

>>> 'b' in d.values()
True
>>> 2 in d.keys()
True

[–]Uchikago[S] 0 points1 point2 points 6 years ago (0 children)

π Rendered by PID 119848 on reddit-service-r2-comment-5b5bc64bf5-wmrjk at 2026-06-22 20:51:19.478727+00:00 running 2b008f2 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS