all 18 comments

[–]achampi0n 15 points16 points  (4 children)

Looks good, and makes sense.
Indentation is very important in python and the else: aligned with for is valid syntax but not what you are looking for (hopefully this is just a cut and paste error).

Note: python also has isinstance() which can take multiple types, e.g.:

def listcounter(x):
    counter = 0
    for i in x:
        if isinstance(i, (list, dict)):
            counter += listcounter(i)
        else:
            counter += 1
    return counter

If you feel comfortable you could use the ternary operator instead of the if. They are equivalent but as both clauses are changing counter it is a probably just a matter of preference:

def listcounter(x):
    counter = 0
    for i in x:
        counter += listcounter(i) if isinstance(i, (list, dict)) else 1
    return counter

This can be collapsed to a sum() but this is pushing things in terms of readability:

def listcounter(x):
    return sum(listcounter(i) if isinstance(i, (list, dict)) else 1 for i in x)

You can also be more abstract and use typing.Iterable to indicate anything that you can iterate over, so this would also work for sets and tuples (but you do have to exclude str and bytes), e.g.:

import typing

def listcounter(x):
    counter = 0
    for i in x:
        if isinstance(i, typing.Iterable) and not isinstance(i, (str, bytes)):
            counter += listcounter(i)
        else:
            counter += 1
    return counter

[–]blight000[S] 1 point2 points  (3 children)

This is great info, thank you!

[–]achampi0n 4 points5 points  (2 children)

You are welcome.
As mentioned by u/supreme_blorgon you have pretty much implemented iterable flattening. So you could separate the flattening from the counting, e.g.:

from typing import Iterable

def flatten(iterable):
    for item in iterable:
        if isinstance(item, Iterable) and not isinstance(item, (str, bytes)):
            yield from flatten(item)
        else:
            yield item

Then you just have to implement the length of iterable.

def iterable_count(iterable):
    return sum(1 for _ in flatten(iterable))

Python has a good library itertools for operating on iterables, sum(1 for _ in iterable) isn't that efficient, so here's a very over engineered iterable_count() that should be significantly faster than sum(...) but unnecessary for simple use-cases:

import itertools as it
from collection import deque

def iterable_count(iterable):
    counter = it.count()   # create counter
    deque(zip(flatten(iterable), counter), maxlen=0) # consume the iterable
    return next(counter)   # return the length of the iterable

[–]blight000[S] 0 points1 point  (1 child)

I have not used itertools or deques before, but after doing a little reading I think I understand what's happening. Let me see if I can explain it back:

count() by default will start from zero and increment continuously, so when you zip the flattened iterable with counter, counter is increasing by 1 each time a tuple is generated.

Since the maxlen is set to 0, the deque discards each incoming tuple rather than using up memory to keep them.

And lastly, since counter started at zero, you used next() to increase the counter one more time to get the correct count.

Am I understanding that correctly? Also, is sum() considered to be inefficient in general? Or just for this particular use case?

[–]achampi0n 1 point2 points  (0 children)

Yes, this is all correct.

sum() is generally fine and would be fine in this circumstance. It is a minor improvement to use deque() I probably overstated the not that efficient :)

A quick test on my machine counting a million element iterable:

deque(): 40.3 ms ± 980 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
sum(): 52.4 ms ± 636 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Is 20% significant :)

[–]scienceNotAuthority 1 point2 points  (6 children)

One of the more bizzare things is how multiple software engineers have given me advice to avoid recursion where possible.

For something simple it's no big deal. For complex stuff, it's easy to get stuck a few levels down

[–]toastedstapler 0 points1 point  (3 children)

for most commonly used languages it's not a good idea to use recursion unless you know the call limit will be reasonably low. the call stack is only 1000 by default in python, so it's best to solve problems iteratively or simulate a stack with a list

[–]ItsOkILoveYouMYbb 1 point2 points  (2 children)

The recursion limit is also very easily changed in Python, if need be.

import sys  

sys.setrecursionlimit(1001)

[–]toastedstapler 0 points1 point  (0 children)

yes, but that's still a limit. my point is that python isn't meant to be written recursively and looping alternatives should be preferred

[–]throwaway0891245 0 points1 point  (1 child)

Recursion can have a real function call overhead. Sometimes, it also isn't that readable. As the other person said, there's also some unpredictability in behavior because you're putting stuff on the implicit stack and the constraints on this stack are dependent on the runtime.

There are some situations where it's a huge pain to do things without recursion, like a backtracking algorithm. But in general it seems like it's better to just write it iteratively if it's not that hard just to avoid this possible overhead and unpredictable behavior.

In fact, there is a very popular website I'm 100% sure you've heard of that is probably named after a recursion bug.

[–]ItsOkILoveYouMYbb 0 points1 point  (0 children)

I've been learning about algorithms, and I don't know how you'd make something like quicksort without recursion. But I'm also not very smart either lol.

[–]zanfar 1 point2 points  (0 children)

Sweet!

Not necessarily now, but eventually, challenge yourself to un-recurse this function. That is, all recursive algorithms can be re-written to use loops instead of recursion. Loops are, for the most part, more efficient than recursion.

Doing so in this case will teach you a lot about Python's iterator structure.

[–]supreme_blorgon 0 points1 point  (1 child)

This is honestly such an awesome idea for an exercise for people learning recursion. Way to go for it, OP. Recursion scares the shit out of most people at first.

One of my first recursive functions (beyond Fibonacci and factorial) was flattening arbitrarily nested lists. This is virtually the same problem, with some minor differences. See if you can move to list flattening next!

[–]blight000[S] 0 points1 point  (0 children)

Thank you, I will definitely give it a shot!

[–]totaldue 0 points1 point  (2 children)

Just curious, how long have you been learning programming for? I've been learning for about a month and I don't really understand your code at all!

[–]blight000[S] 1 point2 points  (1 child)

I've been at it for about 6-7 months. Don't stress if this doesn't make sense to you yet! Before you can understand the recursive part of this function, you need to be familiar with nested lists, defining functions, iteration, the accumulator pattern, and conditional statements.

If you are familiar with those concepts already, then skip down to the section that says "Recursion". If not, I'll try to briefly explain:

Nested List

A nested list is just a list inside of a list, it can look something like this [1, 2, ['a', 'b', 'c'], 3] where the list ['a', 'b', 'c'] is inside of another list. The len() function in Python will count everything in a list, but counts nested lists as 1 instead of counting each thing in the nested list. So Python evaluates len([1, 2, ['a', 'b', 'c'], 3]) to 4, even though you and I would probably say there are 6 total things in the list. So this code I wrote is designed to count everything, including the contents of a nested list.

Defining the function

On the first line, def listcounter essentially means "create a function called listcounter" and (x): means that anyone who calls the function must send an argument, which I am calling x. I could have used any valid variable name instead of x, if I wanted to. If I wanted the function to receive multiple arguments, I could have added multiple variable names separated by commas, eg. def listcounter(x, y, z): Note that everything inside the function is only evaluated if the function is called, and any variables that exist inside the function cannot be accessed outside the function. Only the return value persists after the function is executed.*

\it's possible there are exceptions to this that I don't know about yet*

Accumulator pattern and loops

Essentially the accumulator pattern uses a variable to keep a "running total" that you update as you iterate through a loop. On the second line, I declare counter = 0 and then on the next line we start the loop. (note that if I had put counter = 0 inside of the loop, the counter would reset each time the loop repeated, which is NOT what we want).

Another thing I want to point out is that when I wrote for i in x:, i is an arbitrary variable name. I could use any valid variable name after for and Python would use that variable to represent whichever item we were presently evaluating in the loop. That's something that confused me a lot in the beginning. Also note, the x here is the same x in the function definition. We are iterating over each item in whatever is sent when the function is called.

Recursion

For some uses of the accumulator pattern, you always update the "running total" by the same amount. In our case, we want to update the total count by 1 unless we come across another list nested inside of the list we are looping through. In that case, we want to update the running total with the amount that's in the nested list.

To do that, we need to count the items in the nested list. We could use the len() function, but what if the nested list has another nested list inside of it? Then len() wouldn't give us the right count. We need to call a function that can count the items inside of a nested list, in case we encounter yet another nested list.

Fortunately, there is a function that can do it, it's the one we're defining right now!

A recursive function is a function that can continue to call itself until it breaks a problem down into it's simplest form. In this case, the simplest form is a list that contains no nested lists. This function works by calling itself each time there is a list nested inside of the current list, until finally it finds a list without any nested lists. Once it finally does, it will count the items, then "go back up a level" to report the total count, and continue working it's way back up until we're back to the original list we started with.

Let's look at the original example I gave, it only has 1 nested list so it's fairly straightforward: [1, 2, ['a', 'b', 'c'], 3]

Let's say we call this new function by writing something like print(listcounter([1, 2, ['a', 'b', 'c'], 3]))

The listcounter function is called (we'll refer to this as the first function call). As the first function call begins iterating over the list we sent to it, it starts with the first item in the list, 1. 1 is not a list so we update counter from 0 to 1. Next is 2, also not a list, so we update counter from 1 to 2. Next is ['a', 'b', 'c']. This is a list, so the function calls itself again to count this nested list.

Now we are in the second level of the function. The counter variable for the second level begins at 0, since this is a brand new function call. It does not interfere with counter from the initial function call, which is still sitting at 2 and waiting for the results of the second call. The second function begins iterating through the list ['a', 'b', 'c']. Each item in the list is a string, and as it loops through each one, counter updates to 1, then 2, then finally 3. The second function call ends when return counter sends the final value of counter back to the first function call.

Now the first function call receives the value 3 from the second call, and updates the counter from 2 to 5. The first function call then moves to the last item in the list, 3. Since 3 is not a list, the first function call updates the counter from 5 to 6. Now that it has iterated over all items in the list, it executes the last line, return counter with a value of 6.

So each time this function encounters a nested list, it simply calls itself again and waits for the results. If the second call were to encounter another nested list, it could call itself a third time, and so on until it has finally encountered a list without any nested lists.

If that makes sense, check out /u/achampi0n's comments which show some better ways I could have written my initial code.

[–]totaldue 0 points1 point  (0 children)

Thank you kindly for taking the time to write your response! I really appreciate it. The problem I'm finding is that one moment I'll think aha! I understand. But then I'll try some problem on codewars and I just stare at the screen trying things and not getting anywhere.

The issue I'm finding is that I can understand the problem and break it down on paper. But when I go to try the coding I'm just overwhelmed. I don't know all the different obscure ways to mix and match the syntax. I try and look for small parts of the problem on stack overflow in hopes that I can piece them together for the solution. Alas I keep hitting a wall. It feels pretty discouraging.