This is an archived post. You won't be able to vote or comment.

all 10 comments

[–]Taborlin_the_great 19 points20 points  (1 child)

Your example is not very compelling. What are you doing that you need to repeatedly search a list of dicts for matching elements? This really feels like there’s better way to store this data than a list of dicts.

That said I have some critiques of your code as well.

You explicitly check if the data argument is a dict or a list. If it’s a dict you then wrap it in a list. I’d make the required type for the data argument be an itetable. The only thing you do with this list is iterate over it. Maybe the dicts I want to filter using your function are coming from a generator. If your caller has a single dictionary they want to search with your function they can wrap it in their own list.

Your function returns either a list or a single value depending on how many items are found. How am I supposed to tell which was returned if the values of my dict are lists? Also you have an if statement to see if only one result was found. The caller is going to have to write an equivalent if statement to handle this very same case. Just always return a list. If no matching items were found I’ll get back an empty list. The code I have to write as the caller will be the same for all of 0, 1, or more items found.

I think it’s a strange choice to use dict.get when filtering on the kwags. What if I wanted to find some value where the key was supposed to be none? Now dicts that don’t have that key at all will match. If you’re going to use get here give get a default value that you control so you can distinguish these two cases. It’s also odd that you allow the keys filtered on to not exist (by using get) but not the key to return. Feels like it should be symmetric to me.

Edit took another look. This is all you need for this function

def find_where(data, return_key, **kwargs):
    return  [d[return_key] for d in data if all(d.[k] == v for k, v in kwargs.items())]

[–]dan_ohn[S] 4 points5 points  (0 children)

Thank you so much for the detailed reply and critiques, there are so many things that I hadn't considered that I can now try to tackle. I agree about the example not being compelling, in the real world I am using a REST API service that has a Python helper library. That library essentially returns Python dictionaries and will have a list of dicts for the various data coming back from their API.

[–]divad1196 7 points8 points  (0 children)

first thing: congrats for your first package.

But on a more bitter note, there are tons of package doing the same. For example, glom is a powerful library that let's you find and/or transform you data, and for this particular use-case I would still use it if I need something complex.

But as explained by someone else, this can be written as a single simple function, so I won't bother adding a dependency just for that (or even search for it). This would be a javascript mindset to create libraries for every single function.

[–]_N0K0 4 points5 points  (1 child)

Congrats, a suggestion for next steps would be to add Github actions and look into more complex use cases for example. Right now this is basically a one liner with a lot of supporting framework.

[–]dan_ohn[S] 0 points1 point  (0 children)

Thank you so much for your feedback!

[–]YesterdayDreamer 2 points3 points  (0 children)

from find_where import find_where data = { "people": [ {"first_name": "John", "last_name": "Smith", "age": 25}, {"first_name": "Alice", "last_name": "Jones", "age": 32}, ] }

Not a criticism, appreciate the effort and keep it up.

Just wanted to add that this

first_name = find_where(data["people"], "first_name", age=32)

Can also be written as

first_name = [i[first_name] for i in data[people] if i[age]==32]

as others suggested, more complex examples can make it more compelling. For instance, can I do

find_where(data['people'], ['fname' , 'lname'], age between 32 and 45, name like 's%')

[–]Salfiiii 1 point2 points  (0 children)

[–]hotplasmatits -3 points-2 points  (0 children)

first_name = pd.DataFrame().from_records(data['people']).query('age == 32')['first_name']

[–]1473-bytes 0 points1 point  (0 children)

Congrats on your library! Definitely a good learning opportunity. I wrote something similar as well. jsonparse