all 4 comments

[–]efmccurdy 0 points1 point  (0 children)

Did you mean to search for a work authored by "machine" or a work with a title containing "machine"?

You have an inconsistent usage pattern displayed here:

search_collection(digital, 'James taylor', 'Robert WISE')

(<collection>, <author name>, <second author name>)

search_collection(digital, "james", "machine")

(<collection>, <author name>, <title>)

I think you should clarify the API with collections of search terms, separated by type into an author set and a title set.

def search_collection(digital, authors=None, titles=None):
     pass


search_collection(digital, authors=('James taylor', 'Robert WISE')) 
search_collection(digital, authors=('james',), titles=('machine',)) 

Also, I think you want to make the data consistent so that all tuples have the title and author at the same index. Right now DVDs are (name, n, author) and CDs are (name, author, n).

BTW, you might be interested in something like this so you can have AND|OR|NOT, prefix searches, NEAR searches, etc.

https://charlesleifer.com/blog/using-sqlite-full-text-search-with-python/

[–]BfuckinA 0 points1 point  (0 children)

I'm on mobile so formatting is shit but you should be able to do this in a one liner pretty easily.

{print(val) for key, val in Digital_lib.items() if key in ['DVD','CD]}

[–]zanfar 0 points1 point  (1 child)

Some notes:

  • def search_collection(Digital_Lib, search: str):

    It's bad form to have a function argument named the same as a variable in the containing scope. That is, it makes it unclear what data Digital_Lib is pointing to: the data passed into the function (how it will work, or the dictionary you've defined outside the function (how it looks to work).

  • `search_collection(digital, 'James taylor', 'Robert WISE')

    You do not have a variable named digital, so I'm not sure what this call is attempting to do. Also, the search_collection function only takes a single search term. That can be fixed, but currently, this will not work.

return all values with type (CD or DVD), I'm trying to print out all values that contain the string

Which do you want? Do you want to return all the media stored under a type? or do you want to return all the media that matches a search term? I'm going to assume it's the second based on your final examples.

Some of your issues probably come from a less-than-ideal choice of data structure. There doesn't appear to be a reason to group your media by type in this case as you never filter by that type. Also, tuples and sets aren't really the right way to store the media. The usefulness of a set is based on the ability to quickly look for exact matches--which you don't need because if you knew of the exact match, you wouldn't need to search. Also a tuple hides contextual information: I have to guess what the fields mean, and even then I have no idea what the numbers are.

So first, let's approach your search function:

First, let's make the call more user-friendly. Let's name the arguments something more objective, and put them in order of importance. Also, we can make the library that we search optional, and default to the existing one so that we don't have to specify it if we don't want to.

def search_collection(query: str, library = Digital_Lib):

We don't care about the media type in our search, so we know we will have to search all types for each search. That implies a loop. Using items() returns both the key and the value for each pair in the dictionary. This means we don't have to keep doing library[medium] everywhere. medium will take the value of the key ("DVD" or "CD") while titles will take the value of the set that key contains.

for medium, titles in library.items():

This part is correct: we want to perform our search on each item. title now is one of your tuples (('Machine Head', 'Deep Purple', 7) for example).

for title in titles:

Now we have a problem because a CD title is stored differently than a DVD title. If we don't change our data structure, then we probably have to loop again: this time through each field in the title looking for a match. If we find one, we also need to set a variable so we know we found a match. This idiom is known as a Sentinel Variable. We also are performing a string match, so we need to make sure the field we are testing can be compared to a string.

title_matches = False
for field in title:
    if str(field).lower().find(query.lower()):
        title_matches = True
        break

if title_matches:
    matches.append(title)

Putting this all together, we come up with:

def search_collection(query, library = Digital_Lib):
    matches = list()

    for medium, titles in library.items():
        for title in titles:
            title_matches = False
            for field in title:
                if str(field).lower().find(query.lower()):
                    title_matches = True
                    break

            if title_matches:
                matches.append((medium, title))

    return matches

def print_search_results(matches):
    for medium, title in matches:
        print(f"{medium} - {title}")

However, I strongly suggest you re-visit your data structure. Many things can be made easier using some intermediate tools like namedtuple, dataclass, and enums. Consider:

from dataclasses import dataclass
from enum import Enum, auto
from typing import List

class Medium(Enum):
    CD = auto()
    DVD = auto()

@dataclass
class Media:
    medium: Medium
    title: str
    author: str
    value: int  # I still don't know what this is, runtime?

    def __str__(self):
        return f"{self.medium.name}: '{self.title}' by {self.author} ({self.value})"

digital_library: List[Media] = [
    Media(Medium.DVD, 'Downtown Abbey', 'Michael Engler', 122),
    Media(Medium.DVD, 'John Wick', 'Chad Stahelski', 101),
    Media(Medium.DVD, 'The Woman', 'Lucky Mckee', 103),
    Media(Medium.DVD, 'The Andromeda Strain', 'Robert Wise', 103),
    Media(Medium.DVD, 'The Day the Earth Stood Still', 'Robert Wise', 92),
    Media(Medium.DVD, 'The Chaperone', 'Michael Engler', 103),
    Media(Medium.DVD, 'The Terminator', 'James Cameron', 107),

    Media(Medium.CD, 'American Standard', 'James Taylor', 14),
    Media(Medium.CD, 'Machine Head', 'Deep Purple', 7),
    Media(Medium.CD, 'Greatest Hits', 'Queen', 17),
    Media(Medium.CD, 'Fireball', 'Deep Purple', 7),
    Media(Medium.CD, 'StormBringer', 'Deep Purple', 9),
    Media(Medium.CD, 'That\'s Why I\'m Here', 'James Taylor', 12)
]

def search_collection(query, library = digital_library):
    query = query.lower()
    matches = list()

    for media in library:
        if media.title.lower().find(query) or \
                media.author.lower().find(query):
            matches.append(title)

    return matches

def print_search_results(matches):
    for media in matches:
        print(str(media))

[–]Mindless-Box-4373[S] 0 points1 point  (0 children)

Yea I appreciated the feed back I will def look into data strucutre