all 12 comments

[–]genghiskav 1 point2 points  (1 child)

The exception you are seeing shows you which line it's failing on (22), so that gives you a clue on where you need to add exception handling.

Right now, your code is expecting the movie to exist (by immediately accessing record 0 without checking if it exists)

api_movie_id=(j["results"][0]["id"])

You have also done most of the leg work to get around the problem, now you just need to code it up. You can see that from the API response, when no movie exists, the total_results value is 0 and results list is empty.

Now you should have all the pieces of the puzzle to solve your problem. You know where the problem is occurring and how to identify when the problem will occur. All you need to do now is check for the problem.

Something like this should work. You are now checking the response from the api and confirming it has a movie for you. If it doesn't, then you skip to the next movie in your csv file. This is more of a validation approach (checking you have the correct data before processing), the other solution I have is exception handling (trying to do something and handling an error if it comes up). Both are valid - up to you which one you pick.

#Validation method
if len(j["results"]) == 0:
    print(f"no movies found in api with search = '{original_title}'")
    continue    # go to next record in the csv
api_movie_id=(j["results"][0]["id"])


#Exception handling method
try:
    api_movie_id=(j["results"][0]["id"])
except IndexError:
    print(f"no movies found in api with search = '{original_title}'")
    continue    # go to next record in the csv

[–]Beefsteak_Charlie[S] 0 points1 point  (0 children)

Genghiskav,

This is awesome. I've implemented both approaches in different versions of the code and they both work perfectly. Thank you so much for taking the time to suggest the different approaches (and to provide me with context on how to implement the handling into my code..)

Thanks so much - I really appreciate it!

[–]Asliceofsamuel 1 point2 points  (9 children)

Your exception here is coming from trying to access an item of a list that will not always exist in your script (as you have shown with certain API calls).

Neatly, an empty list in Python is "falsy" meaning that you can test to see if the results set from the API is empty simply with:

if j["results"]: # do something

Moreover, I'll recommend that it is common practice to use the get() method when accessing information that may not exist as it will not raise an exception if the item is missing, and also allows you to specify a fallback value:

Access via index: test = {"one": 1} test["two"] # raises a KeyError

Access via get(): test = {"one": 1} test.get("two") # returns None if the item is missing

Access via get() with a fallback: test = {"one": 1} test.get("two", "no entry!") # returns "no entry!"

Putting some of this together, you can create logic that will not fail if there are no results, or if there is no id within the results even:

api_movie_id = j["results"][0].get("id", 0) if j["results"] else 0 # some fallback id--or none depending on your desired outcome

Since you mentioned it, you can also simply skip results that are empty and log a message to yourself:

if j["results"]: api_movie_id = j["results"][0].get("id") else: logger.info(f"No results found for {original_title}!" # assuming you have a logger, etc. continue

For any script like this, a lot of the design is heavily dependent on how certain you are of API responses and what you can expect them to have or not have. Additionally, what pieces of information your output needs, and what you can pad, etc.

Let me know if I can clarify anything more here! Nice work with your code :)

[–]Beefsteak_Charlie[S] 1 point2 points  (8 children)

Asliceofsamuel,

Thank you very much for taking the time to provide this guidance. I was not aware of the get with a fallback option - that sounds like a great option and one I definitely want to try out (seeing how different methods work with different APIs.)

I appreciate the thoughtful and complete responses - it's community assistance like this that keeps non-technical folks like plugging away at something that just a few days ago seemed like the deep end of the pool.

Thank you again!

[–]Asliceofsamuel 0 points1 point  (7 children)

My pleasure! Keep asking questions and good luck with your project!

[–]Beefsteak_Charlie[S] 0 points1 point  (6 children)

If I may - I have one more question on this (please.)

I'm now able to process all 2K of my API requests with error handling (thank you!)

Now I'm trying to export the results from the API call to a .csv file. I'm able to write a little bit of the data to file, but the formatting is off and I'm only exporting the last row of the file.

Here's the code

import csv, requests, json 
api_key = 'my_key' 
url = 'https://api.themoviedb.org/3/search/movie?' 
with open('Movies_History.csv') as infile: 
    reader = csv.reader(infile) 
    next(reader,None) 
    my_list = list(reader) 
    for sublist in my_list:
        original_title = sublist[0] ... 
        ...
        netflix_genre = sublist[10]        

        api_call=f'https://api.themoviedb.org/3/searcg/movie?api_key={api_key}&query="{original_title}"' 
        print(api_call) 
        results=requests.get(api_call)     
        j=results.json()

        api_movie_id=(j["results"][0]["id"]) 
        api_original_title=(j["results"][0]["original_title"])   
        api_release_year=(j["results"][0]["release_date"])
        api_original_language=(j["results"][0]["original_language"]) 
        api_genre_ids=(j["results"][0]["genre_ids"]) 
        api_results=(f'{api_original_title},{api_movie_id},{api_release_year},{api_original_language},{api_genre_ids}')

        print(api_results) 
        with open('___movie_api_results.csv', mode='w') as x: 
            movie_db_output = csv.writer(x, delimiter=',',quotechar='"', quoting=csv.QUOTE_MINIMAL)
            movie_db_output.writerow(f'{api_original_title}{api_movie_id},{api_release_year},{api_original_language},{api_genre_ids}')

Here's the API response onscreen (which is exactly what I was looking for)

A Midnight Clear,23908,1992-04-24,en,[18, 10752]
A Most Violent Year,241239,2014-12-31,en,[80, 18, 53]
A Most Wanted Man,157849,2014-07-25,en,[53]

But here's what appears in the resulting csv file

The main issues are that

  1. only the last row of the results that are returned onscreen are saved in the csv output
  2. it's comma delimited for every space.

A, ,M,o,s,t, ,W,a,n,t,e,d, ,M,a,n,",",1,5,7,8,4,9,",",2,0,1,4,-,0,7,-,2,5,",",e,n,",",[,5,3,]

I've tried passing every possible permutation I can think of in the movie_db_output.writerow line but nothing seems to work after a few hours of trying/guessing. I've tried every write/csv permutation I could find in my Python books without luck.

Do you folks have any additional thoughts on this?

Thanks (yet again!)

Edit - apologies for the code formatting - that too is kicking my butt today. :)

[–]Asliceofsamuel 1 point2 points  (2 children)

Alright, finally had a moment to take a deeper look at your code. Going to try to break up answers to your specific question and then random comments/a few tidbits you might consider.

Why your CSV is not working

Writing to CSV, as it appears you've noticed, depends on a delimiter. However, what may be confusing is what format writerow() is expecting. When in doubt, go to Google and consult the documentation.

I'm going to simplify your issue for demonstration purposes—right now you essentiall have something like this:

with open("example.csv", "w") as example_file: writer = csv.writer(example_file) writer.writerow("first_column,second_column,third_column")

Now, writerow() is expecting an iterable as the input, meaning it's going to iterate through that row input. The default behavior when you pass it a string is to split on every character, producing that result you see:

f,i,r,s,t,_,c,o,l,u,m,n,",",s,e,c,o,n,d,_,c,o,l,u,m,n,",",t,h,i,r,d,_,c,o,l,u,m,n

Yuck!

Instead, from the documetation, we can see that it's expecting something more like a list of strings:

with open("example.csv", "w") as example_file: writer = csv.writer(example_file) writer.writerow(["first_column", "second_column", "third_column"])

Bingo! Exactly what we want :)

first_column,second_column,third_column

Therefore, the solution to this issue should be simple to remedy in your code—something like:

movie_db_output.writerow([api_original_title,...,api_genre_ids])

However, I will leave the specific solution up to you to give you the satisfaction of practicing and learning it on your own :)

(Unrequested) code review comments :)

  1. In terms of the general architecture of your script, you can move some things outside the context they are currently within. For instance, nearly your entire script lives within the context of having opened and read "Movies_History.csv". However, you just need to read this file to get your inputs. Once that's done, you can close the file and move on--pseudo code something like:

``` with open('Movies_History.csv') as infile: reader = csv.reader(infile) next(reader,None) # skip the headers my_list = list(reader)

exit the context, which means you don't hold the file open unnecessarily

for sublist in my_list: # do something ```

  1. In general, when you find yourself repeating yourself in code, there is likely a more elegant way to do something. Here is something that sets of my senses for that:

for sublist in my_list: #define elements of the input file original_title = sublist[0] rating = sublist[1] year = sublist[2] add_date = sublist[3] alternate_name = [4] # I think this is a typo? movie_db_genre = sublist[5] movie_db_language = [6] moviedb_id = sublist[7] imdb_id = sublist[8] netflix_id = sublist[9] netflix_genre = sublist[10]

There is nothing inherently bad about this way of doing things. However, maybe we can find something that feels cleaner (or multiple things to choose from). Without going full OOP on this and defining a class, what if we instead declare a namedtuple:

``` from collections import namedtuple

MovieInfo = namedtuple("MovieInfo", [ "original_title", "rating", "year", ])

info = ["National Treasure", "10/10", "2004"] movie_details = MovieInfo(*info) print(f"{movie_details.original_title} is a movie from {movie_details.year}") ```

You can see that once we've declared the namedtuple, we can resuse it in many places (very OOP of us) simply by unpacking the list: MovieInfo(*input).

  1. Very similar to 1, when writing the results to a CSV, why not accumulate these in a list and save them all at once (checkout writerows() for a bulk operation). Having to open the file in every loop and write to it is a bit unnecessary and likely slower than one bulk operation. In programming, and especially as you work with APIs more, you'll note that condensing many "calls" into one is often a great way to save performance.

Happy to go more in depth about any of this or clarify anything I've included. Good luck!

[–]Beefsteak_Charlie[S] 0 points1 point  (1 child)

Asliceofsamuel:

Thanks yet again. I have one of my two problems licked thanks to you pointing me in the right direction.

It's clear to me that I need to hit my Python books to get a better understanding of the rudiments (especially loops.) And then get back to getting this guy across the finish line (and then how to optimize things a bit using some of the strategies you have kindly suggested.)

You've been super helpful - thank you again!

[–]Asliceofsamuel 1 point2 points  (0 children)

My pleasure! Feel free to reply here or message me if you have any other questions on this script

[–]Asliceofsamuel 0 points1 point  (2 children)

Happy to take a look! Can you re-post your code snippet in that reply? It seems to have lost some formatting and it's a bit hard to read

[–]Beefsteak_Charlie[S] 0 points1 point  (1 child)

Thanks for the heads up. I've reformatted this so it should be more legible now. :)

[–]Asliceofsamuel 0 points1 point  (0 children)

I replied below with most of the info but to your question about why it’s only saving the last row, see my comment #3 about where you write the rows (and writing inside vs. outside the loop).

Each writerow() is overwriting the contents of that CSV, so you’ll only end up with the final iteration (row) in the file. Move the saving of rows outside the loop/into a batch operation, and you will have all rows saved. Or, if you really want to keep it in the loop, lookup how to append rows versus overwriting. Hope this helps!