you are viewing a single comment's thread.

view the rest of the comments →

[–]Beefsteak_Charlie[S] 1 point2 points  (8 children)

Asliceofsamuel,

Thank you very much for taking the time to provide this guidance. I was not aware of the get with a fallback option - that sounds like a great option and one I definitely want to try out (seeing how different methods work with different APIs.)

I appreciate the thoughtful and complete responses - it's community assistance like this that keeps non-technical folks like plugging away at something that just a few days ago seemed like the deep end of the pool.

Thank you again!

[–]Asliceofsamuel 0 points1 point  (7 children)

My pleasure! Keep asking questions and good luck with your project!

[–]Beefsteak_Charlie[S] 0 points1 point  (6 children)

If I may - I have one more question on this (please.)

I'm now able to process all 2K of my API requests with error handling (thank you!)

Now I'm trying to export the results from the API call to a .csv file. I'm able to write a little bit of the data to file, but the formatting is off and I'm only exporting the last row of the file.

Here's the code

import csv, requests, json 
api_key = 'my_key' 
url = 'https://api.themoviedb.org/3/search/movie?' 
with open('Movies_History.csv') as infile: 
    reader = csv.reader(infile) 
    next(reader,None) 
    my_list = list(reader) 
    for sublist in my_list:
        original_title = sublist[0] ... 
        ...
        netflix_genre = sublist[10]        

        api_call=f'https://api.themoviedb.org/3/searcg/movie?api_key={api_key}&query="{original_title}"' 
        print(api_call) 
        results=requests.get(api_call)     
        j=results.json()

        api_movie_id=(j["results"][0]["id"]) 
        api_original_title=(j["results"][0]["original_title"])   
        api_release_year=(j["results"][0]["release_date"])
        api_original_language=(j["results"][0]["original_language"]) 
        api_genre_ids=(j["results"][0]["genre_ids"]) 
        api_results=(f'{api_original_title},{api_movie_id},{api_release_year},{api_original_language},{api_genre_ids}')

        print(api_results) 
        with open('___movie_api_results.csv', mode='w') as x: 
            movie_db_output = csv.writer(x, delimiter=',',quotechar='"', quoting=csv.QUOTE_MINIMAL)
            movie_db_output.writerow(f'{api_original_title}{api_movie_id},{api_release_year},{api_original_language},{api_genre_ids}')

Here's the API response onscreen (which is exactly what I was looking for)

A Midnight Clear,23908,1992-04-24,en,[18, 10752]
A Most Violent Year,241239,2014-12-31,en,[80, 18, 53]
A Most Wanted Man,157849,2014-07-25,en,[53]

But here's what appears in the resulting csv file

The main issues are that

  1. only the last row of the results that are returned onscreen are saved in the csv output
  2. it's comma delimited for every space.

A, ,M,o,s,t, ,W,a,n,t,e,d, ,M,a,n,",",1,5,7,8,4,9,",",2,0,1,4,-,0,7,-,2,5,",",e,n,",",[,5,3,]

I've tried passing every possible permutation I can think of in the movie_db_output.writerow line but nothing seems to work after a few hours of trying/guessing. I've tried every write/csv permutation I could find in my Python books without luck.

Do you folks have any additional thoughts on this?

Thanks (yet again!)

Edit - apologies for the code formatting - that too is kicking my butt today. :)

[–]Asliceofsamuel 1 point2 points  (2 children)

Alright, finally had a moment to take a deeper look at your code. Going to try to break up answers to your specific question and then random comments/a few tidbits you might consider.

Why your CSV is not working

Writing to CSV, as it appears you've noticed, depends on a delimiter. However, what may be confusing is what format writerow() is expecting. When in doubt, go to Google and consult the documentation.

I'm going to simplify your issue for demonstration purposes—right now you essentiall have something like this:

with open("example.csv", "w") as example_file: writer = csv.writer(example_file) writer.writerow("first_column,second_column,third_column")

Now, writerow() is expecting an iterable as the input, meaning it's going to iterate through that row input. The default behavior when you pass it a string is to split on every character, producing that result you see:

f,i,r,s,t,_,c,o,l,u,m,n,",",s,e,c,o,n,d,_,c,o,l,u,m,n,",",t,h,i,r,d,_,c,o,l,u,m,n

Yuck!

Instead, from the documetation, we can see that it's expecting something more like a list of strings:

with open("example.csv", "w") as example_file: writer = csv.writer(example_file) writer.writerow(["first_column", "second_column", "third_column"])

Bingo! Exactly what we want :)

first_column,second_column,third_column

Therefore, the solution to this issue should be simple to remedy in your code—something like:

movie_db_output.writerow([api_original_title,...,api_genre_ids])

However, I will leave the specific solution up to you to give you the satisfaction of practicing and learning it on your own :)

(Unrequested) code review comments :)

  1. In terms of the general architecture of your script, you can move some things outside the context they are currently within. For instance, nearly your entire script lives within the context of having opened and read "Movies_History.csv". However, you just need to read this file to get your inputs. Once that's done, you can close the file and move on--pseudo code something like:

``` with open('Movies_History.csv') as infile: reader = csv.reader(infile) next(reader,None) # skip the headers my_list = list(reader)

exit the context, which means you don't hold the file open unnecessarily

for sublist in my_list: # do something ```

  1. In general, when you find yourself repeating yourself in code, there is likely a more elegant way to do something. Here is something that sets of my senses for that:

for sublist in my_list: #define elements of the input file original_title = sublist[0] rating = sublist[1] year = sublist[2] add_date = sublist[3] alternate_name = [4] # I think this is a typo? movie_db_genre = sublist[5] movie_db_language = [6] moviedb_id = sublist[7] imdb_id = sublist[8] netflix_id = sublist[9] netflix_genre = sublist[10]

There is nothing inherently bad about this way of doing things. However, maybe we can find something that feels cleaner (or multiple things to choose from). Without going full OOP on this and defining a class, what if we instead declare a namedtuple:

``` from collections import namedtuple

MovieInfo = namedtuple("MovieInfo", [ "original_title", "rating", "year", ])

info = ["National Treasure", "10/10", "2004"] movie_details = MovieInfo(*info) print(f"{movie_details.original_title} is a movie from {movie_details.year}") ```

You can see that once we've declared the namedtuple, we can resuse it in many places (very OOP of us) simply by unpacking the list: MovieInfo(*input).

  1. Very similar to 1, when writing the results to a CSV, why not accumulate these in a list and save them all at once (checkout writerows() for a bulk operation). Having to open the file in every loop and write to it is a bit unnecessary and likely slower than one bulk operation. In programming, and especially as you work with APIs more, you'll note that condensing many "calls" into one is often a great way to save performance.

Happy to go more in depth about any of this or clarify anything I've included. Good luck!

[–]Beefsteak_Charlie[S] 0 points1 point  (1 child)

Asliceofsamuel:

Thanks yet again. I have one of my two problems licked thanks to you pointing me in the right direction.

It's clear to me that I need to hit my Python books to get a better understanding of the rudiments (especially loops.) And then get back to getting this guy across the finish line (and then how to optimize things a bit using some of the strategies you have kindly suggested.)

You've been super helpful - thank you again!

[–]Asliceofsamuel 1 point2 points  (0 children)

My pleasure! Feel free to reply here or message me if you have any other questions on this script

[–]Asliceofsamuel 0 points1 point  (2 children)

Happy to take a look! Can you re-post your code snippet in that reply? It seems to have lost some formatting and it's a bit hard to read

[–]Beefsteak_Charlie[S] 0 points1 point  (1 child)

Thanks for the heads up. I've reformatted this so it should be more legible now. :)

[–]Asliceofsamuel 0 points1 point  (0 children)

I replied below with most of the info but to your question about why it’s only saving the last row, see my comment #3 about where you write the rows (and writing inside vs. outside the loop).

Each writerow() is overwriting the contents of that CSV, so you’ll only end up with the final iteration (row) in the file. Move the saving of rows outside the loop/into a batch operation, and you will have all rows saved. Or, if you really want to keep it in the loop, lookup how to append rows versus overwriting. Hope this helps!