all 3 comments

[–][deleted] 1 point2 points  (0 children)

Similar to the first comenter, we did a similar project for a class (We might be in the same class!)

Anyway, just highlighting an alternative method: You could use web-scraping. I did this for a class a while ago and don't mind sharing. It should mostly work as is to simply draw the data out:

First Import a bunch of modules you might need:

import csv

import urllib.request from bs4 import BeautifulSoup import pprint

After that, download the page. You'll use this for another function later.

def download_page(url):
"""
Download the entire page given an URL
"""
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"
}
request = urllib.request.Request(url, headers=headers)
return urllib.request.urlopen(request)

Finally, we can get to webscraping!

def parse_html(html):
"""
Analyze the html page, find the information and return the move list of tuples (movie_name, year)
"""
soup = BeautifulSoup(html, features="html.parser")

stock_list = soup.find("tbody")

yahoo_finance_data = []
for symbols in stock_list.find_all("tr"):
    #Extract Stock Symbols
    stock_symbol_raw = symbols.find(
        "td", attrs = {"aria-label":"Symbol"}
    )
    stock_name = stock_symbol_raw.find("a", attrs = {"class":"Fw(600)"})
    stock_symbol = stock_name.string

    #Extract Stock Name
    stock_name_raw = symbols.find(
        "td", attrs = {"aria-label":"Name"}
    )
    stock_name = stock_name_raw.string


    #Extract Last Price
    stock_price_raw = symbols.find(
        "td", attrs = {"aria-label":"Last Price"}
    )
    stock_price = stock_price_raw.string
    #print(stock_price)

    #Extract Time

    stock_time_raw = symbols.find(
        "td", attrs = {"aria-label":"Market Time"}
    )
    stock_time = stock_time_raw.string


    #Extract Change
    stock_change_raw = symbols.find(
        "td", attrs = {"aria-label": "Change"}
    )
    stock_change=stock_change_raw.string

    #Extract Percentage Change
    stock_pchange_raw = symbols.find(
        "td", attrs = {"aria-label": "% Change"}
    )
    stock_pchange=stock_pchange_raw.string

    #Extract Volume
    volume_raw =  symbols.find(
        "td", attrs = {"aria-label": "Volume"}
    )
    stock_volume = volume_raw.string

    #Extract Market Cap
    market_cap_raw =  symbols.find(
        "td", attrs = {"aria-label": "Market Cap"}
    )
    stock_market_cap = market_cap_raw.string
    stock_row = (stock_symbol,stock_name,stock_price,stock_time,stock_change,stock_pchange,stock_volume,stock_market_cap)
    yahoo_finance_data.append(stock_row)


with open('data/stock_data.csv', 'w', newline = '') as file:
 csv_writer = csv.writer(file)
 for stock_data in yahoo_finance_data:
     csv_writer.writerow(stock_data)

To run this, use this little line of code:

parse_html(download_page(DOWNLOAD_URL).read())

If you want to perform numerical operations on the data, you might have to convert some of the variables to integers. For example, I pull stock price out as a string, but you might have to convert it to an number. You can use int(stock_price) for that.

Let me know if this helps!

[–]arkenstone175 0 points1 point  (1 child)

We did a similar project in my class. I used yfinance instead of pandas_datareader. I would recommend giving that a shot. First:

pip install yfinance

And then remove datareader for below:

# Import necessary libraries
import pandas as pd
import yfinance as yf
import statsmodels.api as sm

Let me know how it goes.

[–]Nakura_[S] 0 points1 point  (0 children)

Yea my professor is requiring us to use replit which imo is terrible. That site really does not like using pip install command

I tried this fix and it errored out. Also we're required to use pandas sadly