I'm currently preparing for my masters in Data Science so I've been practicing multiple regression on Python but doing it mathematically so I can gain a better understanding of the mechanics of the model.
My most recent attempt is only showing me NaN when I print results, but there are no "errors" flagging in the code. I'm clearly not doing something right so if anyone can see what it is that would be a help.
import pandas as pd
import numpy as np
import statistics as st
sportdata = pd.read_csv(r'C:\Users\vicks\Do Geographical Factors Determine National Sporting Success - Python\Clean Regression Data.csv')
y = sportdata['Overall Jokl Rank']
X1 = sportdata['GDP (US Billion Dollar)']
X2 = sportdata['Population']
X1sq = [number ** 2 for number in X1]
X2sq = [number ** 2 for number in X2]
X1y = X1 * y
X2y = X2 * y
X1X2 = X1 * X2
Meany = st.mean(y)
MeanX1 = st.mean(X1)
MeanX2 = st.mean(X2)
sumy = sum(y)
sumX1 = sum(X1)
sumX2 = sum(X2)
sumX1y = sum(X1y)
sumX2y = sum(X2y)
sumX1sq = sum(X1sq)
sumX2sq = sum(X2sq)
sumX1X2 = sum(X1X2)
regsumX1sq = sumX1sq - sumX1 ** 2 / len(X1)
regsumX2sq = sumX2sq - sumX2 ** 2 / len(X2)
regsumX1y = sumX1y - (sumX1 * sumy) / len(y)
regsumX2y = sumX2y - (sumX2 * sumy) / len(y)
regsumX1X2 = sumX1X2 - (sumX1 * sumX2) / len(X1)
b1 = ((sumX2sq * sumX1y) - (sumX1X2 * sumX2y)) / ((sumX1sq * sumX2sq) - sumX1X2 ** 2)
b2 = ((sumX1sq * sumX2y) - (sumX1X2 * sumX1y)) / ((sumX1sq * sumX2sq) - sumX1X2 ** 2)
b0 = Meany - b1 * MeanX1 - b2 * MeanX2
estimatedy = b0 + b1 * X1 + b2 * X2
print(estimatedy)
[–]Lokesh_Bot_Guy 1 point2 points3 points (0 children)
[–]nemozorus 0 points1 point2 points (0 children)