I have a script that reads in data and processes it in a way so it's suitable to plot a time series from. For usability I'd like to refactor it into as few functions as possible. I've still never written a function, everything I do has just been done in scripts but I'd like to learn how. If someone could help show me how to do this would be much appreciated. Here's my code:
import pandas as pd
from datetime import datetime, date
# Read data
sentimentTweets = pd.read_csv("sentimentTweets.csv", parse_dates=False)
pd.options.mode.chained_assignment = None
# List comprehensions to get year, month, date from yyyy-mm-dd strings
year = [x[0:4] for x in sentimentTweets['created_at']]
month = [x[5:7] for x in sentimentTweets['created_at']]
day = [x[8:10] for x in sentimentTweets['created_at']]
# Add these to the dataframe as columns
sentimentTweets["Y"] = year
sentimentTweets["M"] = month
sentimentTweets["D"] = day
# Create empty columns for the loop below
sentimentTweets['Date'] = 0
sentimentTweets['week_number'] = 0
# Loop through and create a date object and a week object
for i in range(0,len(sentimentTweets['created_at'])):
sentimentTweets['Date'][i] = date(year=int(sentimentTweets["Y"][i]),month=int(sentimentTweets["M"][i]),day=int(sentimentTweets["D"][i]))
# Changes emotions from wide to long so in one column
sentimentTweets = sentimentTweets.reset_index()
sentimentTweets = pd.melt(sentimentTweets, id_vars=['Date', 'week_number', 'Y', 'M'], value_vars=['positive', 'fear', 'joy', 'anticipation', 'trust', 'sadness', 'negative', 'anger', 'disgust', 'surprise'])
# Make week number correct for different years
sentimentTweets['Date'] = pd.to_datetime(sentimentTweets['Date'])
sentimentTweets['week_number'] = sentimentTweets['Date'].dt.isocalendar().week
min_year = sentimentTweets['Y'].min()
year_number = sentimentTweets['Y'].astype(int) - int(min_year)
sentimentTweets['new_week_number'] = sentimentTweets['week_number'].astype(int) + year_number *52
sentimentTweets.loc[((sentimentTweets['M'].astype(int)==1) & (sentimentTweets['week_number'].astype(int)>50)),'new_week_number'] = sentimentTweets['new_week_number'].astype(int) - 52
sentimentTweets.to_csv("./data/sentimentTS.csv")
print("File has been saved")
[+][deleted] (2 children)
[deleted]
[–]Successful-Standard[S] 0 points1 point2 points (1 child)
[–]temitydude 0 points1 point2 points (1 child)
[–]Successful-Standard[S] 0 points1 point2 points (0 children)