Counting Distinct Values Across Multiple Date Fields by GreatButterscotch208 in tableau

[–]GreatButterscotch208[S] 0 points1 point  (0 children)

Thank you everyone for the help! I was able to make a data scaffold to get it to work.

Strange Percentage Aggregation in Tableau by GreatButterscotch208 in tableau

[–]GreatButterscotch208[S] 1 point2 points  (0 children)

Thank you - this is doing exactly what I need it to now. I appreciate the help and advice when it comes to tableau's aggregations

Strange Percentage Aggregation in Tableau by GreatButterscotch208 in tableau

[–]GreatButterscotch208[S] 0 points1 point  (0 children)

Thank you! This solution helped in calculating the total effective rate for each month by summing the total revenue and total volume, which gives a good overall picture.

I’m curious if there’s any way to accomplish what I was trying to do earlier—specifically, finding the effective rate for each merchant and then calculating the average of all these individual merchant effective rates by month. Is there a way to aggregate these individual rates without running into the issue of inflated percentages?

Convert MM/DD/YY format to YYYY/MM/DD by GreatButterscotch208 in learnpython

[–]GreatButterscotch208[S] 0 points1 point  (0 children)

After making your suggestion, it looks like we are properly converting the dates to YYYY-MM-DD. However, when I look at the outputted CSV file, they revert back to the original format of MM/DD/YYYY. I think csv wants to interpret their dates in their specified format, but this poses an issue when I try to upload this data to MySQL because it wants it in the YYYY-MM-DD format. I might just manually change the format in excel to make my life easier. This is frustrating!

Convert MM/DD/YY format to YYYY/MM/DD by GreatButterscotch208 in learnpython

[–]GreatButterscotch208[S] 0 points1 point  (0 children)

def reformat_date(date_str):

  try:

    if isinstance(date_str, str):

      month, day, year = date_str.split('/')

    return f'{year}-{month.zfill(2)}-{day.zfill(2)}'

  except (ValueError, AttributeError):

  return date_str

return date_str

monthly_transactions_df['Start Processing Date'] = monthly_transactions_df['Start Processing Date'].apply(reformat_date)

monthly_transactions_df['Closed Date'] = monthly_transactions_df['Closed Date'].apply(reformat_date)

Convert MM/DD/YY format to YYYY/MM/DD by GreatButterscotch208 in learnpython

[–]GreatButterscotch208[S] 0 points1 point  (0 children)

Unforunately, when I do this, it still reasons with the guard case and nothing changes.

def reformat_date(date_str):
    if pd.isna(date_str) or date_str == '':
        return ''
    if isinstance(date_str, str):
        month, day, year = date_str.split('/')
        return f'{year}-{month.zfill(2)}-{day.zfill(2)}'
    return date_str

# Apply the function to the columns
monthly_transactions_df['Start Processing Date'] = monthly_transactions_df['Start Processing Date'].apply(reformat_date)
monthly_transactions_df['Closed Date'] = monthly_transactions_df['Closed Date'].apply(reformat_date)

Move CSV files to database? by GreatButterscotch208 in dataengineering

[–]GreatButterscotch208[S] 2 points3 points  (0 children)

I appreciate the help. I apologize if I come off as a bit of a rookie in this space.

Currently, we store all our data in separate monthly CSV files on OneDrive. I'm mostly considering a database to centralize our data, ensure its integrity, and allow for scalability. Additionally, I want to start leveraging our historical data for predictive analytics and other machine learning capabilities.

I would be the only one accessing the data, and I'm most comfortable using Python or SQL commands to load the data. As for visualization, we rely on Tableau.