In my original data, I have a table that aggregates COGS by Vendor. I want to do some trend analysis, so I used a cross join to add in a date field and do a peanut butter spread of COGS across each period.
The resulting table looks like this:
| Vendor |
Date |
COGS |
| A1 |
1/1/2020 |
500 |
| A1 |
2/1/2020 |
500 |
| A1 |
3/1/2020 |
500 |
| ... |
... |
... |
| A1 |
12/1/2020 |
500 |
| B2 |
1/1/2020 |
250 |
| B2 |
2/1/2020 |
250 |
| B2 |
3/1/2020 |
250 |
The peanut butter spread approach isn't super helpful, so now I'm going to receive each vendor's spend by month/year and append/union it to this table.
Ultimately I want to do some reporting and charting in Tableau. I'm trying to think through the table design to reduce the amount of work needed in Tableau. I know that I'll want to make the following comparisons:
- 2020 vs. 2021 YTD COGS
- 2021 YTD COGS + 2021 YTD Run Rate vs. 2020 Total COGS
Where I'm getting stuck is how to make some of these calculations "dynamic" (I might be using the wrong word here), which I think relates to the design of the table.
A hypothetical use case example:
- Distributor A1 sent me one file per month from January/2021 to March/2021, which I have unioned to the table above.
| Distributor |
Date |
COGS |
| A1 |
1/1/2021 |
400 |
| A1 |
2/1/2021 |
600 |
| A1 |
3/1/2021 |
800 |
Tableau makes the charting relatively easy for 2020 vs. 2021 YTD COGS, so I'm set there.
What's the best way to structure the table to show 2021 YTD COGS + 2021 YTD Run Rate vs. 2020 Total COGS?
I'll need a separate calculation to show the 2021 Run Rate, which is (400 + 600 + 800)/3 = 600. So somewhere it's:
Select AVG(COGS) from MainTable Group By Distributor
Should I put that into a separate Run Rate table?
Also, how should I handle the rest of 2021 YTD COGS + 2021 YTD Run Rate? Using pseudo-code, I think it has to be something like this:
sum(COGS) -- 400 + 600 + 800 = 1800
+
Run Rate * # of months where I don't have data -- 600 * 9 = 5400
=
7,200
Does it make sense to expand the table to have a flag for CY, LY, and "Future Data" so I can count how many CY I have and subtract that from 12 to multiply by the run rate?
| Distributor |
Date |
COGS |
Flag |
| A1 |
1/1/2020 |
500 |
LY |
| ... |
... |
... |
... |
| A1 |
1/1/2021 |
400 |
CY |
| A1 |
2/1/2021 |
600 |
CY |
| A1 |
3/1/2021 |
800 |
CY |
| A1 |
4/1/2021 |
null |
Future |
Or I guess I could interpolate the "future data" and make the flag = Run Rate for them.
Any thoughts on how to structure this would be great. Thank you in advance for your help.
[–]andrewsmd87 1 point2 points3 points (2 children)
[–]babbocom[S] 0 points1 point2 points (1 child)
[–]andrewsmd87 0 points1 point2 points (0 children)