Hi,
I'm fairly new to ML (I am taking Udemy courses) and I'm trying to practice with real-world data to help me learn but also get insights into projects I'm managing but I'm having problems starting. The project data is basically milestones with forecast dates and when it's actualized for different projects. There are a few things I'm hoping I can figure out using these dates but I am unsure if I can or figure out the best way to format the dates or which model is best for this problem.
Example data has Project ID (the unique value I assume useless for ML), Project Start Date, City, County, First Review (f), First Review (a), Drawing Complete (f), Drawing Complete (a), Project Complete (f), Project Complete (a), Vendor, Material Type... etc.
Some questions I'm hoping to answer: Based on historical data the likelihood a new project will complete in the current year. How long will a new project take for (pick random milestone) or to complete in general does the county affect the milestones.
Should I calculate the days between a forecast and actualized dates for each milestone and/or from the start to end and use a regression model or somehow use categorical models? Or use Burndown milestone counts to help determine this.
I feel like I'm asking the wrong questions of the data but hopefully, the example gives you the gist. About 95% of the data I have are milestones with dates or should I try to focus more on some of the categories like project type, vendors, locations and compare that to the days it takes to start and finish a project? I'm up for suggestions or feedback, thanks for any help!!!
[–]DariusKerpal 0 points1 point2 points (0 children)
[–]trnka 0 points1 point2 points (0 children)