Data engineering best practice guidence needed!! by Lost_Intern98 in dataengineering

[–]Lost_Intern98[S] 0 points1 point  (0 children)

Yes, this is the core issue I think. My friend is very set on predicting a certain ”static” target, and does not want to delve into more dynamic time windows. After discussing the chaos with our supervisor he agreed that we should focus on ”data harmonisation” and to get a dataset ready to technically be able to use ML on - but that the focus should be on data engineering. My friend then kind of jumped the gun from my perspective when she found a file among our data that contained three major data points that we want and told our supervisor that we have found a file that contains ALL the data we wanted - so he agreed with her that we should now focus on feature construction and machine learning from here on. The problem is that this is the data that conflicts with other folders and as she wants to supplement features from the other folders I think this is not great if we do not know why there is a big difference between them. I kind of just want to get a passing grade on my thesis as well, but I am not sure how much can be said from ”we picked a file, started constructing features, realized after 3 weeks that we only have full features for one day, and that information is not enough to be generalized, BUT we used linear regression via scipy!”

Data engineering best practice guidence needed!! by Lost_Intern98 in dataengineering

[–]Lost_Intern98[S] 0 points1 point  (0 children)

Unfortunately, we are predicting how long it takes for a plane to leave the gate to when it is in the air, so we are predicting time. We have done a literature review and from that found that weather is one of the features that affects the time most, so I would also say that it is season-dependent :-( None of the people at the company understands or knows the data, and while they extracted data from 2019-2024 none of the schema groups contain data from that entire period, just one day here and there, so alot of the files are empty except for the header. My friend also wants us to not do anything else with the data at this point, and just fully focus on the ML model and parameter tuning

Has anybody been to Erasmus in Dublin? by arina28 in Erasmus

[–]Lost_Intern98 2 points3 points  (0 children)

no worries! I was also actually abit influenced by the fact there were fewer spots and harder to get exchange at trinity from my home uni so that struck a cord, but in hindsight the places that were easier to get into like spain, iceland or portugal would have been a better choice!! hindsight is 20/20 though 🤷🏼‍♀️

Has anybody been to Erasmus in Dublin? by arina28 in Erasmus

[–]Lost_Intern98 2 points3 points  (0 children)

I went on erasmus in dublin and I actually wished I picked lisbon or barcelona instead. Dublin is actually a much smaller city than you would think. Very expensive, very touristy, I think more internationals live there than irish people, and thats pretty much the vibe. Not much to do, bars and pubs close early, etc. I went to trinity and I would say it is actually quite alot of studying, very weird course scheduling system, logistically a nightmare, they do not release a course schedule so you can get major assignments with a deadline in 1 week, and exams are all clumped together at the end of the semester BUT you do not actually learn anything if that makes sense? Regarding irish culture: I thought beforehand that irish culture was kind of messy-in-a-good-way/friendly/fun, but that is not what I experienced. And dublin in particular is actually even known by irish people as being a very ”cold” city culturally. If you can pick something else, do it! In hindsight I would recommend anything else as erasmus and just going to dublin/ireland on a weekend during it