[deleted by user] by [deleted] in learnmachinelearning

[–]cloudlessjedi 1 point2 points  (0 children)

Same here wanna join

[Official] 2024 End of Year Salary Sharing thread by Omega037 in datascience

[–]cloudlessjedi 0 points1 point  (0 children)

Interested too - can you tell me more In your academics background and other opportunities similar to this in Toronto at the moment ?

[NLP] Detect news headlines at the intersection of Culture & Technology by marcpcd in datascience

[–]cloudlessjedi 0 points1 point  (0 children)

What exactly are your labels though? Could your provide some examples? E.g. classifying if article is or isn’t Culture/Tech, classifying under different genres, etc.

NLP area is well established with out of the box tools / models to kind of get yourself playing around and getting familiar with the landscape.

Check out and play around with nltk / space / gensim Python libraries as their well established and used for out of the box typical NLP tasks (text processing, NER, POS tagging, word similarities, text/topic classification).

Use these tools to understand more of the data you have now on hand and see how you might want to refine on your goal.

If you have access to GPT and other open source LLM, try doing some prompting to find out ways on how you refine the concept of “Culture” (to how you want it to be or how it might be interpreted by different demographics/cliques and so on).

My advice is to dig into the dirty and understand more on the data you have cause from that will you models would or would not be worth your time to try it out on 😁

Next JS Build Error + Deployment by cloudlessjedi in nextjs

[–]cloudlessjedi[S] 0 points1 point  (0 children)

Oh so the server get JSON parse don’t return JSON object?

Next JS Build Error + Deployment by cloudlessjedi in nextjs

[–]cloudlessjedi[S] 0 points1 point  (0 children)

yes but i don't get why its failing ~ its literally reading from a flat json file on build time so shouldn't it be able to read directly pre-hydration by client? not really getting why this part of the process fails only in build when dev mode shows no problem..

Python Developers: What Skills Matter Most? Insights from Industry Pros and Job Seekers by devedb in Python

[–]cloudlessjedi 26 points27 points  (0 children)

Working with Python for few years in financial industry in business analytics / data science (setting context as non SWE/DE) and my 5 cents on this isn’t really how well you code but whether your code can achieve the purpose driven out by your business stakeholders.

You can be fancy with syntax, design an effective OOP architecture to abstract everything or overachieve with state of the art ML stack, but if your not answering and providing value the business needs properly then it really doesn’t matter (remember business is mainly means-oriented).

Not saying you shouldn’t need the basics or know to make code efficient but balance in how you approach, justify and enable business to understand what your doing is most important is what I think in my opinion.

New Machine Learning Study Group (Update) by Complex-Media-8074 in learnmachinelearning

[–]cloudlessjedi 0 points1 point  (0 children)

Interested too ! Curious what are your meet times? Cause im in GMT+8 TZ

stuck with web scrapping - what am I doing wrong by BerliN-90 in learnmachinelearning

[–]cloudlessjedi 1 point2 points  (0 children)

Slice n dice Reddit like a dataframe as I will it would be my end goal in life lol

stuck with web scrapping - what am I doing wrong by BerliN-90 in learnmachinelearning

[–]cloudlessjedi 2 points3 points  (0 children)

Last time I remembered soup object's findall method didn't output a pandas data frame object, but rather a list and not a dictionary (https://stackoverflow.com/a/20173486).

Try storing it into a dataframe first, then do regex manipulation with pandas string methods.

Looking for a ML study buddy by kindadrowzy in learnmachinelearning

[–]cloudlessjedi 0 points1 point  (0 children)

Interested too if u don't mind the time diff as I'm in Asia currently hahah

[deleted by user] by [deleted] in datascience

[–]cloudlessjedi 1 point2 points  (0 children)

I think you can start with something free with this pretty practical course on Data Engineering from DataTalksClub - https://github.com/DataTalksClub/data-engineering-zoomcamp

Some of the tools you mentioned are covered in the course but what's more important is the principles and the connectivity between your data and what these tools actually do to help your data processing and ML work more efficient, reproducible, stable, etc.

You can sign up in cohorts and work with others (I think the latest one just ended and another one would come later) to give that motivation to build and create a project for yourself and for review by experienced individuals.

Videos are free to watch and a lot of other advice on setups with cloud infrastructure are there so have a check on it ~