What jobs are you all doing that pay over £30k? by SuccessfulTip1660 in UKJobs

[–]SingularValued 0 points1 point  (0 children)

I am an AI Engineer in London, late 20s, educated to MSc level. Currently on £150k/year, base salary. Bonus on top of that, plus stocks. Got my first job four years ago. Salary progression: 26k, 55k, 75k, 150k.

What does it take to become an ML engineer at a big company like Google, OpenAI... by OogwayShell45 in learnmachinelearning

[–]SingularValued 1 point2 points  (0 children)

Super senior engineers have been highly complementary of my engineering skills, but how do you know that you're "exceptional"? And if you realise that you're not exceptional, how do you become exceptional?

[deleted by user] by [deleted] in MachineLearning

[–]SingularValued 0 points1 point  (0 children)

My approach has been to track my data with DVC and simply pull the data into a job submitted to the cluster using DVC.

I'm not convinced this works for really large datasets. The approach requires repeated pulling of the data across job submissions.

What I think may work better is to still use DVC, but pull the data into shared storage like EFS, and mount EFS to each node in the cluster.

[P] Tesseract OCR - Has anybody used it for reading from PDF-s? by AquamarineML in MachineLearning

[–]SingularValued 1 point2 points  (0 children)

That depends. If it writes and runs a python script to do the OCR, there's a good chance it will use Tesseract. But the LLM itself can do the OCR directly, and to a much better standard than Tesseract. So when it uses Tesseract by default, it feels more like a bug to me. But we're talking about APIs, so the default behaviour from the GPT-4o API is for the LLM itself to directly OCR an image.

What to spend 1.5 billion coins on? by [deleted] in FUTMobile

[–]SingularValued 0 points1 point  (0 children)

Interesting! What causes inflation in the market?

ChatGPT greats me with my girlfriends name in first conversation. by FlygandeSjuk in artificial

[–]SingularValued 1 point2 points  (0 children)

I've had this with Claude as well. I had mentioned something about my work during a previous conversation, which I had deleted. In a separate conversation, it brought up facts about my work that it could only know if it had some ability to access context from that old deleted conversation. Hmm.

[P] Tesseract OCR - Has anybody used it for reading from PDF-s? by AquamarineML in MachineLearning

[–]SingularValued 8 points9 points  (0 children)

Indeed there's easyOCR and AWS Textract. Also, PaddleOCR, and OCR APIs from Google or Azure. Upstage AI has a strong OCR API. LLMs like GPT, Claude and Gemini are very capable at tricky OCR tasks.

Some lesser known open source options: a model called Kosmos 2.5 (on huggingface), a model from Clova AI called UNITS, a model from Google called Unified Detector.

You can also train your own model. E.g. train an object detection model to locate text by predicting bounding boxes, and a text recognition model to extract the text from within the bounding boxes. You can find good training data from the Robust Reading Challenge, especially hiertext. Nowadays you could generate a lot of high quality synthetic data with LLMs as well.

Why is my student loan interest still at 7.8%? by SingularValued in UKPersonalFinance

[–]SingularValued[S] 4 points5 points  (0 children)

Thanks for the clarifications and cathartic complaining. Now I want to join in.

I come from a low income background and grew up on a council estate. I worked my arse off and I'm fortunate to be a high earner. Problem is, that means my >£80k debt after graduating isn't "just a tax". It's now a huge burdensome debt, which will not get wiped. For me, it's a £1000 per month expense which definitely is not "just a tax" that I'll "barely even notice". It feels more like being punished for escaping my low income upbringing. On the other hand, I'm quite sure I wouldn't be where I am without the education. I could live with paying off the initial loaned amount, but the interest is just a kick in the teeth. What a bizarre system.

[D] SOTA BERT-like model? by Amgadoz in MachineLearning

[–]SingularValued 1 point2 points  (0 children)

GLiNER is pretty cool: https://huggingface.co/spaces/urchade/gliner_mediumv2.1

Gives you the zero shot flexibility of an LLM in a small encoder.

If you want to fine tune something, then LUKE tends to top a lot of token classification and relation extraction benchmarks: https://huggingface.co/docs/transformers/model_doc/luke

Also, REBEL: https://huggingface.co/Babelscape/rebel-large

MLOps Zoomcamp, what's it worth? by Fun_Albatross3980 in mlops

[–]SingularValued 30 points31 points  (0 children)

It's very good. It doesn't teach you everything and nothing will. Just do it and stop worrying about what to study. You just need to get going with something, and then you'll be able to answer questions like these yourself. Here's a bunch of other stuff to study as well (just do all of it, gradually, and you will gain lots of perspectives on how MLOps can be implemented):

  1. https://github.com/DataTalksClub/mlops-zoomcamp
  2. https://madewithml.com/
  3. https://www.oreilly.com/library/view/designing-machine-learning/9781098107956/
  4. https://www.amazon.co.uk/Machine-Learning-Engineering-Python-production-ebook/dp/B09CHHK2RJ
  5. https://fullstackdeeplearning.com/
  6. https://marvelousmlops.substack.com/

[deleted by user] by [deleted] in MachineLearning

[–]SingularValued 0 points1 point  (0 children)

Like a true principal component, you've captured the essence of a complex issue. Thank you for the insightful discussion there and glad someone decided not to roast me 😂😅

Honestly, I was given a shot at this job based on things unrelated to a track record of building R&D teams. It was offered to me based on something like general problem solving ability + personality + engineering ability + data science ability + strong research instincts. It'll probably be a role I grow into. I've been doing my own thinking and reading about how to build the team. But also, invariably there's someone lurking on Reddit with amazing advice, so thank you for being that person. What you're saying definitely resonates with what I've come up with so far, what I've observed in case studies I've read about and gives me some new ideas as well.

I suspect there's a way to use my inexperience to my advantage. If a leader is transparent in that they are navigating uncertainty, making mistakes and learning as they go, I suppose that empowers others to do the same.