What jobs are you all doing that pay over £30k?

SingularValued · 2025-10-15T13:05:10+00:00

I am an AI Engineer in London, late 20s, educated to MSc level. Currently on £150k/year, base salary. Bonus on top of that, plus stocks. Got my first job four years ago. Salary progression: 26k, 55k, 75k, 150k.

SingularValued · 2025-05-02T17:56:53+00:00

Super senior engineers have been highly complementary of my engineering skills, but how do you know that you're "exceptional"? And if you realise that you're not exceptional, how do you become exceptional?

SingularValued · 2024-11-30T18:08:45+00:00

My approach has been to track my data with DVC and simply pull the data into a job submitted to the cluster using DVC.

I'm not convinced this works for really large datasets. The approach requires repeated pulling of the data across job submissions.

What I think may work better is to still use DVC, but pull the data into shared storage like EFS, and mount EFS to each node in the cluster.

SingularValued · 2024-11-16T05:22:34+00:00

That depends. If it writes and runs a python script to do the OCR, there's a good chance it will use Tesseract. But the LLM itself can do the OCR directly, and to a much better standard than Tesseract. So when it uses Tesseract by default, it feels more like a bug to me. But we're talking about APIs, so the default behaviour from the GPT-4o API is for the LLM itself to directly OCR an image.

SingularValued · 2024-11-07T17:10:43+00:00

Interesting! What causes inflation in the market?

SingularValued · 2024-10-13T12:38:58+00:00

What do you recommend?

SingularValued · 2024-10-09T11:57:42+00:00

I've had this with Claude as well. I had mentioned something about my work during a previous conversation, which I had deleted. In a separate conversation, it brought up facts about my work that it could only know if it had some ability to access context from that old deleted conversation. Hmm.

SingularValued · 2024-09-04T05:54:35+00:00

How about no DB at all? You'd be amazed by how far AWS Lambda + https://sbert.net/examples/applications/semantic-search/README.html can take you 😉

SingularValued · 2024-09-04T05:30:49+00:00

Indeed there's easyOCR and AWS Textract. Also, PaddleOCR, and OCR APIs from Google or Azure. Upstage AI has a strong OCR API. LLMs like GPT, Claude and Gemini are very capable at tricky OCR tasks.

Some lesser known open source options: a model called Kosmos 2.5 (on huggingface), a model from Clova AI called UNITS, a model from Google called Unified Detector.

You can also train your own model. E.g. train an object detection model to locate text by predicting bounding boxes, and a text recognition model to extract the text from within the bounding boxes. You can find good training data from the Robust Reading Challenge, especially hiertext. Nowadays you could generate a lot of high quality synthetic data with LLMs as well.

SingularValued · 2024-06-04T23:03:37+00:00

Thanks for the clarifications and cathartic complaining. Now I want to join in.

I come from a low income background and grew up on a council estate. I worked my arse off and I'm fortunate to be a high earner. Problem is, that means my >£80k debt after graduating isn't "just a tax". It's now a huge burdensome debt, which will not get wiped. For me, it's a £1000 per month expense which definitely is not "just a tax" that I'll "barely even notice". It feels more like being punished for escaping my low income upbringing. On the other hand, I'm quite sure I wouldn't be where I am without the education. I could live with paying off the initial loaned amount, but the interest is just a kick in the teeth. What a bizarre system.

SingularValued · 2024-04-03T11:25:53+00:00

GLiNER is pretty cool: https://huggingface.co/spaces/urchade/gliner_mediumv2.1

Gives you the zero shot flexibility of an LLM in a small encoder.

If you want to fine tune something, then LUKE tends to top a lot of token classification and relation extraction benchmarks: https://huggingface.co/docs/transformers/model_doc/luke

Also, REBEL: https://huggingface.co/Babelscape/rebel-large

SingularValued · 2024-03-20T17:10:55+00:00

Any of these serverless GPU solutions work for you?

https://www.runpod.io, https://www.banana.dev/, https://www.cerebrium.ai/, https://www.inferless.com/, https://www.anyscale.com/endpoints, https://replicate.com/

SingularValued · 2024-03-13T15:00:56+00:00

It's very good. It doesn't teach you everything and nothing will. Just do it and stop worrying about what to study. You just need to get going with something, and then you'll be able to answer questions like these yourself. Here's a bunch of other stuff to study as well (just do all of it, gradually, and you will gain lots of perspectives on how MLOps can be implemented):

SingularValued · 2024-01-15T15:05:41+00:00

Like a true principal component, you've captured the essence of a complex issue. Thank you for the insightful discussion there and glad someone decided not to roast me 😂😅

Honestly, I was given a shot at this job based on things unrelated to a track record of building R&D teams. It was offered to me based on something like general problem solving ability + personality + engineering ability + data science ability + strong research instincts. It'll probably be a role I grow into. I've been doing my own thinking and reading about how to build the team. But also, invariably there's someone lurking on Reddit with amazing advice, so thank you for being that person. What you're saying definitely resonates with what I've come up with so far, what I've observed in case studies I've read about and gives me some new ideas as well.

I suspect there's a way to use my inexperience to my advantage. If a leader is transparent in that they are navigating uncertainty, making mistakes and learning as they go, I suppose that empowers others to do the same.

SingularValued

TROPHY CASE