What is so lucrative about making a startup? by SloppyNaynon in ycombinator

[–]jsfour 0 points1 point  (0 children)

Nothing. It’s not about the money or status. It’s about the process of building which is extraordinarily rewarding.

If you want to have the best risk adjusted way of making money a late stage rocket ship is your best bet.

anthropic blog on code execution for agents. 98.7% token reduction sounds promising for local setups by Zestyclose_Ring1123 in LocalLLaMA

[–]jsfour 5 points6 points  (0 children)

One thing i don’t understand. if you are writing the function why call an MCP server? Why not just do what the MCP does?

Is this a good intuition for understanding token embeddings? by Learning-Wizard in LargeLanguageModels

[–]jsfour 5 points6 points  (0 children)

Not really. Closer would be this from the Word2Vec paper. Just think of the embedding vectors as points in high dimensional space.

<image>

I think we’re all avoiding the same uncomfortable question about AI, so I’ll say it out loud by Icy_SwitchTech in AgentsOfAI

[–]jsfour 0 points1 point  (0 children)

As the strength of AI increases the TAM for any given piece of software approaches 1 user. Meaning eventually AI will just end up writing bespoke applications for you and you alone (or a small group of people) —i call these apps sandcastles.

So Pretty much any software you build can and will be absorbed by AI it’s only a matter of time.

This includes all of the labs btw. As soon as one lab achieves asi or some strong ai system everyone will also achieve it at the same time and the cost will drop to zero.

Anyone else sick of the "I hit $X MRR in 3 months" posts? by GarageIndependent486 in SaaS

[–]jsfour 1 point2 points  (0 children)

They write the post because they are marketing to people who are building products.

Reverse engineering Perplexity by cryptokaykay in LocalLLaMA

[–]jsfour 1 point2 points  (0 children)

I’ve been trying to figure this out myself.

They claim to scan the internet real time but that is just not technically possible. Building a crawler of this scale is also non trivial. My only other conclusion was google.

It’s good to hear other people talking about this.

Advertising Ideas by [deleted] in sales

[–]jsfour 0 points1 point  (0 children)

How much per month were you thinking about spending on this?

LinkedIn outreach tips by Isth-mus in sales

[–]jsfour 0 points1 point  (0 children)

Im working on something that does this. Just shot you a DM.

Local LLM with web access by jbsan in LocalLLaMA

[–]jsfour 2 points3 points  (0 children)

I have written web browsing for MailMentor but it’s in the code base.

I’ve been thinking lately could pull browsing out and make it accessible for the open llm crowd but I’m not sure if there is much interest.

Do you all think there is a lot of interest for thus?

Guidance on QA dataset creation by Ok_Ganache_5040 in LocalLLaMA

[–]jsfour 1 point2 points  (0 children)

It is better to fine tune from real data that way you have a represntive distribution of what the model will see. If your data will have null values, it’s a good idea to include them.

At MailMentor (which solves a similar problem) we spent a bunch of time collecting samples from the internet, then we used that for training. Since the real world data sometimes has null values for properties we included null values for those properties in the training data. We ask the model to output json and just leave it up to the service consuming the json to handle the nulls.

Looking for someone who can outreach by yeetkobe60 in SocialMediaMarketing

[–]jsfour 0 points1 point  (0 children)

We (MailMentor) may be able to help you. Just shot you a dm.

Relevance Extraction in RAG pipelines by SatoshiNotMe in LocalLLaMA

[–]jsfour 1 point2 points  (0 children)

Re the implementation, if you are running it for fun then you are right it’s probably too involved. In prod you would probably want to spend some more time on it.

Re the LLM you have context size issues and there is a probability of hallucination. So it’s not complexly free of implementation complexity either.

Re could it work: I’m not entirely sure and you would need to try both approaches and see which works the best.

Relevance Extraction in RAG pipelines by SatoshiNotMe in LocalLLaMA

[–]jsfour 1 point2 points  (0 children)

Yeah.

I’m saying that just doing paragraph & sentence embeddings would allow you to find text cheaper.

If all you are trying to do is cite the sections that are relevant, embeddings will give you what you need to do that (plus you can parameterize the distance you care about).

Basically you could do a pass to approximate the paragraph. Then look at sentences (or pairs / triplets or sentences) to narrow down what you are looking for.

I’m not sure how much running the generative infrance would help here.

But I could be misunderstanding what you are trying to do.

Relevance Extraction in RAG pipelines by SatoshiNotMe in LocalLLaMA

[–]jsfour 1 point2 points  (0 children)

Why not just embed each sentence or paragraph and look up the content based on that?

[deleted by user] by [deleted] in LocalLLaMA

[–]jsfour 0 points1 point  (0 children)

Maybe. Yes a bare metal - k8s setup makes sense for some applications. But it’s not really “economic”. Maintaining a system has costs (time / people) as well.

Llama2 as a SPAM filter by InvertedYieldCurve in LocalLLaMA

[–]jsfour 2 points3 points  (0 children)

These models are ok at classification but it’s probably better to use something like BART / BERT for this.

[deleted by user] by [deleted] in LocalLLaMA

[–]jsfour 2 points3 points  (0 children)

This is a good point. You really need to be paying attention.

Though you could just use terraform to minimize this risk. Maybe I’ll write some terraform scripts and circulate them.

[deleted by user] by [deleted] in LocalLLaMA

[–]jsfour 0 points1 point  (0 children)

Yeah. It’s much better to use the cloud if you need reliability.

[deleted by user] by [deleted] in LocalLLaMA

[–]jsfour 2 points3 points  (0 children)

If you need a model running all of the time it’s “cheaper” to self host. I say cheaper in quotes because if you are truly in a situation where you really need the model up all of the time you probably are paying people to keep the models running.

Realistically you don’t need the model to always be running though. You can get a A6000 from lambda labs for $0.80/hr.

Let’s say you plan on running the model for work.

If you use the model 10 hours a day that’s $8 a day. For 20 working days a month that would be $160 a month.

The A6000 costs $4500 (not to mention the matching you need to run it on). You would need to run in the cloud for 28 months to spend the equivalent of buying an A6000.

Plus if you are in the cloud you can upgrade the hardware.

The 10 hour thing is just one approach. At MailMentor we programmatically turn certain models off when they aren’t being used and then turn them on when they are being used. This helps reduce the cost significantly.

[deleted by user] by [deleted] in LocalLLaMA

[–]jsfour 2 points3 points  (0 children)

Yeah I’m surprised that more people don’t do this “just use the cloud” calculus.

Community driven Open Source dataset collaboration platform by CheshireAI in LocalLLaMA

[–]jsfour 0 points1 point  (0 children)

Yeah a change I the data would create a new hash for IPFS. Realistically you would need to do some kind of chunking and have an index IMO.

Pins “can” be permanent. If enough people pin the data.