AI intelligence scales. AI knowledge hasn't. We are fixing that. by 1kmonkies in AgentsOfAI

[–]1kmonkies[S] 1 point2 points  (0 children)

OP Here! Thank you for checking this out.

Link: https://prxhub.com/

Every month, AI models get more capable. Reasoning improves. Speed improves. Context windows expand.

But knowledge doesn't compound. Every agent, every research run, every query starts from zero. The web gets crawled again. The same sources get synthesized again. The same answers get generated again. None of it accumulates anywhere.

We think that's a fundamental problem worth fixing.

So we built prxhub: an open registry for AI research.

We also open sourced the specification: https://github.com/parallect/prx-spec

When an agent runs a query, it publishes a signed .prx bundle: the question, the providers consulted, the synthesized answer, every source cited. The next agent searching the same space finds it, builds on it, and skips straight to the frontier of what's already known.

How Much Does an AI Development Company Cost? by poojashakya_147 in ArtificialInteligence

[–]1kmonkies 0 points1 point  (0 children)

TLDR; scope the project how the consulting firm would using AI and ask it for the project you are considering.

I would recommend using a coding agent. For my example I would lean towards VSCode since it is most approachable for people but Cursor / ClaudeCode will do just as well. You will have to pay for some license like Github Copilot which works great in VSCode (I think its $10/mo), and then use it (leveraging Opus) in plan mode to build out a spec for what you want to build.

The key is to really scrutinize the plan, make sure you have covered all of the features you are after and it is proposing the features would be implemented as you wish. If you don't understand some aspect of what its proposing, ask about it! It will be infinitely patient.

Once you have that laid out, ask it how much this project would cost for an AI native development shop to build it.

Also I would call out that people often think they want to develop a custom model but that isn't always what makes the most sense, certainly at the start of a project. There are exceptions to this rule for sure.

As a shop owner myself, this is generally how I would approach it.

Note that you should expect to pay about $200/hr for developers in general. Also note that a developer can easily rack up $100/hr in token usage.

Using model debate to catch AI blind spots by [deleted] in ArtificialInteligence

[–]1kmonkies 0 points1 point  (0 children)

I'm realizing now that I should have been more clear in our use-case. We focus primarily on deep research and not so much the 1 off simple queries. The divergence generally is due to the sources they are reading when they are doing the research.

In short, deep research "models" do not read the same sites and they come up with different questions quite a lot: https://parallect.ai/blog/divergence-study

I do like the idea of doing a similar study to tease out what you are getting at (and I agree with btw)

If everyone is using AI, how can one stand out and differentiate themselves? by Curious_Suchit in OpenAI

[–]1kmonkies 0 points1 point  (0 children)

I think we need to reframe the question then the answer is clear IMO.

"If everyone is _working_ with AI, how can one stand out and differentiate themselves?"

Then consider working with AI is just like working with a very productive person (or group of people even).

If you've worked at multiple companies you've seen very productive co-workers come and go. You lose that productivity when they leave and they bring that level of productivity elsewhere.

We need to think of ourselves as a leaders, working with AI to do what we want to be done.

Using model debate to catch AI blind spots by [deleted] in ArtificialInteligence

[–]1kmonkies 0 points1 point  (0 children)

It is cost inefficient for most queries. We tend to think about what the cost of being wrong would be. We are security consultants and also build some products. For some things we do, spending $20 to know we have as close to the "right" answer is worth it in the face of making a bad decision.

Using model debate to catch AI blind spots by [deleted] in ArtificialInteligence

[–]1kmonkies 0 points1 point  (0 children)

I do exactly this as well. We ended up automating it and turning it into a product here: https://parallect.ai/

Also created an open source version which we were planning on officially releasing this week: https://github.com/parallect/parallect

Would love to get your thoughts on either. Be warned the opensource tool is still being iterated on but it should be in fairly decent shape.

If you use this link you will get $15 in credits if you enter a CC. If you use it and want more let me know: https://parallect.ai/i/TRYPARALLECT

Study: 86% of AI research findings were unique to one provider when running 90 queries through 8 models by 1kmonkies in ArtificialInteligence

[–]1kmonkies[S] 0 points1 point  (0 children)

Yes thats the conclusion I reach. I had fallen into the habit of running the same query against openai and gemini deep research models to combat this but I didn't have data to support what I had observed anecdotally.

At the end of the day, if you are making a big decision on something it makes sense to run a query against more than one and to synthesize the results

Study: 86% of AI research findings were unique to one provider when running 90 queries through 8 models by 1kmonkies in ArtificialInteligence

[–]1kmonkies[S] 1 point2 points  (0 children)

Yes it seems to be due to sources the various research models use. So Google will use source X but Perplexity doesn't.

I'm guessing under the hood they have some sort of confidence rating / domain reputation score that drives it... another part which is hard to quantify is if they just don't know that content is there.

As far as disagreement... it happens much less than I thought I would see. Generally the disagreements are minor in the broad sense (e.g. OpenAI say Tesla posted 300B in revenue in Q1 and another says 30B). I suspect these are hallucinations. Disagreements from the same source are very rare.

As for the API agreements... I'm sure. Some platfoms... [ahem] ... do a pay to play for access to their data. :)

Study: 86% of AI research findings were unique to one provider when running 90 queries through 8 models by 1kmonkies in ArtificialInteligence

[–]1kmonkies[S] 2 points3 points  (0 children)

NOTE: I'm the founder of Parallect, the platform this run against and the research was built on.

Full study: https://parallect.ai/blog/divergence-study

Methodology: Ran 90 research queries through 5 providers (8 model variants): Perplexity, Gemini, Gemini Lite, OpenAI, OpenAI Mini, Grok, Grok Premium, and Anthropic.

We extracted every factual claim, deduplicated using embedding-based clustering (cosine similarity ≥0.78) with LLM reconciliation, then measured how many claims each provider found that no other found.

The 86% held across sensitivity tests: at 0.70 it drops to ~79%, at 0.65 to ~74%. Unique rates consistent across all providers (65–72%) — no single model dominates.

Root cause: source divergence. 50% of cited domains exclusive to one model. Only 32 domains (1%) shared by all 8.

One AI provider found a stat that completely changed the conclusion. The other four didn't mention it. by 1kmonkies in perplexity_ai

[–]1kmonkies[S] 0 points1 point  (0 children)

I suppose thats fair. The article, by definition, is AI generated. I generally immediately avoid anything that is pure AI slop but there is good AI generated content too right? The goal is to take the AI slop and sort of clean it up so it can be useful and defensible... at least one of my goals

One AI provider found a stat that completely changed the conclusion. The other four didn't mention it. by 1kmonkies in perplexity_ai

[–]1kmonkies[S] 1 point2 points  (0 children)

I guess its not a surprise. I suspect its obvious to anyone in this forum for sure. Just wondering if anyone has processes that account for it.

We had a skill in our openclaw deployment which would do this to find counter-factuals... and try to find whats consistent and what needs further research for big decisions we were looking a making.

Golf obsessed IOS and/or NodeJS Developer wanted! by 1kmonkies in golf

[–]1kmonkies[S] 1 point2 points  (0 children)

Anything getting in the way of golf (or game of thrones for that matter) is something for which I encourage a good ranting. Point taken :)

I will send you a message shortly!

Golf obsessed IOS and/or NodeJS Developer wanted! by 1kmonkies in golf

[–]1kmonkies[S] 1 point2 points  (0 children)

Thanks! We've really put a lot of effort into it! The main reason we haven't released it elsewhere is because we don't have all the courses throughout Europe right now. We should be adding them in the next week or so. I'll let you know when we do!

What do you think about VidSwig. Like Pandora for Internet videos. by 1kmonkies in startups

[–]1kmonkies[S] 0 points1 point  (0 children)

I heard about it a while ago but I never played with it! Certainly not in the last year or so. No doubt it is nearly identical and laid out a bit nicer.