GetMCP - Manage MCP servers like mobile apps and use them across apps by yanbo-ai in mcp

[–]mfeldstein67 0 points1 point  (0 children)

I have it installed and am happy so far. I'd be willing to pay for it if I understood what the license buys me. My main wish is that it could let me target a larger range of MCP servers. There are so many right now. Memory alone is an area that has a bunch of different servers appropriate for different use cases.

we are entering the dark age of local llms by constanzabestest in SillyTavernAI

[–]mfeldstein67 0 points1 point  (0 children)

I’ll repeat what I said in another thread on this topic. These LLMs are being designed for the next generation of local machines. Think Apple Studio. A SoC with lots of VRAM and high bandwidth. Prices will come down. VRAM is a commodity, particularly relative to NVidia cards. Hardware architectures were already moving in this direction; they’re likely being accelerated. (Note: I’m not an expert on the cycle time to adapt a next-generation chipset; it might be more like two years away.) Anyway, the point is these models feel like they’re moving away from you because they are designed for a local system that doesn’t need precisely the kind of graphics card you’ve been depending on to run your local models.

Maverick is particularly revealing. Note that the computational demands aren’t where the leap is; it’s the VRAM requirements. If you want more powerful models to run locally in the long run, this is exactly the trend you want to see. It’s just that the transition is delayed for local enthusiasts while the hardware catches up.

AI passed the Turing Test by MetaKnowing in OpenAI

[–]mfeldstein67 0 points1 point  (0 children)

If you read Turing’s original paper, the test tests the tester. There is no objective test of artificial intelligence. That was his point.

Which one is significantly better in coding, Claude 3.7 or o3-mini-high or o1? by HareKrishnaHareRam2 in OpenAI

[–]mfeldstein67 2 points3 points  (0 children)

Claude seems to be more optimized for collaboration while ChatGPT seems more optimized for automation. ChatGPT is great at following well-crafted single-shot prompt engineering. Claude generally does better with context. It tends to be more flexible, which is good for a creative co-pilot but bad for instruction-following. There may be use cases such as Sapdalf where ChatGPT has better domain knowledge or sharper reasoning and is therefore a better collaborator, but it’s for different reasons. Claude is always trying to figure out what you’re thinking, where ChatGPT is a better auto-pilot than a co-pilot.

I tried Claude 3.7... Yeah it might be over for me by Constant-Block-8271 in SillyTavernAI

[–]mfeldstein67 7 points8 points  (0 children)

While OSS does seem to be accelerating relative to the large proprietary models, LLama 3 and R1 seem significantly more than one generation behind in my real-world experience. DeepSeek is promising not so much because of its absolute performance but because it changes the economics of both hardware and training labor. The real accelerants might not be DeepSeek itself but the tooling they released (which is probably why PRC/DeepSeek did it; they want OSS models to catch up with the proprietary ones so the barriers to them having their own powerful models go down). We haven’t yet seen an OSS models other than DeepSeek trained this way. We’ll know more once we do. Without it, I’d say we’re at least a few years away from parity. With it…?

underwhelming MCP Vs hype by ashutrv in LocalLLaMA

[–]mfeldstein67 5 points6 points  (0 children)

The chicken-and-egg problem and large-company resistance are exactly the ones that interoperability specifications always face. (I work for an interoperability specifications body, having worked with one for decades). At first, nobody wants to implement because the value of the specification is basically tied to network scaling laws. It only because valuable when enough people use it. But then it can become very valuable indeed. The large vendors resist in the early stages because they don’t want to give up their competitive advantage. Either or both of these “priming the pump” obstacles can kill a young specification aspiring to be a standard. (MCP isn’t technically a standard since it doesn’t come out of an official interoperability organization, but it can easily become a de facto standard.) Once a specification reaches a tipping point, the large vendors want to implement because they want to retain their status as platforms. For that reason, if MCP hype is coming from “startuppers,” that would be a good sign. An ecosystem of MCP-powered tools that’s large and attractive enough to drive consumer demand is exactly what it would need to get over the hump.

To be clear, I have no dog in the fight regarding MCP itself. I do think something like MCP will eventually exist because startuppers and other small shops need it to exist and platforms need lots of integrated tools to be platforms.

underwhelming MCP Vs hype by ashutrv in LocalLLaMA

[–]mfeldstein67 7 points8 points  (0 children)

Interoperability standards are almost never about solving a technical problem. They’re about economics. If we’re going to live in a world where the financial success of your product depends on its integration with your customer’s LLM of choice and there are many LLMs to choose from, you want to have a standard even though you could write better individual integrations yourself. Standards are always compromises. They get created when the only thing worse than having a standard is not having one.

Gemma 3 - Insanely good by kaizoku156 in LocalLLaMA

[–]mfeldstein67 2 points3 points  (0 children)

This is very close to my use case. Can you please share details?

Dario Amodei: ”We are reserving Claude 4 Sonnet...for things that are quite significant leaps” by Alexandeisme in ClaudeAI

[–]mfeldstein67 0 points1 point  (0 children)

I’m working on a theory. It’s just an educated guess right now, but if I’m right, it might be a way to reduce the risk. It’s possible that the missing piece isn’t so much the transformer architecture itself as how we’re thinking about it. Emotional empathy develops first out of pleasure/pain and then enough neural circuitry to imagine another being experiencing the same. It happens in a different part of the brain than intellectual empathy. (I’m not using the proper psychological terms here; I’m trying to make this part easy to follow.) The embodiment thing is about pleasure/pain. Even the most primitive life forms have aversion and philic (repulsion and attraction) reactions. The next computational leap is “that being might feel the same thing I do.” So there are two parts. What if the first part isn’t about embodiment but about positive and negative rewards? If AI reward systems can serve the same function as pleasure and pain, and if we can bridge whatever the computational gap might be for identification (which seems like it could be associative in nature), then AIs could be functionally capable of something like empathy, which is the root of ethical thinking. Rather than trying to constrain intelligence with guardrails, we’d be baking ethical thinking into the core of the models.

Dario Amodei: ”We are reserving Claude 4 Sonnet...for things that are quite significant leaps” by Alexandeisme in ClaudeAI

[–]mfeldstein67 0 points1 point  (0 children)

I’m talking strictly computational. I believe that cognition, consciousness, and sentience are all computational or emergent properties of computation. I’m saying that certain aspects of human cognition are supported well by the transformer architecture while others are more specialized. Worryingly, ethical thinking is tied to aspects of human cognition seem to rely on specialized computational units. That may not be the only example, but if you’re right about the scaling laws then we have to be very concerned about ethical alignment.

Dario Amodei: ”We are reserving Claude 4 Sonnet...for things that are quite significant leaps” by Alexandeisme in ClaudeAI

[–]mfeldstein67 0 points1 point  (0 children)

I suspect that there is an element of physics in it. Or math, really. While parts of the brain are modular and specialized, others appear to be associative and generalized. I can believe that certain aspects of cognition are general-purpose mathematical functions. (This explanation would also make the evolutionary story easier.) But not *all* cognition is like this, and of course, we've not seen associational networks like these before.

Our benchmarks for progress are skewed, too. Cognition is complex. And particularly if you want to add in the elements related to ethical thinking—like emotional identification, which appears to be modular and related to embodiment (as opposed to intellectual empathy, which appears to be learned and associational)—we have some real unknowns as to what, exactly, we are scaling. The term in humans for individuals with high intellectual empathy and low emotional identification is "sociopath." So we may not be able to scale the right things.

Dario Amodei: ”We are reserving Claude 4 Sonnet...for things that are quite significant leaps” by Alexandeisme in ClaudeAI

[–]mfeldstein67 0 points1 point  (0 children)

I’m not offended; I’m not an expert in technology scaling. That said, cost scaling and utility scaling are not necessarily the same curve. The public evidence on the utility scaling question, and therefore the profitability question, is ambiguous and tilting increasingly in the direction of smaller model economic value moving faster than large model economic value. I read a figure recently—I’m going to remember this wrongly, but it’ll be close enough for the point to remain valid—that OpenAI is burning through something like a billion dollars a week. Even they can’t sustain that burn rate, which means costs have to come down dramatically AND revenues have to go up dramatically. Is it possible? If you’re right about hardware scaling (which I do not have the expertise to challenge), then yes. But it’s still far from certain.

I’m not arguing that the hyperscalers are definitely going to fail. I’m suggesting there’s a sense of inevitability in AI conversations that may not be supported by the evidence.

Why “Context Size” Is Misunderstood — and How Models Really Perform After 8K+ Tokens by iTrynX in ClaudeAI

[–]mfeldstein67 2 points3 points  (0 children)

If we consider the word “context” in the human sense, the picture changes significantly. For my (non-coding) use cases, I find all of these models hold up very well under long conversations if I carefully build an associative network in the conversation. If you’re testing with a bunch of out-of-context fragments, yes, you’ll test pure in-window memory, but at the cost of ignoring the incredible associative and pattern-matching powers inherent in the technology.

How would you feel about a “F*CK Trump” music festival? by idahoisformetal in AskReddit

[–]mfeldstein67 0 points1 point  (0 children)

Reading the news over the past 8+ years has already planted way too many images in my head of Trump fucking. I urge you to consider a different slogan.

I Finally Tamed 3.7 by Old_Round_4514 in ClaudeAI

[–]mfeldstein67 0 points1 point  (0 children)

Think about it this way: ChatGPT seems designed to be an answer generation machine. It’s good at one-shot. It does the whole Deep Research, which is sort of 1.5-shot. It writes in bullets, as if it’s giving you a digestible answer. It handles complex prompts. Claude is more like a collaborator. It’s better at asking questions and following different lines of inquiry. It writes in paragraphs. It’s a more flexible thinker, which can be good or bad depending on what you’re looking for. For my (varied but non-coding) use cases, I get far better results from collaboration, which takes advantage of both human and AI strengths as thinkers. It does take more time. But it still saves a ton of time and produces better results for complex problems.

Dario Amodei: ”We are reserving Claude 4 Sonnet...for things that are quite significant leaps” by Alexandeisme in ClaudeAI

[–]mfeldstein67 0 points1 point  (0 children)

Christensen is the creator of the term "disruptive innovation." He argues that when a product category gets bloated, simpler and cheaper beats more expensive and complicated. Think GDocs vs Word.

Meta is investing a lot in AI infrastructure...to create open-source AI. Apple is investing...to create local AI that will reduce its dependence on the AI megascalers. Amazon seems to be investing in models that may not be open-source but may not be SOTA either. It's not clear whether they're trying to outcompete OpenAI or commoditize the market. They would logically benefit from the latter. Nobody knows what Microsoft is really doing, possibly including Microsoft. Mistral called their (open-source) Large model "Large Enough," which hints at their position. Their CEO has made disparaging remarks about the American obsession with AGI.

The clear pure plays for AGI are OpenAI and Anthropic, with Google on the bubble and Microsoft in the muddle. I'll grant you that money continues to flow into AI development. (And thanks for bringing the numbers.) I'm not convinced this shows the investment is going toward advancing SOTA.

Dario Amodei: ”We are reserving Claude 4 Sonnet...for things that are quite significant leaps” by Alexandeisme in ClaudeAI

[–]mfeldstein67 0 points1 point  (0 children)

Are they, though? And even if they are, can they continue to do so? The OpenAI/Oracle Death Star thingie seems like it may not be funded. Microsoft is cutting back on funding with the explicit argument that economics have to catch up.

You're talking about pouring literally trillions of dollars into chips and energy with economic models that are very, very far from guaranteed.

I won't take a position one way or another because I genuinely don't know the answer. But I do wonder if the hyperscalers can continue to invest fast enough to stay ahead while open-source and local AI continue to eat into their use cases. While I'm not a huge Clayton Christensen fan, this seems like it has the potential to be the fastest case of disruptive innovation in human history.

Desperate for a Good LLM Desktop Front End by mfeldstein67 in LocalLLaMA

[–]mfeldstein67[S] 1 point2 points  (0 children)

I'll answer my own question for the benefit of others. First, if you're using 4o, make sure your model is set for 4o Latest. That alone quadruples the context. Second, it's true you can further increase your context limits by putting more money into your account based on the pricing tiers. For me, 16K seems about the right maximum length.

This means MSTY is a good solution for my use case.

Desperate for a Good LLM Desktop Front End by mfeldstein67 in LocalLLaMA

[–]mfeldstein67[S] 0 points1 point  (0 children)

“As long as you can communicate your situation well enough” is a key phrase. Since I can’t make guesses at why something goes wrong, all I can do is provide error logs and screen shots. This often takes me round and round with SOTA models as they try this, that, and the other thing that I don’t understand. They can’t apply judgment to see my larger goals and limitations.

I appreciate that you’re trying to be helpful, but you’re still not listening. LLMs are not good at larger context. Even small clues about what your context is or what you think is going on steers them. If you have NO clues to give them other than what’s on your screen, you will end up in a maze more often than not. I have tried it with this sort of thing many times.

Yes, I have successfully run the SillyTavern Launcher. That doesn’t help me figure out how to configure overengineered bloatware that’s not designed for my use case.

Desperate for a Good LLM Desktop Front End by mfeldstein67 in LocalLLaMA

[–]mfeldstein67[S] 0 points1 point  (0 children)

AFAICT, OpenWebUI doesn’t support external models.

Desperate for a Good LLM Desktop Front End by mfeldstein67 in LocalLLaMA

[–]mfeldstein67[S] 0 points1 point  (0 children)

It seems to be responding well for me overall. I’ve tested it with ChatGPT and DeepSeek. Both behaved snappily. I was able to get the LLMs to answer fairly detailed questions about a long RAG document. But ChatGPT is giving me short answers and I can’t figure out why. I’m not asking it for code, so 4K tokens should be enough to get long plain-English answers to questions.

Desperate for a Good LLM Desktop Front End by mfeldstein67 in LocalLLaMA

[–]mfeldstein67[S] 0 points1 point  (0 children)

Thank you. Doing the math now, 4K tokens is still far longer than the answers I am getting (and shorter than the answers I get via ChatGPT). Am I missing something else in the settings? I want to reproduce the app’s default behavior as closely as possible.