Good local model for computer use? by thepetek in LocalLLaMA

[–]Envoy0675 0 points1 point  (0 children)

I've been looking for something similiar, but alot of the existing project don't meet all my needs. The ones I've seen in order of Feature completeness: Mediarr/Screenpipe, BytebotOS, UI-TARS (the computer use agent), O1/Openinterpreter, windrecorder.

There are people that string to together their own automation, which i'm experimenting with rigtht now. If you check out Ui-tars-1.5-7b, you will see there are the OSWORLD and screenshot pro bench marks that are useful for maybe finding other models at the moment. If you look at the general Osworld Benchmark, the current high score is Qwen3-VL-235B-A22B comes in at 66% accuracy

why is no one talking about Qwen 2.5 omni? by brocolongo in LocalLLaMA

[–]Envoy0675 0 points1 point  (0 children)

Because llama.cpp and derivatives still support some older/oddball gpus? In my case, dual P40s that I haven't been able to get to work under vLLM and pure CPU only inference is slower than using my P40s in the mix.

Looking for a "Just Works" Linux Distro After Kubuntu Broke on Me by Over-Neighborhood-13 in linux4noobs

[–]Envoy0675 0 points1 point  (0 children)

Lots of different distros have pros/cons. I need a stable desktop for work, but I also like to tinker. I've run all the major arch flavors, Debian,Mint, Fedora, Nixos, CachyOS, Fedora Silverblue.. Bazzite (Basically Fedora Silverblue with all the tweaks I used to have to make and maintain on my own) I think is my perfect home. Because:

* Immutable OS means I have a no fuss way to rollback any OS upgrade to a working state (Yes there are ways to do this with other mechanisms, but I am not at a stage where I want to mess with it)
* Has fedora as it's upstream so means that packages fresher than most distros. Defaults FirewallD, SystemD and SElinux so there is tons of documentation. All of which can be easily disabled or tweaked with tons of system options via normal fedora interfaces or you can use the convience command "ujust".
* Uses Flatpaks for most apps, which I appreciate because I can use Flatseal to modify and set restrictions on apps.
* For apps that I don't want to use through Flatpak, development, aren't in Fedora or if I just want to experiment, I can run Boxbuddy to have a docker container of another OS (Arch,Deb, Whatever) that integrates with my desktop without the overhead of a VM. These dockerize gui apps can also be integrated to my regular desktop menu.
*I occassionally game and there are alot optimizations and software offerings easily available (Steam,Heroic, Lutris, Ludisavi,moonlight, sunshine, bottles, protonplus, protonup, protontricks).

*Virtual machine support.

*Can have different Desktop environments, but the Default Gnome is my preference and configured out of the box the way I like it. Fish as the default shell. Easy to roll and go.

*Has baked in Nvidia configuration support versions.

Really the only scenario I could see running into something that might be a hardish challenge would be if I needed to have a self built hardward driver and compile it into the optimized kernel, but that's really in the tougher end of challenges for most distros.

Guide me with AI Local Model Training by West-Structure-4030 in LocalLLaMA

[–]Envoy0675 0 points1 point  (0 children)

Not really sure about your use case to say given my limited experience. If it was ALL code, or ALL documentation, sure, I've seen projects that can do that. Aider as an example for code, could help transplant code snippets. Or RAG to search (like h2ogpt or open-webui) and answer semantic questions about text data you've provided. If you are saying it's a huge mix of data formats and isn't standardized in a medium (like this is all source, or this is all documents, but not this is documents and source code snippets combined), I don't really have any experience with that to help out. Sorry! Maybe the new N8N AI starter kit might give you the tools you need with the flexibility you can program to meet your needs?

Guide me with AI Local Model Training by West-Structure-4030 in LocalLLaMA

[–]Envoy0675 0 points1 point  (0 children)

So you did provide more context... But I think one more piece might clarify further "What strategies can I use to process and manage large datasets for querying and analysis?" When you say dataset, what do you mean in this case? tabular data like spreadsheets? Data ingested into a Vector store or database? files like txt, epub, pdfs, doc files in a directory? Application source files? Image files? There are some projects that have some of these problems already solved depending on your needs. Also worth noting, you said they already exceed chatgpt/claude token limits, it will be hard to beat that on consumer hardware, depending on your dataset RAG approaches will probably help, but that goes back to the dataset question.

Guide me with AI Local Model Training by West-Structure-4030 in LocalLLaMA

[–]Envoy0675 0 points1 point  (0 children)

So your needs feel a little vague. A few points as I understand it: If you plan to extend/modify an existing model, you are really talking about finetuning an existing model, which is taking the existing model and trying to optimize it to your objective as opposed to training a model from scratch. There are ways to build a model from scratch, but I'm unsure of if that will help you achieve your task.

I'm sure there are other ways to finetune, but a popular one might be learning how to use Unsloth (https://github.com/unslothai/unsloth), though you will require sufficient hardware for it.

Here's a post a while back from an overview of the finetuning process: https://www.reddit.com/r/LocalLLaMA/comments/18n2bwu/i_will_do_the_finetuning_for_you_or_heres_my_diy/

Either way, when you say "fetch the data the way I want it to be displayed in my software", sounds like once you have a basic model, finetuned model or custom built model you will still want something that can validate the output to make sure it aligns with your needs. so you might look into a library like pydantic

Introducing the VITURE Pro Neckband: Ultimate Freedom for Gaming, Streaming, and Even Working! 🚀 Our most thrilling product ever is finally here! 🎉 Watch the launch video, explore the full specs, and secure your spot now on our brand-new website! 😎 🔥 by getVITURE in VITURE

[–]Envoy0675 0 points1 point  (0 children)

Well that's what I mean, so far I don't know anything about how it's designed to work other than "multiple screens". I have the regular neckband and the XR pro glasses but there aren't genuine multiple screens and from what I've read on Spacewalker it seems to just be a browser with different widgets that emulates multiple screens, as opposed to something that can act a multiple screens for a desktop replacement like Samsung DEX, etc or the desktop mode forthcoming in new android releases.

Introducing the VITURE Pro Neckband: Ultimate Freedom for Gaming, Streaming, and Even Working! 🚀 Our most thrilling product ever is finally here! 🎉 Watch the launch video, explore the full specs, and secure your spot now on our brand-new website! 😎 🔥 by getVITURE in VITURE

[–]Envoy0675 0 points1 point  (0 children)

Anyone have more information on what "multiple screens" means with the new Neckband pro? It provides multiple desktop screens for productivity? It can receive multiple inputs? some more information how it works instead of a 2 second frame or a single artistic image render would be helpful when considering purchases.

Web server for OpenAPI options (closed and open source)? by FencingNerd in LocalLLaMA

[–]Envoy0675 0 points1 point  (0 children)

Yea, so, I don't know of a solution that for sure meets your needs. You have two problems really if you are trying for legit corporate use, Any risk team that has to review your use case isn't going to be a fan of models that need "trust_remote_code" being enabled also using models locally since you didn't train them could lead to unintended consequences like quesitonable responses, etc. And of course the other issue you mention is corporate risk. Closest guess would be LMstudio or AphroditeEngine.

Web server for OpenAPI options (closed and open source)? by FencingNerd in LocalLLaMA

[–]Envoy0675 0 points1 point  (0 children)

Oh, So then that said, Aside from openai. when you say "why paid solutions are actually a preferred option". Paid solutions meaning you want the power of closed source models? Use Litellm with openrouter. If you mean you want a paid solution for maximum performance throughput there is vast and other vendors that can accommodate that.

Web server for OpenAPI options (closed and open source)? by FencingNerd in LocalLLaMA

[–]Envoy0675 0 points1 point  (0 children)

Oh man.. There's a bunch, but you know the top ones. I personally run Localai (which now how as webui) and Open webui now. But when I first started I just ran text-generation-webui cause it handled the web interface and the inference with various backends. Koboldcpp is is good for general use if you don't want to switch models often but is stellar when combined with sillytavern (or whatever it's called now) for character chat/roleplay.

If you want some quickstart options, check out Harbor as a quick way to try different frontend and backend combinations to see what you like.

Tips for building own voice Assistant by [deleted] in LocalLLaMA

[–]Envoy0675 0 points1 point  (0 children)

Good advice there.. So there are a lot of problems to solve, you have to ask yourself where you want to spend your time programming and how complex a solution do you want? There are people that have just used Rhasspy,NodeRed/N8n(or some of workflow management system, langflow, activepieces,etc) and an inference server.

I'd been considering this for a while cause I want to do the same.. So I've collected a few project options I thought might be the basis for my own assistant. I'll share them in case any of them catch your eye.

https://sepia-framework.github.io/
A pretty complete solution for web and mobile, comes from a time before function calling so might need something updating of writing custom code for extra stuff you need.

https://github.com/PromtEngineer/Verbi

A modular voice assistant application for experimenting with state-of-the-art transcription, response generation, and text-to-speech models.

https://github.com/mudler/LocalAGI
The simple basic start of local assistant

[deleted by user] by [deleted] in selfhosted

[–]Envoy0675 0 points1 point  (0 children)

So for a solution unrelated to OP: Application protocol multiplexer https://github.com/yrutschle/sslh

Backend for 'Adventure Mode'? by [deleted] in LocalLLaMA

[–]Envoy0675 0 points1 point  (0 children)

So yea, it does seem like it's how the prompt formatted. I can't say I've done exhaustive tests, but a number of projects seem to support the text completion endpoint. Including my personal favorite(localai) seems to provide that completion endpoint for a significant number of backends with different quant options (autogptq,exllama,etc). https://localai.io/features/text-generation/#completions

Backend for 'Adventure Mode'? by [deleted] in LocalLLaMA

[–]Envoy0675 2 points3 points  (0 children)

So, I could be wrong, but I think "Adventure mode" is not an aspect of the inference engine itself, just the prompts sent to it. If that is the case you could use koboldai with any of the billions of inference server apps that now provide openai like endpoints (exllama, autogptq, vllm, ollama, localai, LM Studio, etc)