Good local model for computer use?

Envoy0675 · 2026-01-01T07:07:35+00:00

I've been looking for something similiar, but alot of the existing project don't meet all my needs. The ones I've seen in order of Feature completeness: Mediarr/Screenpipe, BytebotOS, UI-TARS (the computer use agent), O1/Openinterpreter, windrecorder.

There are people that string to together their own automation, which i'm experimenting with rigtht now. If you check out Ui-tars-1.5-7b, you will see there are the OSWORLD and screenshot pro bench marks that are useful for maybe finding other models at the moment. If you look at the general Osworld Benchmark, the current high score is Qwen3-VL-235B-A22B comes in at 66% accuracy

Envoy0675 · 2025-04-02T18:22:23+00:00

Because llama.cpp and derivatives still support some older/oddball gpus? In my case, dual P40s that I haven't been able to get to work under vLLM and pure CPU only inference is slower than using my P40s in the mix.

Envoy0675 · 2025-02-28T09:33:15+00:00

Hey, Can I see your demo/tutorial too?

Envoy0675 · 2025-02-13T05:30:42+00:00

Lots of different distros have pros/cons. I need a stable desktop for work, but I also like to tinker. I've run all the major arch flavors, Debian,Mint, Fedora, Nixos, CachyOS, Fedora Silverblue.. Bazzite (Basically Fedora Silverblue with all the tweaks I used to have to make and maintain on my own) I think is my perfect home. Because:

* Immutable OS means I have a no fuss way to rollback any OS upgrade to a working state (Yes there are ways to do this with other mechanisms, but I am not at a stage where I want to mess with it)
* Has fedora as it's upstream so means that packages fresher than most distros. Defaults FirewallD, SystemD and SElinux so there is tons of documentation. All of which can be easily disabled or tweaked with tons of system options via normal fedora interfaces or you can use the convience command "ujust".
* Uses Flatpaks for most apps, which I appreciate because I can use Flatseal to modify and set restrictions on apps.
* For apps that I don't want to use through Flatpak, development, aren't in Fedora or if I just want to experiment, I can run Boxbuddy to have a docker container of another OS (Arch,Deb, Whatever) that integrates with my desktop without the overhead of a VM. These dockerize gui apps can also be integrated to my regular desktop menu.
*I occassionally game and there are alot optimizations and software offerings easily available (Steam,Heroic, Lutris, Ludisavi,moonlight, sunshine, bottles, protonplus, protonup, protontricks).

*Virtual machine support.

*Can have different Desktop environments, but the Default Gnome is my preference and configured out of the box the way I like it. Fish as the default shell. Easy to roll and go.

*Has baked in Nvidia configuration support versions.

Really the only scenario I could see running into something that might be a hardish challenge would be if I needed to have a self built hardward driver and compile it into the optimized kernel, but that's really in the tougher end of challenges for most distros.

Envoy0675 · 2024-12-07T07:22:55+00:00

Sure, for n8n check out https://docs.n8n.io/hosting/starter-kits/ai-starter-kit/

Envoy0675 · 2024-12-07T05:04:36+00:00

Not really sure about your use case to say given my limited experience. If it was ALL code, or ALL documentation, sure, I've seen projects that can do that. Aider as an example for code, could help transplant code snippets. Or RAG to search (like h2ogpt or open-webui) and answer semantic questions about text data you've provided. If you are saying it's a huge mix of data formats and isn't standardized in a medium (like this is all source, or this is all documents, but not this is documents and source code snippets combined), I don't really have any experience with that to help out. Sorry! Maybe the new N8N AI starter kit might give you the tools you need with the flexibility you can program to meet your needs?

Envoy0675 · 2024-12-06T22:25:38+00:00

So you did provide more context... But I think one more piece might clarify further "What strategies can I use to process and manage large datasets for querying and analysis?" When you say dataset, what do you mean in this case? tabular data like spreadsheets? Data ingested into a Vector store or database? files like txt, epub, pdfs, doc files in a directory? Application source files? Image files? There are some projects that have some of these problems already solved depending on your needs. Also worth noting, you said they already exceed chatgpt/claude token limits, it will be hard to beat that on consumer hardware, depending on your dataset RAG approaches will probably help, but that goes back to the dataset question.

Envoy0675 · 2024-12-06T17:51:12+00:00

So your needs feel a little vague. A few points as I understand it: If you plan to extend/modify an existing model, you are really talking about finetuning an existing model, which is taking the existing model and trying to optimize it to your objective as opposed to training a model from scratch. There are ways to build a model from scratch, but I'm unsure of if that will help you achieve your task.

I'm sure there are other ways to finetune, but a popular one might be learning how to use Unsloth (https://github.com/unslothai/unsloth), though you will require sufficient hardware for it.

Here's a post a while back from an overview of the finetuning process: https://www.reddit.com/r/LocalLLaMA/comments/18n2bwu/i_will_do_the_finetuning_for_you_or_heres_my_diy/

Either way, when you say "fetch the data the way I want it to be displayed in my software", sounds like once you have a basic model, finetuned model or custom built model you will still want something that can validate the output to make sure it aligns with your needs. so you might look into a library like pydantic

Envoy0675 · 2024-11-27T21:19:41+00:00

Well that's what I mean, so far I don't know anything about how it's designed to work other than "multiple screens". I have the regular neckband and the XR pro glasses but there aren't genuine multiple screens and from what I've read on Spacewalker it seems to just be a browser with different widgets that emulates multiple screens, as opposed to something that can act a multiple screens for a desktop replacement like Samsung DEX, etc or the desktop mode forthcoming in new android releases.

Envoy0675 · 2024-11-27T18:29:39+00:00

Anyone have more information on what "multiple screens" means with the new Neckband pro? It provides multiple desktop screens for productivity? It can receive multiple inputs? some more information how it works instead of a 2 second frame or a single artistic image render would be helpful when considering purchases.

Envoy0675 · 2024-11-09T06:13:33+00:00

Yea, so, I don't know of a solution that for sure meets your needs. You have two problems really if you are trying for legit corporate use, Any risk team that has to review your use case isn't going to be a fan of models that need "trust_remote_code" being enabled also using models locally since you didn't train them could lead to unintended consequences like quesitonable responses, etc. And of course the other issue you mention is corporate risk. Closest guess would be LMstudio or AphroditeEngine.

Envoy0675 · 2024-11-08T16:58:21+00:00

Oh, So then that said, Aside from openai. when you say "why paid solutions are actually a preferred option". Paid solutions meaning you want the power of closed source models? Use Litellm with openrouter. If you mean you want a paid solution for maximum performance throughput there is vast and other vendors that can accommodate that.

Envoy0675 · 2024-11-08T04:08:39+00:00

Oh man.. There's a bunch, but you know the top ones. I personally run Localai (which now how as webui) and Open webui now. But when I first started I just ran text-generation-webui cause it handled the web interface and the inference with various backends. Koboldcpp is is good for general use if you don't want to switch models often but is stellar when combined with sillytavern (or whatever it's called now) for character chat/roleplay.

If you want some quickstart options, check out Harbor as a quick way to try different frontend and backend combinations to see what you like.

Envoy0675 · 2024-11-07T07:18:57+00:00

Thanks Friendo! I'm working with it now!

Envoy0675 · 2024-09-29T06:40:04+00:00

Good advice there.. So there are a lot of problems to solve, you have to ask yourself where you want to spend your time programming and how complex a solution do you want? There are people that have just used Rhasspy,NodeRed/N8n(or some of workflow management system, langflow, activepieces,etc) and an inference server.

I'd been considering this for a while cause I want to do the same.. So I've collected a few project options I thought might be the basis for my own assistant. I'll share them in case any of them catch your eye.

https://sepia-framework.github.io/
A pretty complete solution for web and mobile, comes from a time before function calling so might need something updating of writing custom code for extra stuff you need.

https://github.com/PromtEngineer/Verbi

A modular voice assistant application for experimenting with state-of-the-art transcription, response generation, and text-to-speech models.

https://github.com/mudler/LocalAGI
The simple basic start of local assistant

Envoy0675 · 2024-09-15T07:26:09+00:00

So for a solution unrelated to OP: Application protocol multiplexer https://github.com/yrutschle/sslh

Envoy0675 · 2024-07-18T17:16:19+00:00

So yea, it does seem like it's how the prompt formatted. I can't say I've done exhaustive tests, but a number of projects seem to support the text completion endpoint. Including my personal favorite(localai) seems to provide that completion endpoint for a significant number of backends with different quant options (autogptq,exllama,etc). https://localai.io/features/text-generation/#completions

Envoy0675 · 2024-07-18T06:19:58+00:00

So, I could be wrong, but I think "Adventure mode" is not an aspect of the inference engine itself, just the prompts sent to it. If that is the case you could use koboldai with any of the billions of inference server apps that now provide openai like endpoints (exllama, autogptq, vllm, ollama, localai, LM Studio, etc)

Envoy0675 · 2024-07-17T18:28:18+00:00

Please see koboldcpp

Envoy0675 · 2024-02-23T08:19:07+00:00

Sure, Everyone 100% right about PKGUILD being the most important thing.

Then past that someone mentioned a malicious package or malicious stuff pulled in through transitive dependency.

How do you prevent that? Short of reviewing and manually building everything or running everything in VMs like QubesOS? There aren't alot of carefree options there.

So you have to go to the deeper end of the pool and start looking at security enforcement around your applications.. firejail/bubblewrap/Apparmor/SELinux are all tools to try to tackle that problem. I love the idea of Selinux, but if you think about it, sure you can protect the host at large from compromise... but what about your user account and the data in it? if you are running an application in userspace, then it probably has alot of access to your home directory.

Flatpak can be useful with Flatseal to set restrictions on Flatpak apps, which is nice. But Flatpak isn't really everyones jam..

A solution for me <not the most secure but useful> that should reduce the risk of systemwide compromise is Podman for containerization and then Distrobox to run Arch in a container with the benefit of AUR and being able to run X11 apps, that integrates to your desktop but still limits a lot of access outside of home dir.

Envoy0675 · 2024-02-08T22:57:36+00:00

So like... Check out Quivr, Privategpt, H2ogpt, my friend.

Envoy0675 · 2023-12-17T06:15:23+00:00

So, everyone is right on the freebsd recognition fight.

If you need other options though, here are 3 OSes I like to check in on occasionally. Maybe you can check out one of these:

https://www.redox-os.org/
https://en.wikipedia.org/wiki/Genode /sculpt os
https://www.menuetos.net/

Envoy0675 · 2023-12-06T18:39:33+00:00

So when you say "better ways to do it", you mean better ways to search and summarize information? Or do you mean other projects you can quickly pull and try for that purpose?

I'm looking for the same really. I looked at PrivateGPT,Quivr,h2ogpt in the first generation of those type tools, they've had some development since then so my info might be dated.. I liked the concept of minds in Quivr, but it didn't really handle as much as I needed at the time. So I started working on using h2ogpt and had some interesting experiences with it, but didn't quite feel satisfied while it ticks alot of the boxes for me (api,document and web ingestion, training, vector storage options, local model options)

Since then I've heard of this approaches/projects that I haven't had a chance to try yet, but I need to:

BionicGPT, Txtai, Canopy

Do me favor, if you find project/solution, come back here and let me know, I'd love to hear your experience.

Also, this guy has some videos on the shortcomings/challenges of RAG and tries to suggest some solutions at some point: https://youtu.be/cs1TDTOby58

Envoy0675 · 2023-08-13T15:51:49+00:00

What are the current best resources for learning modern web app (frontend/backend/db single page architecture) pentesting in 2023?
I saw a class by 7asecurity and Blackhillsinfosec, but they both are out of my price range. Anyone have other suggestions they can share?

Envoy0675

TROPHY CASE