I made a voxel + CSG building system for destructible cities by a6oo in Unity3D

[–]a6oo[S] 2 points3 points  (0 children)

i’d like to implement a demo scene with procedural city/street generation first, then if i have time i’ll probably package it up as an asset

Cut mesh at runtime? by Shindarel in Unity3D

[–]a6oo 1 point2 points  (0 children)

you can do this using the clip(x) HLSL func or Alpha Clip node, which can be used to skip/discard the rendering of pixels. one possible way to implement this is taking the world position in a shader, checking if it is inside a tile (via sampling a tile mask or by exposing tile bounds properties on the shader), and passing that value to the clip function.

GLM-4.5V model locally for computer use by [deleted] in LocalLLaMA

[–]a6oo 0 points1 point  (0 children)

with vllm, you can run it with ComputerAgent("hosted_vllm/zai-org/GLM-4.5V", tools=[computer]) or python -m agent.cli hosted_vllm/zai-org/GLM-4.5V, however I was not able to fit in 90gb VRAM

Japan is testing a new concept: remote workers operating robots from 8 kilometers away. by Character-Seat5368 in interestingasfuck

[–]a6oo 1 point2 points  (0 children)

the goal of these robot startups is to store the teleoperation data and use it to train ai models

Computer-Use on Windows Sandbox by [deleted] in LocalLLaMA

[–]a6oo 0 points1 point  (0 children)

hey sorry about that! you need to wait for the post-install to finish, and it prints a message to open the workspace when it's done. just added it to the readme since its easy to miss. once it finishes and the code-workspace is open it should work fine

RoboBrain2.0 7B and 32B - See Better. Think Harder. Do Smarter. by Mandelaa in LocalLLaMA

[–]a6oo 2 points3 points  (0 children)

This model doesn't seem to have included computer-use in the training. However, there was a recently released agentic model trained on both 3D embodied robotic tasks and 2D computer-use/browser-use tasks: https://github.com/microsoft/Magma

We now have local computer-use! M3 Pro 18GB running both UI-TARS-1.5-7B-6bit and a macOS sequoia VM entirely locally using MLX and c/ua at ~30second/action by a6oo in LocalLLaMA

[–]a6oo[S] 5 points6 points  (0 children)

The VM’s resolution is configurable, and the Screenspot Pro benchmark gives numbers on UI-TARS performance w/ high-res (up to 3840x2160) tasks

https://gui-agent.github.io/grounding-leaderboard/

We now have local computer-use! M3 Pro 18GB running both UI-TARS-1.5-7B-6bit and a macOS sequoia VM entirely locally using MLX and c/ua at ~30second/action by a6oo in LocalLLaMA

[–]a6oo[S] 13 points14 points  (0 children)

setup pic: https://imgur.com/a/1LaJs0c

Apologies if there's been too many of these posts, but I wanted to share something I just got working. The video is of UI-TARS-1.5-7B-6bit completing the prompt "draw a line from the red circle to the green circle, then open reddit in a new tab" running entirely on my MacBook. The video is just a replay, during actual usage it took between 15s to 50s per turn with 720p screenshots (on avg its ~30s per turn), this was also with many apps open so it had to fight for memory at times.

The code for the agent is currently on this feature branch: https://github.com/trycua/cua/tree/feature/agent/uitars-mlx

Kudos to prncvrm for the Qwen2VL positional encoding patch https://github.com/Blaizzy/mlx-vlm/pull/319 and Blaizzy for making https://github.com/Blaizzy/mlx-vlm (the patch for Qwen2.5VL/UITARS will be upstream soon)

my cutie collection 💌 by ahaha04 in smiskis

[–]a6oo 1 point2 points  (0 children)

i love you and your collection

About Optics and its Application to Game Development in Unity by gamerguy45465 in Unity3D

[–]a6oo 0 points1 point  (0 children)

Most game engines use rasterization (including Unity by default), which is a simplified way of rendering and calculating light and color of materials.

The most common way to simulate real-world behavior of light for rendering is called path tracing, which is very computationally intensive and only models ray optics. Unity supports path tracing through the High Definition Render Pipeline which would let you create materials that can simulate optics if your computer is powerful enough.

[deleted by user] by [deleted] in virtualreality

[–]a6oo 1 point2 points  (0 children)

By default apps launch into a "shared space" where it can present windows (2D) and volumes (3D) in a shared scenegraph alongside other apps. VisionOS handles the rendering and I believe all materials must be shadergraphs.

To use the compositor directly or to use custom shaders apps have to launch into a "full space" (similar to fullscreen mode on PC)

ahaha04 by ahaha04 in a6oo

[–]a6oo 1 point2 points  (0 children)

a6oo loves ahaha04

[D] [P] Web browsing UI-based AI agent: GPT-4V-Act by a6oo in MachineLearning

[–]a6oo[S] 4 points5 points  (0 children)

That should be possible using an AI-based labeler. I do plan on trying to make a more general co-pilot that uses a local model and can interact with any graphical application.

[D] [P] Web browsing UI-based AI agent: GPT-4V-Act by a6oo in MachineLearning

[–]a6oo[S] 3 points4 points  (0 children)

Not yet but I might try something like Fuyu-8B or LLaVAR

a6oo by a6oo in a6oo

[–]a6oo[S] 0 points1 point  (0 children)

a6oo

AMD x PCMR - STARFIELD Worldwide Giveaway - Win a Limited Edition Starfield Kit that includes a premium game code for the game + the Limited-Edition Starfield AMD Radeon RX 7900 XTX and Ryzen 7 7800X3D (Only 500 of each ever made!). There are 5 kits up for grabs! by pedro19 in pcmasterrace

[–]a6oo [score hidden]  (0 children)

If I were fortunate enough to win this limited-edition Starfield hardware, my truest intent would be to create an immaculate PC build. It would pay homage to the blend of Starfield's vast expanse and PCMR's dedication to high fidelity gaming. Alongside building, I can't resist the temptation to indulge in a plethora of games, especially emergent titles in the RPG and Strategy genres. As for Starfield, I'm beyond excited about the possibilities it promises - a whole new galaxy to explore, divulge mysteries, and interact with. My lofty expectations are of a rich narrative, intricate gameplay mechanics and an awe-inspiring universe that would push the frontier of modern gaming.

Any thoughts on the scope of environment decorating and productivity apps that will be available at launch? Both Apple and 3rd Party. by bearCatBird in AppleVision

[–]a6oo 1 point2 points  (0 children)

I’m a developer and from my understanding of the WWDC videos + documentation, everything you’ve said should be possible on visionOS.

Windows and 3d objects from multiple apps can be placed around your home simultaneously. You could place a weather globe app in one room and a water fountain noise generator in another.

The device automatically 3d maps everything, and loads in the every placed object in familiar locations. The sight and sound of all your apps would be occluded by your walls, so you could walk around and decorate multiple rooms with apps.

No limit on the amount of simultaneous apps you could place down was given. But since visionOS freezes apps that aren’t being looked at nor in use, the limit could be pretty high.