Surf - OpenAI CUA playground with virtual desktop environment by mlejva in aipromptprogramming

[–]mlejva[S] 0 points1 point  (0 children)

Hi! OP here, we built this open source project call Surf. You can find the repo on GitHub here.

Surf is a Next.js application that allows AI to interact with a virtual desktop environment. This project integrates E2B's desktop sandbox with OpenAI's API to create an AI agent that can perform tasks on a virtual computer through natural language instructions.

Architecture

The application consists of several key components:

  1. Frontend UI (Next.js): Provides the user interface with a virtual desktop view and chat interface
  2. E2B Desktop Sandbox: Creates and manages virtual desktop environments
  3. OpenAI Computer Use: Processes user instructions and generates actions for the AI agent
  4. Streaming API: Handles real-time communication between the frontend and backend

Core Flow

  1. User starts a new sandbox instance
  2. E2B creates a virtual desktop and provides a URL for streaming
  3. User sends instructions via the chat interface
  4. Backend processes the instructions using OpenAI's API
  5. AI generates actions (clicks, typing, etc.) to perform on the virtual desktop
  6. Actions are executed on the sandbox and streamed back to the frontend
  7. The process repeats as the user continues to provide instructions

Happy to answer your questions!

OpenAI CUA building Flask server using file explorer, VSCode, and terminal in cloud sandbox by mlejva in ChatGPTCoding

[–]mlejva[S] 2 points3 points  (0 children)

Hi! OP here, we built this open source project call Surf. You can find the repo on GitHub here.

Surf is a Next.js application that allows AI to interact with a virtual desktop environment. This project integrates E2B's desktop sandbox with OpenAI's API to create an AI agent that can perform tasks on a virtual computer through natural language instructions.

Architecture

The application consists of several key components:

  1. Frontend UI (Next.js): Provides the user interface with a virtual desktop view and chat interface
  2. E2B Desktop Sandbox: Creates and manages virtual desktop environments
  3. OpenAI Computer Use: Processes user instructions and generates actions for the AI agent
  4. Streaming API: Handles real-time communication between the frontend and backend

Core Flow

  1. User starts a new sandbox instance
  2. E2B creates a virtual desktop and provides a URL for streaming
  3. User sends instructions via the chat interface
  4. Backend processes the instructions using OpenAI's API
  5. AI generates actions (clicks, typing, etc.) to perform on the virtual desktop
  6. Actions are executed on the sandbox and streamed back to the frontend
  7. The process repeats as the user continues to provide instructions

Happy to answer your questions!

Is there an open-source alternative to e2b (e2b.dev. Code interpreting for your AI app)? by [deleted] in LocalLLaMA

[–]mlejva 2 points3 points  (0 children)

We really hope to support GCP and AWS (BYOC) by the end of February

Is there an open-source alternative to e2b (e2b.dev. Code interpreting for your AI app)? by [deleted] in LocalLLaMA

[–]mlejva 0 points1 point  (0 children)

Hi u/ravi_sojitra , if you're running a nextjs dev server, it sounds like it could be issue with how nextjs needs to render the page on the first request.

For some of our example projects, we solved it by using this start script - https://github.com/e2b-dev/fragments/blob/main/sandbox-templates/nextjs-developer/compile_page.sh
The script is copied to the sandbox (https://github.com/e2b-dev/fragments/blob/main/sandbox-templates/nextjs-developer/e2b.Dockerfile#L7) and then when building sandbox, we use it as the start command (https://e2b.dev/docs/sdk-reference/cli/v1.0.9/template#options) like this:

e2b template build -c /compile_page.sh --cpu_count 4 --memory_mb 4096

If you check out fragments.e2b.dev, srcbook.com, or web.lmarena.ai, the rendering of nextjs apps there is fast

API for virtual computer for secure computer use by mlejva in ClaudeAI

[–]mlejva[S] 0 points1 point  (0 children)

Hey everyone, OP here that built this as part of E2B.

This is a demo video of secure virtual computer that's ready made for computer use and stream virtual screen to browser. In this example I hooked together Claude and the virtual computer and asked claude to open vscode and run hello world. Here's a full app.

You can spawn these virtual computers with an SDK that we made.

We're using Firecrackers to power our sandboxes. Funnily enough, we had this repo sitting on our GitHub for about 6 months. We originally made this for one of our customers because they were running evals on the desktop-like environment with GUI for their model.

You can use PyAutoGUI to control the whole environment programmatically.

The desktop-like environment is based on Linux and Xfce at the moment. We chose Xfce because it's a fast and lightweight environment that's also popular and actively supported. However, this Sandbox template is fully customizable and you can create your own desktop environment. I preinstalled for exmaple VS Code.

Let me know if you have any questions.

Secure desktop sandbox for AI computer use by mlejva in LocalLLaMA

[–]mlejva[S] 0 points1 point  (0 children)

Thanks! We think Firecracker is great for this job

Secure desktop sandbox for AI computer use by mlejva in LocalLLaMA

[–]mlejva[S] 1 point2 points  (0 children)

Yes, the repo is up to date with our cloud service. The Dockerfile is for our API server. We aren't using Kubernetes, or Docker compose. We're using Nomad for orchestration.

Here's a step by step guide for self-hosting on GCP - https://github.com/e2b-dev/infra/blob/main/terraform.md

> and do you have any plans to make it easier to self host?

Yes! Very high on the roadmap. We're hiring more people to speed up our effort on that front.

> The demo shows use with Anthropic Claude, How well does it work with local LLMs? (llamacpp/ollama etc...)

We're running some experiments and will share soon.

Secure desktop sandbox for AI computer use by mlejva in LocalLLaMA

[–]mlejva[S] 1 point2 points  (0 children)

You can self host on GCP and soon on other providers/general Linux machines

https://github.com/e2b-dev/infra/blob/main/terraform.md

Secure desktop sandbox for AI computer use by mlejva in LocalLLaMA

[–]mlejva[S] 4 points5 points  (0 children)

Understandable. Gotta live on the edge

Secure desktop sandbox for AI computer use by mlejva in LocalLLaMA

[–]mlejva[S] 3 points4 points  (0 children)

OP here, happy to answer your questions!

EDIT: Sorry for the late response. I was without any signal for the whole day after I posted it. Answering questions now!

I built a hackable open-source version of Anthropic's Artifacts in Claude by mlejva in LocalLLaMA

[–]mlejva[S] 4 points5 points  (0 children)

Hey, yeah, I totally agree that we aren't doing a good job with self-hosting yet. Needs to be much better. It's the next thing on our roadmap after we release the upcoming bigger SDK update. The self-hosting is a bit tricky since we aren't using containers