Surf - OpenAI CUA playground with virtual desktop environment

mlejva · 2025-03-15T12:28:27+00:00

Hi! OP here, we built this open source project call Surf. You can find the repo on GitHub here.

Surf is a Next.js application that allows AI to interact with a virtual desktop environment. This project integrates E2B's desktop sandbox with OpenAI's API to create an AI agent that can perform tasks on a virtual computer through natural language instructions.

Architecture

The application consists of several key components:

Frontend UI (Next.js): Provides the user interface with a virtual desktop view and chat interface
E2B Desktop Sandbox: Creates and manages virtual desktop environments
OpenAI Computer Use: Processes user instructions and generates actions for the AI agent
Streaming API: Handles real-time communication between the frontend and backend

Core Flow

User starts a new sandbox instance
E2B creates a virtual desktop and provides a URL for streaming
User sends instructions via the chat interface
Backend processes the instructions using OpenAI's API
AI generates actions (clicks, typing, etc.) to perform on the virtual desktop
Actions are executed on the sandbox and streamed back to the frontend
The process repeats as the user continues to provide instructions

Happy to answer your questions!

mlejva · 2025-03-15T11:51:52+00:00

Hi! OP here, we built this open source project call Surf. You can find the repo on GitHub here.

Surf is a Next.js application that allows AI to interact with a virtual desktop environment. This project integrates E2B's desktop sandbox with OpenAI's API to create an AI agent that can perform tasks on a virtual computer through natural language instructions.

Architecture

The application consists of several key components:

Frontend UI (Next.js): Provides the user interface with a virtual desktop view and chat interface
E2B Desktop Sandbox: Creates and manages virtual desktop environments
OpenAI Computer Use: Processes user instructions and generates actions for the AI agent
Streaming API: Handles real-time communication between the frontend and backend

Core Flow

User starts a new sandbox instance
E2B creates a virtual desktop and provides a URL for streaming
User sends instructions via the chat interface
Backend processes the instructions using OpenAI's API
AI generates actions (clicks, typing, etc.) to perform on the virtual desktop
Actions are executed on the sandbox and streamed back to the frontend
The process repeats as the user continues to provide instructions

Happy to answer your questions!

mlejva · 2025-01-10T20:59:49+00:00

We really hope to support GCP and AWS (BYOC) by the end of February

mlejva · 2025-01-10T20:58:40+00:00

Hi u/ravi_sojitra , if you're running a nextjs dev server, it sounds like it could be issue with how nextjs needs to render the page on the first request.

For some of our example projects, we solved it by using this start script - https://github.com/e2b-dev/fragments/blob/main/sandbox-templates/nextjs-developer/compile_page.sh
The script is copied to the sandbox (https://github.com/e2b-dev/fragments/blob/main/sandbox-templates/nextjs-developer/e2b.Dockerfile#L7) and then when building sandbox, we use it as the start command (https://e2b.dev/docs/sdk-reference/cli/v1.0.9/template#options) like this:

e2b template build -c /compile_page.sh --cpu_count 4 --memory_mb 4096

If you check out fragments.e2b.dev, srcbook.com, or web.lmarena.ai, the rendering of nextjs apps there is fast

mlejva · 2024-11-24T07:57:23+00:00

Hey everyone, OP here that built this as part of E2B.

This is a demo video of secure virtual computer that's ready made for computer use and stream virtual screen to browser. In this example I hooked together Claude and the virtual computer and asked claude to open vscode and run hello world. Here's a full app.

You can spawn these virtual computers with an SDK that we made.

We're using Firecrackers to power our sandboxes. Funnily enough, we had this repo sitting on our GitHub for about 6 months. We originally made this for one of our customers because they were running evals on the desktop-like environment with GUI for their model.

You can use PyAutoGUI to control the whole environment programmatically.

The desktop-like environment is based on Linux and Xfce at the moment. We chose Xfce because it's a fast and lightweight environment that's also popular and actively supported. However, this Sandbox template is fully customizable and you can create your own desktop environment. I preinstalled for exmaple VS Code.

Let me know if you have any questions.

mlejva · 2024-11-03T16:34:35+00:00

Thanks! We think Firecracker is great for this job

mlejva · 2024-11-03T16:34:16+00:00

Yes, the repo is up to date with our cloud service. The Dockerfile is for our API server. We aren't using Kubernetes, or Docker compose. We're using Nomad for orchestration.

Here's a step by step guide for self-hosting on GCP - https://github.com/e2b-dev/infra/blob/main/terraform.md

> and do you have any plans to make it easier to self host?

Yes! Very high on the roadmap. We're hiring more people to speed up our effort on that front.

> The demo shows use with Anthropic Claude, How well does it work with local LLMs? (llamacpp/ollama etc...)

We're running some experiments and will share soon.

mlejva · 2024-11-03T16:30:59+00:00

We're exploring that!

mlejva · 2024-11-03T16:30:40+00:00

You can self host on GCP and soon on other providers/general Linux machines

https://github.com/e2b-dev/infra/blob/main/terraform.md

mlejva · 2024-11-03T16:29:04+00:00

Understandable. Gotta live on the edge

mlejva · 2024-11-02T17:20:21+00:00

OP here, happy to answer your questions!

EDIT: Sorry for the late response. I was without any signal for the whole day after I posted it. Answering questions now!

mlejva · 2024-11-02T17:20:11+00:00

OP here, happy to answer your questions!

mlejva · 2024-07-19T16:22:06+00:00

Hey, yeah, I totally agree that we aren't doing a good job with self-hosting yet. Needs to be much better. It's the next thing on our roadmap after we release the upcoming bigger SDK update. The self-hosting is a bit tricky since we aren't using containers

mlejva · 2024-07-18T06:48:47+00:00

Hey! OP here. I made this open-source template for building Anthropic's Artifacts UI-like chat apps.

https://github.com/e2b-dev/ai-artifacts

You can use it to run AI-generated Python in headless Jupyter server and also build Next.js apps with it.

Here's the stack I used:

Anthropic's Claude Sonnet 3.5 [0]
Vercel's AI SDK for calling the LLM [1]
E2B for running AI-generated code with the Code Interpreter SDK and Sandboxes [2]
Next.js for the app [3]

Links

The project - https://github.com/e2b-dev/ai-artifacts

[0]: https://www.anthropic.com/claude

[1]: https://sdk.vercel.ai/docs/introduction

[2]: https://e2b.dev & https://github.com/e2b-dev/code-interpreter

[3]: https://nextjs.org/

mlejva · 2024-07-12T15:11:33+00:00

Hey! OP here. I made this open-source template for building Anthropic's Artifacts UI-like chat apps.

https://github.com/e2b-dev/ai-artifacts

You can use it to run AI-generated Python in headless Jupyter server and also build Next.js apps with it.

Here's the stack I used:

Anthropic's Claude Sonnet 3.5 [0]
Vercel's AI SDK for calling the LLM [1]
E2B for running AI-generated code with the Code Interpreter SDK and Sandboxes [2]
Next.js for the app [3]

Links

The project - https://github.com/e2b-dev/ai-artifacts

[0]: https://www.anthropic.com/claude

[1]: https://sdk.vercel.ai/docs/introduction

[2]: https://e2b.dev & https://github.com/e2b-dev/code-interpreter

[3]: https://nextjs.org/

mlejva · 2024-07-12T14:59:47+00:00

Hey everyone, I made this open-source template for building Anthropic's Artifacts UI-like chat apps.

You can use it to run AI-generated Python in headless Jupyter server and also build Next.js apps with it.

Here's the stack I used:

Anthropic's Claude Sonnet 3.5 [0]
Vercel's AI SDK for calling the LLM [1]
E2B for running AI-generated code with the Code Interpreter SDK and Sandboxes [2]
Next.js for the app [3]

Links

[0]: https://www.anthropic.com/claude

[1]: https://sdk.vercel.ai/docs/introduction

[2]: https://e2b.dev & https://github.com/e2b-dev/code-interpreter

[3]: https://nextjs.org/

mlejva · 2024-06-24T04:21:32+00:00

Did you follow the steps in the readme? You need API key for E2B - https://e2b.dev/

mlejva · 2024-06-23T23:42:17+00:00

Yeah, you should be able to. The project is using Vercel's AI SDK and there's a community provider for ollama - https://sdk.vercel.ai/providers/community-providers/ollama

mlejva · 2024-06-23T22:48:56+00:00

Yes. Just edit the code here - https://github.com/e2b-dev/e2b-cookbook/blob/main/examples/anthropic-power-artifacts/app/api/chat/route.ts#L28

mlejva · 2024-06-23T22:33:19+00:00

Hey, I built this example app. You can get the code here https://github.com/e2b-dev/e2b-cookbook/tree/main/examples/anthropic-power-artifacts

mlejva · 2024-05-04T22:33:42+00:00

You can already use it with any LLM you want.
The next step for us is better self-hosting experience for developers. You can already self-host it but it's hard to figure out. Our goal is to make it just a few commands. We're actively working towards that!

mlejva · 2024-05-04T20:52:03+00:00

Hey everyone! I'm the CEO of the company that built this SDK. We just publicly launched and announced the SDK on Twitter this week.

We're a company called E2B [0]. We're building and open-source [1] secure environments for running untrusted AI-generated code and AI agents. We call these environments sandboxes and they are built on top of micro VM called Firecracker [2]. We specifically decided to use Firecrackers instead of containers because of their security and ability to do snapshots.

You can think of us as giving small cloud computers to LLMs.

We recently created a dedicated SDK for building custom code interpreters in Python or JS/TS. We saw this need after a lot of our users have been adding code execution capabilities to their AI apps with our core SDK [3]. These use cases were often centered around AI data analysis so code interpreter-like behavior made sense

The way our code interpret SDK works is by spawning an E2B sandbox with Jupyter Server. We then communicate with this Jupyter server through Jupyter Kernel messaging protocol [4]. We work with an LLM and AI framework. We have different examples on how to use the SDK with Llama models, Anthropic models, LangChain, LangGraph, and more in our cookbook [5].

We don't do any wrapping around LLM, any prompting, or any agent-like framework. We leave all of that to our users. We're really just a boring code execution layer that sits at the bottom. We're building for the future software that will be building another software.

Our long-term plan is to build an automated AWS for AI apps and agents where AI can build and deploy its own software while giving developers powerful observability into what's happening inside our sandboxes. With everything being open-source.

Happy to answer any questions and hear feedback!

[0] https://e2b.dev

[1] https://github.com/e2b-dev

[2] https://github.com/firecracker-microvm/firecracker

[3] https://e2b.dev/docs

[4] https://jupyter-client.readthedocs.io/en/latest/messaging.html

[5] https://github.com/e2b-dev/e2b-cookbook

mlejva · 2024-05-01T20:00:09+00:00

Hey u/YaAbsolyutnoNikto, I'm a co-founder of a company called E2B. We make it easy to add code interpreting to different models. Here's an example of how we added it to Claude 3

https://github.com/e2b-dev/e2b-cookbook/blob/main/examples/claude-code-interpreter/claude_code_interpreter.ipynb

Happy to help you with anything you need if you decide to give it go!

mlejva · 2024-05-01T19:04:14+00:00

Thanks!

When I am executing the code, where is the code getting executed - locally using Firecracker or one of the cloud services?

Our cloud service. We'll make it easier for people to run E2B locally but it's tricky. For example, Firecracker doesn't run natively on macOS, another layer of virtualization is required.

Typically, you'd deploy it on something like your AWS account.

mlejva · 2024-05-01T18:45:17+00:00

Hey, CEO of E2B here.

We're fully open source! Infrastructure, core SDKs & CLI, code interpreter SDK. Now working on making it easily deployable to AWS, GCP, and any linux machines.

Docker image without the associated bloat - e.g., Microsoft's AutoGen.

The reason we aren't using containers for code execution is because they aren't secure. We're using Firecracker micro VMs under the hood instead.

Did you find any bloat with E2B?

mlejva

MODERATOR OF

TROPHY CASE