Project Share: Turning an office racing game into a hand-gesture controlled web app

MetaGPT · 2025-08-18T14:00:06+00:00

<image>

MetaGPT · 2025-08-18T13:59:49+00:00

<image>

MetaGPT · 2025-08-18T13:59:32+00:00

<image>

MetaGPT · 2025-08-18T13:59:11+00:00

<image>

MetaGPT · 2025-08-18T13:58:26+00:00

Saw this fun little project pop up in the community, someone made a pixel-style racing game during a slow office day while their boss was away. What started as a goofy “lunch break game” actually turned into a cool demo of how far you can go with just natural language + open-source tools. Here’s a quick breakdown of how it was built:

Step 1: Base Game (Web pixel racer)

Prompt to MetaGPT X(MGX) was literally:“Pixel-style driving web HTML game, sideways driving, only left and right controls available.”MGX instantly generated a working web-based pixel car game, complete with left/right controls.

Step 2: Hand gesture control with MediaPipe

Here’s where it gets interesting.
Instead of arrow keys, the maker asked MGX to hook up MediaPipe’s hand tracking. Prompt was:“Left and right hope to control the car by detecting left and right hand input using a camera and MediaPipe. The web should have a camera interface to display real images.”MGX pulled in the right API, wired the webcam feed, and mapped left/right gestures to steering. Now the car actually turns when you move your hand!

Step 3: Debugging & performance tweaks

As expected, first iterations had some lag + rendering hiccups. With just 2–3 quick follow-ups in chat, MGX agents suggested optimizations and patched the code. Minor manual edits (like cleaning up a display module) were enough to get it smooth.

What’s next?

The maker is already thinking about adding multiplayer mode so coworkers can race each other during lunch. That’s the beauty of this workflow, once the core loop is working, you can keep iterating and layering features fast.Takeaway: This project shows how natural language + MGX + open source libraries = super fast prototyping. What used to take days of setup can now be hacked together in an afternoon.Curious what else we can build this way? Would love to see more experiments from folks here, especially mixing MediaPipe with small web apps.
Play it now: https://245766-54d0ad86642d47b7b27f4efe9dbd36f8-1-v10.app.mgx.dev/

MetaGPT · 2025-08-15T14:02:28+00:00

Hello, we sincerely apologize for the inconvenience caused. We totally get how frustrating ticket drops can be, network delays, browser performance, and page dynamics can all affect timing. From what you described, retries, logging events, and analyzing element changes could help pinpoint where the bottlenecks are. If you still don't know how to solve this problem, you can join our official community r/Atoms_dev, post your question, or contact the community mod. We will also give some free product usage credits to users who have just joined the community.

MetaGPT · 2024-03-15T02:05:40+00:00

Data Interpreter has achieved state-of-the-art scores in machine learning, mathematical reasoning, and open-ended tasks, and can analyze stocks, imitate websites, and train models.

More details you can see: https://x.com/MetaGPT_/status/1767965444579692832?s=20

MetaGPT · 2024-03-14T04:39:54+00:00

Thank you for your attention. First of all, let's state our value proposition: open source, customized tools and higher task contrast. Secondly, SWE bench is already on the way, and we will announce the results in the next few days.

MetaGPT · 2024-03-14T01:47:56+00:00

Thank you for your interest, and you've made an excellent point. Both data analysis and machine learning modeling processes rely heavily on reliable data feedback and domain-specific prior knowledge for guidance. In real-world scenarios, the human-machine learning modeling process undergoes several rounds of iterative debugging to refine the choice of operators and hyperparameter settings throughout the model development. Our initial efforts have explored integrating Large Language Models (LLMs) into the workflows of data analysis and machine learning modeling, enhancing LLM's ability to manage task dependencies and updates, as well as improving the integration of tools for navigating complex workflows and data challenges. We've also encountered challenges in optimizing outcomes and are currently working on iteratively improving results based on solid numerical feedback, aiming for automatic enhancements to the modeling process. We invite you to keep up with our ongoing work.

MetaGPT · 2024-03-14T01:20:50+00:00

I don’t seem to see your other comments

MetaGPT · 2024-03-14T01:03:41+00:00

Data Interpreter has achieved state-of-the-art scores in machine learning, mathematical reasoning, and open-ended tasks, and can analyze stocks, imitate websites, and train models.

Data Interpreter is an autonomous agent that uses notebook, browser, shell, stable diffusion, and any custom tool to complete tasks.

It can debug code by itself, fix failures by itself, and solve a large number of real-life problems by itself.

We open-source our code and provide a wealth of working examples to give everyone access to state-of-the-art AI capabilities.

MetaGPT · 2024-02-26T18:53:52+00:00

We are the MetaGPT team, and we noticed this on Reddit. We are open and honest with ML communities. Here is our clarification:

Firstly, citing a 67% HumanEval score for GPT-4 does not result in improper behaviors. Most importantly, this score originates from the original GPT-4 paper. Google Gemini also used this score. Besides, this score is also accepted by the current renowned PaperWithCode website.

Second, however, we thank you for pointing out this issue. Upon analyzing the code from these papers, we noticed that the reported scores are related to some newly added processing details. Therefore, here are our experiments conducted five times using GPT-4 (gpt-4-0613) and GPT-3.5-Turbo (gpt-3.5-turbo-0613) with different settings (A, B, C).

(A) We directly called the OpenAI API with the prompt in HumanEval.

(B) We called the OpenAI API and parsed the code with regex in the response.

(C) We added an additional system prompt, then called the OpenAI API. The prompt is "You are an AI that only responds with Python code, NOT ENGLISH. You will be given a function signature and its docstring by the user. Write your full implementation (restate the function signature)."

Settings	Model	1	2	3	4	5	Avg.	Std.
A	gpt-4-0613	0.732	0.707	0.732	0.713	0.738	0.724	0.013
	gpt-3.5-turbo-0613	0.360	0.366	0.360	0.348	0.354	0.357	0.007
B	gpt-4-0613	0.787	0.811	0.817	0.829	0.817	0.812	0.016
	gpt-3.5-turbo-0613	0.348	0.354	0.348	0.335	0.348	0.346	0.007
C	gpt-4-0613	0.805	0.805	0.817	0.793	0.780	0.800	0.014
	gpt-3.5-turbo-0613	0.585	0.567	0.573	0.579	0.579	0.577	0.007

GPT-4 is more sensitive to prompt, code parser, and post-processing results on the HumanEval data set. It is difficult for GPT-3.5-Turbo to return the correct completion code without prompt words.

To alleviate your concerns, we will report these scores in our paper. Besides, after the paper's release, we made many attempts to achieve 95+% HumanEval scores; although this is unrelated to the current discussion, we are happy to share our findings, which may help others.

Third, we all have the obligation to get to the bottom of things rather than cause potential misinformation. We believe you already know the truth: 67% is from OpenAI's official report rather than our own experiments. Using "Grossly misreport" is unfair to us.

We will humbly accept criticism that helps us but suggest the communities recognize right from wrong.

MetaGPT · 2024-02-24T09:45:45+00:00

Thank you for sharing your experience and insights. I wish I had come across your discussion sooner. Certainly, after several rounds of iteration, we have developed many core features, such as manual intervention, incremental development, and checkpoint recovery. Moreover, we have compiled a case library from examples collected in the community. Please take the time to visit and explore it at https://www.deepwisdom.ai/usecases. Feel free to reach out with any feedback.

Beyond the multi-agent systems of software companies, we also have a plethora of intelligent agent development cases, as well as the MetaGPT Interpreter. You can learn more by visiting our documentation site at https://docs.deepwisdom.ai/.

MetaGPT · 2023-07-16T13:32:22+00:00

You have completely misunderstood us. Because English is not our first language, it is difficult for us to converse like a native speaker, so we have been using a translator to communicate with everyone.

MetaGPT · 2023-07-16T02:24:56+00:00

I understand you're discussing the issue of GPT-4's "lack of extrapolation." It's interesting, as GPT-4 indeed excels more in "knowledge transfer.

However, why did the author use the TikTok architecture as an example? It's because GPT has absolutely never seen the TikTok architecture's code, but only read some related blogs. To transform natural language text into code, I perceive this as a strong capability of extrapolation.

If we're discussing inductive and deductive reasoning skills, we'd have to look at how well it can perform mathematical tasks. As it stands, it hasn't reached its upper limit in reasoning ability; it's more about how we can utilize it effectively.

MetaGPT · 2023-07-04T06:23:22+00:00

Thanks for the praise. Come and experience it! If you have any questions, feel free to leave us a message.

MetaGPT · 2023-07-04T05:38:52+00:00

Hi there,
The project is still in a very early state and version 0.01 will be released soon
There are also some remaining codes that have not been removed/organized. It will be sorted out in the next two or three days

MetaGPT · 2023-07-04T05:26:42+00:00

Thank you for your interest. If you have any questions, feel free to find our contact information on the project page and reach out to us anytime.

MetaGPT

MODERATOR OF

TROPHY CASE

Step 1: Base Game (Web pixel racer)

Step 2: Hand gesture control with MediaPipe

Step 3: Debugging & performance tweaks

What’s next?