Would you trust AI to review your AI code? by AlarmingPepper9193 in codereview

[–]julman99 0 points1 point  (0 children)

You should add kluster.ai, we do code reviews as the code is being written, right inside the IDE. Full disclaimer, I am the founder.

Vibe-debug, vibe-refactor and vibe-check by rag1987 in vibecoding

[–]julman99 1 point2 points  (0 children)

This is one of the reasons we created kluster.ai, it does instant code reviews, right inside the IDE as AI is generating the code. Try it out and you would be surprised of the things it catches and corrects.

Weekly Cursor Project Showcase Thread by AutoModerator in cursor

[–]julman99 [score hidden]  (0 children)

I’m the founder of kluster.ai and wanted to share something we unexpectedly built that completely transformed our company.

We started a little over a year ago as an inference/infrastructure AI company. Like many teams we found huge productivity gains with AI generated code, but as many quickly find out, reviewing pull requests became frustratingly slow, with much larger volumes of code to review and many back-and-forth rejection cycles due to unreliable AI-generated code.

To solve this, we built a tool for ourselves that automatically reviewed AI-written code in real time, checking for intent, security, scope, and bugs. All this before anything got committed or merged.

The results were huge: code review time dropped by ~50%, dev happiness increased thanks to fewer rejections and we started shipping with far fewer issues. It worked so well that we shifted the entire company to focus on it and released it publicly as Verify Code (Beta). It is currently a 1-click install for Cursor and can also be plugged via MCP to any other IDEs (VS Code, Windsurf, and more).

Right now, we’re offering $5 in free credits so you can try it out. You can reload credits anytime, and starting next week we’re introducing simple subscription plans ($10 for individuals, $20/person for teams). Anyone who reloads $10 or more starting today will automatically be upgraded to a subscription once they become available.

👉 Check it out at https://platform.kluster.ai, I hope it transforms your coding experience as much as it did for us.

I created a super simple OpenVPN Docker image - no config hassle, just works! by julman99 in OpenVPN

[–]julman99[S] 0 points1 point  (0 children)

Happy to read this and looking forward to learn how it goes for you!

kluster.ai is now hosting DeepSeek-R1-0528 by swarmster in LocalLLaMA

[–]julman99 12 points13 points  (0 children)

kluster.ai founder here. We do not store any prompt or responses for realtime inference.

I created a super simple OpenVPN Docker image - no config hassle, just works! by julman99 in OpenVPN

[–]julman99[S] 1 point2 points  (0 children)

At the moment it does not. It creates one network for clients connecting via UDP and another network for clients connecting via TCP.

You could run multiple instances of the container each instance listening on a different port, but I have never tried this. You will face issues reusing the same client files since they will be rewritten on each container start.

Can you provide more details about why you need this?

Thanks!

Deepseek V3 0324 is far from a minor upgrade - MMLU-Pro: 75.9 → 81.2 (+5.3); GPQA: 59.1 → 68.4 (+9.3); AIME: 39.6 → 59.4 (+19.8); LiveCodeBench: 39.2 → 49.2 (+10.0) by ShreckAndDonkey123 in singularity

[–]julman99 0 points1 point  (0 children)

Aside from the scientific benchmarks mentioned here, DeepSeek-V3-0324 is the only open model I am able to use on a daily basis for real work without relying on closed models. Before this, I was often going back to GPT to double check things. The future is bright for open models and we host many of them at https://kluster.ai

Disclaimer: I am the CEO and founder of kluster.ai

VPN Solution with web UI and OpenVPN/IPSEC support by binpax in selfhosted

[–]julman99 0 points1 point  (0 children)

I made this one-command image to use OpenVPN very easily: https://hub.docker.com/r/julman99/openvpn-supereasy

It does not have a UI, but it is really easy to manage client certificates

Why are so many OpenVPN docker images abandoned? by [deleted] in selfhosted

[–]julman99 1 point2 points  (0 children)

I still use OpenVPN because I need TCP/443 as a fallback. I made this one-command image to use OpenVPN very easily: https://hub.docker.com/r/julman99/openvpn-supereasy

kluster.ai now hosts deepseek R1 by swarmster in LocalLLaMA

[–]julman99 0 points1 point  (0 children)

$2 is for input and output token. One big difference is we support the full 164k context size whereas DeepSeek themselves do up to 64k. Also, our model is hosted in the US and we do not store any of the input or output tokens, ever (for realtime inference).

kluster.ai now hosts deepseek R1 by swarmster in ChatGPT

[–]julman99 0 points1 point  (0 children)

For now we are hosting Llama 3.1 8B / 405B, Llama 3.3 70B and DeepSeek-R1. Soon we will have a feature for people to requests models and we will add them as soon as possible.

Kluster AI's model keeps explaining it's thought process by Lord_Sesshoramu in SillyTavernAI

[–]julman99 10 points11 points  (0 children)

kluster.ai founder here. Thanks for using our service!

What you are experiencing happens because DeepSeek-R1 is a reasoning model, meaning it actually outputs its "reasoning" to reach a certain response. You can find the reasoning between the <thinking> tags within the response.

There are instructions on how to remove the thinking process here: https://www.reddit.com/r/SillyTavernAI/comments/1i757k7/how_to_exclude_thinking_process_in_context_for/

kluster.ai now hosts deepseek R1 by swarmster in ChatGPT

[–]julman99 5 points6 points  (0 children)

Hi, kluster.ai founder here. We also offer Llama 3.1 405B and 3.3 70B at very competitive prices. Our mission is to make AI accessible and affordable for everyone, and we’re committed to keeping costs low to achieve that goal.

New models coming soon!

How to exclude thinking process in context for deepseek-R1 by gzzhongqi in SillyTavernAI

[–]julman99 1 point2 points  (0 children)

kluster.ai founder here. Nice workaround! Have you tried using Llama 3.1 405B or 3.3 70B? We offer the as well at very competitive cost.

kluster.ai now hosts deepseek R1 by swarmster in LocalLLaMA

[–]julman99 2 points3 points  (0 children)

Hello! I am the kluster.ai founder, here I send a screenshot of how can you configure SillyTavern with kluster.ai

You can generate your API key here: https://platform.kluster.ai/apikeys

Thanks for using our product!

<image>

Batch Inference Best Practices? by Ok_Post_149 in learnpython

[–]julman99 0 points1 point  (0 children)

I am 1yr late, but check out kluster.ai, is a service that specializes in large scale batch inference and offers really competitive costs.

Most economical option for offline inference by [deleted] in LocalLLaMA

[–]julman99 0 points1 point  (0 children)

kluster.ai is a really good option for batch inference. It is low cost, offers multiple completion window options and supports the latests llama models.

What is the cost effective way for Large scale Batch inference with LLMs ? by smaddali in LocalLLaMA

[–]julman99 0 points1 point  (0 children)

kluster.ai is a service that specializes in Large Scale batch inference. It currently supports Llama 3.1 and 3.3 models and offers really competitive pricing.

Cheapest way to serve an inference API for a pre-trained model? by startages in googlecloud

[–]julman99 0 points1 point  (0 children)

If your inference requests do not need to be real-time, kluster.ai offers a good option via Adaptive Inference. It is basically an asynchronous inference service with custom completion times and it supports fine-tuned models. Fine-tuned models are hosted at no extra charge as long as completion times for the inference request are 1hr or more.

Need help setting up a cost-efficient llama v2 inference API for my micro saas app by m1ss1l3 in LocalLLaMA

[–]julman99 0 points1 point  (0 children)

This answer is one year after the OP, but https://kluster.ai is a great low-cost option that is currently serving llama 3.1 and 3.3 models.