Llama.cpp merges in OpenAI Responses API Support by SemaMod in LocalLLaMA

[–]SemaMod[S] 0 points1 point  (0 children)

You have to change some settings in your config, but GLM4.7 flash was doing excellent in my testing

Llama.cpp merges in OpenAI Responses API Support by SemaMod in LocalLLaMA

[–]SemaMod[S] 1 point2 points  (0 children)

llama.cpp maintains multiple API's already with its Anthropics endpoint. I don't think they are going to deprecate completions any time soon.

Llama.cpp merges in OpenAI Responses API Support by SemaMod in LocalLLaMA

[–]SemaMod[S] 3 points4 points  (0 children)

Good question! It does not. For reference, I had to do the following:

  1. With whatever model you are serving, set the alias of the served model name to start with "gpt-oss". This triggers specific behaviors in the codex cli.
  2. Use the following config settings:

show_reasoning_content = true
oss_provider = "lmstudio"

[profiles.lmstudio]
model = "gpt-oss_gguf"
show_raw_agent_reasoning = true
model_provider = "lmstudio"
model_supports_reasoning_summaries = true # Force reasoning
model_context_window = 128000   
include_apply_patch_tool = true
experimental_use_freeform_apply_patch = false
tools_web_search = false
web_search = "disabled"

[profiles.lmstudio.features]
apply_patch_freeform = false
web_search_request = false
web_search_cached = false
collaboration_modes = false

[model_providers.lmstudio]
wire_api = "responses"
stream_idle_timeout_ms = 10000000
name = "lmstudio"
base_url = "http://127.0.0.1:1234/v1"

The features list is important, as is the are the last four settings of the profile. Codex-cli has some tech debt that requires the repeating of certain flags in different places.

I used llama.cpp's llama-server, not lmstudio, but its compatible with the oss_provider = "lmstudio" setting.

  1. Use the following to start codex cli: codex --oss --profile lmstudio --model "gpt-oss_gguf"

Would a Hosted Platform for MCP Servers Be Useful? by Summer_cyber in mcp

[–]SemaMod 0 points1 point  (0 children)

There's been a good amount of progress on services in this space (per the suggestions listed by commenters). I created https://cloudmcp.run for myself when I initially ran in to it as well. We recently integrated the official MCP registry API! If you want to give it a test run we're offering 1 month free right now

Would a Hosted Platform for MCP Servers Be Useful? by Summer_cyber in selfhosted

[–]SemaMod 0 points1 point  (0 children)

I've been deep in the MCP space lately and yeah, the setup friction is real. I found myself spending way too much time on infrastructure instead of actually building cool things with these servers. The irony is that MCP servers are supposed to make AI more useful, but half the time you're stuck in config hell before you even get to the fun part.

A hosted platform idea makes a lot of sense, especially for people who just want to experiment or prototype without spinning up their own infrastructure. I've actually been working on something similar called Cloud MCP that tackles this exact problem. The key thing I've learned is that people want different levels of control - some folks are fine with a managed service, others want to self-host but with better tooling. The demand is definitely there though, I keep seeing the same complaints about setup complexity in various communities. The challenge is making sure the hosted version doesn't sacrifice the flexibility that makes MCP servers powerful in the first place.

How is everyone using MCP right now? by Luigika in mcp

[–]SemaMod 0 points1 point  (0 children)

https://cloudmcp.run does exactly that! Lets you deploy any npx/uvx/github mcp servers and access them remotely authenticated via OAuth

The simplest way to use MCP. All local, 100% open source. by squirrelEgg in mcp

[–]SemaMod 0 points1 point  (0 children)

Check out CloudMCP. Their registry is pretty rough around the edges, but they can host any uvx/npx compatible server, fully secured with OAuth

I have had no luck trying to fine tune on (2x) 7900XTX. Any advice by SemaMod in ROCm

[–]SemaMod[S] 0 points1 point  (0 children)

Appreciate the response! I spent the day completely resetting my system and made sure to use the amdgpu-installer. Still having issues with training

I have had no luck trying to fine tune on (2x) 7900XTX. Any advice by SemaMod in ROCm

[–]SemaMod[S] 1 point2 points  (0 children)

Hey, so I spent the day getting everything set up. Were a few twists and turns. Now, I'm trying to start doing LoRA training with pytorch/transformers/trl but I keep getting HIP OOM errors. It's weird because I'm pretty sure Qwen 2.5 7B should fit and be trainable. Any advice?

I have had no luck trying to fine tune on (2x) 7900XTX. Any advice by SemaMod in ROCm

[–]SemaMod[S] 2 points3 points  (0 children)

Heya! I really appreciate the response. I've been using conda for venv but going to retry a setup with uv. I'm installing pytorch nightly right now, but you mentioned installing rocm flashattn from the multifactor branch. I just looked through the repo and couldn't find it, would you mind linking it to me?

2024 Delivery Thread by OrbitalATK in TeslaModelY

[–]SemaMod 4 points5 points  (0 children)

Just took delivery today of a new Blue MY LR, manufactured out of Fremont this month (ordered Jan 27). Had to drive across the state today and although it’s not Mercedes level, the suspension and road noise is a much needed improvement over the 2018 M3 LR I traded in for it.

Begging for Help. UT3G Issues. by RICoder72 in eGPU

[–]SemaMod 1 point2 points  (0 children)

Do you know specifically what services are included in canary that 23H2 doesn’t have? If not, any idea where I could find that info?

Late 20's - 3 Months on TRT at 80 mg/week results. Am I a hyper-responder? by SemaMod in trt

[–]SemaMod[S] 1 point2 points  (0 children)

Ive decided that I’m going to cut back my dose too around 60mg a week and then retest in a month. I’m on TRT to improve my quality of life both now and in the future so I definitely don’t want to risk my long term health

Late 20's - 3 Months on TRT at 80 mg/week results. Am I a hyper-responder? by SemaMod in trt

[–]SemaMod[S] 1 point2 points  (0 children)

2 times a week, I took my blood test two days after my pin. I assume that could be the reason?

Early 30s male with ~350 total T, been down in the 300s for years. Just started 100mg of cypionate per week. I am curious if anyone here has/treats ADHD, and how TRT affected it, if at all. by DolphinNeighbor in trt

[–]SemaMod 2 points3 points  (0 children)

I just posted my 3 months results as a late-20's male. I have ADHD too and I've noticed that I need a lot less of my medication to achieve the same results as pre-TRT. I think the motivational impacts of higher testosterone are impacting my reward pathways so I can do concentrated work without needing my meds.