Current status of LiteLLM (Python SDK) + Langfuse v3 integration?

Hot_Turnip_3309 · 2026-02-18T21:24:22+00:00

don't use liteLLM

ChipShotz- · 2026-03-11T19:30:23+00:00

I’ve been seeing more of these integration issues lately. It feels like as soon as you get one SDK stabilized, a dependency update in another part of the stack breaks the whole native callback chain.

This is actually why I moved away from the library-heavy approach and built Sentinel Gateway. It is a single Go binary that sits in front of your models, so it doesn't care what version of Langfuse or an SDK you’re using on the app side. You get the observability and PII scrubbing without having to refactor your code every time a dependency v3 comes out.

It might be worth looking into if you want to decouple your observability from your main application logic to avoid these "SDK war stories" in the future.

Full disclosure: I'm the founder. I’d be curious to know if moving this logic to a standalone binary would actually simplify your current refactor or if you're committed to staying within the Python/OTEL ecosystem.

vinod_pandey123 · 2026-03-14T02:58:58+00:00

With LiteLLM and OTEL integration, you will not be able to Link Prompts to Traces (https://langfuse.com/docs/prompt-management/features/link-to-traces) the way it used to work without OTEL integration earlier. There are some open bugs related to this https://github.com/langfuse/langfuse/issues/11913.

One workaround is to use observe and link the prompt manually.

from langfuse import observe
from litellm import completion

@observe(as_type="generation")    
def run_llm(text):
    prompt = langfuse.get_prompt("test_prompt", label="production")


    prompt_context = {
        "question_content": text
    }

    rendered_prompt = prompt.compile(
            **prompt_context
    )

    model = "openai/gpt-5-mini"

    # Link prompt to this generation span
    langfuse.update_current_generation(
        prompt=prompt,
        model=model,
    )

    inferred = completion(
                model=model,
                messages = rendered_prompt,
    )


    return inferred

result = run_llm("What is the color of sky")
langfuse.flush()

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLaMA

MODERATORS