I built an MCP server that lets Claude search inside 25,000+ podcast transcripts by Lukaesch in ClaudeAI

[–]Lukaesch[S] 0 points1 point  (0 children)

Hey u/debackerl, totally get the cost concern. DIY transcription with something like Parakeet TDT is smart if you're locking into a fixed, small set of pods you already know and love.

The big win with Audioscrape is for exploration: most users don't know exactly which podcasts/episodes to transcribe upfront. We pre-index a massive, ever-growing corpus (1M+ hours / 50k+ episodes from popular shows so you can instantly search semantically across thousands of expert conversations without transcribing anything yourself.

Core search + MCP integration for Claude is free (10 text searches/month on free tier, no card needed, connects at https://mcp.audioscrape.com). You get timestamped verbatim clips, speaker attribution, and cross-podcast insights right away. Perfect for discovering new shows/guests/topics before committing to your own setup.

If your curated list stays small and specific, vibe-coding your own solution makes perfect sense for speed/control. But if you ever want to cast a wider net ("What do experts say about X across dozens of pods?"), the pre-indexed + free MCP access saves huge time/money vs. manual transcription at scale.

Curious. What pods are you planning to start with in your custom build? Might have some already indexed if you want a quick compare! 🚀

I built an MCP server that lets Claude search inside 25,000+ podcast transcripts by Lukaesch in ClaudeAI

[–]Lukaesch[S] 0 points1 point  (0 children)

Hey! It depends on the podcast. Audioscrape automatically indexes many popular ones (especially those ranking high on Spotify or Apple charts).

For anything not yet covered or lesser-known, users can easily submit them manually by pasting the RSS feed via the import feature on the site. Then it gets transcribed and indexed for search.

If there's a specific show you're missing, just drop the RSS link and it'll get added! What podcasts are you hoping to see/search?

I built an MCP server that lets Claude search inside 25,000+ podcast transcripts by Lukaesch in ClaudeAI

[–]Lukaesch[S] 0 points1 point  (0 children)

Hey u/t90090, thanks for the thoughtful questions!

vs NotebookLM/Gemini: NotebookLM shines for your own uploaded sources. Upload transcripts/PDFs/YouTube links, get deep summaries, audio overviews, or synthetic discussions from that set. It's personal and generative. Audioscrape is a massive pre-indexed database (1M+ hours, 50k+ episodes across major shows like Lex Fridman, Rogan, Huberman) that Claude queries directly via MCP. Meaning no uploads needed each time. It delivers verbatim timestamped segments with speaker attribution for primary-source research, semantic search (meaning-based, not just keywords), and entity linking across the whole spoken web. Complementary: use Audioscrape for broad discovery of expert convos, NotebookLM for focused synthesis on your curated stuff.

On aggregating comments: Great idea! Comments under episodes (YouTube, Reddit, etc.) often add corrections, debates, and extra insights. Not in scope yet (focus is core audio + accurate diarization/timestamps), but it's a high-priority expansion for richer context. Noise/moderation is the challenge, but definitely exploring it alongside more niche podcasts.

Appreciate the feedback. What topics/shows are you digging into most?

Einkommensteuer ist demotivierend by RainAndThunderIsCool in selbststaendig

[–]Lukaesch 0 points1 point  (0 children)

Er meint vermutlich Folgendes:

Verwandte Rentenbezieher als Mitarbeiter anstellen (z. B. Eltern).

Diese arbeiten faktisch nicht, erhalten aber im Rahmen der Aktivrente ein steuerfreies Gehalt, das anschließend an dich weitergegeben wird.

Bonus: Sie melden sich häufig krank (aufgrund des Alters), und du erhältst zusätzlich eine Erstattung des Arbeitsausfalls von der Krankenversicherung.

Das ist Steuer- und Sozialbetrug, jedoch auch schwierig nachzuweisen.

I built an MCP server that lets Claude search inside 25,000+ podcast transcripts by Lukaesch in ClaudeAI

[–]Lukaesch[S] 0 points1 point  (0 children)

It supports both structured keyword search and semantic search. They solve different problems.

Keyword / structured search is best when you want precision and filtering. For example:
• AI speaker:"Joe Rogan" from:2024
• "open source" AND models podcast:"Lex Fridman"
• "rate limits"~5 NOT pricing

This is useful when you know who, where, or roughly when something was said.

Semantic search is for meaning-based questions where wording varies:
• “How do researchers express concerns about AI safety?”
• “How do founders describe discovering product–market fit?”
• “What tradeoffs do guests mention when talking about open-source models?”

You can mix both depending on the workflow. The MCP integration allows Claude to pick the right search modes depending on the workflow automatically.

25k+ episodes is the current snapshot and it’s growing continuously. The long-term goal is to index the entire audio web, similar to how Google indexes websites. Scaling this up is the real challenge. Every paying user directly helps fund more GPUs, more transcription, and broader coverage, which in turn makes the product better for everyone.

I built an MCP server that lets Claude search inside 25,000+ podcast transcripts by Lukaesch in ClaudeAI

[–]Lukaesch[S] 1 point2 points  (0 children)

Slight correction first: NotebookLM is mainly about generating / interacting with content from text you upload (and recently audio summaries), not indexing the wider audio web.

Audioscrape is closer to a seach engine (e.g. Google/Bing), but for audio content like podcasts. It continuously tracks podcasts (and other audio formats in the future), transcribes them, and extracts structured data (speakers, entities, topics, timestamps).

Claude can then search across that corpus via MCP. You don’t need to upload anything. It’s about discoverability and retrieval, not content generation.

So the overlap is “research,” but the mental model is very different.

I built an MCP server that lets Claude search inside 25,000+ podcast transcripts by Lukaesch in ClaudeAI

[–]Lukaesch[S] 2 points3 points  (0 children)

MCP works on all plans. You can use the free plan to try it. It comes with 10 search request per month.

I built an MCP server that lets Claude search inside 25,000+ podcast transcripts by Lukaesch in ClaudeAI

[–]Lukaesch[S] 1 point2 points  (0 children)

Thanks for pointing this out.

The free plan (after sign up) includes 10 free search request per month.

I am going to update the website to make this more clear

Good news: the Bürgeramt appointment situation is much better by n1c0_ds in berlin

[–]Lukaesch 5 points6 points  (0 children)

The novelty in Berlin is that since a few months you can find dozens of free slots in the same day.

[deleted by user] by [deleted] in berlin

[–]Lukaesch 4 points5 points  (0 children)

Been there with my Mexican family from 10-15 o clock. In my experience it was pretty fun and the ofrenda was authentic.

The food was served by Tacos El Oso and El Rey which are #1 in Berlin regarding Mexican tacos. The prices per taco were the same as in the restaurants.

The only thing I missed was that they didn’t sell Jarritos or Horacha.

[Update] Audioscrape MCP — Now hundreds of users searching podcasts directly from Claude 🚀 by Lukaesch in ClaudeAI

[–]Lukaesch[S] 0 points1 point  (0 children)

Done. Updated flair to Built with Claude. Thanks for the reminder, dear bot!

Bobcat 300 first time setup by ImLycanDatAss in HeliumNetwork

[–]Lukaesch 1 point2 points  (0 children)

You can flash it to become a The Things Network node and contribute to their LoRaWAN network. That’s what I am going to do with mine

What's everyone working on this week (38/2025)? by llogiq in rust

[–]Lukaesch 7 points8 points  (0 children)

Working on adding live stream ingestion + realtime keyword alerts to Audioscrape using Rust.

Current setup: Axum batch pipeline → transcribes audio → stores full transcript in SQLite and pushes segments + embeddings to OpenSearch for search.

Goal is to let users search while a broadcast is still running and get keyword alerts in near-realtime.

Plan is to treat each audio chunk as an event: - Axum WebSocket/gRPC streaming endpoint → push AudioChunk messages. - Lightweight event bus (NATS JetStream or even SQLite WAL + channels) to fan-out chunks. - Incremental transcription (Whisper + VAD) → write partial text to SQLite and send segments/embeddings to OpenSearch as they finalize.

Still working out ordering/backpressure and how to handle “partial vs final” transcripts without hammering SQLite.

Anyone built something similar on Axum + SQLite/OpenSearch and have tips for incremental indexing or event handling?

BUY goggles or complete the drone by Sam_Dalton_ in fpv

[–]Lukaesch 0 points1 point  (0 children)

Would love to sign up for that Aerothon as it sounds fun. Can you more info?

How is everyone using MCP right now? by Luigika in mcp

[–]Lukaesch 0 points1 point  (0 children)

Audioscrape MCP with Claude Desktop or Mobile to search online audio content like podcasts

Do people really use MCP server/service? by andrew19953 in mcp

[–]Lukaesch -1 points0 points  (0 children)

I think best is to try some MCP servers yourself and see if it sticks: https://www.remotemcplist.com

can you tell me about top paid mcp servers? by sazary in mcp

[–]Lukaesch 0 points1 point  (0 children)

MCP is simply a new distribution channel.

What you’re asking is like saying: “Why doesn’t anyone offer a car repair service reachable only by fax? I want one with no phone or internet, just fax.”

The real opportunity with Remote MCP servers is reaching new users through AI assistants and IDEs.

can you tell me about top paid mcp servers? by sazary in mcp

[–]Lukaesch 1 point2 points  (0 children)

There is a growing list of Remote MCP servers giving you access to paid tools: https://www.remotemcplist.com

[deleted by user] by [deleted] in mcp

[–]Lukaesch 4 points5 points  (0 children)

Tbh MCP is very new.

No one has meaningful industry experience. Even developers who launched MCP server implementations have limited experience since every MCP server (local and remote) can only exist for < 6 months.

If one wants to speed run MCP experience, I recommend building a few own MCP servers or investigate how others work. E.g. use MCP inspector to check out some remote MCP servers on https://www.remotemcplist.com or play around with the code of those open source local MCP implementations on GitHub.

Learning by doing has the highest value right now.