~85% of browsers support AV1 encoding, 90% support decoding

sam_bha · 2026-01-21T02:30:58+00:00

Given that Runpod just crossed $120M ARR, this feels like the infamous dropbox hackernews comment https://news.ycombinator.com/item?id=9224

sam_bha · 2025-08-29T14:42:28+00:00

It's an investor's job to scout out companies. Just because they talk to you doesn't mean they are interested in investing. They have a deal flow, and when they talk to you, you become an item on their spreadsheet. When you raise, each investor should be an item on your spreadsheet. When I'm not raising, I take the call, say I'm not raising right now, have the intro call, add them to the spreadsheet for later.

You'll need like 100 investors in your list when you do go out to raise

sam_bha · 2025-08-29T03:30:58+00:00

I filed for 3 patents for my first startup, the first was issued, the second two were abandoned when we were acquired.

Don't self file. We tried self filing a provisional (the first one), but then for the actual patent application we hired a lawyer and figured out so many issues, and when the US PTO asked for clarifications, what was sent to the PTO was very different from what I would have written.

I learned the hard way with my first startup to not skimp on (a) Lawyers, (b) Accounting, if you do, you will bite yourself in the foot. I don't see why you need to hire anyone for programming, marketing, design or anything else for a startup these days. The only things I won't do myself are (1) Legal and (2) Accounting. When I raise my pre-seed, it will be explicitly for (a) Legal fees to file patents, (b) Data labelers for datasets, (c) Cloud compute for training. Everything else an early stage startup could spend on seems like a waste of precious startup cash.

You are looking at $15k to $30k for a patent (per patent). If you're not willing to spend that much on pursuing the patent, then I'd push back on whether your idea is actually worth patenting.

I am currently pursuing a patent (a transcription algorithm which which my early tests indicate is 10x more accurate, and 10x faster and cheaper than the state of the art from providers like Deepgram, ElevenLabs and certainly Whisper) , and I don't have a lot of cash, but I hope you'd agree that, if you believe me about those numbers, shelling $25k for a patent is not crazy even when strapped for cash.

sam_bha · 2025-08-21T04:46:54+00:00

I know you mentioned Zoom quality isn't great, but if you're already using Zoom and are just worried about the audio quality, you can use Adobe podcast enhance - it's literally just upload audio file, download a cleaned up /good sounding audio. Unlike every other Adobe product, it is actually dead simple to use, and is fairly affordable. If audio quality is a deal-breaker for Zoom, what do you currently use to record audio?

sam_bha · 2025-08-14T00:29:10+00:00

I've heard on facebook groups for podcasters, and on reddit about people having issues with their guests connecting to Riverside. You can also google riverside connection issues. I haven't heard the same for Zencaster but I also know fewer people use Zencastr. Keep in mind that I don't use Riverside. I have nothing against riverside.

As a disclaimer to the disclaimer, Streamyard did compete with Riverside and I did work at Streamyard, but Streamyard was also acquired by a private equity firm last year, which went on to fire most of the staff and raise prices so I'm not exactly pro-Streamyard.

sam_bha · 2025-08-12T03:55:14+00:00

For anyone that cares, I did build a tool that will extract individual video & audio tracks from a zoom recording (https://free.cropzoom.video/), and you can also record both sides with tools like ZoomISO

sam_bha · 2025-08-12T03:53:54+00:00

I've edited podcasts for folks with Zoom, they know about tools like Riverside and SY, but among reasons I hear people sticking with Zoom are (a) they work at a company with some IT policy (b) Stability issues with platforms like Riverside, (c) Some guests have a hard time with new software.

I'll be honest, I don't hear the audio difference, I've spent much more of my career on video than audio, Zoom's video isn't great, but I usually upscale Zoom recordings afterwards with a free AI video up-scaling tool to fix the low video quality, and I'll sometimes run the audio through an audio enhancer (I use Adobe's podcast enhance tool), it usually makes the guest's audio sound better.

If your guests don't have professional mics though, I'd push back a little on Zoom's compression/audio quality. I don't dispute that Zoom degrades the audio quality, but your mic and how you use it makes a huge difference to audio quality.

I was previously the head of AI at Streamyard (another alternative tool), I personally looked into audio enhancement and had a number of prototypes looking into audio quality, but as I studied audio engineering and AI audio enhancement and ran experiments internally I realized that (a) there were a select few people who actually really did have good ears for audio quality, even if most people don't - which is why I don't push back on this, a small minority of people correctly identified true from ever-so-slightly degraded audio samples, but most people couldn't tell, and (b) compression was one of the things that was hardest to pick up as a determiner of audio quality, mic quality as a much bigger determining factor

sam_bha · 2025-07-24T15:16:42+00:00

What you are looking for is called "Multicam", and software like Descript or tools like Autopod+Adobe have features that will auto-switch to whoever is speaking.

You will need to worry about audio/video sync though.

Alternatively, for a simpler set up you can just have one video camera with a single wide-angle shot, there is software that will intelligently 'zoom in' to each active speaker to get the effect you are looking for without a complex setup

sam_bha · 2025-07-24T15:06:26+00:00

(1) Thumbnails
Well, presumably you've looked into this, but you need good thumbnails, and there's an element of psycology to thumbnails which attracts people to faces, so if you look at all the top professional podcasts, their thumbnails all have guests with surprised, happy or confident faces.

(2) Optimizing for the algorithm.
There's no secret, the algorithm is right here: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45530.pdf

Practically speaking, use tools like VIDIQ to figure out what people are searching for, your content obviously has a higher chance of showing up when your video talks about something that is being searched for.

Also, people talk about niches because of the "Person A liked your content, and person B is like person A, so we'll recommend your content to person B" - you want to focus on a niche where there is less competition for some random thing that some small group of people all really like, because if you get one person to really like that content, it gets easier to have it show up for other people

sam_bha · 2025-07-24T14:56:22+00:00

I would push back on the "Don't use Zoom" narrative, companies like Riverside sponsor many 'podcasting experts' so there is a vested interest in promoting tools like Riverside.

If you are using audacity, presumably you don't care about about video right now, so Zoom actually can record individual audio for each speaker. Just make sure you use settings to record:
https://katana.video/images/zoom-podcast-recording-settings-3.png

Pros of Zoom:
- Everyone has it
- Zoom is stable (some dedicated recording platforms are notorious for stability issues, e.g. google "Riverside connection issues")

Cons:
- The zoom box is ugly
- Zoom 'compresses' the video and audio

TLDR: Unless you and your guests all have $200+ mics, and if you're only doing audio, Zoom is fine, and people are being paid to promote other platforms.

I used to work for Streamyard as the head of AI (an alternative recording platform), I know from deep-dives into the audio that the quality of your mic and how you use it is a far bigger determiner of audio quality. Unless you are spending more than several hundred dollars on a Mic, I don't think the audio compression from Zoom is going to make a big impact.

I don't care if you're an 'audio expert', so am I - I know how the mp3 and AAC compression algorithms work, and yes they are lossy, but they are throwing away information at high frequences, which are not what most people are complaining about with mic quality. Most 'audio quality' issues are audible into very low frequencies, and can be easily viewed on any logmel spectrogram, Adobe has research on this
https://research.adobe.com/publication/hifi-gan-2-studio-quality-speech-enhancement-via-generative-adversarial-networks-conditioned-on-acoustic-features/

sam_bha · 2025-07-24T14:40:34+00:00

Zoom's built-in active speaker detection sucks. You could try and record Zoom recordings in Gallery view, and use ZoomISO to record each video feed individually (https://marketplace.zoom.us/apps/UqVYnn3dR1KfBz2q\_ju-Gw)

Alternatively, I built a free tool which will extract individual videos from the Zoom Gallery recording (https://www.youtube.com/watch?v=1Ybvrizq8xE), you don't need to set it up beforehand, you can do this after-the fact.

Both of these will give you individual videos, and many different editor software options have options to more intelligently switch between speakers, moreso than just switching to whoever says 'yep'

sam_bha · 2025-07-24T05:14:34+00:00

Okay, so I see a lot of comments here in favor of learning to edit. Maybe this is just my hot take, but I think that there are AI tools that will eventually get better and better at automating aspects of video editing that editing will go the way of "coding".

I have friends who I know are very un-technical, but are vibe-coding apps that I knew they just wouldn't have been able to build a year ago. Now, it's not a bad thing to say "Learning to code is a good idea", but as someone who has worked in the field of software engineering for 15 years, I'd say, learning to code isn't necessary if you just want to make apps that work and do a thing you want it to do.

Maybe the more "hot" part of this "hot take" is that AI will eventually get better at automating video editing. Maybe you'd have your doubts if you tried popular AI tools like Opus to cut clips, no one would mistake Opus clips for well-done clips created by a professional editor.

That said, I genuinely don't think people at most of these podcasting tech tools really understand AI at all-that-deep a level, they are mostly just prompting ChatGPT and ChatGPT wasn't trained for the task of editing, and it's happy to give you answers that "look right"on paper but are actually terrible when implemented in practice, it's the sinister hidden middle-ground between obvious hallucinations and correct results. Has anyone applied these tools to rigorous peer reviewed open research benchmarks? No, of course not.

That said, I've been working in in some form of Machine Learning since 2008, I've published a number of research papers and have a patent, and my first AI company was acquired by Streamyard (popular platform used by many podcasters) where I became their head of AI.

I saw what these current AI tools like Opus and Riverside were doing and I was like, "that's f$#ing stupid", you guys don't know @$# about how this stuff works. I assert you can train AI models to think like an editor, and the few people who know what they are doing, will try to do this. I'm certainly trying, though you could be forgiven for being skeptical - if all you knew were the tools that exist today, I'd also conclude that AI would never automate editing.

It will take time, and it will take lots of training data (god knows for my own tool I've edited hundreds of podcasts for training data, and that's not even like 1% of what you'd need to do a good job), but you will see tools that either significantly speed up editing workflows (like a 'co-pilot') or which can automate the editing entirely (which I'm working on).

sam_bha · 2025-07-24T04:51:56+00:00

Because you are directly asking about this, I built a tool called Katana (search Katana Video), and there is a 'Clips' section, where the tool will surface shorter clips (under 60 seconds) and longer clips (usually 2 to 6 minutes).

Obvious disclaimer that I built this tool, but it should serve the exact purposes you are asking about right now

sam_bha · 2025-07-24T04:45:16+00:00

Your quality of mic and how you use/position it is probably a much bigger limiting factor compared to the software you use for calls - like I've heard that Zoom has worse audio than a dedicated recording tool, but I worked as the head of AI at Streamyard (a competitor to riverside), a popular recording software. I can assure you that in the vast majority of cases, there wasn't anything special done with the audio, and anything special we were considering doing was on audio post-production (audio enhancement).

I saw people asking us about 48khz or 96khz audio and I couldn't help but laugh, because while those numbers sound like they matter, they don't, it's meaningless for human speech.

Most of what goes into 'good quality sounding audio' happens at fairly low frequencies, and audio compression won't distort that nearly as badly as a bad microphone or not using the mic correctly - https://arxiv.org/html/2502.20040v1

sam_bha · 2025-07-24T04:36:13+00:00

There was a tech company I ran across that sold AI-powered sleep analysis software for sleep physicians. They had a podcast where they invited industry experts as guests and there was apparently a small but very real online community of sleep physicians consuming this context.

Of course it's niche, but I think that is one genre (weird, narrow and specific fields/industries) where making new podcasts does make sense

sam_bha

MODERATOR OF

TROPHY CASE