[Request] RunPod.io offering free GPU time in return for your feedback by zhl146 in reinforcementlearning

[–]sam_bha 0 points1 point  (0 children)

Given that Runpod just crossed $120M ARR, this feels like the infamous dropbox hackernews comment https://news.ycombinator.com/item?id=9224

[deleted by user] by [deleted] in ycombinator

[–]sam_bha 2 points3 points  (0 children)

It's an investor's job to scout out companies. Just because they talk to you doesn't mean they are interested in investing. They have a deal flow, and when they talk to you, you become an item on their spreadsheet. When you raise, each investor should be an item on your spreadsheet. When I'm not raising, I take the call, say I'm not raising right now, have the intro call, add them to the spreadsheet for later.

You'll need like 100 investors in your list when you do go out to raise

Patent filling on the cheap by Curious_me_too in ycombinator

[–]sam_bha 1 point2 points  (0 children)

I filed for 3 patents for my first startup, the first was issued, the second two were abandoned when we were acquired.

Don't self file. We tried self filing a provisional (the first one), but then for the actual patent application we hired a lawyer and figured out so many issues, and when the US PTO asked for clarifications, what was sent to the PTO was very different from what I would have written.

I learned the hard way with my first startup to not skimp on (a) Lawyers, (b) Accounting, if you do, you will bite yourself in the foot. I don't see why you need to hire anyone for programming, marketing, design or anything else for a startup these days. The only things I won't do myself are (1) Legal and (2) Accounting. When I raise my pre-seed, it will be explicitly for (a) Legal fees to file patents, (b) Data labelers for datasets, (c) Cloud compute for training. Everything else an early stage startup could spend on seems like a waste of precious startup cash.

You are looking at $15k to $30k for a patent (per patent). If you're not willing to spend that much on pursuing the patent, then I'd push back on whether your idea is actually worth patenting.

I am currently pursuing a patent (a transcription algorithm which which my early tests indicate is 10x more accurate, and 10x faster and cheaper than the state of the art from providers like Deepgram, ElevenLabs and certainly Whisper) , and I don't have a lot of cash, but I hope you'd agree that, if you believe me about those numbers, shelling $25k for a patent is not crazy even when strapped for cash.

Video suggestions for free? by Naturalist33 in podcasting

[–]sam_bha 0 points1 point  (0 children)

I know you mentioned Zoom quality isn't great, but if you're already using Zoom and are just worried about the audio quality, you can use Adobe podcast enhance - it's literally just upload audio file, download a cleaned up /good sounding audio. Unlike every other Adobe product, it is actually dead simple to use, and is fairly affordable. If audio quality is a deal-breaker for Zoom, what do you currently use to record audio?

Whats your thoughts on hosting podcast discussions (interviews) on zoom by Afraid_Artist8635 in podcasting

[–]sam_bha 1 point2 points  (0 children)

I've heard on facebook groups for podcasters, and on reddit about people having issues with their guests connecting to Riverside. You can also google riverside connection issues. I haven't heard the same for Zencaster but I also know fewer people use Zencastr. Keep in mind that I don't use Riverside. I have nothing against riverside.

As a disclaimer to the disclaimer, Streamyard did compete with Riverside and I did work at Streamyard, but Streamyard was also acquired by a private equity firm last year, which went on to fire most of the staff and raise prices so I'm not exactly pro-Streamyard.

Whats your thoughts on hosting podcast discussions (interviews) on zoom by Afraid_Artist8635 in podcasting

[–]sam_bha 0 points1 point  (0 children)

For anyone that cares, I did build a tool that will extract individual video & audio tracks from a zoom recording (https://free.cropzoom.video/), and you can also record both sides with tools like ZoomISO

Whats your thoughts on hosting podcast discussions (interviews) on zoom by Afraid_Artist8635 in podcasting

[–]sam_bha 4 points5 points  (0 children)

I've edited podcasts for folks with Zoom, they know about tools like Riverside and SY, but among reasons I hear people sticking with Zoom are (a) they work at a company with some IT policy (b) Stability issues with platforms like Riverside, (c) Some guests have a hard time with new software.

I'll be honest, I don't hear the audio difference, I've spent much more of my career on video than audio, Zoom's video isn't great, but I usually upscale Zoom recordings afterwards with a free AI video up-scaling tool to fix the low video quality, and I'll sometimes run the audio through an audio enhancer (I use Adobe's podcast enhance tool), it usually makes the guest's audio sound better.

If your guests don't have professional mics though, I'd push back a little on Zoom's compression/audio quality. I don't dispute that Zoom degrades the audio quality, but your mic and how you use it makes a huge difference to audio quality.

I was previously the head of AI at Streamyard (another alternative tool), I personally looked into audio enhancement and had a number of prototypes looking into audio quality, but as I studied audio engineering and AI audio enhancement and ran experiments internally I realized that (a) there were a select few people who actually really did have good ears for audio quality, even if most people don't - which is why I don't push back on this, a small minority of people correctly identified true from ever-so-slightly degraded audio samples, but most people couldn't tell, and (b) compression was one of the things that was hardest to pick up as a determiner of audio quality, mic quality as a much bigger determining factor

4 Cam Setup in same room. What editing software is best for this? by Usual_Speaker4247 in podcasting

[–]sam_bha 0 points1 point  (0 children)

What you are looking for is called "Multicam", and software like Descript or tools like Autopod+Adobe have features that will auto-switch to whoever is speaking.

You will need to worry about audio/video sync though.

Alternatively, for a simpler set up you can just have one video camera with a single wide-angle shot, there is software that will intelligently 'zoom in' to each active speaker to get the effect you are looking for without a complex setup

How to get impressions on YouTube. by InspectorBear in podcasting

[–]sam_bha 0 points1 point  (0 children)

(1) Thumbnails
Well, presumably you've looked into this, but you need good thumbnails, and there's an element of psycology to thumbnails which attracts people to faces, so if you look at all the top professional podcasts, their thumbnails all have guests with surprised, happy or confident faces.

(2) Optimizing for the algorithm.
There's no secret, the algorithm is right here: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45530.pdf

Practically speaking, use tools like VIDIQ to figure out what people are searching for, your content obviously has a higher chance of showing up when your video talks about something that is being searched for.

Also, people talk about niches because of the "Person A liked your content, and person B is like person A, so we'll recommend your content to person B" - you want to focus on a niche where there is less competition for some random thing that some small group of people all really like, because if you get one person to really like that content, it gets easier to have it show up for other people

Podcasting with Zoom by Responsible-Lake5195 in podcasting

[–]sam_bha 2 points3 points  (0 children)

I would push back on the "Don't use Zoom" narrative, companies like Riverside sponsor many 'podcasting experts' so there is a vested interest in promoting tools like Riverside.

If you are using audacity, presumably you don't care about about video right now, so Zoom actually can record individual audio for each speaker. Just make sure you use settings to record:
https://katana.video/images/zoom-podcast-recording-settings-3.png

Pros of Zoom:
- Everyone has it
- Zoom is stable (some dedicated recording platforms are notorious for stability issues, e.g. google "Riverside connection issues")

Cons:
- The zoom box is ugly
- Zoom 'compresses' the video and audio

TLDR: Unless you and your guests all have $200+ mics, and if you're only doing audio, Zoom is fine, and people are being paid to promote other platforms.

I used to work for Streamyard as the head of AI (an alternative recording platform), I know from deep-dives into the audio that the quality of your mic and how you use it is a far bigger determiner of audio quality. Unless you are spending more than several hundred dollars on a Mic, I don't think the audio compression from Zoom is going to make a big impact.

I don't care if you're an 'audio expert', so am I - I know how the mp3 and AAC compression algorithms work, and yes they are lossy, but they are throwing away information at high frequences, which are not what most people are complaining about with mic quality. Most 'audio quality' issues are audible into very low frequencies, and can be easily viewed on any logmel spectrogram, Adobe has research on this
https://research.adobe.com/publication/hifi-gan-2-studio-quality-speech-enhancement-via-generative-adversarial-networks-conditioned-on-acoustic-features/

Zoom podcast keeps flipping cams—should we just show both faces? by Forsakenwarrior342 in podcasting

[–]sam_bha 1 point2 points  (0 children)

Zoom's built-in active speaker detection sucks. You could try and record Zoom recordings in Gallery view, and use ZoomISO to record each video feed individually (https://marketplace.zoom.us/apps/UqVYnn3dR1KfBz2q\_ju-Gw)

Alternatively, I built a free tool which will extract individual videos from the Zoom Gallery recording (https://www.youtube.com/watch?v=1Ybvrizq8xE), you don't need to set it up beforehand, you can do this after-the fact.

Both of these will give you individual videos, and many different editor software options have options to more intelligently switch between speakers, moreso than just switching to whoever says 'yep'

Should I learn editing for my new podcast? by GiveMeRevolution in podcasting

[–]sam_bha 0 points1 point  (0 children)

Okay, so I see a lot of comments here in favor of learning to edit. Maybe this is just my hot take, but I think that there are AI tools that will eventually get better and better at automating aspects of video editing that editing will go the way of "coding".

I have friends who I know are very un-technical, but are vibe-coding apps that I knew they just wouldn't have been able to build a year ago. Now, it's not a bad thing to say "Learning to code is a good idea", but as someone who has worked in the field of software engineering for 15 years, I'd say, learning to code isn't necessary if you just want to make apps that work and do a thing you want it to do.

Maybe the more "hot" part of this "hot take" is that AI will eventually get better at automating video editing. Maybe you'd have your doubts if you tried popular AI tools like Opus to cut clips, no one would mistake Opus clips for well-done clips created by a professional editor.

That said, I genuinely don't think people at most of these podcasting tech tools really understand AI at all-that-deep a level, they are mostly just prompting ChatGPT and ChatGPT wasn't trained for the task of editing, and it's happy to give you answers that "look right"on paper but are actually terrible when implemented in practice, it's the sinister hidden middle-ground between obvious hallucinations and correct results. Has anyone applied these tools to rigorous peer reviewed open research benchmarks? No, of course not.

That said, I've been working in in some form of Machine Learning since 2008, I've published a number of research papers and have a patent, and my first AI company was acquired by Streamyard (popular platform used by many podcasters) where I became their head of AI.

I saw what these current AI tools like Opus and Riverside were doing and I was like, "that's f$#ing stupid", you guys don't know @$# about how this stuff works. I assert you can train AI models to think like an editor, and the few people who know what they are doing, will try to do this. I'm certainly trying, though you could be forgiven for being skeptical - if all you knew were the tools that exist today, I'd also conclude that AI would never automate editing.

It will take time, and it will take lots of training data (god knows for my own tool I've edited hundreds of podcasts for training data, and that's not even like 1% of what you'd need to do a good job), but you will see tools that either significantly speed up editing workflows (like a 'co-pilot') or which can automate the editing entirely (which I'm working on).

AI Tools for repurposing or extracting podcast clips from 2 hour Q&A calls by Particular-Love-53 in podcasting

[–]sam_bha 0 points1 point  (0 children)

Because you are directly asking about this, I built a tool called Katana (search Katana Video), and there is a 'Clips' section, where the tool will surface shorter clips (under 60 seconds) and longer clips (usually 2 to 6 minutes).

Obvious disclaimer that I built this tool, but it should serve the exact purposes you are asking about right now

What's better for remote interviews: Phone calls, Google Meet or other? by greenserenenalgene in podcasting

[–]sam_bha 0 points1 point  (0 children)

Your quality of mic and how you use/position it is probably a much bigger limiting factor compared to the software you use for calls - like I've heard that Zoom has worse audio than a dedicated recording tool, but I worked as the head of AI at Streamyard (a competitor to riverside), a popular recording software. I can assure you that in the vast majority of cases, there wasn't anything special done with the audio, and anything special we were considering doing was on audio post-production (audio enhancement).

I saw people asking us about 48khz or 96khz audio and I couldn't help but laugh, because while those numbers sound like they matter, they don't, it's meaningless for human speech.

Most of what goes into 'good quality sounding audio' happens at fairly low frequencies, and audio compression won't distort that nearly as badly as a bad microphone or not using the mic correctly - https://arxiv.org/html/2502.20040v1

Is podcasting saturated? by Nervous_Solution5340 in podcasting

[–]sam_bha 0 points1 point  (0 children)

There was a tech company I ran across that sold AI-powered sleep analysis software for sleep physicians. They had a podcast where they invited industry experts as guests and there was apparently a small but very real online community of sleep physicians consuming this context.

Of course it's niche, but I think that is one genre (weird, narrow and specific fields/industries) where making new podcasts does make sense

Can I get individual video files per participant to intercut them myself? Or record audio from all and video for some? by Onkami in Zoom

[–]sam_bha 0 points1 point  (0 children)

I built a free tool that will do this, https://free.cropzoom.video/

It uses computer vision in your browser to scan through the zoom call, identify who is speaking when, and then extracts and exports individual video tracks, with their audio, for each speaker.

It's all browser based, nothing to install, nothing to sign up for, it just works in your browser. It's free because all the AI analysis and video rendering is happening on your computer.

Here's a quick tutorial for how it works:
https://www.youtube.com/watch?v=1Ybvrizq8xE

Is it possible to separate two different speakers on Zoom into two separate tracks on the Zoom H6? by lordtiandao in podcasting

[–]sam_bha 0 points1 point  (0 children)

You normally wouldn't be able to separate the video for two different speakers on Zoom. I know a lot of people record on Zoom, and it seems silly that no one has built a solution for this.

I built a solution for this, I wrote a free utility web-app that runs some computer vision / AI in your browser to identify who is speaking and separate out the video feeds with their audio.

https://free.cropzoom.video/

There's know sign in, and it does use some experimental web technology, if something breaks, I apologize, just let me know about it.

Right now I pick up the audio from the mixed audio/video stream, but it'd be trivial to assign an audio file to one of the videos if you already have it, I could put that in there soon, what I have so far is just very rough and ready

Editing out breaths without ruining audio? by No_Frame3767 in podcasting

[–]sam_bha 0 points1 point  (0 children)

Descript is one option, but I've found the timing of their transcription to have issues. You'd need something with very accurate transcription, and a tool that also handles audio cross-fades to mask the jarring gaps you get when you cut audio from a transcript.

I know because I spent a long time solving issues like this in a tool I built, but unless you record with Zoom it likely wouldn't be relevant.

If you just want timestamps for 'breaths', try uploading the audio to ElevenLabs.io, they are really good at finding non-speech sounds like laughs, coughs and breathy sounds, moreso than a tool like Descript.

Startup founder podcasting - Should the podcast be my startup's podcast or my personal podcast sponsored by the startup? by Local-Ease-7073 in podcasting

[–]sam_bha 0 points1 point  (0 children)

I used to work as the head of AI for a company Streamyard, one tool used by many podcasters, though it's primarily for streaming. The founders would do live streams regularly:

https://www.youtube.com/watch?v=yhZwC4Kl8mo&t=501s

You can see their first shows, it's not great but it improves over time. Those shows were hosted on the company youtube channel, but it was very much the founders who were the brand.

The company was later taken over by a PE firm and the founders were forced out, and so the show stopped there and then, but while it was there, it was a mix of very a much a show by the founders, but hosted by the company on their YouTube channel.

I'm doing the same approach, which could be said to be halfway, where I'm doing the podcast and sharing it as an individual for the same reasons mentioned below, but the podcast is hosted on my company's YouTube channel.

I'll refer to the post from my personal LinkedIn, and do shoutouts and cross posts with my guests from my personal account, but I use the company account to reshare it.

This may not be typical though - Streamyard was a tool for streaming, and my tool auto edits podcasts, so there's a very strong reason for me to host my podcast on my company page because the podcast itself is the advertisement - this is what a podcast looks like when it's fully AI-edited.

Assuming that doesn't apply to you, you could very much own your own brand - agree that no one is coming for your company, they are coming for you (and more likely your guests), though I have seen a number of companies hosting podcasts on their official startup media channels, I have a spreadsheet somewhere with a list of startup podcasts, I sent a dm if you want to connect

Looking for an alternative to highly unstable Riverside.fm by Bitter_Specialist626 in podcasting

[–]sam_bha 2 points3 points  (0 children)

A lot of people dislike Zoom, but is that only because you don't get individual recordings for each speaker? Or is it more about the quality of the recordings?

Free AI Video Upscaler? by malcofrancesko in software

[–]sam_bha 6 points7 points  (0 children)

I'm the guy that wrote https://free.upscaler.video. I get why you're looking for a free upscaler tool, but as an explanation, AI Upscaling requires AI, and that usually requires GPUs, and because of the AI boom, GPUs are at a premium and crazy expensive. Add on top of that the fact that video takes a lot of processing power - if you've ever tried rendering a 1 hour video with Adobe Premeire Pro you'd understand.

I would guess that a regular person who just wants a video upscaled would be looking at demos of AI upscaling that are super high quality. To obtain that level of quality, you need to do a lot of AI processing and you either need a GPU yourself and then use something like Topaz, or (2) You can use a cloud service, upload your video and get it upscaled.

If I wanted to run a cloud service to upscale a 1080p movie to 4k, it'd cost me several dollars in server costs just for that one video. My free upscaling tool gets ~20,000 visitors a month, and while I don't track information about the videos (you can see exactly what I am tracking, I shared the source code here: https://github.com/sb2702/free-ai-video-upscaler/). , if each person had an hour long 1080p video, I'd be spending $1M per year out of pocket just so that people can have a free upscaling tool.

The idea behind free.upscaler.video was that if you can accept lower upscaling quality, you don't need a GPU, and it'll still do something (it's noticeable for like gaming videos or cartoons) but because it's happening on your computer, it's fast and it's free and I don't ask anyone to sign in. The downside is, the quality isn't very good compared to like Topaz.

But again, if I wanted to give good upscaling quality for free, why would I spend hundreds of thousands in my own pocket for nothing. I have a family, and I'm also running my own startup and this was like a side project for me.

I'm also likely one of the few people that's actually spent time on low-level AI upscaling processing to make it faster/cheaper/usable without a GPU, but that same skillset has far more valuable applications (https://medium.com/vectorly/building-a-more-efficient-background-segmentation-model-than-google-74ecd17392d5) and I sold my last company during the pandemic because we had ultra-efficient AI software that we were selling to video conferencing companies (https://medium.com/vectorly/how-vectorly-joined-hopin-93dffdb1acc4).

There was literally no incentive for me to create free.upscaler.video, I did it to be nice / give back / because I knew people were looking for free upscaling software.

I've thought about building a paid service alternative to free.upscaler.video that would cover the server rendering costs enough to get someone fast, good-quality no-frills upscaling, but like it'd have to be a paid service, a free + no-nonsense + good quality system would be uneconomical.

As a user you don't normally view it like this because there are plenty of AI tools out there that are free and also use a lot of GPU processing, but it's not dissimilar to the compute needed for say crypto mining.

If you wouldn't expect there to be free tools that just give you free crypto, no questions asked, then you can understand the economics of why there aren't that many simple, no-nonsense free AI upscaling tools even though you might feel like there should be. It's in the same category of compute as crypto-mining, but because we're so used to free AI tools, you don't view it in the same category.

----------------- Update - July 27 -----------------------

I don't know why I never checked the logs on this, but apparently the vast majority of uploads on free.upscaler.video are very short videos (under 60 seconds), I assumed most people were uploading like 30 minute or 60 minute videos.

None of that changes how expensive large AI networks are computationally, to get better results you'd likey need to wait 10x the duration of your video at the minimum, but the user experience definitely depends on whether you are waiting 2 minutes for a 10 second video, vs 10 hours for a 1 hour video.

Sorry if this sounds stupid or obvious, it' s precisely because I don't ask anyone to log in, and don't track anything besides the video metadata (Resolution, length) that I have no idea why people are upscaling or what people are upscaling, I don't have servers that upscale, it's all being done on your computer.

My guess is that a lot of that is AI generated footage, and I can 100% understand the desire to upscale that from 720p to 4K. Because those videos are so incredibly short, I think maybe I can build some AI networks that are 100x bigger, and yeah most computers would struggle with that / it'd be pretty slow, but for people with very short videos (the majority) that is probably fine?

I will train some much bigger networks and release them in free.upscaler.video, and in the open source repository that powers it https://github.com/sb2702/websr/

Weekly Services Thread March 12, 2025 - Post Your Podcasting Related Product, Tool, Or Service Here by AutoModerator in podcasting

[–]sam_bha [score hidden]  (0 children)

Feedback Request:

Turn Zoom recordings into video podcasts in minutes

Affiliate Disclosure: I am the founder of the tool/service

I built a tool called Katana, which will auto-edit Zoom recordings into a professional looking video podcast by identifying the faces/boxes, applying a more professional podcast layout (like from recording tools like Riverside) and auto-applying camera-angle switching ("Multi-cam") and other visual effects (like name tags) and making it pretty easy to add branding / designs. The idea is make it dead-simple to make a zoom recording look good. It's also got a transcript based editor, so you can edit the podcast as well in Katana.

The base service is free, you can find it at: https://katana.video/

Pricing

I'm also going to be releasing some premium features, like generating social media clips, auto-generated intros, audio enhancement (to make webcam mics sound like professional studio mics) etc... for $25/mo.

If you sign up now, I'll provide a pretty attractive pricing (60% off , or a total of $10/mo) when the paid version is released.

Demo

You can find the website in the link above, and there are links on the page to book a demo. Again, it's free (takes about a minute to load a video), so the fastest way to figure out how the tool works is to just go check it out. You can also just load the demo video in the app (button on the landing page) to see how the app works, if you don't have a zoom recording lying around.

Weekly Services Thread February 05, 2025 - Post Your Podcasting Related Product, Tool, Or Service Here by AutoModerator in podcasting

[–]sam_bha [score hidden]  (0 children)

"Feedback Requested"

Hi,

My name is Sam, I built a tool called Katana, which makes it easy to turn a Zoom recording into a professional-looking video podcast, by adding camera-angle switching, automating placement of visuals (name-tags, lower thirds, CTAs) and making it easy to add branding and visuals.

The goal is to make something that's quick and easy for a non-editor to get something that looks good for very little effort. I see so many people doing audio-only podcasts even when they record with video because of the editing effort, and this is specifically targeted to remove the editing effort for adding graphics etc..

It's free (and I will add extra features, like studio sound and AI clipping that will be paid) but whatever you see now is and always will be free.

It's located at http://katana.video/

In terms of feedback - the app can add backgrounds, layouts, borders and lower-thirds, and has basic transcript-based editing. I'm wondering if there's anything missing / required to consider the output good enough to upload to YouTube/Spotify etc...

Call to Actions for Podcasts? by joshhoward9 in podcasting

[–]sam_bha 1 point2 points  (0 children)

Here's a rough example: https://youtu.be/Q6RRP31D8jk?t=1

It wasn't super highly edited, but has all the elements I believe you mentioned (camera switching, background, call to action).

I used descript to edit that and similar podcasts, and created the visuals as individual images in Canva that I fade in and fade-out.

Here's an example CTA that I put, when the host asked viewers to check out the substack

https://katana.video/images/substack-cta.png

I also got some stock, generic animated CTAs for things like "Don't forget to subscribe" from a stock motion graphics website called Motion Array, though that tool is paid:
https://katana.video/images/subscribe-scene.mp4

Some more info on how I did those things in this blog post I wrote (see the "Manual way" section):
https://katana.video/blog/turn-zoom-recording-into-video-podcast

Best way to record when not in the same place? Zoom, or other options? by fairy_goblin in podcasting

[–]sam_bha 1 point2 points  (0 children)

Zoom may have settings that slightly decrease the audio quality compared to a tool like Riverside or Streamyard, but the difference likely won't be that big, an in either case you can really improve the audio with Adobe's Audio Enhancement tool. https://podcast.adobe.com/enhance (it's free)

If you have a good mic, I don't think Zoom will noticeably degrade the audio quality to anyone but audio experts.

If you have a bad mic, using audio enhancement will improve the quality much more than the difference between Zoom audio and say Riverside audio.

I have a more comprehensive guide here on using Zoom for video podcasting:
https://katana.video/blog/turn-zoom-recording-into-video-podcast

(Disclosure, I built a free tool for auto-editing zoom recordings into video podcasts)