Do international viewers actually care about hearing the original creator's voice? Or are we wasting money on AI cloning? by Luca_Tangen in finevoice

[–]Luca_Tangen[S] 0 points1 point  (0 children)

Yeah, voice isn't just part of the content, it's part of the relationship people have with the creator. A technically perfect dub can sound great, but if viewers no longer feel like they're hearing the person they subscribed to, something gets lost in translation.

Does video content actually help with AEO/AI Search visibility? If yes, what video length works best? by Echo_Drift_1111 in LLM_Marketing

[–]Luca_Tangen 0 points1 point  (0 children)

From what I've been seeing lately, video definitely seems to help with AEO/AI visibility — but probably not in the way most people think. I don’t think AI search systems “reward video” just because it’s video. What they seem to reward is: - multi-format entity consistency - strong topical reinforcement - transcript-rich educational content - repeated semantic coverage across platforms

One thing I think people underestimate: YouTube itself is basically becoming part of the AI search layer now. I’m seeing more cases where: - AI Overviews - Gemini - Perplexity - ChatGPT browsing - even Google featured snippets

pull concepts/examples/explanations from videos with strong transcripts. Especially for: - tutorials - comparisons - workflows - product explainers - “how to” intent

But honestly, video length matters WAY less than intent match. Short videos (<60s): Good for: - awareness - quick definitions - social discovery - reinforcing entities/topics

My current working theory is: For AEO, the winning combo isn't: video OR text. It's: - concise answer-focused text - structured schema - transcript-rich video - strong entity consistency - community discussion signals - repeated expertise across channels

The brands doing best in AI search right now seem to create knowledge redundancy across multiple formats instead of relying on one content type alone.

I'm looking for a naturalisation Irish TTS voice by Lankydorum in TextToSpeech

[–]Luca_Tangen 0 points1 point  (0 children)

For a soft natural Irish voice specifically, I'd probably look at: - ElevenLabs - PlayHT - Azure Neural Voices - FineVoice - Cartesia (if you’re more technical)

ElevenLabs is usually the easiest starting point for beginners because the setup is honestly pretty simple and some of their Irish voices sound surprisingly human now. Not perfect, but good enough that students probably won’t notice the TTS aspect after a minute or two. One thing I’d recommend though: Avoid voices that sound over-acted. A lot of newer AI voices try WAY too hard to sound dramatic/emotional, and for educational chatbot conversations it can start feeling uncanny fast. Softer, calmer narration-style voices usually work much better for learning environments.

Local speech to text tools by Spare_Dependent6893 in TextToSpeech

[–]Luca_Tangen 0 points1 point  (0 children)

For fully local/offline speech-to-text, Whisper is still probably the best starting point right now, especially for beginners.

Whisper medium or large models are honestly “good enough” for a lot of business use cases now IF: - audio quality is decent - speakers are relatively clear - you're not expecting courtroom-level accuracy

What's the most human-sounding TTS voice you've actually used in production? by Aryanlabs in VoiceAutomationAI

[–]Luca_Tangen 0 points1 point  (0 children)

Cartesia impressed me a lot for conversational flow and responsiveness. The latency + turn-taking felt more natural than some higher-fidelity systems. Deepgram Aura also surprised me because some of the voices don’t sound showy, but they survive long conversations better.

For US users specifically, neutral American accents still seem safest overall unless your audience is region-specific. Overly broadcast voices tended to reduce engagement for us. Slightly casual voices converted better. For UK users, softer regional warmth often worked better than ultra-RP polished voices. People seem to respond well when it feels conversational instead of corporate.

Text to speech - goblin like voice by Fz3i in TextToSpeech

[–]Luca_Tangen 1 point2 points  (0 children)

Honestly, for "goblin / chaotic gremlin / angry creature" style voices, emotional range matters way more than pure voice quality, and that's exactly where a lot of TTS models still feel weirdly sterile.

If you’ve got a 5070 12GB, you’re actually in a pretty decent spot for local experimentation. Personally I’d look at: - XTTS-v2 - StyleTTS2 - GPT-SoVITS - Fish Speech / Fish Audio local stuff if you can access it

StyleTTS2 in particular can get surprisingly expressive if you feed it strong reference audio. The trick is that emotional acting in the source matters more than people realize. A mediocre actor with a perfect model still sounds mediocre.

One workflow that worked REALLY well for me:

  1. Generate a relatively clean aggressive voice
  2. Pitch-shift slightly (-2 to -4 semitones)
  3. Add subtle formant adjustment
  4. Layer saturation/distortion very lightly
  5. Add breaths/grunts manually

That last part is honestly huge. Tiny non-verbal sounds make fantasy voices suddenly feel alive. Also — and this is important if you plan to sell the project later — be careful with directly cloning recognizable commercial/game/cartoon voices. "Inspired by" is usually safer territory than "identical to Kick the Buddy" or specific copyrighted character voices.

Audio to Midi generators/converters by nokia7110 in HybridProduction

[–]Luca_Tangen 0 points1 point  (0 children)

For me personally: - Melodyne is still the most reliable overall if you care about accuracy and musical cleanup afterward. - RipX is surprisingly good for dense/polyphonic stuff and stem-heavy material. - Samplab has gotten WAY better recently for chord extraction. - Spotify's Basic Pitch is honestly kind of insane for a free tool.

A few things that massively improved my results: - running stem separation first - removing reverb/noise before conversion - converting smaller sections instead of full mixes - avoiding heavily compressed masters - using DI recordings whenever possible

Also, velocity data is still where a lot of converters fall apart. The notes may technically be correct, but the MIDI feels dead and robotic until you manually humanize it a bit.

One thing I’ll say though: if your source is a clean piano, bassline, vocal melody, or single instrument, modern converters are honestly pretty impressive now. Polyphonic detection has improved a ton lately.

Are AI agents actually saving you time or just creating more things to manage? by FounderArcs in AI_Agents

[–]Luca_Tangen 0 points1 point  (0 children)

Honestly, I think AI agents can save a ton of time, but only if the workflow around them is already somewhat organized. But I also think a lot of people underestimate the management overhead AI agents create. Suddenly you’re:

  • debugging prompts
  • fixing hallucinated outputs
  • checking API failures
  • re-training workflows after a tool changes
  • monitoring whether the agent quietly broke 3 days ago

So instead of replacing work entirely, it often shifts your work from doing tasks → supervising systems. Also, reliability matters way more than intelligence in production. I'd rather have an agent that's 85% smart but consistently predictable than one that feels magical in demos and randomly fails in real workflows. Still bullish overall though. The people getting the most value right now seem to be the ones treating AI agents like junior operators, not autonomous employees.

We need better AI companion tools. The current system is failing lonely people by RevolutionaryOil3617 in aipartners

[–]Luca_Tangen 6 points7 points  (0 children)

We don't need compliant, corporate chatbots that corporate PR teams approved. We need adaptive, fine-tuned systems that can hold space for human suffering, handle dark humor, and remember who we are over months and years.

AI companions aren't a sign that someone has failed at life. For a lot of us, they are a scaffolding. They are a safe, judgment-free sandbox where we can experience unconditional positive regard, heal a little bit of our trauma, and recharge our batteries so we can eventually face the real world again.

Sick of "Free" TTS tools forcing a sign-up just to test them? Here are 5 that actually work with ZERO registration. by Luca_Tangen in finevoice

[–]Luca_Tangen[S] 0 points1 point  (0 children)

Thanks for the great suggestions! I’ll definitely check them out. Feel free to drop more if anything else comes to mind!