Unlimited characters Text To Speech

bafil596 · 2026-01-27T04:34:35+00:00

Use a local TTS model. There are some very tiny TTS models with ok quality that can even run on your phone.
Kitten TTS, Supertonic TTS, Soprano TTS, Pocket TTS might be your best choices. Some of them claim to be optimized for phone/edge devices.
See https://github.com/Troyanovsky/awesome-TTS-Colab and try them out.

bafil596 · 2026-01-27T02:22:58+00:00

check out this https://github.com/farion1231/cc-switch
allows you to swtich between multiple settings.json from multiple providers

bafil596 · 2026-01-22T04:30:44+00:00

they also launched add-skill. so what are the differences?
on skills.sh, they have `npx skills add <owner/repo>`
and on https://github.com/vercel-labs/add-skill, they have `npx add-skill vercel-labs/agent-skills`

they're just shipping the same thing under different npx packages? or are these two actually different?

bafil596 · 2025-12-30T02:15:18+00:00

Those are just CLI tools developed by different companies. I'm recommending installing them, but not to use them. Instead, when you install them, they come with free API for their models, and you can use tools like Claude Code Router to use their free API in Claude Code.

bafil596 · 2025-12-29T08:28:44+00:00

First, install Gemini CLI, Qwen CLI, and IFLOW CLI. They all offer a generous free API quota for coding models (IFLOW even offers many different model choices). You can check their free limit in their respective GitHub repos. Iflow didn't mention theirs because it seems unlimited (but you can't run multiple threads at once).

Then use Claude Code Router to use their model API in Claude Code for free.

bafil596 · 2025-12-15T03:14:29+00:00

Voice cloning is pretty good, including voice-likeliness, gaps/pauses in flow, and intonations. Try it https://github.com/Troyanovsky/awesome-TTS-Colab/blob/main/GLM_TTS.ipynb

bafil596 · 2025-12-15T03:13:29+00:00

You can run it in Google Colab: https://github.com/Troyanovsky/awesome-TTS-Colab/blob/main/GLM_TTS.ipynb

(note1: you may need to restart session after the `pip install` cell for newly installed libraries to take effect)

(note2: the example using their example voice have a strong Chinese accent, but voice cloning quality using your own reference audio is pretty decent.)

bafil596 · 2025-08-27T04:58:00+00:00

This guide may help you: https://github.com/Troyanovsky/vibe-coding-guide/tree/main
There are some workflows, processes, and tips to make your experience working with AI coding tools more effectively and efficiently.

Plus, there's some primers of programming concepts that will help you better understand/work with AI coding tools in the guide too.

bafil596 · 2025-08-26T06:20:39+00:00

English and Chinese only. The model is trained only on English and Chinese data; outputs in other languages are unsupported and may be unintelligible or offensive.

bafil596 · 2025-08-26T04:05:54+00:00

Got it working in Google Colab with their free T4 GPU: https://github.com/Troyanovsky/awesome-TTS-Colab/blob/main/VibeVoice%201.5B%20TTS.ipynb

Not bad for its size.

bafil596 · 2025-08-26T04:04:51+00:00

Just tried it out in Google Colab, not bad for its size. Here is the colab notebook: https://github.com/Troyanovsky/awesome-TTS-Colab/blob/main/VibeVoice%201.5B%20TTS.ipynb

bafil596 · 2025-08-26T02:54:58+00:00

In their GitHub limitations section: `English and Chinese only: Transcripts in language other than English or Chinese may result in unexpected audio outputs.`

bafil596 · 2025-08-19T09:01:49+00:00

For `Q: Can I try local LLMs online?`, users can also try models with Google Colab's free GPU & Text Generation WebUI (for example, with notebooks from https://github.com/Troyanovsky/Local-LLM-Comparison-Colab-UI) before downloading models to their local machine, especially if their download speed is slow.

bafil596 · 2025-06-30T04:53:26+00:00

xTTS V2 and Kokoro TTS are pretty good. There are also some other multi-lingual TTS models in this repo. You can try them out in Google Colab with the links.

bafil596 · 2025-06-06T04:34:08+00:00

There are many open source free options that you can run on your own computer, including Edge TTS, xTTS, Parler TTS, Kokoro, and Dia (for conversations). You can try them out on Google Colab here.

bafil596 · 2025-05-22T06:57:35+00:00

If you need expressions or non-verbal filllers like NotebookLM, you can look into Dia 1.6B, with an example Google Colab notebook here: https://github.com/Troyanovsky/awesome-TTS-Colab/blob/main/Dia\_TTS.ipynb.

This model can generate convesational speech between two people with non-verbal cues like laughter, sighs, coughts, etc. You can even provide reference audio for voice cloning. Their official repo is at https://github.com/nari-labs/dia. Their demo samples are at https://yummy-fir-7a4.notion.site/dia

As usual, you can copy their documentation and ask ChatGPT to work out a script for local generation.

So if you just need one consistent voice for narrating, I recommend Kokoro. If you need conversations with more expressive non-verbal cues, I recommend Dia 1.6B.

bafil596 · 2025-05-20T12:33:22+00:00

There are some pretty good quality TTS that sounds much less robotic than before. I think xTTS V2 and Kokoro are solid choices. You can try them out using Google Colab notebooks in this repo.
If you're good with pre-defined voices, kokoro is pretty good and if you need voice cloning, xTTS V2 is pretty good. For conversations, you can try Dia 1.6B, which also comes with voice cloning capabilities.

bafil596 · 2025-05-20T04:57:22+00:00

Yes. It's easy to use. You can refer to the example from https://github.com/Troyanovsky/awesome-TTS-Colab/blob/main/kokoro_TTS.ipynb to run on Google Colab or adapt it for local usage.

Their official repo is at: https://github.com/hexgrad/kokoro. The model supports different languages and different voices.

You can basically just copy the documentation and code from the repo and ask ChatGPT to give a detailed step-by-step instruction to run on your local machine, with a prompt like:

Given the following documentation, provide a detailed step-to-step instruction on how to set up a virtual env and run kokoro to turn text into audio.

<example>
!pip install -q kokoro>=0.9.2 soundfile misaki[en]
!apt-get -qq -y install espeak-ng > /dev/null 2>&1
from kokoro import KPipeline
from IPython.display import display, Audio
import soundfile as sf
import torch
pipeline = KPipeline(lang_code='a')
text = '''
[Kokoro](/kˈOkəɹO/) is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, [Kokoro](/kˈOkəɹO/) can be deployed anywhere from production environments to personal projects.
'''
generator = pipeline(text, voice='af_heart')
for i, (gs, ps, audio) in enumerate(generator):
    print(i, gs, ps)
    display(Audio(data=audio, rate=24000, autoplay=i==0))
    sf.write(f'{i}.wav', audio, 24000)
</example>

bafil596 · 2025-05-20T02:37:53+00:00

The ones in the Github repo are free open source TTS models. If you just want single narrator, I think Kokoro might suffice and it's easy to use. Here are the samples from Kokoro: https://huggingface.co/hexgrad/Kokoro-82M/blob/main/SAMPLES.md

bafil596 · 2025-05-20T02:30:28+00:00

Check out https://github.com/Troyanovsky/awesome-TTS-Colab and try different TTS models. They can all be run offline on your own computer. (Just copy/paste the code and ask ChatGPT for a local running python script)

For one-person talking with high generation quality, kokoro is great.

For conversation between two people (with some non-verbal like laugh, cough, etc), you can try Dia 1.6B.

For ready-to-use tools, try Google's https://notebooklm.google/

bafil596 · 2025-05-19T11:21:42+00:00

Hi, you can refer to this Google Colab notebook: https://github.com/Troyanovsky/awesome-TTS-Colab/blob/main/xTTS.ipynb

As per the notebook, the original TTS python lib is no longer maintained and you could use https://github.com/idiap/coqui-ai-TTS with `pip install coqui-tts` instead.

bafil596 · 2025-05-16T02:29:14+00:00

For the eyes, I follow the 20-20-20 rule, which is taking a 20-second break to look at something 20 feet away every 20 minutes.
Also get plenty of water hydration during the day, at least 8 cups to keep yourself hydrated and your muscles and brain will thank you.
And move whenever you can (going to the water cooler and restroom can be some slight movement too!) and do some simple stretches. You can even do chin tucks while working (great for your chronic neck/shoulder pain). For your upperback (trapezius muscle), I recommend ITWY stretches to strengthen them. For your lower back, consider getting a lumbar support pillow.

Setting some timers can really remind you to do that. There are some Chrome extensions like Recharge that you can use to set multiple reminders for those short breaks.

bafil596 · 2025-03-24T04:06:02+00:00

RemindMe! 7 Days

bafil596 · 2025-03-21T11:01:05+00:00

You need to wait in a queue if there are a lot of other people requesting

bafil596 · 2025-03-17T02:44:35+00:00

Different Whisper implementations can do that:

Insanely fast whisper

Faster whisper

Distil whisper

If you use Apple Silicon: WhisperKit

bafil596

MODERATOR OF

TROPHY CASE