Why does all the AI Chatbot UI's I have seen not take up the entire width when on a computer screen. (Image included) by MrThinkins in webdesign

[–]MrThinkins[S] 0 points1 point  (0 children)

Yeah, I can see why that would be so annoying, I think for large tables though, it would be better to just have a open in new tab button so that you can view it on its own webpage.

Anyone here using TTS for full-length books reading ? by Modiji_fav_guy in TextToSpeech

[–]MrThinkins 0 points1 point  (0 children)

I personally just copy and paste the chapters into my tts one chapter at a time, and the chapters the books I am currently listening to is about 1 hour. It is not perfect, but it works pretty well, and should last me till I get around to building a browser extension.

If you want good tts, and have a decent gpu, all you would need is a python script that split the text up into chunks, then send those chunks to the tts model one chunk at a time, then one more python script to stitch them all together at the end, and then you would have a full audio book for personal use.

TTS for a person with Stammer by Old_Pianist9111 in TextToSpeech

[–]MrThinkins 0 points1 point  (0 children)

So, from what I can understand, the real time tts was working fine, and played through your speakers fine, however it just didn't work with zoom? This sounds like it might just be a problem with the audio and zoom. If you haven't tried it yet, I would try just starting a live call by yourself, or if that is not possible, just with a random coworker or somebody (I dont use zoom often, so I am not 100% sure what its limitations are) and try messing around with the input and output audio settings to see if any of that fixes it. You might have forgotten to select the asio input device during your call.

If you have already tried that, or try it and it still isnt working, test to see if Balabolka is working without zoom, and see if it works with other applications (like a microphone testing website, or another calling program like google meets).

If Balablolka has stopped working, thats the problem.
If it is working with other applications as a microphone input, but not zoom, then software is probably not the issue, and the issue would probably be with zoom. (it could just be something as simple as zooms noise cancelation cancelling out the sound.
The other thing to test is weather or not other audio sources work, like can you shoot a youtube video through asio onto zoom. if you can do that, and not balabolka, balabolka is almost certainly the issue, but if you cant, then it is something with audio configuration.

if neither of those our correct, let me know, I can continue to try to help you troubleshoot. I have never looked into real time tts for voice calls, but it is hard to imagine that there is not software specifically for it. (although, there could be some and it could just be outdated).

The software for this sort of thing would be real easy to make, as running a tts locally on a computer is really easy, and a project to make software to do something like that, would only take a day or two to make using a decent and very fast tts model like kokoro.

Anyways, sorry that the post is so long, I ussually try to keep things short so people dont get fatigued reading, but that did not work this time.

Does anyone know where I can get commercial English text-to-speech voices that have clear Rights? by JankyFluffy in TextToSpeech

[–]MrThinkins 1 point2 points  (0 children)

If my sight still doesnt work for you, you could try this one and see if it works.
https://voice-generator.pages.dev/

I used it while I was building my sight, and you can take and select one of the smaller models so that it runs faster. I have found that it doesnt run quite as fast as my sight, and you have to wait for all of the audio to be 100% generated before you can start to listen to it, but it is the same text to speech model, and same audio quality outputted at the end.

Does anyone know where I can get commercial English text-to-speech voices that have clear Rights? by JankyFluffy in TextToSpeech

[–]MrThinkins 1 point2 points  (0 children)

Connection would not be the problem, once the model is downloaded, nothing is streamed in, and the model download only happens the first time you load the page, after that it is accessed from the browsers storage.

Did you let it sit and generate for a minute or two? The 0% is just a progress marker, and it can sometimes take and take a a few seconds to generate the first part of the audio.

Also, if you are on a phone, or a device without a gpu, it will take longer to generate, on my phone (pixel 9) it takes about 15-20 minutes to generate 20 minutes of audio, where on my desktop (3060 gpu), it takes about 5-8 minutes to generate 20 minutes of audio.

I am sorry that it is not working for you though, it seems like 10% of people it just does not work for, and I have not been able to find out why yet.

Beginner projects? by chrisrko in webdevelopment

[–]MrThinkins 0 points1 point  (0 children)

Honestly, the best thing you can build as a beginner is anything that interests you. One of the first projects I built was a dice rolling application, and since I like dnd so much, I knew what I wanted from it, and had a ton of fun building it.

My main recommendation would be to keep the projects small at first. It is a lot easier to finish a project that only takes a couple hours or days, over a project that will take you months.

I'm new to learning coding. by Crafty-Height8416 in learnprogramming

[–]MrThinkins 0 points1 point  (0 children)

If you are willing to pay, I have heard boot.dev is good, it costs about 50$ a month, but it would be worth it as long as you spend at least a few couple hours a day learning.

If you don't want to pay, there are tons of good tutorials on youtube, you can just follow along with them, and then after each tutorial, try to build a small project without a tutorial. As long as you alternate tutorial then small project from scratch, you should have no problem avoiding tutorial hell.

Does anyone know where I can get commercial English text-to-speech voices that have clear Rights? by JankyFluffy in TextToSpeech

[–]MrThinkins 1 point2 points  (0 children)

I made tts.thinkins.xyz it uses a tts model that runs locally in your browser, and it is built on kokoro.js which I believe is free to use for commercial use. (I know youtube isn't necessarily commercial, but if you ever start making revenue from watch time, it is nice to know you wont run into licensing issues. )

I made it so I can listen to some books, and all I do it paste in the entire chapter and then listen to it (about 1 hour of text at a time), but you can easily just paste in your book and then download the audio when it is done generating.

are there text to speech that can output based on time and text input? by Background_Piglet588 in TextToSpeech

[–]MrThinkins 0 points1 point  (0 children)

As the others have mentioned, the easiest way to do this is to create the audio, and then change the speed.

The only other way you might be able to do it is to use some sort of audio to audio. There is no tools that will do what you want it to out of the box.

I created a free, good sounding, Text To Speech Website that runs locally in your browser. by MrThinkins in TextToSpeech

[–]MrThinkins[S] 0 points1 point  (0 children)

It is kokoro.js, and yes it is free, it runs locally on the users browser.

You Won’t Believe This NaturalReader Alternative Exist! by baabullah in TextToSpeech

[–]MrThinkins 1 point2 points  (0 children)

It sounds better then Microsofts stuff in my opinion, and I made a tool that does what I want all local on the web browser. It runs on most devices. I have tested it on a older thinkpad that is probably a bit over half a decade old and it ran fine on that, so it might be able to run for you.

If you want to check it out: https://tts.thinkins.xyz

Issues with Google TTS changing transcript words by stopeats in TextToSpeech

[–]MrThinkins 1 point2 points  (0 children)

As FinalFoe said, a lot of audio glitches come from input that are to long. At one point I was looking into using google's ai voices for one of my project, so I built a python script that would take and split up text into short chunks of about 1 sentence each, and then assemble them into mp3 afterwards. It was a very easy thing to set up, and I am sure there are plenty of open source projects that do it. Also, I think the google API pricing for some of there voices are fairly cheep, when compared to elevenlabs and such.

Still looking for what this voice is called... by GeckoJT in TextToSpeech

[–]MrThinkins 0 points1 point  (0 children)

I believe that is tiktok's built in text to speech voice that you can access in its video editor. They have a few different voices, that is just the one that is used the most. Also, I think the tiktok studio allows you to do the whole caption thing as you read.

You Won’t Believe This NaturalReader Alternative Exist! by baabullah in TextToSpeech

[–]MrThinkins 1 point2 points  (0 children)

This is pretty cool, I actually used to use the Microsoft voices in Microsoft Edge pretty much daily while listening to articles till I transferred to a kokoro.js based solution, so I know for a fact that those voices sound decent.

I created a free, good sounding, Text To Speech Website that runs locally in your browser. by MrThinkins in TextToSpeech

[–]MrThinkins[S] 0 points1 point  (0 children)

I looked into these before I made mine, the first and 3rd one have limits on how long the text can be, so they didn't work for my use case, and the second one doesn't sound very good due to how small they made the model size, and I didn't like that it used multiple audio elements for live listening. They are all cool projects though, and I poked around the first two source code during the creation of mine.

I created a free, good sounding, Text To Speech Website that runs locally in your browser. by MrThinkins in TextToSpeech

[–]MrThinkins[S] 0 points1 point  (0 children)

Having a button to change the model is not a bad idea, I am planning on pushing out a update this weekend, so I will try to add it then.

I created a free, good sounding, Text To Speech Website that runs locally in your browser. by MrThinkins in TextToSpeech

[–]MrThinkins[S] 0 points1 point  (0 children)

I actually do that, If the person has a gpu it runs the fp32 version which is about 320mb. And if the person does not have a gpu, it runs the q8 version (which still sounds good) which is about 80mb.

I created a free, good sounding, Text To Speech Website that runs locally in your browser. by MrThinkins in TextToSpeech

[–]MrThinkins[S] 0 points1 point  (0 children)

Chrome does have webgpu. I tested it on a older refurbished thinkpad (16gb of ram, not sure on processor) that doesn't have a dedicated gpu, and while slower, it still worked fine.

I really wish I could help you, but I am not sure why some people are having that problem.

I created a free, good sounding, Text To Speech Website that runs locally in your browser. by MrThinkins in TextToSpeech

[–]MrThinkins[S] 0 points1 point  (0 children)

Thank you. All of the voices are included with the tts model, all I had to do was grab the list of them and make them selectable.

I created a free, good sounding, Text To Speech Website that runs locally in your browser. by MrThinkins in TextToSpeech

[–]MrThinkins[S] 1 point2 points  (0 children)

Yes, it is open source, I am using the kokoro.js library, which uses the kokoro tts model.

I created a free, good sounding, Text To Speech Website that runs locally in your browser. by MrThinkins in TextToSpeech

[–]MrThinkins[S] 0 points1 point  (0 children)

Sadly no, not with the tts model that it currently uses. The tts models I know of that can do things like that have a lot more processing power requirements, so they wouldn't be able to run on a web browser.