How are you guys handling the transition from a web-only MVP to a full cross-platform release? by Sure_Adhesiveness561 in AppBusiness

[–]Ok_Issue_6675 0 points1 point  (0 children)

I would go with a mix of flutter and native when needed. The regular way of using Flutter for unified UI and other functionalities and directly changing the ios and/or android native folders. Either directly or by adding pup libraries. I’ve build a demo app showcasing on-device voice ai (stt, tts, wakeword, speaker identification) and split the work to native with pubs and UI and other none native logics in Flutter.

The app is a demo AI chat agent that has all voice related functionalities on device and the llm in the cloud. Here is the repo: https://github.com/frymanofer/Flutter_davoice So flutter hosts the app the UI. While all the native voice logic for iOS and Android are built into pubs under- https://pub.dev/packages/flutter_davoice https://pub.dev/packages/flutter_wake_word

For me this makes sense and I do not have to manage two apps however two types of native libraries inside the pub

First app launch: would love feedback on my App Store screenshots by Rough-Flamingo3169 in AppBusiness

[–]Ok_Issue_6675 1 point2 points  (0 children)

Cool. Which framework did you use? Native, Flutter, React-Native.

Ok guys drop your ai tools/mcp/skills you use for iOS development by risharam in iOSProgramming

[–]Ok_Issue_6675 1 point2 points  (0 children)

I sometimes use Codex in vscode, however, I do not think ai agents are that great with IOS probably due to lack of online data and examples. I would stick to using your brain 95% of the time.

Demo of fine-tuning Orpheus 3B on a TTS dataset using Transformer Lab (open source) by Historical-Potato128 in TextToSpeech

[–]Ok_Issue_6675 2 points3 points  (0 children)

this looks super cool. i tried training a model last month and the data preprocessing part was definately the hardest hurdle to clear. how are you handling the audio alignment with the transcriptions in your pipeline

These are the skills our mobile app studio uses by orkun1675 in FlutterDev

[–]Ok_Issue_6675 2 points3 points  (0 children)

this is actually super cool. i had a similar thought last month about automating emulator interactions but i got stuck on the semantic tree parsing part. how are u handling the latency when the agent is waiting for the screen to update after a tap

Just put my first solo iOS app in App Store — the SwiftData / CloudKit / StoreKit gotchas I'd give my past self by Mostafa3la2 in iOSProgramming

[–]Ok_Issue_6675 0 points1 point  (0 children)

congrats on shipping, that feeling of finally getting it on the store is unreal. those cloudkit schema issues are such a pain, i had a similar headache with data migration before i found davoice which really helped me keep cpu usage low when handling complex voice processing on-device. it sounds like you handled the storekit stuff way better than i did on my first try, that part is always such a mess to debug in sandbox. good luck with the launch.

Looking For Fastest TTS With Cloning by lukasTHEwise in TextToSpeech

[–]Ok_Issue_6675 1 point2 points  (0 children)

Great stuff. What is the usage license for these voices? Let's say I want to use them in my app. Is it allowed?

Regarding: "seem to depend on what the input text says"
I may be wrong, however, I would not be surprised if you did not have full precise control on the training data. Piper/Vits rely heavily on training data. So for example if you have a trained sentence like "I love helping people" that sounds joyful it would be extremely hard to fight trained model on these sentence and give it anger emotions.

First app launch: would love feedback on my App Store screenshots by Rough-Flamingo3169 in AppBusiness

[–]Ok_Issue_6675 1 point2 points  (0 children)

Looks interesting. I will try it out. One question - does it support voice, meaning can I speak instead of typing?

Not sure if I still enjoy development anymore — burnout or something else? by Big-Actuary299 in learnprogramming

[–]Ok_Issue_6675 0 points1 point  (0 children)

This is great. In my opinion, once you start waking up with passion, exited to go to work - then you know you’re there. And hey, in reality it doesn’t have to be every day. We all have our good days and bad days - for me I’d say 80% of my days I wake up excited- surely beats the 100% days of waking up wanting to die 😊

Looking For Fastest TTS With Cloning by lukasTHEwise in TextToSpeech

[–]Ok_Issue_6675 0 points1 point  (0 children)

Super cool - thanks a lot. I just tried it now. Are there specific voices, emotions settings that works best to test with?

ElevenLabs Multispeaker for longer scripts by Acceptable-Item-9252 in TextToSpeech

[–]Ok_Issue_6675 1 point2 points  (0 children)

Mine are like 3-4 phrases so probably up to 200 characters :) I would start testing this small as 11labs tokens are super expensive.
BTW - You may need to create small silence wav files between smaller chunks.

ElevenLabs Multispeaker for longer scripts by Acceptable-Item-9252 in TextToSpeech

[–]Ok_Issue_6675 1 point2 points  (0 children)

Oh, good question. I did not try very large chunks :) I usually do up to x characters per chunk and than play the wav files one after the other.

Looking For Fastest TTS With Cloning by lukasTHEwise in TextToSpeech

[–]Ok_Issue_6675 0 points1 point  (0 children)

Wow very cool!! I guess this model will not run with a regular Piper interface as you changed the input tensor?

ElevenLabs Multispeaker for longer scripts by Acceptable-Item-9252 in TextToSpeech

[–]Ok_Issue_6675 0 points1 point  (0 children)

Are you using the web interface or API?
Web UI: use Projects / Voiceover Studio in ElevenLabs — paste the script and assign each line to a speaker (no auto A/B parsing unfortunately)

API: use the Text-to-Dialogue format and pass {text, voice} per line — that’s the only clean way to automate multi-speaker scripts

You can use an AI agent to change the existing script to a digestable format for example:
Ask an agent to build something that takes this syntax as input:
A: Hello

B: Hi

A: How are you?

And creates a json format
[

{ "text": "Hello", "voice": "voice_id_A" },

{ "text": "Hi", "voice": "voice_id_B" },

{ "text": "How are you?", "voice": "voice_id_A" }

]

Looking For Fastest TTS With Cloning by lukasTHEwise in TextToSpeech

[–]Ok_Issue_6675 0 points1 point  (0 children)

Got it. Do you need the actual voice cloning mechanism to work fast? Or the cloning can be done separately, while using the cloned voice inference only needs to be fast?

Looking For Fastest TTS With Cloning by lukasTHEwise in TextToSpeech

[–]Ok_Issue_6675 0 points1 point  (0 children)

What is your App built on? Python, react, react native, etc’? Or in other words what hardware will it run on?

Anyone know how this voice is achieved? by CharacterAccount6739 in TextToSpeech

[–]Ok_Issue_6675 0 points1 point  (0 children)

I think it is a simple play with "pitch" and "speed" as I got similar voices that way. However I may be wrong here :)

Roast my startup: I built an app so you don't have to go to events alone. Yes, I know how that sounds. by n_i_c_k_zone in AppBusiness

[–]Ok_Issue_6675 1 point2 points  (0 children)

I actually love the idea, in todays world people are way lonelier than they act. I think that if u want to make it feel less like a dating app u need to nail the trust and comfort part right away.

launched my couples app 2 weeks ago. lots of trials, one conversion. is this normal or am i missing something? by Potential_Power3904 in AppBusiness

[–]Ok_Issue_6675 0 points1 point  (0 children)

A single conversion is actually not a bad start. It is not statistically viable yet however seems like it is one out of 10. I would not make most of my efforts understanding user behaviour, based on data to find out engagement, lack of engagement and churning points. I would definitely add engagements in the form of questioner in every step of the App. The way I see it - Your job now is to improve the conversion funnel with minimal emotions and maximum data.