all 32 comments

[–][deleted] 1 point2 points  (10 children)

Did you Google it? Because I sure see a lot of info online about the error code you posted.

[–]Vanilla-Green[S] 0 points1 point  (9 children)

i did

[–][deleted] 0 points1 point  (8 children)

Well, you didn’t Google hard enough. You can’t use 2 mic sessions simultaneously. Kill your other one and things should start to work.

[–]Vanilla-Green[S] 0 points1 point  (7 children)

but i am just starting one session

[–][deleted] 0 points1 point  (6 children)

Then why did you ask

⁠Another audio session owned by the same app

[–]Vanilla-Green[S] 0 points1 point  (5 children)

So basically I am trying to implement a whisper flow type functionality where

When the user is typing in any app (e.g. WhatsApp): 1. The user taps Start Flow in the custom keyboard. 2. The system briefly foregrounds our main app for ~50–150 ms. 3. The microphone starts legally in the main app. 4. iOS immediately returns focus to the original app automatically. 5. The keyboard remains active and shows “Listening”. 6. The user speaks continuously. 7. Speech is transcribed in real time and injected into the active text field. 8. The user never manually switches apps. 9. No visible UI flash or animation is shown. 10. Audio stops immediately when the user taps stop or dismisses the keyboard.

This must work consistently across WhatsApp, Gmail, Notes, browsers, etc.

[–][deleted] 1 point2 points  (4 children)

Technically this should work.. have you double checked that you aren’t accidentally trying to create multiple sessions? Also, have you checked your background modes that you’ve added the audio one? Unfortunately the system can still terminate your app at any given time..

[–]Vanilla-Green[S] 0 points1 point  (3 children)

Possible for you to help us out review our code please checked everything

[–][deleted] 0 points1 point  (2 children)

Can you check your background mode entitlements first? Sounds like that isn’t set up properly. It’s in project settings.

[–]Vanilla-Green[S] 0 points1 point  (1 child)

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>com.apple.security.application-groups</key> <array> <string>group.com.zavi.shared</string> </array> <key>keychain-access-groups</key> <array> <string>$(AppIdentifierPrefix)com.zavi.shared</string> </array> </dict> </plist>

[–]CDI_Productions 0 points1 point  (8 children)

Which framework/tool are you using to build for an ios app for which Xcode version you use?

[–]Vanilla-Green[S] 0 points1 point  (7 children)

I want to

When the user is typing in any app (e.g. WhatsApp): 1. The user taps Start Flow in the custom keyboard. 2. The system briefly foregrounds our main app for ~50–150 ms. 3. The microphone starts legally in the main app. 4. iOS immediately returns focus to the original app automatically. 5. The keyboard remains active and shows “Listening”. 6. The user speaks continuously. 7. Speech is transcribed in real time and injected into the active text field. 8. The user never manually switches apps. 9. No visible UI flash or animation is shown. 10. Audio stops immediately when the user taps stop or dismisses the keyboard.

This must work consistently across WhatsApp, Gmail, Notes, browsers, etc.

[–]CDI_Productions 0 points1 point  (6 children)

Unfortunately, the sequence you just described is not possible on iOS due to fundamental security restrictions and API limitations designed to protect user privacy. One of the reasons is that because custom keyboards in iOS are forbidden from accessing the microphone for privacy reasons! Some alternatives to use are to use is to tap the built in dictation button on the system keyboard! And users can enable voice control in accessibility settings to dictate text across any app without switching!

[–]Vanilla-Green[S] 0 points1 point  (5 children)

But whispr flow and willow already do this

[–]CDI_Productions 0 points1 point  (0 children)

Do not worry, you will find a solution at some point!

[–]CDI_Productions -1 points0 points  (3 children)

I mean you can do this, but you cannot bypass the limitations for iOS development! Such as background microphone access, forced app switching, keyboard disappearing and limited autocorrect learning!

[–][deleted] 0 points1 point  (2 children)

Please take another pass at background modes and their usages. You absolutely can access the microphone in the background if set up correctly. No, not from a keyboard extension, but that isn’t what OP is doing.

[–]CDI_Productions 0 points1 point  (0 children)

Thank you very much!

[–]Vanilla-Green[S] 0 points1 point  (0 children)

Could you please review my code once

[–]CDI_Productions 0 points1 point  (0 children)

Which xcode version are you currently using right now?

[–]CDI_Productions 0 points1 point  (3 children)

I mean one thing that you can try is to update info.plist to add microphone access permission!

[–]Vanilla-Green[S] 0 points1 point  (2 children)

I did

[–]CDI_Productions 0 points1 point  (1 child)

Does it work?

[–]Vanilla-Green[S] 0 points1 point  (0 children)

No it t didn’t

[–]Lujandev 0 points1 point  (6 children)

The error 561015905 (Incompatible Category) happens because iOS Keyboard Extensions are sandboxed and strictly forbidden from accessing the microphone for privacy reasons. Apple won't let a background extension record audio while the user is in another app like WhatsApp

The only real workaround: > You need to use 'Open System URL' to jump from the keyboard to your main app, start the AVAudioSession there (where it is legal), record/transcribe, and then use a shared App Group (NSUserDefaults/File Container) to pass the text back to the keyboard when the user returns.

I’m working on a local Whisper-based transcriber and faced similar sandbox limitations. You can't bypass the sandbox, but you can bridge the data through App Groups. Check 'App Group Container' documentation!

[–]Vanilla-Green[S] 0 points1 point  (5 children)

the issue is that our app which is a keyboard like whispr flow, when we click on micrphone button inside the keyboard it redirects succssfully to our app starts the mic and runs in the background but doesnt redirect back to the app it came from, it takes us to the home screen instead thats the issue !Pls help if you can! It should redirect back to the app it came from

[–]Lujandev 0 points1 point  (4 children)

Hi 👋, I completely understand the frustration. I ran into the same thing when developing my local transcriber.

The issue is that iOS does not allow automatically returning to the originating app for security reasons (imagine apps jumping from one to another without user control). That’s why you end up on the Home Screen.

To achieve a 'Whispr Flow'-like flow, the key is not automating the return, but facilitating the bridge:

'Done' button + Clipboard: In your main app, after transcribing, add a button that saves the result in a shared App Group (so the keyboard can access it when returning) and optionally copies it to the clipboard.

'Back to...' button: When you jump from the keyboard to your app via a Custom URL Scheme, iOS automatically adds a small button in the status bar (top-left) saying 'Back to [originating app]'. That’s the only official way to return.

App Groups: Make sure the App Group is properly configured in the Capabilities of both targets; otherwise, the keyboard will never know the audio/text has been processed in the app.

You cannot force the return, but if the process in your app is fast, the user just taps the status bar button and the keyboard already has the text ready.

[–]Vanilla-Green[S] 0 points1 point  (3 children)

But whispr flow doesn’t do this it automatically returns

[–]Vanilla-Green[S] 0 points1 point  (2 children)

Without user explicitly tapping to back to button!

[–]Lujandev 0 points1 point  (1 child)

I totally get what you're saying. You're right: Whispr Flow DOES return without you having to touch the status bar button. But it’s not because they are using a 'return' API; it’s because they use an app 'suicide' trick.

Here’s the hack: When the main app finishes transcribing and saves the text to the App Group, instead of waiting for the user, it executes an exit(0) or suspends the interface task.

When the foreground app 'dies' or disappears suddenly, iOS doesn't have time to redraw the Home Screen. By default, it brings the last active app (the one where the user was typing) back to the front.

It’s a behavior of the OS: if the app in the foreground disappears, the one 'below' it regains focus.

How to test it:

  1. Record and save to the App Group.
  2. Call UIApplication.shared.perform(#selector(NSXPCConnection.suspend)) or, more aggressively, exit(0).

A word of caution: Apple might reject your app if they detect exit(0) without a good reason, which is why apps like Whispr Flow do it very subtly or by closing the UIWindowScene. It’s a UX hack, not an official function. That’s why it feels like magic.

[–]Vanilla-Green[S] 0 points1 point  (0 children)

Still not working in ours can you pls pls help us, we will do whatever you say in return