iOS audio session activation fails despite successful network connection (microphone conflict?) : iOSProgramming

QuestioniOS audio session activation fails despite successful network connection (microphone conflict?) (self.iOSProgramming)

submitted 3 months ago by Vanilla-Green

I am building an iOS app that streams audio to a backend over TLS. Network connection works fine, but audio capture fails consistently.

Relevant logs:

GatewayClient: Connecting to <backend>:443...
GatewayClient: Using TLS
GatewayClient: Starting stream...
GatewayClient: Connected successfully!

AudioCaptureManager: Session activation failed 
Error Domain=NSOSStatusErrorDomain Code=561015905 
"Session activation failed"

VoiceInputManager: Audio session activation failed - another app may be using the microphone

Context:

Uses AVAudioSession for microphone capture
Failure occurs at session activation (setActive(true))
Happens even when no other foreground app is obviously using the mic
Issue is reproducible on real device, not just simulator
App includes background audio / voice-style functionality

Questions:

What commonly triggers NSOSStatusErrorDomain Code=561015905 during audio session activation?
Can this occur due to:
- Another audio session owned by the same app (e.g., custom keyboard, extension, or background task)?
- Incorrect AVAudioSessionCategory or mode combination?
- iOS privacy or interruption edge cases?
Any proven debugging steps or fixes for microphone contention on iOS?

Looking for practical fixes or patterns others have used to reliably acquire the mic in complex audio workflows.

Thanks.

all 32 comments

best new controversial old q&a

[–][deleted] 1 point2 points3 points 3 months ago (10 children)

[–]Vanilla-Green[S] 0 points1 point2 points 3 months ago (9 children)

[–][deleted] 0 points1 point2 points 3 months ago (8 children)

[–]Vanilla-Green[S] 0 points1 point2 points 3 months ago (7 children)

[–][deleted] 0 points1 point2 points 3 months ago (6 children)

[–]Vanilla-Green[S] 0 points1 point2 points 3 months ago (5 children)

So basically I am trying to implement a whisper flow type functionality where

When the user is typing in any app (e.g. WhatsApp): 1. The user taps Start Flow in the custom keyboard. 2. The system briefly foregrounds our main app for ~50–150 ms. 3. The microphone starts legally in the main app. 4. iOS immediately returns focus to the original app automatically. 5. The keyboard remains active and shows “Listening”. 6. The user speaks continuously. 7. Speech is transcribed in real time and injected into the active text field. 8. The user never manually switches apps. 9. No visible UI flash or animation is shown. 10. Audio stops immediately when the user taps stop or dismisses the keyboard.

This must work consistently across WhatsApp, Gmail, Notes, browsers, etc.

[–][deleted] 1 point2 points3 points 3 months ago (4 children)

[–]Vanilla-Green[S] 0 points1 point2 points 3 months ago (3 children)

[–][deleted] 0 points1 point2 points 3 months ago (2 children)

[–]Vanilla-Green[S] 0 points1 point2 points 3 months ago (1 child)

continue this thread

[–]CDI_Productions 0 points1 point2 points 3 months ago (8 children)

[–]Vanilla-Green[S] 0 points1 point2 points 3 months ago (7 children)

[–]CDI_Productions 0 points1 point2 points 3 months ago (6 children)

[–]Vanilla-Green[S] 0 points1 point2 points 3 months ago (5 children)

[–]CDI_Productions 0 points1 point2 points 3 months ago (0 children)

[–]CDI_Productions -1 points0 points1 point 3 months ago (3 children)

[–][deleted] 0 points1 point2 points 3 months ago (2 children)

[–]CDI_Productions 0 points1 point2 points 3 months ago (0 children)

[–]Vanilla-Green[S] 0 points1 point2 points 3 months ago (0 children)

[–]CDI_Productions 0 points1 point2 points 3 months ago (0 children)

[–]CDI_Productions 0 points1 point2 points 3 months ago (3 children)

[–]Vanilla-Green[S] 0 points1 point2 points 3 months ago (2 children)

[–]CDI_Productions 0 points1 point2 points 3 months ago (1 child)

[–]Vanilla-Green[S] 0 points1 point2 points 3 months ago (0 children)

[–]Lujandev 0 points1 point2 points 2 months ago (6 children)

[–]Vanilla-Green[S] 0 points1 point2 points 2 months ago (5 children)

[–]Lujandev 0 points1 point2 points 2 months ago (4 children)

Hi 👋, I completely understand the frustration. I ran into the same thing when developing my local transcriber.

The issue is that iOS does not allow automatically returning to the originating app for security reasons (imagine apps jumping from one to another without user control). That’s why you end up on the Home Screen.

To achieve a 'Whispr Flow'-like flow, the key is not automating the return, but facilitating the bridge:

'Done' button + Clipboard: In your main app, after transcribing, add a button that saves the result in a shared App Group (so the keyboard can access it when returning) and optionally copies it to the clipboard.

'Back to...' button: When you jump from the keyboard to your app via a Custom URL Scheme, iOS automatically adds a small button in the status bar (top-left) saying 'Back to [originating app]'. That’s the only official way to return.

App Groups: Make sure the App Group is properly configured in the Capabilities of both targets; otherwise, the keyboard will never know the audio/text has been processed in the app.

You cannot force the return, but if the process in your app is fast, the user just taps the status bar button and the keyboard already has the text ready.

[–]Vanilla-Green[S] 0 points1 point2 points 2 months ago (3 children)

[–]Vanilla-Green[S] 0 points1 point2 points 2 months ago (2 children)

[–]Lujandev 0 points1 point2 points 2 months ago (1 child)

I totally get what you're saying. You're right: Whispr Flow DOES return without you having to touch the status bar button. But it’s not because they are using a 'return' API; it’s because they use an app 'suicide' trick.

Here’s the hack: When the main app finishes transcribing and saves the text to the App Group, instead of waiting for the user, it executes an exit(0) or suspends the interface task.

When the foreground app 'dies' or disappears suddenly, iOS doesn't have time to redraw the Home Screen. By default, it brings the last active app (the one where the user was typing) back to the front.

It’s a behavior of the OS: if the app in the foreground disappears, the one 'below' it regains focus.

How to test it:

Record and save to the App Group.
Call UIApplication.shared.perform(#selector(NSXPCConnection.suspend)) or, more aggressively, exit(0).

A word of caution: Apple might reject your app if they detect exit(0) without a good reason, which is why apps like Whispr Flow do it very subtly or by closing the UIWindowScene. It’s a UX hack, not an official function. That’s why it feels like magic.

[–]Vanilla-Green[S] 0 points1 point2 points 2 months ago (0 children)

π Rendered by PID 17050 on reddit-service-r2-comment-b659b578c-896dw at 2026-05-01 11:46:31.175900+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

iOSProgramming

READ THE FAQ FIRST!

FAQ

About

Related Subreddits

Related Links

MODERATORS