So,
I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be.
Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS.
E.g.
InteractionStatistics:
- listeningStarted: 21:24:23 4480 2423
- timeTillFirstAboveNoiseFloor: 01.794
- timeTillLastNoiseAboveFloor: 02.383
- timeTillFirstSpeechDetected: 02.399
- timeTillTranscriptFinalized: 04.510
- timeTillFirstMLModelResponse: 04.938
- timeTillMLModelResponse: 05.379
- timeTillTTSStarted: 04.962
- timeTillTTSFinished: 11.016
- speechLength: 06.054
- timeToResponse: 02.578
- transcript: This is a test.
- mlModelResponse: Sure! I'm ready to help with your test. What do you need help with?
Here, between my audio input ending and the Text-2-Speech starting top play (using AVSpeechUtterance) the total response time was 2.5s.
Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly).
I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now?
I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)
Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
On an iPhone running iOS 26 beta 5, url(for: FilePath("subdir/asset.mov")) most always throws this error:
The URL for “subdir/asset.mov” couldn’t be retrieved: “asset.mov” couldn’t be copied to “subdir” because an item with the same name already exists.
Yet, contents(at: FilePath("subdir/asset.mov")) always returns Data for a playable AVMovie.
How can I avoid this url(for:) error?
The asset pack in question is downloaded. The error persists even after pack deletion, redownload, relaunch, and combinations of that.
// Assets repo root
subdir.aar
subdir/asset.mov
subdir/asset_thumb.heic
subdir/Manifest.json
// Manifest.json
{
"assetPackID": "subdir",
"downloadPolicy": {
"onDemand": {}
},
"fileSelectors": [
{
"directory": "subdir",
},
],
"platforms": [
"iOS",
"visionOS"
]
}
xcrun ba-package subdir/Manifest.json -o subdir.aar
xcrun ba-serve --host 192.168.0.10 -p 443 subdir.aar
Hello Apple Developer Community,
I am seeking clarification on the intended display behavior of HLS audio tracks within the iOS 26 (or current beta) native player, specifically concerning the NAME and LANGUAGE attributes of the EXT-X-MEDIA tag.
In our HLS manifests, we define alternative audio tracks using EXT-X-MEDIA tags, like so:
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-1",DEFAULT=YES,AUTOSELECT=YES,URI="audio_ja.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-2",URI="audio_en.m3u8"
Our observation is that when an audio track is selected and its name is displayed in the native iOS media controls (e.g., Control Center or within a full-screen video player's UI), the value specified in the NAME attribute ("AUDIO-1", "AUDIO-2") does not seem to be used. Instead, the display appears to derive from the LANGUAGE attribute ("ja", "en"), often showing the system's localized string for that language (e.g., "Japanese", "English").
We would like to understand the official or intended behavior regarding this.
Is it the expected behavior for the iOS native player to prioritize the LANGUAGE attribute (or its localized equivalent) over the NAME attribute for displaying the selected audio track's label?
If this is the intended design, what is the recommended best practice for developers who wish to present a custom, human-readable name for audio tracks (beyond the standard language name) in the native iOS UI?
Are there any specific AVPlayer properties or AVMediaSelectionOption considerations that would allow more granular control over this display, or is this entirely managed by the system based on the LANGUAGE attribute?
Any insights or official guidance on this behavior in iOS 26 (and potentially previous versions) would be greatly appreciated.
Thank you for your time and assistance.
AVAudioSessionCategoryOptionAllowBluetooth is marked as deprecated in iOS 8 in iOS 26 beta 5 when this option was not deprecated in iOS 18.6. I think this is a mistake and the deprecation is in iOS 26. Am I right?
It seems that the substitute for this option is "AVAudioSessionCategoryOptionAllowBluetoothHFP". The documentation does not make clear if the behaviour is exactly the same or if any difference should be expected... Has anyone used this option in iOS 26? Should I expect any difference with the current behaviour of "AVAudioSessionCategoryOptionAllowBluetooth"?
Thank you.
I'm developing a photo backup app.
To detect newly added or edited photos since the app launched, I keep a local dictionary in the format [localIdentifier: modification_date].
However, PHAsset.modificationDate is not reliable.
It often changes unexpectedly, possibly due to system operations like iCloud metadata updates.
Is there a more reliable way to detect whether a photo has been modified by user since the last app launch?
I'm thinking about using content hash instead, but I'm not sure how heavy this operation is in terms of performance.
In the past, when using Lightning, many external devices had to go through MFi certification. However, since the iPhone 15 switched from Lightning to USB-C, is MFi certification still required?
Our company has developed several UVC devices, and we have confirmed that iPads can read frames from external cameras through the external device type in AVFoundation. However, this is not supported on iPhones.
We are currently exploring feasible ways to enable UVC device support on iPhones. Is MFi certification the only option? If so, is the MFi certification process for USB-C the same as it was for Lightning? Does it still require purchasing an MFi chip and manufacturing specially designed USB-C cables?
Add RPSystemBroadcastPickerView to the app,
After clicking, no method of SampleHandler is triggered
The documentation for the Apple Music API indicates that the genreNames field for a given artist (see https://developer.apple.com/documentation/applemusicapi/artists/attributes-data.dictionary) is an array of strings. However, it only appears as though you return ONE SINGLE GENRE per Artist, regardless of how many genres might be attached to that artist's albums.
Am I missing something? Is there an artist where multiple genres may be returned, or is this a bug in the documentation?
Hello,
I have an existing AUv3 instrument plugin. In the plug in, users can access files (audio files, song projects) via a UIDocumentPickerViewController
In Logic Pro, (and some other hosts, but not all), the document picker is unable to receive touches, while a keyboard case is attached to the iPad.
Removing the case (this is an Apple brand iPad case) allows the interactions to resume and allows me to pick files in the usual way.
One of my users reports this non-responsive behavior occurs even after disconnecting their keyboard.
I have fiddled with entitlements all day, and have determined that is not the issue, since the keyboard disconnection appears to fix it every time for me.
Here is my, very boilerplate, presentation code :
guard let type = UTType("com.my.type") else {
return
}
let fileBrowser = UIDocumentPickerViewController(forOpeningContentTypes: [type])
fileBrowser.overrideUserInterfaceStyle = .dark
fileBrowser.delegate = self
fileBrowser.directoryURL = myFileFolderURL()
self.present(fileBrowser, animated: true) {
Hello everyone,
I’m working on an iOS app that fetches videos from the "Recently Deleted" album using the Photos framework in Swift. However, I’m unable to fetch any videos, even though the "Recently Deleted" album contains 233 items (including videos), as seen in the Photos app.
Environment:
iOS Version: 18.3.1
Xcode Version: 16.2
Swift Version: Swift 5
Device: iPhone (simulator and physical device both tested)
Photo Library Permission: "All Photos" access granted
Recently Deleted Lock: Face ID/Passcode is disabled for "Recently Deleted"
Some users reported that their images are not loading correctly in our app. After a lot of debugging we identified the following:
This only happens when the app is build for Mac Catalyst. Not on iOS, iPadOS, or “real” macOS (AppKit).
The images in question have unusual color spaces. We observed the issue for uRGB and eciRGB v2.
Those images are rendered correctly in Photos and Preview on all platforms.
When displaying the image inside of a UIImageView or in a SwiftUI Image, they render correctly.
The issue only occurs when loading the image via Core Image.
When comparing the different Core Image render graphs between AppKit (working) and Catalyst (faulty) builds, they look identical—except for the result.
Mac (AppKit):
Catalyst:
Something seems to be off when Core Image tries to load an image with foreign color space in Catalyst.
We identified a workaround: By using a CGImageDestination to transcode the image using the kCGImageDestinationOptimizeColorForSharing option, Image I/O will convert the image to sRGB (or similar) and Core Image is able to load the image correctly. However, one potentially loses fidelity this way.
Or might there be a better workaround?
Topic:
Media Technologies
SubTopic:
Photos & Camera
Tags:
Image I/O
Photos and Imaging
Core Image
Core Graphics
When using AVSampleBufferDisplayLayer to play uncompressed H.264 and H.265 video with B-frames more than 7, frame drops occur. The more B-frames there are, the more noticeable the frame drops become, for example 15 bframes.
Use FFmpeg to transcode a video file with visible timestamps and frame numbers (x264 or x265 ):
ffmpeg -i test.mp4 -vf "drawtext=fontsize=45:text=%{pts} %{n}:y=400" -c:v libx264 -x264-params "bframes=15:b-adapt=0" -crf 30 -y x264_bf15.mp4
ffmpeg -i test.mp4 -vf "drawtext=fontsize=45:text=%{pts} %{n}:y=400" -c:v libx265 -x265-params "bframes=15:b-adapt=0" -crf 30 -y x265_bf15.mp4
Use the demo player from this repository to reproduce the issue: https://github.com/msfrms/CustomPlayer
frame drops can be observed. And following log can be found in devices console.
mediaserverd <<<< IQ-CA >>>> piqca_gmstats_dump: FIQCA(0x1266f4000) recent frames: enqueued: 184, displayed: 138, dropped: 42, flushed: 0, evicted: 3, >16ms late: 2
PS. I was using iphone11 iOS14.6, to replay this issue.
May I ask why frame drops occur in this case?
Is there any configuration or API usage change that could help fix the frame drop issue?
Many thanks!
Hi guys,
Can I use CMIO to achieve the following feature on macOS when a USB device (Camera/Mic/Speaker) is connected:
When a third-party video conferencing app is not in a meeting, ensure the app defaults to using the USB device (Camera/Mic/Speaker).
When a third-party conferencing app is in a meeting, ensure the app automatically switches to the USB device (Camera/Mic/Speaker).
Issue:
Under certain conditions, using CallKit does not automatically enable the microphone.
Steps to Reproduce:
1.Start an outgoing call, then the user manually mutes the audio.
2.Receive a native incoming call, end the current call, then answer the new incoming call.(This order is important.)
3.End the incoming call.
4.Start another outgoing call and observe the microphone; do not manually mute or unmute.
Actual Behavior:
The audio icon indicates that the audio is unmuted, but the microphone remains off, and the small yellow dot in the top status bar (which represents the microphone) does not appear.
Expected Behavior:
The microphone should be on, consistent with the audio icon display, and the small yellow dot should appear in the top status bar.
Device:
iPhone 16 pro & iPhone 15 pro, iOS 18.0+
Can it be reproduced using speakerbox(CallKit Demo)?
YES
FairPlay-Protected HLS Files Not Transferred via Quick Start I have an iOS app that downloads HLS files, which are protected by FairPlay. These files are stored locally, and their locations are managed using Core Data. When playing these tracks, I use AVURLAsset to access the stored file paths.
Recently, a client upgraded to a new iPhone and used Quick Start to transfer data from his old device. While all other app data was successfully transferred, including Core Data records and UserDefaults, the actual HLS files were missing. As a result, the app retained metadata about the downloaded content, but the files themselves were gone, causing playback failures.
Does Quick Start exclude certain types of locally stored files, especially DRM-protected HLS downloads, or is the issue related to how FairPlay-protected content is handled during the transfer of locally stored files?
Topic:
Media Technologies
SubTopic:
Streaming
Tags:
FairPlay Streaming
HTTP Live Streaming
AVFoundation
I'm experiencing audio issues while developing for visionOS when playing PCM data through AVAudioPlayerNode.
Issue Description:
Occasionally, the speaker produces loud popping sounds or distorted noise
This occurs during PCM audio playback using AVAudioPlayerNode
The issue is intermittent and doesn't happen every time
Technical Details:
Platform: visionOS
Device: vision pro / simulator
Audio Framework: AVFoundation
Audio Node: AVAudioPlayerNode
Audio Format: PCM
I would appreciate any insights on:
Common causes of audio distortion with AVAudioPlayerNode
Recommended best practices for handling PCM playback in visionOS
Potential configuration issues that might cause this behavior
Has anyone encountered similar issues or found solutions? Any guidance would be greatly helpful.
Thank you in advance!
Bug Report: ScreenCaptureKit System Audio Capture Crashes with EXC_BAD_ACCESS
Summary
When using ScreenCaptureKit to capture system audio for extended periods, the application crashes with EXC_BAD_ACCESS in Swift's error handling runtime. The crash occurs in swift_getErrorValue when trying to process an error from the SCStream delegate method didStopWithError. This appears to be a framework-level issue in ScreenCaptureKit or its underlying ReplayKit implementation.
Environment
macOS Sonoma 14.6.1
Swift 5.8
ScreenCaptureKit framework
Detailed Description
Our application captures system audio using ScreenCaptureKit's audio capture capabilities. After successfully capturing for several minutes (typically after 3-4 segments of 60-second recordings), the application crashes with an EXC_BAD_ACCESS error. The crash happens when the Swift runtime attempts to process an error in the SCStreamDelegate.stream(_:didStopWithError:) method.
The crash consistently occurs in swift_getErrorValue when attempting to access the class of what appears to be a null object. This suggests that the error being passed from the system framework to our delegate method is malformed or contains invalid memory.
Steps to Reproduce
Create an SCStream with audio capture enabled
Add audio output to the stream
Start capture and write audio data to disk
Allow the capture to run for several minutes (3-5 minutes typically triggers the issue)
The app will crash with EXC_BAD_ACCESS in swift_getErrorValue
Code Sample
func stream(_ stream: SCStream, didStopWithError error: Error) {
print("Stream stopped with error: \(error)") // Crash occurs before this line executes
}
func stream(_ stream: SCStream, didOutputSampleBuffer sampleBuffer: CMSampleBuffer, of type: SCStreamOutputType) {
guard type == .audio, sampleBuffer.isValid else { return }
// Process audio data...
}
Expected Behavior
The error should be properly propagated to the delegate method, allowing for graceful error handling and recovery.
Actual Behavior
The application crashes with EXC_BAD_ACCESS when the Swift runtime attempts to process the error in swift_getErrorValue.
Crash Log Details
Thread #35, queue = 'com.apple.NSXPCConnection.m-user.com.apple.replayd', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
frame #0: 0x0000000194c3088c libswiftCore.dylib`swift::_swift_getClass(void const*) + 8
frame #1: 0x0000000194c30104 libswiftCore.dylib`swift_getErrorValue + 40
frame #2: 0x00000001057fba30 shadow`NewScreenCaptureService.stream(stream=0x0000600002de6700, error=Swift.Error @ 0x000000016b7b5e30) at NEW+ScreenCaptureService.swift:365:15
frame #3: 0x00000001057fc050 shadow`@objc NewScreenCaptureService.stream(_:didStopWithError:) at <compiler-generated>:0
frame #4: 0x0000000219ec5ca0 ScreenCaptureKit`-[SCStreamManager stream:didStopWithError:] + 456
frame #5: 0x00000001ca68a5cc ReplayKit`-[RPScreenRecorder stream:didStopWithError:] + 84
frame #6: 0x00000001ca696ff8 ReplayKit`-[RPDaemonProxy stream:didStopWithError:] + 224
Printing description of stream._streamQueue:
error: ObjectiveC.id:4294967281:18: note: 'id' has been explicitly marked unavailable here
public typealias id = AnyObject
^
error: /var/folders/v4/3xg1hmp93gjd8_xlzmryf_wm0000gn/T/expr23-dfa421..cpp:1:65: 'id' is unavailable in Swift: 'id' is not available in Swift; use 'Any'
Swift._DebuggerSupport.stringForPrintObject(Swift.UnsafePointer<id>(bitPattern: 0x104ae08c0)!.pointee)
^~
ObjectiveC.id:2:18: note: 'id' has been explicitly marked unavailable here
public typealias id = AnyObject
^
warning: /var/folders/v4/3xg1hmp93gjd8_xlzmryf_wm0000gn/T/expr23-dfa421..cpp:5:7: initialization of variable '$__lldb_error_result' was never used; consider replacing with assignment to '_' or removing it
var $__lldb_error_result = __lldb_tmp_error
~~~~^~~~~~~~~~~~~~~~~~~~
_
Before the crash, we observed this error message in the console:
[ERROR] *****SCStream*****RemoteAudioQueueOperationHandlerWithError:1015 Error received from the remote queue -16665
Additional Context
The issue occurs consistently after approximately 3-4 successful audio segment recordings of 60 seconds each
Commenting out custom segment rotation logic does not prevent the crash
The crash involves XPC communication with Apple's ReplayKit daemon
The error appears to be corrupted or malformed when crossing the XPC boundary
Workarounds Attempted
Added proper thread safety for all published properties using DispatchQueue.main.async
Implemented more robust error handling in the delegate methods
None of these approaches prevented the crash since it occurs at the Swift runtime level before our code executes.
Impact
This issue prevents reliable long-duration audio capture using ScreenCaptureKit.
This bug significantly limits the usefulness of ScreenCaptureKit for any application requiring continuous system audio capture for more than a few minutes.
Perhaps this issue might be related to a macOS bug where the system dialog indicates that the screen is being shared, even though nothing is actually being shared. Moreover, when attempting to stop sharing, nothing happens.
AVCaptureVideoDataOutput.preparesCellularRadioForNetworkConnection requires com.apple.developer.avfoundation.video-data-output-prepares-cellular-radio-for-machine-readable-code-scanning. But I cannot acquire its entitlement. I can't find its entitlement on 'Certificates, Identifiers & Profiles'. Any solutions?
Provisioning profile "iOS Team Provisioning Profile: ......" doesn't include the com.apple.developer.avfoundation.video-data-output-prepares-cellular-radio-for-machine-readable-code-scanning entitlement.
Hi everyone,
I'm developing a visionOS app for Apple Vision Pro, and I've encountered an issue related to window resizing at runtime when using AVPlayer to play a live HLS stream.
✅ What I'm Trying to Do
Play a live HLS stream (from Wowza) inside my app using AVPlayer.
Support resizing the immersive window using Vision Pro’s built-in runtime scaling gesture.
Stream works fine at default window size when the app launches.
❌ Problem
If I resize the app’s window at runtime (using the Vision Pro pinch-drag gesture), then try to start the stream, it does not play.
Instead, it just shows the "Loading live stream..." state and never proceeds to playback.
This issue only occurs after resizing the window — if I don’t resize, the stream works perfectly every time.
🧪 What I’ve Tried
Verified the HLS URL — it’s working and plays fine in Safari and in the app before resizing.
Set .automaticallyWaitsToMinimizeStalling = false on AVPlayer.
Observed that .status on AVPlayerItem never reaches .readyToPlay after resizing.
Tried to force window size back using UIWindowScene.requestGeometryUpdate(...), but behavior persists.
I'm developing an iOS app that requires continuous audio recording.
Currently, when a phone call comes in, the AVAudioSession is interrupted and recording stops completely during the ringing phase.
While I understand recording should stop if the call is answered, my app needs to continue recording while the phone is merely ringing.
I've observed that Apple's Voice Memos app maintains recording during incoming call rings. This indicates the hardware and iOS are capable of supporting this functionality.
Request
Please advise on any available AVAudioSession configurations or APIs that would allow my app to:
Continue recording during an incoming call ring
Only stop recording if/when the call is actually answered
Impact
This interruption significantly impacts the user experience and core functionality of my app. Workarounds like asking users to enable airplane mode are impractical and create a poor user experience.
Questions
Is there an approved way to maintain microphone access during call rings?
If not currently possible, could this capability be considered for addition to a future iOS SDK?
Are there any interim solutions or best practices Apple recommends for this use case?
Thank you for your help.
SUPPORT INFORMATION
Did someone from Apple ask you to submit a code-level support request?
No
Do you have a focused test project that demonstrates your issue?
Yes, I have a focused test project to submit with my request
What code level support issue are you having?
Problems with an Apple framework API in my app