I'm developing a video capture app using AVFoundation, designed specifically for use on a boat pylon to record slalom water skiing. This setup involves considerable vibration.
As you may know, the OIS that Apple began adding to lenses since the iPhone 7 is actually very problematic in high vibration circumstances, ironically creating very shaky video, whereas lenses without OIS produce perfectly stable video. Because of this, up until iPhone 14, the solution for my app was simply to use the Selfie lens, which did not have OIS.
Starting with iPhone 14 through iPhone 16 (non-Pro models), technical specs suggest the selfie lens still does not include OIS. However, I’m still seeing the same kind of shaky video behavior I see on OIS-equipped lenses. The one hardware change I see in this camera module is the addition of PDAF (Phase Detection Autofocus), so that is my best guess as to what is causing the unstable video.
1- Does that make any sense - that in high vibration settings, PDAF could create unstable video in the same way that OIS does? Or could it be something else that was changed between the iPhone 13 and 14 Selfie lens?
Thinking that the issue was PDAF, I figured that if I enabled my app to set a Manual Focus level, that ought to circumvent PDAF (expecting that if a lens is manually focusing, it can’t also be autofocusing via PDAF).
However, even with manual focus locked via AVCaptureDevice in my app, on the Selfie lens of an iPhone 16, the video still comes out very shaky, basically unusable. I also tested with the built-in Apple Camera app (using the press-and-hold to lock focus and exposure) and another 3rd party camera app to lock focus, all with the same results, so it's not that my app just isn't correctly doing manual focus.
So I'm stuck with these questions:
2- Does the selfie camera on iPhones 14–16 use PDAF even when focus is set to locked/manual mode?
3- Is there any way in AVFoundation to disable or suppress PDAF during video recording (e.g., a flag, device format setting, or private API)?
4- Is PDAF behavior or suppression documented or controllable via AVCaptureDevice or any related class?
5- If no control of PDAF is available, are there any best practices for stabilizing or smoothing this effect programmatically?
Note that I also have set my app to use the most aggressive form of stabilization available, so it defaults to .cinematicExtendedEnhanced, if that’s not available, then .cinematicExtended, etc. On the 16 Selfie lens, it is using .cinematicExtended. As an additional question:
6- Would those be the most appropriate stabilization settings for a high vibration environment, and if not, what would be best?
Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
My app is a camera app that supports Picture-in-Picture (PiP) mode.
Normally, when the device rotates, I get the device orientation from iOS and use it to rotate the camera feed so that the preview stays correctly aligned.
However, when the app enters PiP mode, it is considered to be in the background, and I can no longer receive orientation updates from the system.
As a result, I can’t apply rotation corrections to the camera video in PiP mode.
Is there any way to retrieve device orientation while the app is in the background (specifically during PiP mode)?
Any guidance would be greatly appreciated.
Thank you!
Is there any way we can detect the status of the Show When Muted and Show on Skip Back device settings in code ?
I have an app that has a WKWebView for watching YouTube videos. When the videos are windowed the audio seems fine, positionally as well. All perfectly.
When I fullscreen the video and it goes into the native visionOS video player the audio messes up.
It will suddenly sound like it is in your ears, or maybe even just one ear channel, or the position will be wrong. It might be fine for a moment but the second I touch the controls or move the window the sound jumps across the room, away from the window, or switches to stereo.
Sometimes exiting windows entirely you will still hear the videos playing. Even if you open the window back up and go to another screen and open another video, now you hear 2 videos playing at the same time with no way to stop the first one in the background, requiring to force restart the app.
It is all sorts of glitchy. I haven't the slightest clue what is happening here. I am strongly feeling this is a visionOS bug.
I tried using AVAudioSession to change some of the sound settings, and that makes zero difference in behavior.
Multiple testers have also reported this behavior and it has been seen on both visionOS 2.3 and 2.4 betas.
Thanks for the help! This is driving me mad! It is extremely consistent behavior!
Hello there!
Is there any list of voices that are always available on iOS/iPadOS devices?
It seems that AVSpeechSynthesisVoice(identifier: "com.apple.voice.compact.en-US.Samantha") is always available on all devices.
I thought that AVSpeechSynthesisVoice(identifier: "com.apple.ttsbundle.siri_Nicky_en-US_compact") and AVSpeechSynthesisVoice(identifier: "com.apple.ttsbundle.siri_Aaron_en-US_compact") were available by default on certain newer devices. Is this true?
I also noticed that on the same iPad where I was using those 2 voices (Nicky and Aaron) - when I updated to the iPadOS 26 beta, those voices were no longer available.
Any information you can share about which voices should be reliably available on which devices would be extremely helpful for our development. Thanks so much!
My app allows users to capture and save videos to the Photos app using the following Swift code:
PHPhotoLibrary.shared().performChanges {
PHAssetChangeRequest.creationRequestForAssetFromVideo(atFileURL: fileURL)
} completionHandler: { success, error in
Videos are successfully saved to Photos and play correctly. However, users report that when they AirDrop these videos from the Photos app to another device (e.g., iPad to iPhone), the videos are saved in the Files app on the receiving device instead of the Photos app. This issue is more common with higher-resolution videos, such as 2K, recorded in HEVC format at 30 fps.
I wasn't able to reproduce the issue locally.
I've found a thread in public apple forum: https://discussions.apple.com/thread/255276865?sortBy=rank but I wonder maybe there are some special flags that I should clear or add to my videos (e.g. PHAssetChangeRequest)?
Thank you!
Hello!
I'm experiencing an issue with iOS's audio routing system when trying to use Bluetooth headphones for audio output while also recording environmental audio from the built-in microphone.
Desired behavior:
Play audio through Bluetooth headset (AirPods)
Record unprocessed environmental audio from the iPhone's built-in microphone
Actual behavior:
When explicitly selecting the built-in microphone, iOS reports it's using it (in currentRoute.inputs)
However, the actual audio data received is clearly still coming from the AirPods microphone
The audio is heavily processed with voice isolation/noise cancellation, removing environmental sounds
Environment Details
Device: iPhone 12 Pro Max
iOS Version: 18.4.1
Hardware: AirPods
Audio Framework: AVAudioEngine (also tried AudioQueue)
Code Attempted
I've tried multiple approaches to force the correct routing:
func configureAudioSession() {
let session = AVAudioSession.sharedInstance()
// Configure to allow Bluetooth output but use built-in mic
try? session.setCategory(.playAndRecord,
options: [.allowBluetoothA2DP, .defaultToSpeaker])
try? session.setActive(true)
// Explicitly select built-in microphone
if let inputs = session.availableInputs,
let builtInMic = inputs.first(where: { $0.portType == .builtInMic }) {
try? session.setPreferredInput(builtInMic)
print("Selected input: \(builtInMic.portName)")
}
// Log the current route
let route = session.currentRoute
print("Current input: \(route.inputs.first?.portName ?? "None")")
// Configure audio engine with native format
let inputNode = audioEngine.inputNode
let nativeFormat = inputNode.inputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: nativeFormat) { buffer, time in
// Process audio buffer
// Despite showing "Built-in Microphone" in route, audio appears to be
// coming from AirPods with voice isolation applied - welp!
}
try? audioEngine.start()
}
I've also tried various combinations of:
Different audio session modes (.default, .measurement, .voiceChat)
Different option combinations (with/without .allowBluetooth, .allowBluetoothA2DP)
Setting session.setPreferredInput() both before and after activation
Diagnostic Observations
When AirPods are connected:
AVAudioSession.currentRoute.inputs correctly shows "Built-in Microphone" after setPreferredInput()
The actual audio data received shows clear signs of AirPods' voice isolation processing
Background/environmental sounds are actively filtered out...
When recording a test audio played near the phone (not through the app), the recording is nearly silent. Only headset voice goes through.
Questions
Is there a workaround to force iOS to actually use the built-in microphone while maintaining Bluetooth output?
Are there any lower-level configurations that might resolve this issue?
Any insights, workarounds, or suggestions would be greatly appreciated. This is blocking a critical feature in my application that requires environmental audio recording while providing audio feedback through headphones 😅
Does anyone know how to pronounce the sound of a specific instrument when you tap a button on the screen on your iPhone or iPad? Now, in the middle of creating a music learning app, I'm thinking of assigning monotones or chords to the button-like frames on the keyboard and fingerboard on the screen. Can it be achieved with SwiftUI chords alone? Once upon a time, MIDI level 1 I remember that there was a pronunciation function of the instrument, but I don't think about implementing the same function in the current OS. Please lend me your wisdom.
Topic:
Media Technologies
SubTopic:
Audio
I am currently developing an HEVC player using VideoToolbox on an iOS device.
I have successfully created an HEVC decoder that receives HEVC streams from our custom image capture and encoding device, and it can decode and display images properly.
However, when my image capture and encoding device configures the encoder to output HEVC streams with fragmented NALUs (i.e., an I-frame or P-frame is split and stored across multiple slice NALUs), the iOS decoder can be initialized successfully but fails to decode and output images.
Can VideoToolbox properly decode HEVC bitstreams when a single frame is split into multiple slice NALUs?
Key Observations:
1. Single-NALU frames work fine.
2. Multi-NALU frames (sliced I/P-frames) cause decoding failure.
3. The decoder session is created successfully (VTDecompressionSessionCreate returns no error).
I have added some custom views on my pip. These controls disappeared after opening the camera in the Xcode16 environment and iOS 18 system, and it was found that these custom views were not removed and seemed to be obscured. They were displayed normally in the Xcode15.4 environment. I would like to ask how to make my custom views display normally
Topic:
Media Technologies
SubTopic:
Video
I am developing an app that uses MusicKit to play music and then I need to have spoken words played to the user, while ducking the audio coming from MusicKit (application music player)
the built in Siri voices are not off sufficient quality so I am using an external service to create an mp3 file and then play this back using AVAudioSession
Sample code below
the problem I am having is that .duckOthers is not ducking the Application Music Player output
Is this a bug or am I doing this wrong?
// Configure audio session for system-wide ducking
try AVAudioSession.sharedInstance().setCategory(.playback, mode: .spokenAudio, options: [.duckOthers, .mixWithOthers])
try AVAudioSession.sharedInstance().setActive(true)
// Set the ducking level to maximum
try AVAudioSession.sharedInstance().setPreferredIOBufferDuration(0.005)
// Create and configure audio player
self.audioPlayer = try AVAudioPlayer(data: audioData)
self.audioPlayer?.delegate = self
self.audioPlayer?.volume = 1.0 // Ensure full volume for speech
self.audioPlayer?.prepareToPlay()
// Set the audio player's settings for maximum clarity
self.audioPlayer?.enableRate = false
self.audioPlayer?.pan = 0.0 // Center the audio
self.audioPlayer?.play()
How can I setup correctly AVSampleBufferDisplayLayer for video display when I have input picture format kCVPixelFormatType_32BGRA?
Currently video i visible in simulator, but not iPhone, miss I something?
Render code:
var pixelBuffer: CVPixelBuffer?
let attrs: [String: Any] = [
kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA,
kCVPixelBufferWidthKey as String: width,
kCVPixelBufferHeightKey as String: height,
kCVPixelBufferBytesPerRowAlignmentKey as String: width * 4,
kCVPixelBufferIOSurfacePropertiesKey as String: [:]
]
let status = CVPixelBufferCreateWithBytes(
nil,
width,
height,
kCVPixelFormatType_32BGRA,
img,
width * 4,
nil,
nil,
attrs as CFDictionary,
&pixelBuffer
)
guard status == kCVReturnSuccess, let pb = pixelBuffer else { return }
var formatDesc: CMVideoFormatDescription?
CMVideoFormatDescriptionCreateForImageBuffer(
allocator: nil,
imageBuffer: pb,
formatDescriptionOut: &formatDesc
)
guard let format = formatDesc else { return }
var timingInfo = CMSampleTimingInfo(
duration: .invalid,
presentationTimeStamp: currentTime,
decodeTimeStamp: .invalid
)
var sampleBuffer: CMSampleBuffer?
CMSampleBufferCreateForImageBuffer(
allocator: kCFAllocatorDefault,
imageBuffer: pb,
dataReady: true,
makeDataReadyCallback: nil,
refcon: nil,
formatDescription: format,
sampleTiming: &timingInfo,
sampleBufferOut: &sampleBuffer
)
if let sb = sampleBuffer {
if CMSampleBufferGetPresentationTimeStamp(sb) == .invalid {
print("Invalid video timestamp")
}
if (displayLayer.status == .failed) {
displayLayer.flush()
}
DispatchQueue.main.async { [weak self] in
guard let self = self else {
print("Lost reference to self drawing")
return
}
displayLayer.enqueue(sb)
}
frameIndex += 1
}
Hello,
I need to enumerate built-in media devices (cameras, microphones, etc.). For this purpose, I am using the CoreAudio and CoreMediaIO frameworks.
According to the table 'Daemon-Safe Frameworks' in Apple’s TN2083, CoreAudio is daemon-safe. However, the documentation does not mention CoreMediaIO.
Can CoreMediaIO be used in a daemon?
If not, are there any documented alternatives to detect built-in cameras in a daemon (e.g., via device classes in IOKit)?
Thank you in advance,
Pavel
Hello Apple Developer Community,
We are developing a music management platform for restaurants and cafes in Saudi Arabia. Our app enables businesses to schedule playlists and allows visitors to request songs via barcodes. Music playback is powered by Apple Music, and users must have their own Apple Music subscriptions to access the music. Our service charges a monthly subscription fee for these management features, not for music access itself.
Project Overview and MusicKit Role
Our app integrates MusicKit to leverage Apple Music’s catalog and playback capabilities. Users log in with their Apple Music accounts, ensuring they have an active subscription for music playback. Our platform’s value lies in its tools—playlist scheduling and song requests—which are built on top of MusicKit’s APIs. We offer these features exclusively in Saudi Arabia.
Legal Context in Saudi Arabia
In Saudi Arabia, to our understanding, no special licenses are required for playing music in commercial venues like restaurants and cafes. This means our clients can use Apple Music subscriptions for playback without additional performance rights licenses. While this aligns with local laws, we recognize that Apple’s global policies may impose stricter requirements, prompting our need for clarification.
Subscription Model and Monetization Concerns
We charge a monthly subscription fee for access to our app’s features (e.g., scheduling playlists and managing song requests). This fee is separate from the Apple Music subscription, which users must maintain for playback. However, Apple’s MusicKit terms state: "You agree not to require payment for or indirectly monetize access to the Apple Music service." We’re concerned whether our subscription model might be interpreted as indirectly monetizing Apple Music access, given its reliance on MusicKit for functionality.
Scheduling Feature and Synchronization Rights
Our app allows businesses to schedule playlists for general time slots (e.g., “play this playlist from 6 PM to 8 PM”). It does not support precise scheduling, such as playing a specific song at an exact moment (e.g., “play this song at 7:30 PM”). Apple’s guidelines mention that “deeper or more complex music integration” may require additional licenses, like synchronization rights. We’re unsure if our general scheduling feature crosses this threshold or remains within MusicKit’s standard usage.
Questions for Clarification
We’d greatly appreciate expert input on the following:
Monetization: Does our subscription fee for management features (scheduling and song requests) violate Apple’s policy against indirectly monetizing Apple Music access?
Local Context: Given that Saudi Arabia requires no additional licenses for commercial music playback, does this impact our compliance with Apple’s global terms?
Scheduling: Does our playlist scheduling for general time slots (not exact moments) fall within MusicKit’s permitted scope, or does it require further licensing?
Thank you in advance for any insights or guidance to ensure our app aligns with Apple’s policies!
Topic:
Media Technologies
SubTopic:
General
Tags:
Apple Music API
MusicKit
MusicKit JS
Apple Music Feed
Hello everyone,
I am working on an app that allows you to review your own music using Apple Music. Currently I am running into an issue with the skipping forwards and backwards outside of the app.
How it should work: When skipping forward or backwards on the lock or home screen of an iPhone, the next or previous song on an album should play and the information should change to reflect that in the app.
If you play a song in Apple Music, you can see a Now Playing view in the lock screen.
When you skip forward or backwards, it will do either action and it would reflect that when you see a little frequency icon on artwork image of a song.
What it's doing: When skipping forward or backwards on the lock or home screen of an iPhone, the next or previous song is reflected outside of the app, but not in the app.
When skipping a song outside of the app, it works correctly to head to the next song.
But when I return to the app, it is not reflected
NOTE: I am not using MusicKit variables such as Track, Album to display the songs. Since I want to grab the songs and review them I need a rating so I created my own that grabs the MusicItemID, name, artist(s), etc.
NOTE: I am using ApplicationMusicPlayer.shared
Is there a way to get the song to reflect in my app?
(If its easier, a simple example of it would be nice. No need to create an entire xprod file)
As of iOS 18, as far as I can tell, it appears there's still no AVPlayer options that allow users to toggle the caption / subtitle track on and off. Does anyone know of a way to do this with AVPlayer or with SwiftUI's VideoPlayer?
The following code reproduces this issue. It can be pasted into an app playground. This is a random video and a random vtt file I found on the internet.
import SwiftUI
import AVKit
import UIKit
struct ContentView: View {
private let video = URL(string: "https://server15700.contentdm.oclc.org/dmwebservices/index.php?q=dmGetStreamingFile/p15700coll2/15.mp4/byte/json")!
private let captions = URL(string: "https://gist.githubusercontent.com/samdutton/ca37f3adaf4e23679957b8083e061177/raw/e19399fbccbc069a2af4266e5120ae6bad62699a/sample.vtt")!
@State private var player: AVPlayer?
var body: some View {
VStack {
VideoPlayerView(player: player)
.frame(maxWidth: .infinity, maxHeight: 200)
}
.task {
// Captions won't work for some reason
player = try? await loadPlayer(video: video, captions: captions)
}
}
}
private struct VideoPlayerView: UIViewControllerRepresentable {
let player: AVPlayer?
func makeUIViewController(context: Context) -> AVPlayerViewController {
let controller = AVPlayerViewController()
controller.player = player
controller.modalPresentationStyle = .overFullScreen
return controller
}
func updateUIViewController(_ uiViewController: AVPlayerViewController, context: Context) {
uiViewController.player = player
}
}
private func loadPlayer(video: URL, captions: URL?) async throws -> AVPlayer {
let videoAsset = AVURLAsset(url: video)
let videoPlusSubtitles = AVMutableComposition()
try await videoPlusSubtitles.add(videoAsset, withMediaType: .video)
try await videoPlusSubtitles.add(videoAsset, withMediaType: .audio)
if let captions {
let captionAsset = AVURLAsset(url: captions)
// Must add as .text. .closedCaption and .subtitle don't work?
try await videoPlusSubtitles.add(captionAsset, withMediaType: .text)
}
return await AVPlayer(playerItem: AVPlayerItem(asset: videoPlusSubtitles))
}
private extension AVMutableComposition {
func add(_ asset: AVAsset, withMediaType mediaType: AVMediaType) async throws {
let duration = try await asset.load(.duration)
try await asset.loadTracks(withMediaType: mediaType).first.map { track in
let newTrack = self.addMutableTrack(withMediaType: mediaType, preferredTrackID: kCMPersistentTrackID_Invalid)
let range = CMTimeRangeMake(start: .zero, duration: duration)
try newTrack?.insertTimeRange(range, of: track, at: .zero)
}
}
}
Hello!
In iOS1.7.5, photogrammetry sessions cannot be performed on iPhones without LiDAR, but I don't think there is much difference in GPU performance between those with and without LiDAR. For example, the chips installed in the iPhone 14 Pro and iPhone 15 are the same A16 Bionic, and I think the GPU performance is also the same. Despite this, photogrammetry can be performed on the iPhone 14 Pro but not on the iPhone 15. Why is this?
In fact, we have confirmed that if you transfer images taken with an iPhone 16 without LiDAR to an iPhone 16 Pro and run a photogrammetry session using those images, a 3D model can be generated.
Also, will photogrammetry be able to be performed on high-performance iPhones without LiDAR in the future?
Topic:
Media Technologies
SubTopic:
Photos & Camera
I'm trying to implement Ambisonic B-Format audio playback on Vision Pro with head tracking. So far audio plays, head tracking works, and the sound appears to be stereo. The problem is that it is not a proper binaural playback when compared to playing back the audiofile with a DAW. Has anyone successfully implemented B-Format playback on Vision Pro? Any suggestions on my current implementation:
func playAmbiAudioForum() async {
do {
try AVAudioSession.sharedInstance().setCategory(.playback)
try AVAudioSession.sharedInstance().setActive(true)
// AudioFile laoding/preperation
guard let testFileURL = Bundle.main.url(forResource: "audiofile", withExtension: "wav") else {
print("Test file not found")
return
}
let audioFile = try AVAudioFile(forReading: testFileURL)
let audioFileFormat = audioFile.fileFormat
// create AVAudioFormat with Ambisonics B Format
guard let layout = AVAudioChannelLayout(layoutTag: kAudioChannelLayoutTag_Ambisonic_B_Format) else {
print("layout failed")
return
}
let format = AVAudioFormat(
commonFormat: audioFile.processingFormat.commonFormat,
sampleRate: audioFile.fileFormat.sampleRate,
interleaved: false,
channelLayout: layout
)
// write audiofile to buffer
guard let buffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: UInt32(audioFile.length)) else {
print("buffer failed")
return
}
try audioFile.read(into: buffer)
playerNode.renderingAlgorithm = .HRTF
// connecting nodes
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.outputNode, format: format)
audioEngine.prepare()
playerNode.scheduleBuffer(buffer, at: nil) {
print("File finished playing")
}
try audioEngine.start()
playerNode.play()
} catch {
print("Setup error:", error)
}
}
Dear Apple Developer Community,
I'm encountering a critical issue with the MusicLibrary.shared.createPlaylist() method in MusicKit that's affecting our app's core functionality. Despite implementing all recommended authorization checks, the app consistently freezes for some users when this method is called.
What we've already verified before calling createPlaylist():
Network connectivity is properly checked and confirmed
Apple Music authorization is explicitly requested via MusicAuthorization.request()
User subscription status is verified using MusicSubscription.current.canPlayCatalogContent
Despite these precautions, many users report that their app completely freezes when attempting to create a playlist. This is particularly concerning as playlist creation is a core feature of our application.
User-reported workarounds (with mixed success):
Some users have resolved the issue by restarting their devices or reinstalling our app
Others report success after enabling "Sync Library" in Settings → Music Unfortunately, a significant number of users are still experiencing the issue even after trying both solutions above
We've reviewed the MusicKit documentation thoroughly and ensured our implementation follows all best practices. Our app correctly handles permissions and uses the async/await pattern as required by the API.
Is there a known issue with the createPlaylist() method that might cause it to block indefinitely? Are there additional authorization steps or settings we should be checking before calling this method? Could this be related to how MusicKit communicates with Apple Music servers?
Any insights from the developer community or official guidance would be greatly appreciated as this issue is severely impacting our user experience.
Thank you for your assistance
I’m currently working on a project where I capture both depth frames and RGB frames using AVCaptureDataOutputSynchronizer. Depth frames are stored as raw binary data and RGB frames are saved with AVAssetWriter.
The issue I’m facing is that AVAssetWriter enforces a fixed framerate, meaning it adds or discards frames to maintain that rate (as I understand it). This causes a desynchronization between the depth and RGB frames, which is a problem because I need each depth frame to be exactly matched with the corresponding RGB frame as they were captured.
How can I ensure that the RGB frames are saved without AVAssetWriter modifying the frame count?