This post documents an issue I reported in feedback FB19610114 and see if anyone knows of a workaround. Here is a copy of the feedback.
Short version
Manipulation (SwiftUI OR RealityKit) fails to translate entities after changing rooms. By changing rooms, I mean a human wearing an Apple Vision Pro leaving one room and entering another room. Once this issue occurs, it impacts all apps that use these features. A device restart is the only solution I have to fix it.
Feedback FB19610114
This is an odd one. I'm using the new Manipulation Component in visionOS 26. Most of the time this works well. Sometime it stops working and when it does the only way to get it working again is to reboot the headset.
When this happens, I can continue to rotate and scale items, but translation no longer works. It is as if the item is stuck to a fixed point in the parent scene (window, volume, etc). When this bug occurs, it affects every app across the entire operating system that is using manipulation, including the RealityKit component AND the SwiftUI version. This is not limited to one app and is not limited to apps that I am working on. Once this error occurs, it affects literally any application across the operating system that is using this API, including apps from Apple.
I won't speculate on the cause of this, but I do know of one way where I can always get it to happen.
Here is how to reproduce it:
Make an Xcode project with a single entity that uses the Manipulation Component. There is no need to customize the configuration of this component. The default implementation will work.
Build and run this app on device. You can keep running from device or quit and launch the app like normal on device.
Open the app and manipulate the entity - it should work as expected.
Physically walk into another room. It is vital that you leave the current room that you are in and enter a different room entirely.
Use the digital crown to recenter your view and bring your window or volume to you.
Test the manipulation on the entity again - it should still be working as expected at this point.
Physically, move yourself and your headset into the original room where you started.
Use the digital crown to recenter your view and bring your window or volume to you.
Test the manipulation on the entity again - you should now see the issue.
When I follow the steps above, then 100% of the time manipulation translation stops working at this point. It will impact any application using this API. The only way to fix it is to restart my headset.
A few points to keep in mind
It does not matter if an app is actively being run from Xcode.
When this occurs, it impacts every single app, not just one.
When this occurs, rotation and scaling continue to work, but the entity/view cannot be translated.
This impacts BOTH the SwiftUI version and the RealityKit version.
When this occurs, the only way to "fix" it is to reboot the device.
General
RSS for tagDiscuss Spatial Computing on Apple Platforms.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
The new Mac virtual display feature on visionOS 2 offers a curved/panoramic window. I was wondering if this is simply a property that can be applied to a window, or if it involves an immersive mode or SceneKit/RealityKit?
We have been using RoomPlan in our app for 2+ years. Through a combination of in-app and manual coaching on scanning best practices, most users are able to achieve high-quality scans on a consistent basis.
In recent weeks, however, we have observed an increase in reports of degraded scanning performance, even from veteran users who had not previously encountered issues. The RoomCaptureView overlay is jittery and crooked, and the resulting scan file has significant issues, even for simple, well-lit rectangular rooms.
It is difficult to troubleshoot these issues given the number of variables at play, and the overall volume of reports is still relatively low, but we'd appreciate any guidance on known issues or workarounds that could help unblock our users who are being affected by this. I noticed that this post includes an acknowledgement of FB14454922 and FB15035788. Our issues seem slightly different as the scans are simply inaccurate and jittery without failing outright. I haven't found any other threads on similar issues.
https://developer.apple.com/documentation/realitykit/videomaterial
The documentation: "Video materials support transparency if the source video’s file format also supports transparency."
I have a transparency video(Hand.mov, HEVC with alpha), I can show the video with transparency background correctly on Vision Pro Simulates, but on physic Device the video has a black background. I'm sure the video format is ok because I can see get the texture from video and display it on an UnlitMaterial.
How can I show the transparency video correctly with the RealityKit/VideoMaterial?
After adding TextComponents to my Entities on visionOS, I have observed that visualBounds will ignore the TextComponents.
Documentation states that it should render a rounded rectangle mesh. These mashes are visible on the device, but not visible in the debugger ("Capture Entity Hierarchy") and ignored by visualBounds.
Am I missing something?
static func makeDirection(_ direction: Direction) -> Entity {
let text = Entity()
text.name = direction.rawValue
text.setScale(SIMD3(repeating: 5), relativeTo: nil)
text.transform.rotation = direction.rotation
text.components.set(direction.textComponent)
return text
}
My workaround is to add a disabled ModelEntity and take its bounds 😬
When assigning a ManipulationComponent to an Entity SceneEvents.WillRemoveEntity will be called for that Entity.
Expected Behavior: the Entity is not (even if temporarily) removed from the Scene and no SceneEvents will be triggered as a result of assigning a ManipulationComponent.
FB20872220
I would like to translate info in a three.js based web app as a 3D model in a volumetric window. Is it possible to do this in a similar manner as loading a web page in a WKWebView?
This is related to the WWDC presentation, What's new in Metal rendering for immersive apps..
Specifically, the macOS spatial streaming to visionOS feature: For reference: the page in the docs.
The presentation demonstrates it using a full immersive space and Metal rendering using compositor services.
I'd like clarity on a few things:
Is the remote device wireless, or must the visionOS device be connected via a wired connected?
Is there a limit to the number of remote devices, and if not, could macOS render different things per remote device simultaneously?
Can I also use mixed mode with passthrough enabled, instead of just a fully-immersive mode?
Can I use RealityKit instead of Metal? If so, may I have an example, or would someone point to an example?
Hi I have a monitoring app, that will take input video from uvc and process it using Metal, and eventually get a MTLTexture.
The problem I'm facing is I have to convert MTLTexture to CGImage then call TextureResource.replace, which is super slow. Metal processing speed is same as input frame rate(50pfs), but MTLTexture -> CGImage -> TextureResource only got 7fps...
Is there any way I can make it faster?
Topic:
Spatial Computing
SubTopic:
General
Tags:
Media Player
Frameworks
Media Accessibility
Core Media
Platform: visionOS 2.6
Framework: RealityKit, SwiftUIComponent: ImagePresentationComponent
I’m working with the new ImagePresentationComponent from visionOS 26 and hitting a rendering limitation when switching to .spatialStereoImmersive viewing mode within a WindowGroup context.
This is what I’m seeing:
Pure immersive space: ImagePresentationComponent with .spatialStereoImmersive mode works perfectly in a standalone ImmersiveSpace
Mode switching API: All mode transitions work correctly (logs confirm the component updates)
Spatial content: .spatialStereo mode renders correctly in both window and immersive contexts.
This is where it’s breaking for me:
Window context: When the same RealityView + ImagePresentationComponent is placed inside a WindowGroup (even when that window is floating in a mixed immersive space), switching to .spatialStereoImmersive mode shows no visual change
The API calls succeed, state updates correctly, but the immersive content doesn’t render.
Apple’s Spatial Gallery demonstrates exactly what I’m trying to achieve:
Spatial photos displayed in a window with what feels like horizontal scroll view using system window control bar, etc.
Tapping a spatial photo smoothly transitions it to immersive mode in-place.
The immersive content appears to “grow” from the original window position by just changing IPC viewing modes.
This proves the functionality should be possible, but I can’t determine the correct configuration.
So, my question to is:
Is there a specific RealityView or WindowGroup configuration required to enable immersive content rendering from window contexts that you know of?
Are there bounds/clipping settings that need to be configured to allow immersive content to “break out” of window constraints?
Does .spatialStereoImmersive require a specific rendering context that’s not available in windowed RealityView instances?
How do you think Apple’s SG app achieves this functionality?
For a little more context:
All viewing modes are available: [.mono, .spatialStereo, .spatialStereoImmersive]
The spatial photos are valid and work correctly in pure immersive space
Mixed immersive space is active when testing window context
No errors or warnings in console beyond the successful mode switching logs I’m getting
Any insights into the proper configuration for window-hosted immersive content
Hello Community,
I am currently developing an experimental VisionOS app, to investigate the social effects of the new Spatial Persona feature, for my bachelor thesis. My setup includes a simple board game for the participants, in which they can engage with their persona avatars.
I tried to use the TabletopKit for this setup, but ran into issues when starting the SharePlay session. When I testes my app, I couldn't see the other spatial persona anymore, despite the green SharePlay button indicating the session started. The other person can see my actions in their version of the app on the board, but can not interact with anything. Also, we are both seat on the default side of the seat.
I tried to remove the environment I added, because it doesn't seem to synch with the other player. When I tried the FaceTime feature in the simulator without the environment, I could then see the test robot avatar, but at a totally wrong place. It's seems like it isn't just my environment occluding the seats, but a flaw in the seating process as well.
When I tried the FaceTime feature in the simulator on the official test scene (TabletopKit Sample), I got the same incorrect placement and the warning "role(for:inSeatNumber:): The provided role identifier does not match a role in the current template."
So my questions are:
What needs to be changed so the TabletopKit can handle seating correctly?
How can I correctly use immersive scenes in combination with the TabletopKit?
I tried to keep the implementation of the TabletopKit example as close as possible, so I think it will enough to look into this codebase for now.
I debugged the position of seats and they are placed correctly in front of their equipment. The personas are just not placed on them.
Hi guys,
I noticed that Apple created a really engaging visual effect for browsing spatial videos in the app. The video appears embedded in glass panel with glowing edges and even shows a parallax effect as you move around. When I tried to display the stereo video using RealityView, however, the video entity always floats above the panel.
May I ask how does VisionOS implement this effect? Is there any approach to achieve this effect or example code I can use in my own code.
Thanks!
let component = GestureComponent(DragGesture())
iOS: ☑️
visionOS: ❌
This bug from beta to public, please fix it.
Hi I know it's possible to play equirectangular VR180 video either SBS or MV-HEVC. And for fisheye video, the only way I know is to convert it into an AIVU for playback.
Is there any way to directly play fisheye video using AVPlayer? Thanks a lot!
I'm trying to run a PhotogrammetrySession based on photos taken in an AVCaptureSession and stored as .heic files.
When I load the files I'm always seeing the error "Sample 0 missing LiDAR point cloud!" showing up for each individual sample.
Debugging shows that sample.depthDataMap is populated, also the .heic contains depth data which can be extracted using e.g. heif-convert on my Mac.
Comparing the .heic I created to one of the ObjectCaptureSession which doesn't show the LiDAR warning, I noticed the only difference being the HEIC information here:
So my questions are:
Are these the missing information in my manual capture causing this warning?
Can I somehow add these information in an AVCaptureSession?
Do these information allow better photogrammetry results?
Game Controller Input Limitations in visionOS Volumetric Windows
Hello Apple Developer Community,
I'm developing a game for visionOS and have encountered significant limitations with game controller input when using volumetric windows (WindowGroup with .volumetric style). I'd appreciate clarification on whether this is expected behavior and any guidance on best practices.
🧩 Issue Summary
When using a DualSense controller with a volumetric window in visionOS, only a subset of controller inputs are available to the app. The remaining inputs appear to be reserved by the system for UI navigation.
✅ Working Inputs (Volumetric Window)
D-Pad (all directions)
L3 (left thumbstick button click)
R3 (right thumbstick button click)
Menu button
Options button
❌ Not Working Inputs (Volumetric Window)
Left thumbstick analog movement (used for UI scrolling instead)
Right thumbstick analog movement (used for UI scrolling instead)
Face buttons (Cross, Circle, Square, Triangle / A, B, X, Y)
Shoulder buttons (L1, R1)
Triggers (L2, R2)
Key observation: When moving the left thumbstick in a volumetric window, the window's UI scrolls vertically instead of sending input to my app's GameController handlers. Similarly, face buttons seem to be reserved for system UI interactions.
⚙️ Implementation Details
I'm using the standard GameController framework:
Connect to controller via GCController.controllers()
Access extendedGamepad profile
Set up valueChangedHandler and pressedChangedHandler for all inputs
Handlers confirmed registered via logging
Working inputs (D-Pad, L3, R3) trigger immediately and consistently
Non-working inputs (thumbsticks, face buttons) never trigger
🧠 Critical Finding: ImmersiveSpace Works Perfectly
When testing the exact same code in an ImmersiveSpace (.mixed immersion style), all controller inputs work perfectly:
✅ Both thumbsticks provide full analog input
✅ All face buttons trigger their handlers
✅ All shoulder buttons and triggers work correctly
✅ 100% success rate with no intermittent issues
This suggests the issue isn't with my code, but rather how visionOS handles controller input differently between Volumetric Windows and ImmersiveSpace.
🧪 Test Environment
I created a minimal test project (Controller-Playground) to isolate the issue:
A simple ControllerTester class that registers all GameController handlers
A visual UI showing real-time input state
No game logic, RealityKit physics, or other complexity
Results
In volumetric window: Only D-Pad, L3, R3, Menu, Options work
In ImmersiveSpace: All inputs work perfectly
This confirms the limitation exists at the visionOS platform level, not in app code.
🧰 Attempted Workarounds
I tried the following without success:
Setting GCSupportsControllerUserInteraction = false in Info.plist
Setting UIRequiresFullScreen = true
Changing window styles (.plain, .volumetric)
Polling vs. handler-based input approaches
Various threading models (MainActor, separate thread)
Result: The only way to enable full controller support is to switch to ImmersiveSpace.
❓ Questions for Apple
Is this input reservation behavior in volumetric windows intended and documented?
Are game controllers expected to have limited functionality in volumetric windows while full functionality is reserved for ImmersiveSpace?
Is there a way to request full controller input access in a volumetric window, or is ImmersiveSpace the only option for complete controller support?
Where can I find official documentation about controller input differences between window types?
Are there any APIs or configuration options to disable system controller shortcuts in volumetric windows?
🎯 Impact
This limitation has a significant effect on game design and architecture:
Volumetric windows offer a multitasking-friendly, less immersive experience
ImmersiveSpace provides full controller support but may be more immersive than some games require
Games that only need basic D-Pad and button input can work fine in volumetric windows
Games requiring analog sticks or face buttons must currently use ImmersiveSpace
It would be very helpful if Apple could clarify or reference existing documentation regarding controller input handling in different visionOS window types. If such documentation doesn't exist yet, it might be valuable to include this information in future developer guides or best-practice documents.
🕹 Current Workaround
For now, I'm using:
D-Pad for character movement (digital 8-direction)
R3 (right stick click) as a substitute for the "X" button
This setup allows the game to function within a volumetric window, though full controller support still requires ImmersiveSpace.
📄 Request
If this is expected behavior, I may have simply missed the relevant documentation — could you please point me to any existing resources that explain this design?
If there isn't one yet, it would be great if future visionOS documentation could:
Clearly outline controller input behavior across window types
Provide guidance on when to use Volumetric Windows vs. ImmersiveSpace for games
Consider adding an API option to request full controller access when appropriate
If this is not expected behavior, I'm happy to file a detailed bug report with sample code.
💻 System Information
visionOS: Latest Simulator
Xcode: Latest version
Controller: Sony DualSense
Framework: GameController (standard extendedGamepad profile)
Test project: Minimal reproducible example available
Thank you for any clarification or guidance you can provide. This information would be valuable for many developers working on visionOS games.
How do you force quit apps on the Vision Pro simulator?
I've read online you can double press the digital crown on a real Vision Pro device but there is no such button on the simulator. So far I've tried
Pressing the home button twice (⇧⌘H)
Pressing the Siri button twice (⌥⇧⌘H)
None of them works
I'm on Xcode 15.0 beta 5 (15A5209g) & visionOS 1.0 beta 2 (21N5207e)
Hello everyone,
I am currently developing an experience for visionOS using RealityKit and I would like to achieve volumetric light effects, such as visible light rays or shafts through fog or dust.
I found this GitHub project: https://github.com/robcupisz/LightShafts, which demonstrates the kind of visual style I am aiming for. I would like to know if there is a way to create similar effects using RealityKit on visionOS.
So far, I have experimented with DirectionalLight, SpotLight, ImageBasedLight, and custom materials (e.g., additive blending on translucent meshes), but none of these approaches can replicate the volumetric light shaft look shown in the repository above.
Questions:
Is there a recommended technique or workaround in RealityKit to simulate light shafts or volumetric lighting?
Is creating a custom mesh (e.g., cone or volume geometry with gradient alpha and additive blending) the only feasible method?
Are there any examples, best practices, or sample projects from Apple or other developers that showcase a similar visual style?
Any advice or hints would be greatly appreciated. Thank you in advance!
Topic:
Spatial Computing
SubTopic:
General
Tags:
RealityKit
Reality Composer Pro
Shader Graph Editor
visionOS
Hi everyone,
I’m building a visualization app for VisionPro that uses SharePlay and GroupActivities to explore datasets collaboratively.
I’ve successfully implemented the new SharedWorldAnchor feature, and everything works well with nearby, local participants.
However, I’m stuck on one point:
How can I share a world anchor with remote participants who join via FaceTime as spatial personas?
Apple’s demo app (where multiple users move a plane model around) seems to suggest that this is possible.
For context, I’m building an immersive app with Metal rendering.
Any guidance or examples would be greatly appreciated!
Thanks,
Jens
The WWDC25 video and notes titled “Learn About Apple Immersive Video Technologies” introduced the Apple Spatial Audio Format (ASAF) and codec (APAC). However, despite references throughout on using immersive video, there is scant information on ASAF/APAC (including no code examples and no framework references), and I’ve found no documentation in Apple’s APIs/Frameworks about its implementation and use months on.
I want to leverage ambisonic audio in my app. I don’t want to write a custom AU if APAC will be opened up to developers. If you read the notes below along with the iPhone 17 advertising (“Video is captured with Spatial Audio for immersive listening”), it sounds like this is very much a live feature in iOS26.
Anyone know the state of play? I’m across how the PHASE engine works, which is unrelated to what I’m asking about here.
Original quote from video referenced above: “ASAF enables truly externalized audio experiences by ensuring acoustic cues are used to render the audio. It’s composed of new metadata coupled with linear PCM, and a powerful new spatial renderer that’s built into Apple platforms. It produces high resolution Spatial Audio through numerous point sources and high resolution sound scenes, or higher order ambisonics.”
”ASAF is carried inside of broadcast Wave files with linear PCM signals and metadata. You typically use ASAF in production, and to stream ASAF audio, you will need to encode that audio as an mp4 APAC file.”
”APAC efficiently distributes ASAF, and APAC is required for any Apple immersive video experience. APAC playback is available on all Apple platforms except watchOS, and supports Channels, Objects, Higher Order Ambisonics, Dialogue, Binaural audio, interactive elements, as well as provisioning for extendable metadata.”
Topic:
Spatial Computing
SubTopic:
General