Can you help to write a code able to pick an element a bit far from me, then bring it near to me, flick it a bit and then send it back to its original position when I release it?
Thanks a lot,
Christophe
Discuss spatial computing on Apple platforms and how to design and build an entirely new universe of apps and games for Apple Vision Pro.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
hi, I'm trying to create a virtual movie theater, but after running computeDiffuseReflectionUVs.py and applying attenuation map, I noticed the light falloff effect just covers over the objects. I used apple provided attenuation map (did not specify the attenuation map name on python script) with sample size of 6000. I thought the python script would calculate vertices and create shadow for, say, back of the chairs. Am I understanding this wrong?
Hi everyone, I'm developing a MR Vision Pro app where I’d like to anchor virtual objects (such as UI elements) around the user's arm. However, I’ve noticed that Vision Pro seems to mask out the area where the user’s real arm is, hiding virtual content in that region so that you see your real arm.
Is there a way to render virtual elements on the user's arm—so that it looks like the object is placed directly on the arm despite the real-world passthrough? I was hoping there might be a way to adjust the depth or behavior of this masked-out region. Any insights or workarounds would be greatly appreciated! Thanks :)
After writing the code, when debugging on VisionPro, the program will encounter a blocking situation when running from Xcode to VisionPro. It will take a long time for the execution information to appear on the Xcode console
I have tested the MagnifyGesture code below on multiple devices:
Vision Pro - working
iPhone - working
iPad - working
macOS - not working
In Reality Composer Pro, I have also added the below components to the test model entity:
Input Target
Collision
For macOS, I tried the touchpad pinch gesture and mouse scroll wheel, but neither approach works. How to resolve this issue? Thank you.
import SwiftUI
import RealityKit
import RealityKitContent
struct ContentView: View {
var body: some View {
RealityView { content in
// Add the initial RealityKit content
if let immersiveContentEntity = try? await Entity(named: "Immersive", in: realityKitContentBundle) {
content.add(immersiveContentEntity)
}
}
.gesture(MagnifyGesture()
.targetedToAnyEntity()
.onChanged(onMagnifyChanged)
.onEnded(onMagnifyEnded))
}
func onMagnifyChanged(_ value: EntityTargetValue<MagnifyGesture.Value>) {
print("onMagnifyChanged")
}
func onMagnifyEnded(_ value: EntityTargetValue<MagnifyGesture.Value>) {
print("onMagnifyEnded")
}
}
Topic:
Spatial Computing
SubTopic:
General
I want to create a screenshot (static image) of the current view on the Apple Vision Pro using written code in visionOS. Unfortunately, I currently can’t find a way to achieve this. The only option I’ve found so far is through Reality Composer Pro. However, since I want to accomplish this directly through code, this approach is not an option for me.
I am developing an app in VisionPro using RealityKit and ARKit. I want my RealityKit entity looks more realistic. So it is important to render its shadow based on light in real world.
e.g. When I turn on the light in real world, the shadow of the entity will change. Can this effect be implemented in VisionPro?
In several visionOS apps, we readjust our scenes to the user's eye level (their heads). But, we have encountered issues whereby the WorldTrackingProvider returns bad/incorrect positions for the first x number of frames.
See below code which you can copy paste in any Immersive Space. Relaunch the space and observe the numberOfBadWorldInfos value is inconsistent.
a. what is the most reliable way to get the devices's position?
b. is this indeed a bug?
c. are we using worldInfo improperly?
d. as a workaround, in our apps we set to 10 the number of frames to let pass before using worldInfo, should we set our threshold differently?
import ARKit
import Combine
import OSLog
import SwiftUI
import RealityKit
import RealityKitContent
let SUBSYSTEM = Bundle.main.bundleIdentifier!
struct ImmersiveView: View {
let logger = Logger(subsystem: SUBSYSTEM, category: "ImmersiveView")
let session = ARKitSession()
let worldInfo = WorldTrackingProvider()
@State var sceneUpdateSubscription: EventSubscription? = nil
@State var deviceTransform: simd_float4x4? = nil
@State var numberOfBadWorldInfos = 0
@State var isBadWorldInfoLoged = false
var body: some View {
RealityView { content in
try? await session.run([worldInfo])
sceneUpdateSubscription = content.subscribe(to: SceneEvents.Update.self) { event in
guard let pose = worldInfo.queryDeviceAnchor(atTimestamp: CACurrentMediaTime()) else {
return
}
// `worldInfo` does not return correct values for the first few frames (exact number of frames is unknown)
// - known SO: https://stackoverflow.com/questions/78396187/how-to-determine-the-first-reliable-position-of-the-apple-vision-pro-device
deviceTransform = pose.originFromAnchorTransform
if deviceTransform!.columns.3.y < 1.6 {
numberOfBadWorldInfos += 1
logger.warning("\(#function) \(#line) deviceTransform.columns.3.y \(deviceTransform!.columns.3.y), numberOfBadWorldInfos \(numberOfBadWorldInfos)")
} else {
if !isBadWorldInfoLoged {
logger.info("\(#function) \(#line) deviceTransform.columns.3.y \(deviceTransform!.columns.3.y), numberOfBadWorldInfos \(numberOfBadWorldInfos)")
}
isBadWorldInfoLoged = true // stop logging.
}
}
}
}
}
Hi Apple Team,
I’m working on a human portrait scanning application using PhotogrammetrySession, and I’ve been very impressed by the results. Thank you for building such a powerful and accessible photogrammetry solution into macOS!
I do, however, have a question regarding mesh detail limitations on different Mac hardware configurations.
When using PhotogrammetrySession.Request.Detail.custom and trying to set maximumPolygonCount = 1000000, I see the following log message:
Clamped max poly count: 1000000 to device limit. 250000 is used.
This is on an M1 Max with 32 GB RAM.
I’m aware that PhotogrammetrySession.limits can report values like maximumInputImageDimension and maximumNumberOfInputImages, but I haven’t found documentation on how the maximumPolygonCount is determined, and what hardware specs influence it.
Is it tied more to:
• GPU performance (e.g. neural/graphics cores)?
• CPU architecture?
• Memory size or bandwidth?
• Or is it fixed per SoC generation?
I’d love to understand what kind of hardware upgrades (e.g. moving to M4 Pro or increasing RAM) could allow me to increase mesh complexity and generate more detailed models.
Any insights would be greatly appreciated—and if this is covered in upcoming WWDC sessions or documentation, I’d be happy to tune in.
Thanks in advance!
KitCheng
I want to display a huge image in RealityView in 3d space on Vision Pro. of course instead of one giant file I'm using a lot of big images.
to achieve this, I'm generating multiple planes exactly beside each others and put each image on them. although the planes are exactly beside each others but there is still a white gap between them.(image below)
**Does anybody know how to fix this issue? **
Topic:
Spatial Computing
SubTopic:
General
Tags:
RealityKit
Reality Composer Pro
Shader Graph Editor
visionOS
I want to select a sub model under a large model in a mixed space, and when I select this sub model, I will add a stroke to it, similar to the effect of selecting a model in Reality Composer Pro ,How to create entity strokes similar to this effect
In my Reality Composer Pro workflow for Vision Pro development, I’m using xcrun realitytool image to pre-compress textures into .ktx format, typically using ASTC block compression. These textures are used for cubemaps and environment assets.
I’ve noticed that regardless of the image content—whether it’s a highly detailed photo or a completely black image—once compressed with the same ASTC block size (e.g., ASTC_8x8), the resulting .ktx file size is nearly identical. There appears to be no content-aware logic that adapts the compression ratio to the actual texture complexity.
In contrast, Unreal Engine behaves differently: even when all cubemap faces are imported at the same resolution as DDS textures, the engine performs content-aware compression during packaging:
Low-complexity images are compressed more aggressively
The final packaged file size varies based on content complexity
Since Reality Composer Pro requires textures to be pre-compressed as .ktx, there’s no opportunity for runtime optimization or per-image compression adjustment.
Just wondering: is there any recommended way to implement content-aware compression for .ktx textures in Reality Composer Pro?
Or any best practices to optimize .ktx sizes based on image complexity?
Thanks!
Hi, would love for your help in that matter.
I try to get the position in space of two QR codes to make an alignment to their positions in space. The detection shows that the QR codes position is always 0,0,0 and I don't understand why. Here's my code:
import SwiftUI
import RealityKit
import RealityKitContent
struct AnchorView: View {
@ObservedObject var qrCoordinator: QRCoordinator
@ObservedObject var coordinator: ImmersiveCoordinator
let qrName: String
@Binding var startQRDetection: Bool
@State private var anchor: AnchorEntity? = nil
@State private var detectionTask: Task<Void, Never>? = nil
var body: some View {
RealityView { content in
// Add the QR anchor once (must exist before detection starts)
if anchor == nil {
let imageAnchor = AnchorEntity(.image(group: "QRs", name: qrName))
content.add(imageAnchor)
anchor = imageAnchor
print("📌 Created anchor for \(qrName)")
}
}
.onChange(of: startQRDetection) { enabled in
if enabled {
startDetection()
} else {
stopDetection()
}
}
.onDisappear {
stopDetection()
}
}
private func startDetection() {
guard detectionTask == nil, let anchor = anchor else { return }
detectionTask = Task {
var detected = false
while !Task.isCancelled && !detected {
print("🔎 Checking \(qrName)... isAnchored=\(anchor.isAnchored)")
if anchor.isAnchored {
// wait a short moment to let transform update
try? await Task.sleep(nanoseconds: 100_000_000)
let worldPos = anchor.position(relativeTo: nil)
if worldPos != .zero {
// relative to modelRootEntity if available
var posToSave = worldPos
if let modelEntity = coordinator.modelRootEntity {
posToSave = anchor.position(relativeTo: modelEntity)
print("converted to model position")
} else {
print("⚠️ modelRootEntity not available, using world position")
}
print("✅ \(qrName) detected at position: world=\(worldPos) saved=\(posToSave)")
if qrName == "reanchor1" {
qrCoordinator.qr1Position = posToSave
let marker = createMarker(color: [0,1,0])
marker.position = .zero // sits directly on QR
marker.position = SIMD3<Float>(0, 0.02, 0)
anchor.addChild(marker)
print("marker1 added")
} else if qrName == "reanchor2" {
qrCoordinator.qr2Position = posToSave
let marker = createMarker(color: [0,0,1])
marker.position = posToSave // sits directly on QR
marker.position = SIMD3<Float>(0, 0.02, 0)
anchor.addChild(marker)
print("marker2 added")
}
detected = true
} else {
print("⚠️ \(qrName) anchored but still at origin, retrying...")
}
}
try? await Task.sleep(nanoseconds: 500_000_000) // throttle loop
}
print("🛑 QR detection loop ended for \(qrName)")
detectionTask = nil
}
}
private func stopDetection() {
detectionTask?.cancel()
detectionTask = nil
}
private func createMarker(color: SIMD3<Float>) -> ModelEntity {
let sphere = MeshResource.generateSphere(radius: 0.05)
let material = SimpleMaterial(color: UIColor(
red: CGFloat(color.x),
green: CGFloat(color.y),
blue: CGFloat(color.z),
alpha: 1.0
), isMetallic: false)
let marker = ModelEntity(mesh: sphere, materials: [material])
marker.name = "marker"
return marker
}
}
Topic:
Spatial Computing
SubTopic:
General
We’re trying to build a custom player for Unity. For this, we’re using AVPlayer with AVPlayerItemVideoOutput to get textures. However, we noticed that playback is not smooth and the stream often freezes.
For testing, we used this 8K video:
https://deovr.com/nwfnq1
The video was played using the following code:
@objc public func playVideo(urlString: String)
{
guard let url = URL(string: urlString) else { return }
let pItem = AVPlayerItem(url: url)
playerItem = pItem
pItem.preferredForwardBufferDuration = 10.0
let pixelBufferAttributes: [String: Any] = [
kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange,
kCVPixelBufferMetalCompatibilityKey as String: true,
]
let output = AVPlayerItemVideoOutput( pixelBufferAttributes: pixelBufferAttributes )
pItem.add(output)
playerItemObserver = pItem.observe(\.status)
{
[weak self] pItem, _ in
guard pItem.status == .readyToPlay else { return }
self?.playerItemObserver = nil
self?.player.play()
}
player = AVPlayer(playerItem: pItem)
player.currentItem?.preferredPeakBitRate = 35_000_000
}
When AVPlayerItemVideoOutput is attached, the video stutters and the log looks like this:
🟢 Playback likely to keep up
🟡 Buffer ahead: 4.08s | buffer: 4.08s
🟡 Buffer ahead: 4.08s | buffer: 4.08s
🟡 Buffer ahead: -0.07s | buffer: 0.00s
🟡 Buffer ahead: 2.94s | buffer: 3.49s
🟡 Buffer ahead: 2.50s | buffer: 4.06s
🟡 Buffer ahead: 1.74s | buffer: 4.30s
🟡 Buffer ahead: 0.74s | buffer: 4.30s
🟠 Playback may stall
🛑 Buffer empty
🟡 Buffer ahead: 0.09s | buffer: 4.30s
🟠 Playback may stall
🟠 Playback may stall
🛑 Buffer empty
🟠 Playback may stall
🟣 Buffer full
🟡 Buffer ahead: 1.41s | buffer: 1.43s
🟡 Buffer ahead: 1.41s | buffer: 1.43s
🟡 Buffer ahead: 1.07s | buffer: 1.43s
🟣 Buffer full
🟡 Buffer ahead: 0.47s | buffer: 1.65s
🟠 Playback may stall
🛑 Buffer empty
🟡 Buffer ahead: 0.10s | buffer: 1.65s
🟠 Playback may stall
🟡 Buffer ahead: 1.99s | buffer: 2.03s
🟡 Buffer ahead: 1.99s | buffer: 2.03s
🟣 Buffer full
🟣 Buffer full
🟡 Buffer ahead: 1.41s | buffer: 2.00s
🟡 Buffer ahead: 0.68s | buffer: 2.27s
🟡 Buffer ahead: 0.09s | buffer: 2.27s
🟠 Playback may stall
🛑 Buffer empty
🟠 Playback may stall
When we remove AVPlayerItemVideoOutput from the player, the video plays smoothly, and the output looks like this:
🟢 Playback likely to keep up
🟡 Buffer ahead: 1.94s | buffer: 1.94s
🟡 Buffer ahead: 1.94s | buffer: 1.94s
🟡 Buffer ahead: 1.22s | buffer: 2.22s
🟡 Buffer ahead: 1.05s | buffer: 3.05s
🟡 Buffer ahead: 1.12s | buffer: 4.12s
🟡 Buffer ahead: 1.18s | buffer: 5.18s
🟡 Buffer ahead: 0.72s | buffer: 5.72s
🟡 Buffer ahead: 1.27s | buffer: 7.28s
🟡 Buffer ahead: 2.09s | buffer: 3.03s
🟡 Buffer ahead: 4.16s | buffer: 6.10s
🟡 Buffer ahead: 6.66s | buffer: 7.09s
🟡 Buffer ahead: 5.66s | buffer: 7.09s
🟡 Buffer ahead: 4.66s | buffer: 7.09s
🟡 Buffer ahead: 4.02s | buffer: 7.45s
🟡 Buffer ahead: 3.62s | buffer: 8.05s
🟡 Buffer ahead: 2.62s | buffer: 8.05s
🟡 Buffer ahead: 2.49s | buffer: 3.53s
🟡 Buffer ahead: 2.43s | buffer: 3.38s
🟡 Buffer ahead: 1.90s | buffer: 3.85s
We’ve tried different attribute settings for AVPlayerItemVideoOutput. We also removed all logic related to reading frame data, but the choppy playback still remained.
Can you advise whether this is a player issue or if we’re doing something wrong?
Greetings. I am having this issue with a Unity Polyspatial VisionOS app.
We have our main Bounded Volume for our app.
We have other Native UI windows that appear when we interact with objects in our Bounded Volume.
If a user closes our main Bounded Volume...sometimes it quits the app. Sometimes it doesn't.
If we go back to the home screen and reopen the app, our main Bounded Volume doesn't always appear, and just the Native UI windows we left open are visible. But, we can sometimes still hear sounds that are playing in our Bounded Volume.
What solutions are there to make sure our Bounded Volume always appears when the app is open?
Hi, I have been using RealityRenderer to render scenes in MacOS as spatial videos and view it in Vision Pro and it is awesome. I understand that it uses PerspectiveCamera to render. I wanted to know what is the default FOV for this camera and how much can we push it? I want to ideally render a scene with 180 degrees of fov. Thanks
Hi,
I'm encountering an issue in our app that uses RoomPlan and ARsession for scanning.
After prolonged use—especially under heavy load from both the scanning process and other unrelated app operations—the iPhone becomes very hot, and the following warning begins to appear more frequently:
"ARSession <0x107559680>: The delegate of ARSession is retaining 11 ARFrames. The camera will stop delivering camera images if the delegate keeps holding on to too many ARFrames. This could be a threading or memory management issue in the delegate and should be fixed."
I was able to reproduce this behavior using Apple’s RoomPlanExampleApp, with only one change: I introduced a CPU-intensive workload at the end of the startSession() function:
DispatchQueue.global().asyncAfter(deadline: .now() + 5) {
for i in 0..<4 {
var value = 10_000
DispatchQueue.global().async {
while true {
value *= 10_000
value /= 10_000
value ^= 10_000
value = 10_000
}
}
}
}
I suspect this is some RoomPlan API problem that's why a filed an feedback: 17441091
I am developing a Unity application for the Apple Vision Pro using PolySpatial and RealityKit integration.
The goal is to create a graspable object (for example, a handheld cube) that includes a secondary camera. When the user grabs and moves the object, the secondary camera should render its view to a RenderTexture, which is displayed on a quad attached to the object, simulating a live camera screen.
In the Unity Editor, this setup works correctly. The RenderTexture updates in real time, and the quad displays the camera’s view as expected.
However, when building and running the application on the Vision Pro, the quad only displays the clear background color of the secondary camera. No scene content appears. The graspable interaction itself works fine: the object can be grabbed and moved as intended.
Steps I have taken:
Created a new layer (CameraFeed) and assigned the relevant objects to it.
Set the secondary camera’s culling mask to render only the CameraFeed layer.
Assigned the RenderTexture as the camera’s target texture.
Applied the RenderTexture to an Unlit/Texture material on a quad.
Confirmed the camera is active and correctly positioned relative to the object.
From my research, it appears that once objects are managed by RealityKit through PolySpatial (for example, made graspable), they are no longer rendered through Unity's normal camera pipeline. Only the main XR camera (managed by RealityKit) seems able to see these objects. Secondary Unity cameras cannot render RealityKit-synced content to a RenderTexture. If this is correct, it seems there is currently no way to implement a true live secondary camera feed showing graspable objects on Vision Pro using Unity PolySpatial.
My questions are:
Is there any official way to enable multiple camera rendering of RealityKit-managed objects through PolySpatial?
Are there known workarounds to simulate a live camera feed that still allows objects to be grabbed?
Has anyone found alternative design patterns or methods for this kind of interaction?
Environment: Unity 6.0 , PolySpatial 2.2.4, Apple Vision OS XR 2.2.4
Any insight or suggestions would be greatly appreciated.
Thank you.
I’ve submitted the following feedback:
FB13820942 (List Outline View Not Using Accent Color on Disclosure Caret for visionOS)
I’d appreciate help on this to see if I’m doing something wrong or indeed it’s the way visionOS currently works and it’s a suggested feedback.
Hello, I am currently developing a Vision Pro VR application with Unreal Engine 5.5. Is it possible to interact with objects (grabbing, clicking on buttons)? I cannot find any information on this. Thank you.
Topic:
Spatial Computing
SubTopic:
General