Delve into the world of graphics and game development. Discuss creating stunning visuals, optimizing game mechanics, and share resources for game developers.

All subtopics
Posts under Graphics & Games topic

Post

Replies

Boosts

Views

Activity

How to apply the same SystemImage to both mainEmitter and spawnedEmitter without clipping in ParticleEmitterComponent?
Hi everyone, I’m currently learning about ParticleEmitterComponentParticleEmitterComponent and exploring the sample app provided in the Simulating particles in your visionOS app documentation. In the sample app, when I set the EmitterPreset to fireworks from the settings panel on the left side of the window and choose SystemImage, I noticed two issues: The image applied to mainEmitter appears clipped or cropped. The image on spawnedEmitter does not update to the selected SystemImage. What I want to achieve: Apply the same SystemImage to both mainEmittermainEmitter and spawnedEmitterspawnedEmitter so that it displays correctly without clipping. Remove the animation that changes the size of spawnedEmitterspawnedEmitter over time and keep it at a constant size. Could someone explain which properties should be adjusted to achieve this behavior? Any guidance or examples would be greatly appreciated. Thanks in advance!
0
0
485
Sep ’25
Apple Unity plugin issue
I use unity 2020.3.48f1 to develop a game; trying to implement Apple Services integration I use Apple unity plugins(https://github.com/apple/unityplugins) Using latest version of unity plugins I getting error in Unity project after plugin import It say "Not allowed platform VisionOS" When I tryed to use older version of the plugins I getting error on runtime when calling "var fetchItemsResponse = await GKLocalPlayer.Local.FetchItems();" in line 42 it drop EXC_BAD_ACCESS(code=257, address=0x0000...) error I tryed to use different commits from official repositorys and even custom branches of apple unity plugins like (https://github.com/muZZkat/unityplugins/tree/muzzkat/fix-fetch-items) but it did not help There is whole my script which trying to use apple unuity plugins using System.Threading.Tasks; using UnityEngine; using System.Collections; using System; using Apple.GameKit; using UnityEngine.UI; public class TheScript : MonoBehaviour { [SerializeField] InputField otp; string Signature; string TeamPlayerID; string Salt; string PublicKeyUrl; string Timestamp; void Start() { StartCoroutine(Call()); } private IEnumerator Call() { yield return new WaitForSeconds(5); Login(); } public async Task Login() { otp.text += $"Loginig... "; if (!Apple.GameKit.GKLocalPlayer.Local.IsAuthenticated) { try { var player = await GKLocalPlayer.Authenticate(); var localPlayer = GKLocalPlayer.Local; TeamPlayerID = localPlayer.TeamPlayerId; var fetchItemsResponse = await GKLocalPlayer.Local.FetchItems(); Signature = Convert.ToBase64String(fetchItemsResponse.GetSignature()); PublicKeyUrl = fetchItemsResponse.PublicKeyUrl; otp.text += $"Team Player ID: {TeamPlayerID} "; otp.text += $"PublicKeyUrl: {PublicKeyUrl} "; } catch(Exception e) { otp.text += $"Error: " + e.Message; } } else { Debug.Log("AppleGameCenter player already logged in."); } } async Task SignInWithAppleGameCenterAsync(string signature, string teamPlayerId, string publicKeyURL, string salt, ulong timestamp) { } }
0
1
211
May ’25
Deterministic RNG behaviour across Mac M1 CPU and Metal GPU – BigCrush pass & structural diagnostics
Hello, I am currently working on a research project under ENINCA Consulting, focused on advanced diagnostic tools for pseudorandom number generators (structural metrics, multi-seed stability, cross-architecture reproducibility, and complementary indicators to TestU01). To validate this diagnostic framework, I prototyped a small non-linear 64-bit PRNG (not as a goal in itself, but simply as a vehicle to test the methodology). During these evaluations, I observed something interesting on Apple Silicon (Mac M1): • bit-exact reproducibility between M1 ARM CPU and M1 Metal GPU, • full BigCrush pass on both CPU and Metal backends, • excellent p-values, • stable behaviour across multiple seeds and runs. This was not the intended objective, the goal was mainly to validate the diagnostic concepts, but these results raised some questions about deterministic compute behaviour in Metal. My question: Is there any official guidance on achieving (or expecting) deterministic RNG or compute behaviour across CPU ↔ Metal GPU on Apple Silicon? More specifically: • Are deterministic compute kernels expected or guaranteed on Metal for scientific workloads? • Are there recommended patterns or best practices to ensure reproducibility across GPU generations (M1 → M2 → M3 → M4)? • Are there known Metal features that can introduce non-determinism? I am not sharing the internal recurrence (this work is proprietary), but I can discuss the high-level diagnostic observations if helpful. Thank you for any insight, very interested in how the Metal engineering team views deterministic compute patterns on Apple Silicon. Pascal ENINCA Consulting
0
0
214
Nov ’25
Optimizing HZB Mip-Chain Generation and Bindless Argument Tables in a Custom Metal Engine
Hi everyone, I’ve been developing a custom, end-to-end 3D rendering engine called Crescent from scratch using C++20 and Metal-cpp (targeting macOS and visionOS). My primary goal is to build a zero-bottleneck, GPU-driven pipeline that maximizes the potential of Apple Silicon’s Unified Memory and TBDR architecture. While the fundamental systems are stable, I am looking for architectural feedback from Metal framework engineers regarding specific synchronization and latency challenges. Current Core Implementations: GPU-Driven Instance Culling: High-performance occlusion culling using a Hierarchical Z-Buffer (HZB) approach via Compute Shaders. Clustered Forward Shading: Support for high-count dynamic lights through view-space clustering. Temporal Stability: Custom TAA with history rejection and Motion Blur resolve. Asset Infrastructure: Robust GUID-based scene serialization and a JSON-driven ECS hierarchy. The Architectural Challenge: I am currently seeing slight synchronization overhead when generating the HZB mip-chain. On Apple Silicon, I am evaluating the cost of encoder transitions versus cache-friendly barriers. && m_hzbInitPipeline && m_hzbDownsamplePipeline && !m_hzbMipViews.empty(); if (canBuildHzb) { MTL::ComputeCommandEncoder* hzbInit = commandBuffer->computeCommandEncoder(); hzbInit->setComputePipelineState(m_hzbInitPipeline); hzbInit->setTexture(m_depthTexture, 0); hzbInit->setTexture(m_hzbMipViews[0], 1); if (m_pointClampSampler) { hzbInit->setSamplerState(m_pointClampSampler, 0); } else if (m_linearClampSampler) { hzbInit->setSamplerState(m_linearClampSampler, 0); } const uint32_t hzbWidth = m_hzbMipViews[0]->width(); const uint32_t hzbHeight = m_hzbMipViews[0]->height(); const uint32_t threads = 8; MTL::Size tgSize = MTL::Size(threads, threads, 1); MTL::Size gridSize = MTL::Size((hzbWidth + threads - 1) / threads * threads, (hzbHeight + threads - 1) / threads * threads, 1); hzbInit->dispatchThreads(gridSize, tgSize); hzbInit->endEncoding(); for (size_t mip = 1; mip < m_hzbMipViews.size(); ++mip) { MTL::Texture* src = m_hzbMipViews[mip - 1]; MTL::Texture* dst = m_hzbMipViews[mip]; if (!src || !dst) { continue; } MTL::ComputeCommandEncoder* downEncoder = commandBuffer->computeCommandEncoder(); downEncoder->setComputePipelineState(m_hzbDownsamplePipeline); downEncoder->setTexture(src, 0); downEncoder->setTexture(dst, 1); const uint32_t mipWidth = dst->width(); const uint32_t mipHeight = dst->height(); MTL::Size downGrid = MTL::Size((mipWidth + threads - 1) / threads * threads, (mipHeight + threads - 1) / threads * threads, 1); downEncoder->dispatchThreads(downGrid, tgSize); downEncoder->endEncoding(); } if (m_instanceCullHzbPipeline) { dispatchInstanceCulling(m_instanceCullHzbPipeline, true); } } My Questions: Encoder Synchronization: Would you recommend moving this loop into a single ComputeCommandEncoder using MTLBarrier between dispatches to maintain L2 cache residency, or is the overhead of separate encoders negligible for depth-downsampling on TBDR? visionOS Bindless Latency: For stereo rendering on visionOS, what are the best practices for managing MTL4ArgumentTable updates at 90Hz+? I want to ensure that updating bindless resources for each eye doesn't introduce unnecessary CPU-to-GPU latency. Memory Management: Are there specific hints for Memoryless textures that could be applied to intermediate HZB levels to save bandwidth during this process? I’ve attached a screenshot of a scene rendered with the engine (PBR, SSR, and IBL).
0
0
435
Feb ’26
Deterministic RNG behaviour across Mac M1 CPU and Metal GPU – BigCrush pass & structural diagnostics
Hello, I am currently working on a research project under ENINCA Consulting, focused on advanced diagnostic tools for pseudorandom number generators (structural metrics, multi-seed stability, cross-architecture reproducibility, and complementary indicators to TestU01). To validate this diagnostic framework, I prototyped a small non-linear 64-bit PRNG (not as a goal in itself, but simply as a vehicle to test the methodology). During these evaluations, I observed something interesting on Apple Silicon (Mac M1): • bit-exact reproducibility between M1 ARM CPU and M1 Metal GPU, • full BigCrush pass on both CPU and Metal backends, • excellent p-values, • stable behaviour across multiple seeds and runs. This was not the intended objective, the goal was mainly to validate the diagnostic concepts, but these results raised some questions about deterministic compute behaviour in Metal. My question: Is there any official guidance on achieving (or expecting) deterministic RNG or compute behaviour across CPU ↔ Metal GPU on Apple Silicon? More specifically: • Are deterministic compute kernels expected or guaranteed on Metal for scientific workloads? • Are there recommended patterns or best practices to ensure reproducibility across GPU generations (M1 → M2 → M3 → M4)? • Are there known Metal features that can introduce non-determinism? I am not sharing the internal recurrence (this work is proprietary), but I can discuss the high-level diagnostic observations if helpful. Thank you for any insight, very interested in how the Metal engineering team views deterministic compute patterns on Apple Silicon. Pascal ENINCA Consulting
0
0
284
Nov ’25
How can I uninstall game-porting-toolkit completely
So, I'm done with GPTK and decided to delete it. The only thing I installed was brew -v install apple/apple/game-porting-toolkit and the external libraries from the ditto command. Now, I tried to remove it, but even after brew remove game-porting-toolkit brew autoremove all of the dependencies installed with brew are still there. The most obvious was game-porting-toolkit-compiler, but even after removing this there are so many libraries that are now orphaned and it's just impossible to manually identify those. Is there a way or is the easiest way to simply uninstall Homebrew completely and reinstall it again?
0
0
281
May ’25
Hover effects w/ Compositor Services w/ PSVR2 controllers
Hi, I would like clarification on whether the new hover effects feature introduced in vision os 26 supported pinch gestures through the psvr 2 controllers. In your sample application, I was not able to confirm that this was working. Only pinch clicking with my hands worked. Pulling the trigger on the controller whilst looking at a 3d object did not activate the hover effect spatial event in the sample application. (The object is showing the highlight though) This is inconsistent with hover effect behavior with psvr2 controllers on swift ui views, where the trigger press does count as a button click. The sample I used was this one: https://developer.apple.com/documentation/compositorservices/rendering_hover_effects_in_metal_immersive_apps
0
0
454
Jan ’26
Can a compute pipeline be as efficient as a render pipeline for rasterization?
I'm new to graphics and game design and I just wanted to know if a compute pipeline could be as efficient as a render pipeline for rasterization and an explanation on how and why. Also is it possible to manually perform rasterization with a render pipeline as in manipulate individual pixel data in a metal texture yourself but do it with a render pipeline?
0
0
163
2w
MTL4FXTemporalDenoisedScaler initialization
I’m trying to use MTL4FXTemporalDenoisedScaler, and I’m seeing a crash during initialization even with a very simple sample app. I created a minimal sample here: https://github.com/tatsuya-ogawa/MetalFXInitExample The exception is: NSException: "-[AGXG16XFamilyHeap baseObject]: unrecognized selector sent to instance ..." What I found is: • This works: descriptor.makeTemporalDenoisedScaler(device: device) • This crashes: descriptor.makeTemporalDenoisedScaler(device: device, compiler: metal4Compiler) So the issue seems to happen only with the Metal4FX version. For testing, I’m using an iPhone 15 Pro. According to the Metal Feature Set Tables, MetalFX denoised upscaling should be supported on Apple9 and later, so I believe the device itself should meet the requirements. Reference: https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf Has anyone seen this before, or knows what might be causing it? I’d appreciate any advice. Thanks.
0
0
44
5d
Unity iOS Game Name Display Issue
When building a Unity iOS game, the app name displays incorrectly as "BigBall" on the iPhone home screen, despite setting the project name and bundle identifier to "Big Ball" in Unity and Apple Developer account. The correct name, "Big Ball," appears in TestFlight. I tried solutions from ChatGPT and DeepSeek, but none were satisfactory. Please help me.
0
0
104
Feb ’26
Open Shading Language (OSL) in Metal
Hi. I'm a 3D designer, using Blender for most of my work. The most recent Blender conference discussed utilizing the Open Shading Language (OSL) in their latest versions, which allows designers to write custom shaders for their workflows. At the moment, only Nvidia Optix GPU's can utilize this language for rendering (from what I understand), but Blender developers stated they are waiting on other GPU manufacturers to implement this feature as well. I'm not sure if there are any licensing issues here, but would this be something Apple could implement in Metal to make their hardware more attractive to the 3D design community? Any help or knowledge on this topic would be greatly appreciated.
0
0
265
Feb ’26
Xcode Metal Capture crash when using MTLSamplerState
The sample code just draw a triangle and sample texture. both sample code can draw a correct triangle and sample texture as expected. there are no error message from terminal. Sample code using constexpr Sampler can capture and replay well. Sample code using a argumentTable to bind a MTLSamplerState was crashed when using Metal capture and replay on Xcode. Here are sample codes. Sample Code Test Environment: M1 Pro MacOS 26.3 (25D125) Xcode Version 26.2 (17C52) Feedback ID: FB22031701
0
0
106
3w
The description of set_indices in the MSL reference seems incorrect.
I'm currently learning Metal. While reading the reference, I came across a strange description. Page 78 in Version 4 Reference (2025-10-25) says: It is legal to call the following set_indices functions to set the indices if the position in the index buffer is valid and if the position in the index buffer is a multiple of 2 (uchar2 overload) or 2 (uchar4 overload). The index I needs to be in the range [0, max_indices). void set_indices(uint I, uchar2 v); void set_indices(uint I, uchar4 v); However, it seems that the uchar4 overload should be multiple of 4. Furthermore, there is no explanation of what these methods actually do. I believe it involves setting two to four consecutive indices at once, but there is no mention of that here. I would like to know if the above understanding is correct.
0
0
109
Feb ’26
MTLBinaryArchive Size
I'm trying to use MTLBinaryArchive. I collected a BinaryArchive from one device and used metal-tt to translate it for all supported iPhone devices, ranging from iPhone 7 Plus to iPhone 16. However, this BinaryArchive is quite large, around 1.5GB uncompressed, and about 500MB compressed in the IPA. I'm wondering how to address the size issue. I watched the WWDC 2022 video, which mentioned that the operating system or app installation process would handle compatibility. Does this compatibility support different GPU chips? I tried installing an IPA with a BinaryArchive collected only from an iPhone 12 on an iPhone 13, but the BinaryArchive didn't take effect. I also saw that Apple supports App Thinning. However, it seems that resources in the Asset Catalog cannot be accessed via URL, and creating an MTLBinaryArchive requires a URL. Is it possible for MTLBinaryArchive to be distributed through App Thinning? The WWDC 2022 video also mentioned using the -Os optimization flag to reduce size. Can this give an estimate of how much compression it would achieve? Are there any methods to solve the BinaryArchive size issue without impacting performance?
0
1
114
Mar ’25
How do I control a SwiftUI TextField with a game controller?
I've coded a text-adventure game in SwiftUI. (My game has no graphics or sound effects.) My app already supports keyboard navigation; I would like to add support for game controllers on iPhone. I can't figure out how to do it. I especially can't see any way to allow controller users to enter text in a TextField. I've read https://developer.apple.com/documentation/gamecontroller/supporting-game-controllers and it's all about button events. There's no reference to SwiftUI at all in that documentation, or any input-method editing at all. The only mention of "keyboard" is about treating the keyboard itself as if it were a game controller providing button events. How do I implement this?
0
0
120
Feb ’26
Threadgroup configuration for tile shading
Hello! I have a question about how thread groups work with tile shading. When running "traditional" compute, I get to choose both thread group size and the grid size. However, when using tile shading kernel I only have dispatchThreadsPerTile method - this controls how many threads will be ran in each tile. So far so good, but what about thread groups? The examples in video "Tile Shading on A11" seem to suggest that there will be only one thread group per tile. In the video, [[thread_index_in_threadgroup]] is called "local_id" and it is used to access the image block. I assume this is the default configuration. So when one does the following: Creates MTLRenderPassDescriptor with tileWidth set to W and tileHeight set to H Fires up the tile shading kernel using dispatchThreadsPerTile with MTLSize size = { W, H, 1 } I understand that the result is 1-to-1 mapping between the tile "pixels" and kernel threads. Now, what I would like to do is to have more than one thread group there. I want this for performance reasons: I have a certain compute kernel which I know executes very well with small thread group size. In fact, { 32, 1, 1 } seems to be the fastest. My understanding is that even if I set tile size to 16x16, and so I am executing 256 threads there, there will only be one SIMD group active in a thread group. Meaning that this SIMD group has to execute 8 times over the tile. Is it possible somehow? Or perhaps the limitations of the API are pointing at the limitations of hardware itself, and if I want to execute with SIMD group sized thread groups I have to use "traditional" compute encoder? Will be grateful for help. Michał
0
0
89
Mar ’25
Does VisionOS have the equivalent of ARView.physicsOrigin?
I'm trying to scale the physics of a scene without changing the apparent size to avoid the low-speed zeroing-out of motion that the physics simulation does. I found a technique for using separate simulation and physics roots in the docs, but it relies on ARView, which VisionOS doesn't have. This seems more elegant than scaling absolutely everything with shared root -- any chance I'm just failing in my searches to find the equivalent functionality?
0
0
12
2h
'__abort_with_payload' from CompositorNonUI on visionOS 26.2 (device + simulator, Omniverse streaming)
I am developing a custom app for Apple Vision Pro using Compositor Services to stream content from NVIDIA Omniverse. The app is based on: https://github.com/NVIDIA-Omniverse/apple-configurator-sample Environment: Device: Apple Vision Pro OS Version: visionOS 26.2 Xcode Version: 26.2 The Issue: The application crashes hard (__abort_with_payload) in "libsystem_kernel.dylib" on Task 6 immediately after initialization. This appears to be a deliberate abort triggered by the compositor, not a typical crash. The issue occurs on both physical device and simulator. Important detail: The console output shows a specific CLIENT BUG assertion. By checking the metadata of the warning, I found that it is related to "Library: CompositorNonUI". Relevant console output before abort: Missed 'FrameLimiter' target of 90.0 Hz running compositor services to get IPD, FOV, etc fence tx observer 14f27 timed out after 0.600000 fence tx observer bc1b timed out after 0.600000 BUG IN CLIENT: For mixed reality experiences please use cp_drawable_compute_projection API
0
0
138
Jan ’26
How to apply the same SystemImage to both mainEmitter and spawnedEmitter without clipping in ParticleEmitterComponent?
Hi everyone, I’m currently learning about ParticleEmitterComponentParticleEmitterComponent and exploring the sample app provided in the Simulating particles in your visionOS app documentation. In the sample app, when I set the EmitterPreset to fireworks from the settings panel on the left side of the window and choose SystemImage, I noticed two issues: The image applied to mainEmitter appears clipped or cropped. The image on spawnedEmitter does not update to the selected SystemImage. What I want to achieve: Apply the same SystemImage to both mainEmittermainEmitter and spawnedEmitterspawnedEmitter so that it displays correctly without clipping. Remove the animation that changes the size of spawnedEmitterspawnedEmitter over time and keep it at a constant size. Could someone explain which properties should be adjusted to achieve this behavior? Any guidance or examples would be greatly appreciated. Thanks in advance!
Replies
0
Boosts
0
Views
485
Activity
Sep ’25
Apple Unity plugin issue
I use unity 2020.3.48f1 to develop a game; trying to implement Apple Services integration I use Apple unity plugins(https://github.com/apple/unityplugins) Using latest version of unity plugins I getting error in Unity project after plugin import It say "Not allowed platform VisionOS" When I tryed to use older version of the plugins I getting error on runtime when calling "var fetchItemsResponse = await GKLocalPlayer.Local.FetchItems();" in line 42 it drop EXC_BAD_ACCESS(code=257, address=0x0000...) error I tryed to use different commits from official repositorys and even custom branches of apple unity plugins like (https://github.com/muZZkat/unityplugins/tree/muzzkat/fix-fetch-items) but it did not help There is whole my script which trying to use apple unuity plugins using System.Threading.Tasks; using UnityEngine; using System.Collections; using System; using Apple.GameKit; using UnityEngine.UI; public class TheScript : MonoBehaviour { [SerializeField] InputField otp; string Signature; string TeamPlayerID; string Salt; string PublicKeyUrl; string Timestamp; void Start() { StartCoroutine(Call()); } private IEnumerator Call() { yield return new WaitForSeconds(5); Login(); } public async Task Login() { otp.text += $"Loginig... "; if (!Apple.GameKit.GKLocalPlayer.Local.IsAuthenticated) { try { var player = await GKLocalPlayer.Authenticate(); var localPlayer = GKLocalPlayer.Local; TeamPlayerID = localPlayer.TeamPlayerId; var fetchItemsResponse = await GKLocalPlayer.Local.FetchItems(); Signature = Convert.ToBase64String(fetchItemsResponse.GetSignature()); PublicKeyUrl = fetchItemsResponse.PublicKeyUrl; otp.text += $"Team Player ID: {TeamPlayerID} "; otp.text += $"PublicKeyUrl: {PublicKeyUrl} "; } catch(Exception e) { otp.text += $"Error: " + e.Message; } } else { Debug.Log("AppleGameCenter player already logged in."); } } async Task SignInWithAppleGameCenterAsync(string signature, string teamPlayerId, string publicKeyURL, string salt, ulong timestamp) { } }
Replies
0
Boosts
1
Views
211
Activity
May ’25
Deterministic RNG behaviour across Mac M1 CPU and Metal GPU – BigCrush pass & structural diagnostics
Hello, I am currently working on a research project under ENINCA Consulting, focused on advanced diagnostic tools for pseudorandom number generators (structural metrics, multi-seed stability, cross-architecture reproducibility, and complementary indicators to TestU01). To validate this diagnostic framework, I prototyped a small non-linear 64-bit PRNG (not as a goal in itself, but simply as a vehicle to test the methodology). During these evaluations, I observed something interesting on Apple Silicon (Mac M1): • bit-exact reproducibility between M1 ARM CPU and M1 Metal GPU, • full BigCrush pass on both CPU and Metal backends, • excellent p-values, • stable behaviour across multiple seeds and runs. This was not the intended objective, the goal was mainly to validate the diagnostic concepts, but these results raised some questions about deterministic compute behaviour in Metal. My question: Is there any official guidance on achieving (or expecting) deterministic RNG or compute behaviour across CPU ↔ Metal GPU on Apple Silicon? More specifically: • Are deterministic compute kernels expected or guaranteed on Metal for scientific workloads? • Are there recommended patterns or best practices to ensure reproducibility across GPU generations (M1 → M2 → M3 → M4)? • Are there known Metal features that can introduce non-determinism? I am not sharing the internal recurrence (this work is proprietary), but I can discuss the high-level diagnostic observations if helpful. Thank you for any insight, very interested in how the Metal engineering team views deterministic compute patterns on Apple Silicon. Pascal ENINCA Consulting
Replies
0
Boosts
0
Views
214
Activity
Nov ’25
Optimizing HZB Mip-Chain Generation and Bindless Argument Tables in a Custom Metal Engine
Hi everyone, I’ve been developing a custom, end-to-end 3D rendering engine called Crescent from scratch using C++20 and Metal-cpp (targeting macOS and visionOS). My primary goal is to build a zero-bottleneck, GPU-driven pipeline that maximizes the potential of Apple Silicon’s Unified Memory and TBDR architecture. While the fundamental systems are stable, I am looking for architectural feedback from Metal framework engineers regarding specific synchronization and latency challenges. Current Core Implementations: GPU-Driven Instance Culling: High-performance occlusion culling using a Hierarchical Z-Buffer (HZB) approach via Compute Shaders. Clustered Forward Shading: Support for high-count dynamic lights through view-space clustering. Temporal Stability: Custom TAA with history rejection and Motion Blur resolve. Asset Infrastructure: Robust GUID-based scene serialization and a JSON-driven ECS hierarchy. The Architectural Challenge: I am currently seeing slight synchronization overhead when generating the HZB mip-chain. On Apple Silicon, I am evaluating the cost of encoder transitions versus cache-friendly barriers. && m_hzbInitPipeline && m_hzbDownsamplePipeline && !m_hzbMipViews.empty(); if (canBuildHzb) { MTL::ComputeCommandEncoder* hzbInit = commandBuffer->computeCommandEncoder(); hzbInit->setComputePipelineState(m_hzbInitPipeline); hzbInit->setTexture(m_depthTexture, 0); hzbInit->setTexture(m_hzbMipViews[0], 1); if (m_pointClampSampler) { hzbInit->setSamplerState(m_pointClampSampler, 0); } else if (m_linearClampSampler) { hzbInit->setSamplerState(m_linearClampSampler, 0); } const uint32_t hzbWidth = m_hzbMipViews[0]->width(); const uint32_t hzbHeight = m_hzbMipViews[0]->height(); const uint32_t threads = 8; MTL::Size tgSize = MTL::Size(threads, threads, 1); MTL::Size gridSize = MTL::Size((hzbWidth + threads - 1) / threads * threads, (hzbHeight + threads - 1) / threads * threads, 1); hzbInit->dispatchThreads(gridSize, tgSize); hzbInit->endEncoding(); for (size_t mip = 1; mip < m_hzbMipViews.size(); ++mip) { MTL::Texture* src = m_hzbMipViews[mip - 1]; MTL::Texture* dst = m_hzbMipViews[mip]; if (!src || !dst) { continue; } MTL::ComputeCommandEncoder* downEncoder = commandBuffer->computeCommandEncoder(); downEncoder->setComputePipelineState(m_hzbDownsamplePipeline); downEncoder->setTexture(src, 0); downEncoder->setTexture(dst, 1); const uint32_t mipWidth = dst->width(); const uint32_t mipHeight = dst->height(); MTL::Size downGrid = MTL::Size((mipWidth + threads - 1) / threads * threads, (mipHeight + threads - 1) / threads * threads, 1); downEncoder->dispatchThreads(downGrid, tgSize); downEncoder->endEncoding(); } if (m_instanceCullHzbPipeline) { dispatchInstanceCulling(m_instanceCullHzbPipeline, true); } } My Questions: Encoder Synchronization: Would you recommend moving this loop into a single ComputeCommandEncoder using MTLBarrier between dispatches to maintain L2 cache residency, or is the overhead of separate encoders negligible for depth-downsampling on TBDR? visionOS Bindless Latency: For stereo rendering on visionOS, what are the best practices for managing MTL4ArgumentTable updates at 90Hz+? I want to ensure that updating bindless resources for each eye doesn't introduce unnecessary CPU-to-GPU latency. Memory Management: Are there specific hints for Memoryless textures that could be applied to intermediate HZB levels to save bandwidth during this process? I’ve attached a screenshot of a scene rendered with the engine (PBR, SSR, and IBL).
Replies
0
Boosts
0
Views
435
Activity
Feb ’26
Deterministic RNG behaviour across Mac M1 CPU and Metal GPU – BigCrush pass & structural diagnostics
Hello, I am currently working on a research project under ENINCA Consulting, focused on advanced diagnostic tools for pseudorandom number generators (structural metrics, multi-seed stability, cross-architecture reproducibility, and complementary indicators to TestU01). To validate this diagnostic framework, I prototyped a small non-linear 64-bit PRNG (not as a goal in itself, but simply as a vehicle to test the methodology). During these evaluations, I observed something interesting on Apple Silicon (Mac M1): • bit-exact reproducibility between M1 ARM CPU and M1 Metal GPU, • full BigCrush pass on both CPU and Metal backends, • excellent p-values, • stable behaviour across multiple seeds and runs. This was not the intended objective, the goal was mainly to validate the diagnostic concepts, but these results raised some questions about deterministic compute behaviour in Metal. My question: Is there any official guidance on achieving (or expecting) deterministic RNG or compute behaviour across CPU ↔ Metal GPU on Apple Silicon? More specifically: • Are deterministic compute kernels expected or guaranteed on Metal for scientific workloads? • Are there recommended patterns or best practices to ensure reproducibility across GPU generations (M1 → M2 → M3 → M4)? • Are there known Metal features that can introduce non-determinism? I am not sharing the internal recurrence (this work is proprietary), but I can discuss the high-level diagnostic observations if helpful. Thank you for any insight, very interested in how the Metal engineering team views deterministic compute patterns on Apple Silicon. Pascal ENINCA Consulting
Replies
0
Boosts
0
Views
284
Activity
Nov ’25
How can I uninstall game-porting-toolkit completely
So, I'm done with GPTK and decided to delete it. The only thing I installed was brew -v install apple/apple/game-porting-toolkit and the external libraries from the ditto command. Now, I tried to remove it, but even after brew remove game-porting-toolkit brew autoremove all of the dependencies installed with brew are still there. The most obvious was game-porting-toolkit-compiler, but even after removing this there are so many libraries that are now orphaned and it's just impossible to manually identify those. Is there a way or is the easiest way to simply uninstall Homebrew completely and reinstall it again?
Replies
0
Boosts
0
Views
281
Activity
May ’25
Hover effects w/ Compositor Services w/ PSVR2 controllers
Hi, I would like clarification on whether the new hover effects feature introduced in vision os 26 supported pinch gestures through the psvr 2 controllers. In your sample application, I was not able to confirm that this was working. Only pinch clicking with my hands worked. Pulling the trigger on the controller whilst looking at a 3d object did not activate the hover effect spatial event in the sample application. (The object is showing the highlight though) This is inconsistent with hover effect behavior with psvr2 controllers on swift ui views, where the trigger press does count as a button click. The sample I used was this one: https://developer.apple.com/documentation/compositorservices/rendering_hover_effects_in_metal_immersive_apps
Replies
0
Boosts
0
Views
454
Activity
Jan ’26
Combine 2 animations in RealityKit
Hello I would like to know how to combine 2 animations with RealityKit (one animation for the arms and one for the legs for example) I saw this apple demo that seems to explain it but I don't understand at all how to do it... Thanks
Replies
0
Boosts
0
Views
495
Activity
Jul ’25
Can a compute pipeline be as efficient as a render pipeline for rasterization?
I'm new to graphics and game design and I just wanted to know if a compute pipeline could be as efficient as a render pipeline for rasterization and an explanation on how and why. Also is it possible to manually perform rasterization with a render pipeline as in manipulate individual pixel data in a metal texture yourself but do it with a render pipeline?
Replies
0
Boosts
0
Views
163
Activity
2w
MTL4FXTemporalDenoisedScaler initialization
I’m trying to use MTL4FXTemporalDenoisedScaler, and I’m seeing a crash during initialization even with a very simple sample app. I created a minimal sample here: https://github.com/tatsuya-ogawa/MetalFXInitExample The exception is: NSException: "-[AGXG16XFamilyHeap baseObject]: unrecognized selector sent to instance ..." What I found is: • This works: descriptor.makeTemporalDenoisedScaler(device: device) • This crashes: descriptor.makeTemporalDenoisedScaler(device: device, compiler: metal4Compiler) So the issue seems to happen only with the Metal4FX version. For testing, I’m using an iPhone 15 Pro. According to the Metal Feature Set Tables, MetalFX denoised upscaling should be supported on Apple9 and later, so I believe the device itself should meet the requirements. Reference: https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf Has anyone seen this before, or knows what might be causing it? I’d appreciate any advice. Thanks.
Replies
0
Boosts
0
Views
44
Activity
5d
Unity iOS Game Name Display Issue
When building a Unity iOS game, the app name displays incorrectly as "BigBall" on the iPhone home screen, despite setting the project name and bundle identifier to "Big Ball" in Unity and Apple Developer account. The correct name, "Big Ball," appears in TestFlight. I tried solutions from ChatGPT and DeepSeek, but none were satisfactory. Please help me.
Replies
0
Boosts
0
Views
104
Activity
Feb ’26
Open Shading Language (OSL) in Metal
Hi. I'm a 3D designer, using Blender for most of my work. The most recent Blender conference discussed utilizing the Open Shading Language (OSL) in their latest versions, which allows designers to write custom shaders for their workflows. At the moment, only Nvidia Optix GPU's can utilize this language for rendering (from what I understand), but Blender developers stated they are waiting on other GPU manufacturers to implement this feature as well. I'm not sure if there are any licensing issues here, but would this be something Apple could implement in Metal to make their hardware more attractive to the 3D design community? Any help or knowledge on this topic would be greatly appreciated.
Replies
0
Boosts
0
Views
265
Activity
Feb ’26
Has anyone been able to create a window/portal using metal.
I am trying to create a simple portal like that in RealityKit, but using metal instead of RealityKit. Has anyone been able to create a window or portal like thing to show a skybox outside in mixed Reality?
Replies
0
Boosts
0
Views
224
Activity
Jan ’26
Xcode Metal Capture crash when using MTLSamplerState
The sample code just draw a triangle and sample texture. both sample code can draw a correct triangle and sample texture as expected. there are no error message from terminal. Sample code using constexpr Sampler can capture and replay well. Sample code using a argumentTable to bind a MTLSamplerState was crashed when using Metal capture and replay on Xcode. Here are sample codes. Sample Code Test Environment: M1 Pro MacOS 26.3 (25D125) Xcode Version 26.2 (17C52) Feedback ID: FB22031701
Replies
0
Boosts
0
Views
106
Activity
3w
The description of set_indices in the MSL reference seems incorrect.
I'm currently learning Metal. While reading the reference, I came across a strange description. Page 78 in Version 4 Reference (2025-10-25) says: It is legal to call the following set_indices functions to set the indices if the position in the index buffer is valid and if the position in the index buffer is a multiple of 2 (uchar2 overload) or 2 (uchar4 overload). The index I needs to be in the range [0, max_indices). void set_indices(uint I, uchar2 v); void set_indices(uint I, uchar4 v); However, it seems that the uchar4 overload should be multiple of 4. Furthermore, there is no explanation of what these methods actually do. I believe it involves setting two to four consecutive indices at once, but there is no mention of that here. I would like to know if the above understanding is correct.
Replies
0
Boosts
0
Views
109
Activity
Feb ’26
MTLBinaryArchive Size
I'm trying to use MTLBinaryArchive. I collected a BinaryArchive from one device and used metal-tt to translate it for all supported iPhone devices, ranging from iPhone 7 Plus to iPhone 16. However, this BinaryArchive is quite large, around 1.5GB uncompressed, and about 500MB compressed in the IPA. I'm wondering how to address the size issue. I watched the WWDC 2022 video, which mentioned that the operating system or app installation process would handle compatibility. Does this compatibility support different GPU chips? I tried installing an IPA with a BinaryArchive collected only from an iPhone 12 on an iPhone 13, but the BinaryArchive didn't take effect. I also saw that Apple supports App Thinning. However, it seems that resources in the Asset Catalog cannot be accessed via URL, and creating an MTLBinaryArchive requires a URL. Is it possible for MTLBinaryArchive to be distributed through App Thinning? The WWDC 2022 video also mentioned using the -Os optimization flag to reduce size. Can this give an estimate of how much compression it would achieve? Are there any methods to solve the BinaryArchive size issue without impacting performance?
Replies
0
Boosts
1
Views
114
Activity
Mar ’25
How do I control a SwiftUI TextField with a game controller?
I've coded a text-adventure game in SwiftUI. (My game has no graphics or sound effects.) My app already supports keyboard navigation; I would like to add support for game controllers on iPhone. I can't figure out how to do it. I especially can't see any way to allow controller users to enter text in a TextField. I've read https://developer.apple.com/documentation/gamecontroller/supporting-game-controllers and it's all about button events. There's no reference to SwiftUI at all in that documentation, or any input-method editing at all. The only mention of "keyboard" is about treating the keyboard itself as if it were a game controller providing button events. How do I implement this?
Replies
0
Boosts
0
Views
120
Activity
Feb ’26
Threadgroup configuration for tile shading
Hello! I have a question about how thread groups work with tile shading. When running "traditional" compute, I get to choose both thread group size and the grid size. However, when using tile shading kernel I only have dispatchThreadsPerTile method - this controls how many threads will be ran in each tile. So far so good, but what about thread groups? The examples in video "Tile Shading on A11" seem to suggest that there will be only one thread group per tile. In the video, [[thread_index_in_threadgroup]] is called "local_id" and it is used to access the image block. I assume this is the default configuration. So when one does the following: Creates MTLRenderPassDescriptor with tileWidth set to W and tileHeight set to H Fires up the tile shading kernel using dispatchThreadsPerTile with MTLSize size = { W, H, 1 } I understand that the result is 1-to-1 mapping between the tile "pixels" and kernel threads. Now, what I would like to do is to have more than one thread group there. I want this for performance reasons: I have a certain compute kernel which I know executes very well with small thread group size. In fact, { 32, 1, 1 } seems to be the fastest. My understanding is that even if I set tile size to 16x16, and so I am executing 256 threads there, there will only be one SIMD group active in a thread group. Meaning that this SIMD group has to execute 8 times over the tile. Is it possible somehow? Or perhaps the limitations of the API are pointing at the limitations of hardware itself, and if I want to execute with SIMD group sized thread groups I have to use "traditional" compute encoder? Will be grateful for help. Michał
Replies
0
Boosts
0
Views
89
Activity
Mar ’25
Does VisionOS have the equivalent of ARView.physicsOrigin?
I'm trying to scale the physics of a scene without changing the apparent size to avoid the low-speed zeroing-out of motion that the physics simulation does. I found a technique for using separate simulation and physics roots in the docs, but it relies on ARView, which VisionOS doesn't have. This seems more elegant than scaling absolutely everything with shared root -- any chance I'm just failing in my searches to find the equivalent functionality?
Replies
0
Boosts
0
Views
12
Activity
2h
'__abort_with_payload' from CompositorNonUI on visionOS 26.2 (device + simulator, Omniverse streaming)
I am developing a custom app for Apple Vision Pro using Compositor Services to stream content from NVIDIA Omniverse. The app is based on: https://github.com/NVIDIA-Omniverse/apple-configurator-sample Environment: Device: Apple Vision Pro OS Version: visionOS 26.2 Xcode Version: 26.2 The Issue: The application crashes hard (__abort_with_payload) in "libsystem_kernel.dylib" on Task 6 immediately after initialization. This appears to be a deliberate abort triggered by the compositor, not a typical crash. The issue occurs on both physical device and simulator. Important detail: The console output shows a specific CLIENT BUG assertion. By checking the metadata of the warning, I found that it is related to "Library: CompositorNonUI". Relevant console output before abort: Missed 'FrameLimiter' target of 90.0 Hz running compositor services to get IPD, FOV, etc fence tx observer 14f27 timed out after 0.600000 fence tx observer bc1b timed out after 0.600000 BUG IN CLIENT: For mixed reality experiences please use cp_drawable_compute_projection API
Replies
0
Boosts
0
Views
138
Activity
Jan ’26