Created attachment 134033 [details]
backtrace of vulkaninfo crash
While testing vulkan support for my 390x on git master, I ran across the following issue.
Running vulkaninfo crashes and this line shows up in the journal:
traps: vulkaninfo trap divide error ip:7f9777251563 sp:7fff7aa100e8 error:0 in libvulkan_radeon.so[7f97771ed000+1a5000]
I also ran git-bisect, which indicates that this issue was caused by the commit 180c1b924e1ed3a2918fad9c5cbb653524de8233
Attached is a backtrace of when it crashes, which is in radv_pipeline_scratch_init of src/vulkan/radv_pipeline.c. The only divide by zero error that looks possible in that function is on line 763 if pipeline->shaders[i]->config.num_vgprs is zero.
Running 4.13.0 kernel on NixOS
Author: Bas Nieuwenhuizen <email@example.com>
Date: Wed Aug 16 09:09:56 2017 +0200
ac/nir: Add shader support for multiviews.
It uses an user SGPR to pass the view index to the shaders, except
for the fragment shader where we use layer=view (which comes in
handy when we want to do the NV ext that allows us to execute pre-FS
stages once instead of per view).
Reviewed-by: Dave Airlie <firstname.lastname@example.org>
I can't reproduce. Can you do a crashing run with RADV_DEBUG=shaders,shaderstats and then upload the stdout+stderr?
if num_vgprs is really 0, I'd think something is really wrong.
Strangely, I can no longer reproduce the original error. This is true even when booting the exact system configuration (same kernel, mesa, and all other libraries and executables tracked by NixOS). This is surprising given how consistently well the bisect seemed to be working a few days ago. In fact, I'm even able to finally run steamvr with my 390x.
I'll go ahead and close this issue. I'll reopen in the future if I ever am able to consistently reproduce this again. Thanks anyway.
I had the same problem on a previously working system, with a very similar looking backtrace:
Thread 1 "vulkaninfo" received signal SIGFPE, Arithmetic exception.
0x00007ffff62aa2f3 in radv_pipeline_scratch_init (pipeline=pipeline@entry=0x555555a66c50, device=<optimized out>,
device=<optimized out>) at ../../../../src/amd/vulkan/radv_pipeline.c:117
I noticed that running vulkaninfo as root, or as another user worked.
Just deleting the file ~/.cache/radv_builtin_shaders fixed it here.
I'm not sure if the file simply got corrupt due to me killing a vulkan app abruptly or if it had something to do with switching between 64bit and 32bit apps and drivers?
(Not entirely sure how to regenerate the file to recreate the bug either, just running vulkan-smoketest doesn't seem to suffice)