Summary: | [BXT,BSW] arb_gpu_shader_fp64 causes gpu hang | ||
---|---|---|---|
Product: | Mesa | Reporter: | Mark Janes <mark.a.janes> |
Component: | Drivers/DRI/i965 | Assignee: | Mark Janes <mark.a.janes> |
Status: | RESOLVED FIXED | QA Contact: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Severity: | major | ||
Priority: | medium | CC: | ben, siglesias |
Version: | git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Bug Depends on: | |||
Bug Blocks: | 96253 |
Description
Mark Janes
2016-05-17 20:56:31 UTC
(In reply to Mark Janes from comment #0) > When fp64 was enabled, corresponding piglit tests cause gpu hang on bxt and > bsw: > > 2016-05-17T11:37:58,456330-0700 [drm] GPU HANG: ecode 8:0:0x85dffffb, in > shader_runner [23184], reason: Ring hung, action: reset > Hanging Test: > piglit.spec.arb_gpu_shader_fp64.uniform_buffers.fs-dvec4-uniform-array- > direct-indirect Hi Mark, I don't think we have BXT or BSW hardware here so I am afraid we would need some help from Intel :( Also I get from your report that there are multiple tests producing hangs like this on these two platforms? (In reply to Iago Toral from comment #1) > (In reply to Mark Janes from comment #0) > > When fp64 was enabled, corresponding piglit tests cause gpu hang on bxt and > > bsw: > > > > 2016-05-17T11:37:58,456330-0700 [drm] GPU HANG: ecode 8:0:0x85dffffb, in > > shader_runner [23184], reason: Ring hung, action: reset > > Hanging Test: > > piglit.spec.arb_gpu_shader_fp64.uniform_buffers.fs-dvec4-uniform-array- > > direct-indirect > > Hi Mark, I don't think we have BXT or BSW hardware here so I am afraid we > would need some help from Intel :( > > Also I get from your report that there are multiple tests producing hangs > like this on these two platforms? Would it be possible to get a list of the tests that cause a hang? If they are related to a specific feature of fp64 it would help us narrow down the problem to something more specific. Two tests fail on bsw, related to this bug, which are not covered by the set of tests with arb_gpu_shader_fp64 in the name: - piglit.spec.arb_tessellation_shader.execution.dmat-vs-gs-tcs-tes - piglit.spec.arb_tessellation_shader.execution.dvec3-vs-tcs-tes I re-enabled fp64, and encountered gpu hang on: piglit.spec.arb_gpu_shader_fp64.uniform_buffers.fs-dvec4-uniform-array-direct-indirect For this run, only one gpu hang occured. The 'chv-fixes' branch of my tree may fix this. https://cgit.freedesktop.org/~kwg/mesa/commit/?h=chv-fixes I don't have a Braswell or Broxton system to test with me, so I haven't tested these at all. They do make one of the hanging tests happier in the simulator. (In reply to Kenneth Graunke from comment #5) > The 'chv-fixes' branch of my tree may fix this. > > https://cgit.freedesktop.org/~kwg/mesa/commit/?h=chv-fixes > > I don't have a Braswell or Broxton system to test with me, so I haven't > tested these at all. They do make one of the hanging tests happier in the > simulator. Thanks Ken! we finally got a couple of BSW systems today so we will be able to look into this. We'll check with your branch too and let you know the results. I wrote a patch that fixes the GPU hang in BSW: https://github.com/samuelig/mesa/commit/1b0350a566f5f7d23f1d96226b9dc8f85aff0d30 It is included in my wip/siglesias/chv-fixes branch which you can clone running this command: $ git clone -b wip/siglesias/chv-fixes https://github.com/samuelig/mesa.git This branch includes Ken's patches. Mark, Can you run it in the CI to see if it fixes the GPU hang in BXT and doesn't add regressions on other generations? I ran piglit on BSW and there are no GPU hangs and, on BDW, there are no regressions compared to current master. Samuel's branch resolves BSW/BXT gpu hangs, and doesn't generate other regressions. Beyond the hang, there are nearly 60 fp64 piglit tests that do not pass on BSW or BXT. For example: piglit.spec.arb_gpu_shader_fp64.execution.conversion.vert-conversion-explicit-bvec2-dvec2 Standard Output bin/shader_runner /tmp/build_root/m64/lib/piglit/generated_tests/spec/arb_gpu_shader_fp64/execution/conversion/vert-conversion-explicit-bvec2-dvec2.shader_test -auto piglit: debug: Requested an OpenGL 3.2 Core Context, and received a matching 4.3 context Probe color at (0,2) Expected: 0 255 0 255 Observed: 3 252 0 255 I need guidance as to whether these test failures should gate the 12.0 release or should be written up in a separate bug. full list of failing tests: piglit.spec.arb_gpu_shader_fp64.execution.conversion.frag-conversion-explicit-bool-double piglit.spec.arb_gpu_shader_fp64.execution.conversion.frag-conversion-explicit-bvec2-dvec2 piglit.spec.arb_gpu_shader_fp64.execution.conversion.frag-conversion-explicit-bvec3-dvec3 piglit.spec.arb_gpu_shader_fp64.execution.conversion.frag-conversion-explicit-bvec4-dvec4 piglit.spec.arb_gpu_shader_fp64.execution.conversion.geom-conversion-explicit-bool-double piglit.spec.arb_gpu_shader_fp64.execution.conversion.geom-conversion-explicit-bvec2-dvec2 piglit.spec.arb_gpu_shader_fp64.execution.conversion.geom-conversion-explicit-bvec3-dvec3 piglit.spec.arb_gpu_shader_fp64.execution.conversion.geom-conversion-explicit-bvec4-dvec4 piglit.spec.arb_gpu_shader_fp64.execution.conversion.vert-conversion-explicit-bool-double piglit.spec.arb_gpu_shader_fp64.execution.conversion.vert-conversion-explicit-bvec2-dvec2 piglit.spec.arb_gpu_shader_fp64.execution.conversion.vert-conversion-explicit-bvec3-dvec3 piglit.spec.arb_gpu_shader_fp64.execution.conversion.vert-conversion-explicit-bvec4-dvec4 piglit.spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-mixed-shader piglit.spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-shader piglit.spec.arb_gpu_shader_fp64.shader_storage.layout-std430-fp64-mixed-shader piglit.spec.arb_gpu_shader_fp64.shader_storage.layout-std430-fp64-shader piglit.spec.arb_gpu_shader_fp64.uniform_buffers.fs-double-uniform-array-direct-indirect piglit.spec.arb_gpu_shader_fp64.uniform_buffers.gs-double-uniform-array-direct-indirect piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat2 array piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat2 arrays_of_arrays piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat2 separate piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat2x3 array piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat2x3 arrays_of_arrays piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat2x3 separate piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat2x4 array piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat2x4 arrays_of_arrays piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat2x4 separate piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat3 array piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat3 arrays_of_arrays piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat3 separate piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat3x2 array piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat3x2 arrays_of_arrays piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat3x2 separate piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat3x4 array piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat3x4 arrays_of_arrays piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat3x4 separate piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat4 array piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat4 arrays_of_arrays piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat4 separate piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat4x2 array piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat4x2 arrays_of_arrays piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat4x2 separate piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat4x3 array piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat4x3 arrays_of_arrays piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dmat4x3 separate piglit.spec.arb_gpu_shader_fp64.varying-packing.simple double array piglit.spec.arb_gpu_shader_fp64.varying-packing.simple double arrays_of_arrays piglit.spec.arb_gpu_shader_fp64.varying-packing.simple double separate piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dvec2 array piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dvec2 arrays_of_arrays piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dvec2 separate piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dvec3 array piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dvec3 arrays_of_arrays piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dvec3 separate piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dvec4 array piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dvec4 arrays_of_arrays piglit.spec.arb_gpu_shader_fp64.varying-packing.simple dvec4 separate (In reply to Mark Janes from comment #9) > Beyond the hang, there are nearly 60 fp64 piglit tests that do not pass on > BSW or BXT. > [...] > I need guidance as to whether these test failures should gate the 12.0 > release or should be written up in a separate bug. OK, I am going to take a look at them and tell you what to do with them. Thanks! (In reply to Samuel Iglesias from comment #11) > (In reply to Mark Janes from comment #9) > > Beyond the hang, there are nearly 60 fp64 piglit tests that do not pass on > > BSW or BXT. > > > [...] > > I need guidance as to whether these test failures should gate the 12.0 > > release or should be written up in a separate bug. > > OK, I am going to take a look at them and tell you what to do with them. > > Thanks! We have written a couple of patches to fix these tests: $ git clone -b wip/siglesias/chv-fixes https://github.com/samuelig/mesa.git The fixes are specific to Cherryview, so they might be failing in BXT. There is one patch from Kenneth that did not land master but it is still in my branch. With these patches I got 0 piglit regressions on BSW and BDW and the remaining fp64 failed tests on BSW are fixed. Mark, Would you mind testing them in CI system? Our plan is to land them in master before the 4th release candidate (planned on Friday, AFAIK), so they will part of the final 12.0 release. Thanks! (In reply to Samuel Iglesias from comment #12) > (In reply to Samuel Iglesias from comment #11) > > (In reply to Mark Janes from comment #9) > > > Beyond the hang, there are nearly 60 fp64 piglit tests that do not pass on > > > BSW or BXT. > > > > > [...] > > > I need guidance as to whether these test failures should gate the 12.0 > > > release or should be written up in a separate bug. > > > > OK, I am going to take a look at them and tell you what to do with them. > > > > Thanks! > > We have written a couple of patches to fix these tests: > > $ git clone -b wip/siglesias/chv-fixes https://github.com/samuelig/mesa.git > > The fixes are specific to Cherryview, so they might be failing in BXT. There > is one patch from Kenneth that did not land master but it is still in my > branch. With these patches I got 0 piglit regressions on BSW and BDW and the > remaining fp64 failed tests on BSW are fixed. > > Mark, Would you mind testing them in CI system? Our plan is to land them in > master before the 4th release candidate (planned on Friday, AFAIK), so they > will part of the final 12.0 release. > > Thanks! If they are failing in BXT too, you can test this branch instead: $ git clone -b wip/siglesias/chv-bxt-fixes https://github.com/samuelig/mesa.git If fp64 tests pass in BXT with the second branch, I will update the patches before pushing them to master (possibly sending a v2 before), because I have already sent them for review to save time. All fp64 tests on bxt/bsw are fixed by the branch: wip/siglesias/chv-bxt-fixes thanks! (In reply to Mark Janes from comment #14) > All fp64 tests on bxt/bsw are fixed by the branch: > > wip/siglesias/chv-bxt-fixes > > thanks! The following two patches landed master: bdab572 i965/fs: indirect addressing with doubles is not supported in CHV/BSW/BXT 0177dbb i965/fs: Fix single-precision to double-precision conversions for CHV/BSW/BXT There is still one patch from Kenneth that did not land master: "i965: Fix multiplication of immediates on Cherryview/Broxton." The final patch has been merged. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.