bisected to: a9031bf9b55602d93cccef6c926e2179c23205b4 Author: Andrii Simiklit <andrii.simiklit@globallogic.com> i965/batch: avoid reverting batch buffer if saved state is an empty There's no point reverting to the last saved point if that save point is the empty batch, we will just repeat ourselves. CC: Chris Wilson <chris@chris-wilson.co.uk> Fixes: 3faf56ffbdeb "intel: Add an interface for saving/restoring the batchbuffer state." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Previously, this test failed within 3s with: i965: Failed to submit batchbuffer: No space left on device
Because the build spun forever, Mesa CI timed the build out. Unfortunately, this was reported as a success :( The CI test results for this patch were invalid.
Reverting the patch fixes the test, without any other regressions shown in CI.
Andrii, I found the issue. It is not specific to g965. In the blorp code, on a retry, the check_aperture_failed_once variable will be reset back to false since we clear things before the retry. This small patch added to your change fixes it. Another idea could be to preset check_aperture_failed_once before the retry label. Can you re-submit the patch with this, or a similar change? diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c b/src/mesa/drivers/dri/i965/genX_blorp_exec.c index fd9ce93c6c..7b0f77af25 100644 --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c @@ -309,7 +309,7 @@ retry: intel_batchbuffer_require_space(brw, 1400); brw_require_statebuffer_space(brw, 600); intel_batchbuffer_save_state(brw); - check_aperture_failed_once = intel_batchbuffer_saved_state_is_empty(brw); + check_aperture_failed_once |= intel_batchbuffer_saved_state_is_empty(brw); brw->batch.no_wrap = true; #if GEN_GEN == 6
I reverted the original patch, fixing the regression.
(In reply to Mark Janes from comment #0) > bisected to: > > a9031bf9b55602d93cccef6c926e2179c23205b4 > Author: Andrii Simiklit <andrii.simiklit@globallogic.com> > > i965/batch: avoid reverting batch buffer if saved state is an empty > > There's no point reverting to the last saved point if that save point is > the empty batch, we will just repeat ourselves. > > CC: Chris Wilson <chris@chris-wilson.co.uk> > Fixes: 3faf56ffbdeb "intel: Add an interface for saving/restoring > the batchbuffer state." > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626 > Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> > Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> > > > Previously, this test failed within 3s with: > i965: Failed to submit batchbuffer: No space left on device Yes you are right. It my mistake :( Sorry for that. Thanks a lot for your work. (In reply to Jordan Justen from comment #3) > Andrii, > > I found the issue. It is not specific to g965. In the blorp code, on a > retry, the check_aperture_failed_once variable will be reset back to > false since we clear things before the retry. This small patch added > to your change fixes it. > > Another idea could be to preset check_aperture_failed_once before > the retry label. > > Can you re-submit the patch with this, or a similar change? > > diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c > b/src/mesa/drivers/dri/i965/genX_blorp_exec.c > index fd9ce93c6c..7b0f77af25 100644 > --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c > +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c > @@ -309,7 +309,7 @@ retry: > intel_batchbuffer_require_space(brw, 1400); > brw_require_statebuffer_space(brw, 600); > intel_batchbuffer_save_state(brw); > - check_aperture_failed_once = intel_batchbuffer_saved_state_is_empty(brw); > + check_aperture_failed_once |= > intel_batchbuffer_saved_state_is_empty(brw); > brw->batch.no_wrap = true; > > #if GEN_GEN == 6 You are right. Thanks a lot for your fix. I will send updated patch shortly.
I investigated why I missed this regression on my local machine despite I checked this patch several times under piglit and found out that this test passes on my machine even with my original patch. So I guess it depends on some additional things. asimiklit@asimiklit-pc:~/projects/piglit$ bin/tex3d-maxsize -auto Probe tolerance: 0.023438, 0.023438, 0.023438, 0.023438 GL_MAX_3D_TEXTURE_SIZE = 2048 Actual max 3D texture size for GL_INTENSITY8: 1024 x 1024 x 1024 (1024 MB) Testing 1024 x 1024 x 256 GL_INTENSITY8 (256 MB) texture Testing 1024 x 1024 x 512 GL_INTENSITY8 (512 MB) texture Testing 1024 x 1024 x 1024 GL_INTENSITY8 (1024 MB) texture Actual max 3D texture size for GL_RGBA8: 512 x 512 x 1024 (1024 MB) Testing 512 x 512 x 256 GL_RGBA8 (256 MB) texture Testing 512 x 512 x 512 GL_RGBA8 (512 MB) texture Testing 512 x 512 x 1024 GL_RGBA8 (1024 MB) texture Actual max 3D texture size for GL_RGBA32F: 256 x 512 x 512 (1024 MB) Testing 256 x 512 x 128 GL_RGBA32F (256 MB) texture Testing 256 x 512 x 256 GL_RGBA32F (512 MB) texture Testing 256 x 512 x 512 GL_RGBA32F (1024 MB) texture PIGLIT: {"result": "pass" } So I will check my updated patch on Intel CI before sent.
(In reply to asimiklit from comment #6) > I investigated why I missed this regression on my local machine despite I > checked this patch several times under piglit and found out that this test > passes on my machine even with my original patch. So I guess it depends on > some additional things. > > asimiklit@asimiklit-pc:~/projects/piglit$ bin/tex3d-maxsize -auto > Probe tolerance: 0.023438, 0.023438, 0.023438, 0.023438 > GL_MAX_3D_TEXTURE_SIZE = 2048 > Actual max 3D texture size for GL_INTENSITY8: 1024 x 1024 x 1024 (1024 MB) > Testing 1024 x 1024 x 256 GL_INTENSITY8 (256 MB) texture > Testing 1024 x 1024 x 512 GL_INTENSITY8 (512 MB) texture > Testing 1024 x 1024 x 1024 GL_INTENSITY8 (1024 MB) texture > Actual max 3D texture size for GL_RGBA8: 512 x 512 x 1024 (1024 MB) > Testing 512 x 512 x 256 GL_RGBA8 (256 MB) texture > Testing 512 x 512 x 512 GL_RGBA8 (512 MB) texture > Testing 512 x 512 x 1024 GL_RGBA8 (1024 MB) texture > Actual max 3D texture size for GL_RGBA32F: 256 x 512 x 512 (1024 MB) > Testing 256 x 512 x 128 GL_RGBA32F (256 MB) texture > Testing 256 x 512 x 256 GL_RGBA32F (512 MB) texture > Testing 256 x 512 x 512 GL_RGBA32F (1024 MB) texture > PIGLIT: {"result": "pass" } > > > So I will check my updated patch on Intel CI before sent. The updated patch is delayed due to testing problems.
(In reply to asimiklit from comment #7) > (In reply to asimiklit from comment #6) > > I investigated why I missed this regression on my local machine despite I > > checked this patch several times under piglit and found out that this test > > passes on my machine even with my original patch. So I guess it depends on > > some additional things. > > > > The updated patch is delayed due to testing problems. I hope my comment ("It is not specific to g965") is not delaying things. What I meant is that "in theory" it could happen on any system. But, due to other factors, we only currently know of a reproducible case on g965. So, I would not hold off on re-posting the patch just because you cannot reproduce the bug on a newer machine. -Jordan
(In reply to Jordan Justen from comment #8) > (In reply to asimiklit from comment #7) > > (In reply to asimiklit from comment #6) > > > I investigated why I missed this regression on my local machine despite I > > > checked this patch several times under piglit and found out that this test > > > passes on my machine even with my original patch. So I guess it depends on > > > some additional things. > > > > > > > The updated patch is delayed due to testing problems. > > I hope my comment ("It is not specific to g965") is not delaying > things. What I meant is that "in theory" it could happen on any system. > But, due to other factors, we only currently know of a reproducible > case on g965. > > So, I would not hold off on re-posting the patch just because you > cannot reproduce the bug on a newer machine. > > -Jordan I just wanted to see "Build # XX - Successful!" from Intel CI for my updated patch to be 100% sure that there are no regressions. But on Friday I was unable to do it due to troubles with a 'mesa-ci.01.org' (I didn't have an ability to see results of Intel CI) and I was unable to get "Success" even for latest mesa master. Today I started one more test run: "global_logic - Build # 37 - Building!" for latest mesa master and when I receive 'Success' I will run it for my updated patch.
I pushed v4 of Andrii's patch: commit b787dcf57b7298868ce9b6885a827d57a6127ba1 Author: Andrii Simiklit <andrii.simiklit@globallogic.com> Date: Mon Nov 5 09:48:26 2018 +0200 i965/batch: avoid reverting batch buffer if saved state is an empty It appears to abort (as expected) with tex3d-maxsize on g965, rather than hanging the test: https://mesa-ci.01.org/jljusten/builds/215/results/9151567
(In reply to Jordan Justen from comment #10) > I pushed v4 of Andrii's patch: > > commit b787dcf57b7298868ce9b6885a827d57a6127ba1 > Author: Andrii Simiklit <andrii.simiklit@globallogic.com> > Date: Mon Nov 5 09:48:26 2018 +0200 > > i965/batch: avoid reverting batch buffer if saved state is an empty > > It appears to abort (as expected) with tex3d-maxsize on g965, > rather than hanging the test: > > https://mesa-ci.01.org/jljusten/builds/215/results/9151567 Thanks a lot)
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.