Bug 99354

Summary: [G71] "Assertion `bkref' failed" reproducible with glmark2
Product: Mesa Reporter: Michele Ballabio <barra_cuda>
Component: Drivers/DRI/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED FIXED QA Contact: Nouveau Project <nouveau>
Severity: normal    
Priority: medium    
Version: 13.0   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: glmark2 backtrace
glmark2 apitrace

Description Michele Ballabio 2017-01-10 22:01:48 UTC
Created attachment 128875 [details]
glmark2 backtrace

[G71] "Assertion `bkref' failed" reproducible with glmark2

Using glmark2 I can trigger the assertion
"glmark2: pushbuf.c:238: pushbuf_krel: Assertion `bkref' failed."
quite easily. I'm not sure if this is a duplicate of other bugs.

The fastest way to reproduce seems to be:

./glmark2 -b buffer:columns=200:interleave=false:update-dispersion=0.99:update-fraction=0.5:update-method=subdata:buffer-usage=static --run-forever

Changing some parameters from that command line still trigger the assertion,
but usually it takes a bit more time. The command above triggers for me
after 2.5 seconds (usually less than that), while for example the following command

./glmark2 -b buffer:columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map --run-forever

can trigger even a minute later.

I've used glmark2 at commit f413c5b423250b4fde8f95639ad368d5b02c5b9b.

Software versions:
xf86-video-nouveau-1.0.13
mesa-13.0.3
libdrm-2.4.74
xorg-server-1.19.0

Hardware: nVidia Quadro FX 1500

Backtrace attached.
Comment 1 Michele Ballabio 2017-01-10 22:05:45 UTC
Created attachment 128876 [details]
glmark2 apitrace

This is an apitrace of glmark2 hitting th bkref assertion. I've used apitrace at commit 849af14927410b5cd8a9597a39bd7f636ae54644.
Comment 2 Ilia Mirkin 2017-01-10 22:20:15 UTC
Does running the apitrace in question cause the assert to happen?
Comment 3 Michele Ballabio 2017-01-10 22:44:56 UTC
(In reply to Ilia Mirkin from comment #2)
> Does running the apitrace in question cause the assert to happen?

Yes it does:

apitrace replay apitrace.out

[...]

glretrace: pushbuf.c:238: pushbuf_krel: Assertion `bkref' failed.
apitrace: warning: caught signal 6
16922: error: caught an unhandled exception
glretrace+0x2b391d
glretrace+0x2b3116
glretrace+0x2b132c
linux-gate.so.1+0xb3b
/lib/libc.so.6: gsignal+0xbd
/lib/libc.so.6: abort+0x16c
/lib/libc.so.6+0x27eb4
/lib/libc.so.6: __assert_fail+0x57
/usr/lib/libdrm_nouveau.so.2: pushbuf_krel+0x3ce3: /tmp/libdrm-2.4.74/nouveau/pushbuf.c:238
/usr/lib/libdrm_nouveau.so.2: nouveau_pushbuf_reloc+0x44: /tmp/libdrm-2.4.74/nouveau/pushbuf.c:746
/usr/lib/xorg/modules/dri/nouveau_dri.so: PUSH_RELOC+0x541f38: ./nv30/nv30_winsys.h:26
/usr/lib/xorg/modules/dri/nouveau_dri.so: nv30_transfer_copy_data+0x544701: nv30/nv30_transfer.c:740
/usr/lib/xorg/modules/dri/nouveau_dri.so: nouveau_transfer_write+0x528d0c: /tmp/mesa-13.0.3/src/gallium/drivers/nouveau/nouveau_buffer.c:211
/usr/lib/xorg/modules/dri/nouveau_dri.so: nouveau_buffer_transfer_unmap+0x52976e: /tmp/mesa-13.0.3/src/gallium/drivers/nouveau/nouveau_buffer.c:544
/usr/lib/xorg/modules/dri/nouveau_dri.so: u_transfer_unmap_vtbl+0x4526ad: util/u_transfer.c:154
/usr/lib/xorg/modules/dri/nouveau_dri.so: pipe_transfer_unmap+0x452321: ./util/u_inlines.h:457
/usr/lib/xorg/modules/dri/nouveau_dri.so: u_default_buffer_subdata+0x4523eb: util/u_transfer.c:35
/usr/lib/xorg/modules/dri/nouveau_dri.so: pipe_buffer_write+0x235686: ../../src/gallium/auxiliary/util/u_inlines.h:345
/usr/lib/xorg/modules/dri/nouveau_dri.so: st_bufferobj_subdata+0x235836: state_tracker/st_cb_bufferobjects.c:132
/usr/lib/xorg/modules/dri/nouveau_dri.so: _mesa_buffer_sub_data+0x5ff8d: main/bufferobj.c:1804
/usr/lib/xorg/modules/dri/nouveau_dri.so: _mesa_BufferSubData+0x60015: main/bufferobj.c:1818
glretrace+0x142273
glretrace+0x93096
glretrace+0x84afc
glretrace+0x86cfc
glretrace+0x86c6f
glretrace+0x84e46
glretrace+0x85013
glretrace+0x85a7f
/lib/libc.so.6: __libc_start_main+0x106
glretrace+0x81ba0: ../sysdeps/i386/start.S:115
apitrace: info: taking default action for signal 6
Comment 4 Ilia Mirkin 2017-01-10 22:46:41 UTC
Cool thanks! I think a similar issue has been reported before, and I stared at that code with little success. Having a reproducible issue that I can poke at should definitely make success (in at least understanding, if not fixing) much more likely.
Comment 5 Ilia Mirkin 2017-01-11 01:58:15 UTC
While I was not able to hit the issue by replaying your trace, I grabbed glmark2, and with

MESA_GL_VERSION_OVERRIDE=2.0 ./build/src/glmark2 -b buffer:columns=200:interleave=false:update-dispersion=0.99:update-fraction=0.5:update-method=subdata:buffer-usage=static

(built with the x11-gl target) it indeed fails on my NV34. Will start digging.
Comment 6 Ilia Mirkin 2017-01-11 03:12:00 UTC
I have been unable to trigger the issue locally with the following mesa patch:

https://patchwork.freedesktop.org/patch/132414/

Please try to see if this also resolves your problems.
Comment 7 Michele Ballabio 2017-01-11 14:35:33 UTC
I've applied your patch on top of mesa 13.0.3 (there was a conflict in
nvc0_query_hw.c, but it doesn't matter with my hardware).

I've been running
gdb --args ./glmark2 -b buffer:columns=200:interleave=false:update-dispersion=0.99:update-fraction=0.5:update-method=subdata:buffer-usage=static --run-forever
for 30 minutes now, so I guess this issue is fixed.

Thank you, Ilia.

Changing status to resolved-fixed.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.