Bug 17905 - [G45] -intel git freezes X after login (on gigabyte GA-EG45M-DS2H with Intrepid)
Summary: [G45] -intel git freezes X after login (on gigabyte GA-EG45M-DS2H with Intrepid)
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: git
Hardware: Other All
: high normal
Assignee: Wang Zhenyu
QA Contact: Xorg Project Team
Keywords: NEEDINFO
Depends on:
Blocks: intel-2.5
  Show dependency treegraph
Reported: 2008-10-04 13:24 UTC by martin
Modified: 2008-10-19 19:50 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:

xorg log with info about overflow in ring buffer or whatever (32.88 KB, text/plain)
2008-10-04 13:24 UTC, martin
no flags Details
gdb bt for EXA+ModeDebug freeze (13.29 KB, text/x-log)
2008-10-10 08:15 UTC, martin
no flags Details
dri spam in dmesg for EXA+ModeDebug freeze on login (122.16 KB, text/x-log)
2008-10-10 08:18 UTC, martin
no flags Details
xorg.log for EXA+ModeDebug freeze on login (103.42 KB, text/x-log)
2008-10-10 08:19 UTC, martin
no flags Details
libdrm/xf86-intel GIT masters with EXA+DRI_false+ModeDebug (XORG LOG) (51.55 KB, text/x-log)
2008-10-12 04:38 UTC, martin
no flags Details
libdrm/xf86-intel GIT masters with EXA+DRI_false+ModeDebug (DMESG) (40.65 KB, text/x-log)
2008-10-12 04:38 UTC, martin
no flags Details
libdrm/xf86-intel GIT masters with EXA+DRI_false+ModeDebug (GDB BT SHOWING SEGV) (16.04 KB, text/x-log)
2008-10-12 04:39 UTC, martin
no flags Details

Description martin 2008-10-04 13:24:28 UTC
Created attachment 19371 [details]
xorg log with info about overflow in ring buffer or whatever

Both 2.4.1 (as packaged by ubuntu in intrepid) and also 2.5.96 from git head flashes several times (flashing as in mode setting like flashes) and then freezes xorg about 1 sec after login (same thing with DRI disabled). If I set accelmethod XAA it starts and _mostly_ works fine (unless I launch for instance GIMP then xorg freezes again, 100% reproducibly even in XAA mode).

While I've not seen it myself I've heard on IRC that some other users actually got G45 machines running ubuntu with 2.4.1 intel driver semi-working. No freeze, just strangely colored lines as reported here:

All of the people that commented on the launchpad bug have other motherboard than me though. My motherboard is a Gigabyte GA-EG45M-DS2H.

In my xorg log I see what looks me my untrained eye, a filled up ring buffer (xserver giving up because it waited more than 2 seconds and it got nowhere to put the stuff it could not fit inside the ring buffer). I will attach the xorg log.
Comment 1 Gordon Jin 2008-10-04 23:49:31 UTC
Thanks for reporting. But this seems dup with bug#17235.

*** This bug has been marked as a duplicate of bug 17235 ***
Comment 2 martin 2008-10-09 10:15:47 UTC
I can still repro this bug and because bug 17235 was closed due to "no repro" I will reopen this one. Also note that I got a very different stacktrace/xorg.log and also this bug repros _every_ time I login to X.
Comment 3 martin 2008-10-09 10:22:15 UTC
Gordon, if you need additional information please specify exactly what config you want to run and I will try it and then attack stacks/xorg.log etc.

Currently, X still freezes using git head 2.4.97 taken from here:
git-clone git://anongit.freedesktop.org/git/xorg/driver/xf86-video-intel
For all other components I'm using what is in Ubuntu intrepid right now (should semi recent stuff afaik).

If I go "NoAccel" it does not freeze. I'm currently using XAA which works if I only use terminal+firefox. If I launch gimp or openoffice then even X in XAA mode freezes (but this is most likely another bug because it's a 100% CPU spin).

PS. I'm using an LCD flat panel (SyncMaster 2253BW from Samsung) connected through DVI.
Comment 4 Gordon Jin 2008-10-09 18:47:56 UTC
OK. Let's stick on xf86-video-intel master tip.

Zhenyu, we are getting a G45-freeze reporter capable of using git tip. Please cooperate with him to root cause. We also have Intrepid but not able to reproduce.
Comment 5 Wang Zhenyu 2008-10-09 19:04:46 UTC
Keith has fixed this non-dri bug in upstream drm/xf86-video-intel, please test against that.
Comment 6 martin 2008-10-10 08:13:14 UTC
Okay,I just git pulled the following repo:
git-clone git://anongit.freedesktop.org/git/xorg/driver/xf86-video-intel
(at the time I got it keithp had two commits at the top and the tip was the one with description "For non-DRM, add NOOPs after BATCH_BUFFER_START to verify completion").
I did not change kernel, drm, mesa or anything from what is already in intrepid.

For xorg.conf I used AccelMethod EXA and ModeDebug true (and no other options in xorg.conf in this test run).

X.org still frozen directly at login. I logged in through ssh and saw a tiny amount of CPU activity in dd, klogd and syslog. X itself was not taking any CPU.

I will attach a gdb session where I did "bt, bt full, c, CTRL-C, bt, bt full" and it shows basically that X is stuck doing ioctl like this:

#0  0x00007fe9c7325a17 in ioctl () from /lib/libc.so.6
#1  0x00007fe9c5f00c43 in drmIoctl (fd=10, request=1074029637, arg=0x7fffd14f9cf0) at xf86drm.c:183
#2  0x00007fe9c5f00ccb in drmCommandWrite (fd=10, drmCommandIndex=<value optimized out>, data=0x7fffd14f9cf0, size=18446744073709551615)
    at xf86drm.c:2343
#3  0x00007fe9c5c821a8 in I830Sync (pScrn=0x1e46a70) at i830_accel.c:214
#4  0x00007fe9c5222f6c in exaWaitSync (pScreen=0x1e78f50) at ../../exa/exa.c:1051

Further, since the CPU activity seemed to suggest that there was some logging going on I looked at dmesg (will attach that as well) and it was completely filled with repeating entries like this:

[ 1249.152162] [drm:drm_ioctl] pid=12668, cmd=0x40046445, nr=0x45, dev 0xe200, auth=1
[ 1249.152164] [drm:i915_wait_irq] irq_nr=2836 breadcrumb=2800
[ 1249.172134] [drm:drm_ioctl] ret = fffffffc
[ 1249.172160] [drm:drm_ioctl] pid=12668, cmd=0x40046445, nr=0x45, dev 0xe200, auth=1
[ 1249.172163] [drm:i915_wait_irq] irq_nr=2836 breadcrumb=2800
[ 1249.192134] [drm:drm_ioctl] ret = fffffffc
[ 1249.192160] [drm:drm_ioctl] pid=12668, cmd=0x40046445, nr=0x45, dev 0xe200, auth=1
[ 1249.192163] [drm:i915_wait_irq] irq_nr=2836 breadcrumb=2800

Finally, I also saved (and till attach) xorg.log even though I couldn't see anything super interesting in there (but you might!).
Comment 7 martin 2008-10-10 08:15:12 UTC
Created attachment 19567 [details]
gdb bt for EXA+ModeDebug freeze
Comment 8 martin 2008-10-10 08:18:08 UTC
Created attachment 19568 [details]
dri spam in dmesg for EXA+ModeDebug freeze on login
Comment 9 martin 2008-10-10 08:19:44 UTC
Created attachment 19569 [details]
xorg.log for EXA+ModeDebug freeze on login
Comment 10 Wang Zhenyu 2008-10-11 08:35:03 UTC
Keith also fixes in drm, so you have to pull drm from git, just build libdrm and test it for now.
Comment 11 martin 2008-10-12 04:36:47 UTC
I've deleted /lib/libdrm* and /usr/local/libdrm* and then ran "sudo ldconfig".
Then I pulled git://anongit.freedesktop.org/git/mesa/drm from that root I did:
./autogen.sh --prefix=/usr
sudo make install

(Note: I didn't copy any .ko files since [keithp told me that] the .ko files shipping with libdrm are out of date. I also did NOT pull this tree: http://git.kernel.org/?p=linux/kernel/git/anholt/drm-intel.git;a=summary and I assume it's okay to ignore those bits since I'm using DRI=false, right?)

After building libdrm as described above I verified that there was a recently written version of libdrm.so* written into /usr/lib/libdrm.so* (seems to have worked afaik).

The libdrm "git log" at this point certainly had keithp and eric's fixes from last week plus four (mostly freebsd) fixes at the HEAD.


After that I pulled git://anongit.freedesktop.org/git/xorg/driver/xf86-video-intel
 and built it using:
./autogen.sh --prefix=/usr
sudo make install


Then I configured EXA+DRIfalse+ModeDebug and rebooted. I'm attaching xorg_log+dmesg+gdb trace showing a SEGV (even though it's on the error handling path triggered by some lockup).
Comment 12 martin 2008-10-12 04:38:20 UTC
Created attachment 19598 [details]
libdrm/xf86-intel GIT masters with EXA+DRI_false+ModeDebug (XORG LOG)
Comment 13 martin 2008-10-12 04:38:52 UTC
Created attachment 19599 [details]
libdrm/xf86-intel GIT masters with EXA+DRI_false+ModeDebug (DMESG)
Comment 14 martin 2008-10-12 04:39:35 UTC
Created attachment 19600 [details]
libdrm/xf86-intel GIT masters with EXA+DRI_false+ModeDebug (GDB BT SHOWING SEGV)
Comment 15 martin 2008-10-12 04:42:35 UTC
Again please let me know if you need more/different info (or if I didn't get libdrm installed correctly). It's a shame that the loaded libdrm version number isn't printed into xorg_log :-(

On a positive note. Using libdrm/xf86-intel GIT masters now allows me to do VT switching in XAA mode (I couldn't do that before). Still Xorg freezes when I launch GIMP or OpenOffice in XAA mode (I havn't filed a bug on that because I assume XAA lower priority).
Comment 16 Michael Fu 2008-10-13 02:33:43 UTC
Martin, what's exact distribution do you test? I tried the Ubuntu 8.10 Beta ( build date is 8th, Oct 2008) on a GA-EG43M-DS2H ( not 5 ) and it works fine... could you burn a live cd to have a test, so that we could make sure the SW environment we use to reproduce this bug is same?

Comment 17 martin 2008-10-13 08:13:51 UTC
Good idea Michael. To test this I burned the following image to disk and booted from it:

It frozen on startup just like my main configuration (which is intrepid bleeding edge, installing updates as they come). This narrows it down to the HW difference (I got G45 and you got G43 right?)

Beyond software and hardware there is also the BIOS settings even though I assume they are very unlikely to be the culprit. In my BIOS I currently got:

"On board VGA" set to "Enable if no ext PEG"
(and I dont have any graphics card in the machine beyond the on board G45)

and I also got:
"On-chip Frame Buffer Size" set to "32MB+2MB for GTT".
(I've never tried to boot with any other settings, these are the factor defaults I think).
Comment 18 Michael Fu 2008-10-14 00:44:27 UTC
since this bug also reproducible on XAA case, I'm removing the EXA tag...

Martin, pls checkout keith's post on intel-gfx@lists.freedesktop.org.
Comment 19 martin 2008-10-14 11:58:30 UTC
Great news, when I apply the intel-agp patch Keith posted to intel-gfx recently I can actually login to my machine in EXA mode. The patch I applied was posted with subject line "[Intel-gfx] G45 BIOS mis-initializes stolen GTT PTEs"
Comment 20 Eric Anholt 2008-10-14 15:54:20 UTC
This needs to be retested against current drm-intel-next kernel and master xf86-video-intel.
Comment 21 Gordon Jin 2008-10-16 06:05:07 UTC
Martin, the patch on drm-intel-next is a bit different with Keith's original one. So your retesting is important for this fix. Thanks!
Comment 22 martin 2008-10-18 10:17:53 UTC
I cloned drm-intel-next today and then I git-format-patched just this single commit:

I then applied that patch (using "patch -p1 < file.patch") onto the current ubuntu intrepid intel-agp.ko and that was sufficient for me to be able to boot into EXA.

I had no special stuff in xorg.conf except ModeDebug and I also got direct rendering, I was able to launch glxgears, gimp, frets on fire and some other random apps. So far so good.

I tried (but didn't succeed due to lack to understand of how all this stuff works really) to build the full config with:
masters of xf86intel,libdrm.so,i915.ko,drm.ko,intel-agp.ko etc
I think the problem was that I tried to build the modules against my ubuntu .27 kernel and there was a bunch of GEM stuff missing in that kernel. So I got a bunch of errors like these:
/home/mnemo/src/intel_driver/drm-intel/drivers/gpu/drm/drm_gem.c: In function ‘drm_gem_init’:
/home/mnemo/src/intel_driver/drm-intel/drivers/gpu/drm/drm_gem.c:74: error: ‘struct drm_device’ has no member named ‘object_name_lock’
/home/mnemo/src/intel_driver/drm-intel/drivers/gpu/drm/drm_gem.c:75: error: ‘struct drm_device’ has no member named ‘object_name_idr’

Is it possible to build eric's tree without GEM, I mean is there a define I can use to disable it or something?

Otherwise, if I must build with GEM should I build the kernel itself with all modules and then install it into GRUB in order to test it or how is this stuff usually done?
Comment 23 Wang Zhenyu 2008-10-19 19:50:33 UTC
Thanks for testing! I think that agp patch is the only sensible one for this bug, and it clearly fixes your problem. I'm closing this one.

Yeah, gem kernel needs some other changes to build, and now it's easier as it's in upstream kernel now, so just pull linus's tree and build it.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.