Created attachment 13941 [details]
2.6.24 kernel's .config
On my Asus P5E-VM desktop (Intel G35, ICHR), switching to virtual console
(ctrl alt f1) won't work if I suspended-to-ram the machine before. In that
case, the monitor displays "no input signal" and stays black (until I
switch back to X11 via ctrl-alt-f7).
lspci -vvvxxxx output differs before and after then machine went suspended.
Many graphic cards registers differs (see attached lspci outputs), while
they stay consistent across reboots (an reciprocally, lspci look always
the same after a s2r).
Other symptoms I have on the machine when it has been s2r :
- If I stop X11 (ie. /etc/init.d/gdm stop), the console won't properly
restore. X11 won't restart either (ie. /etc/init.d/gdm restart from ssh).
Looks like the problem in bz #14218
- The console is completely garbled when I reboot (and the reboot seems
frozen at some point).
Indeed all those works well if the machine hasn't been suspend before
(switching to console works, reboot works, stopping/restarting xorg works).
The only way I found to fix the problem is to reboot (reset) the machine.
After a non working ctrl alt f1, those new lines appeared in Xorg.0.log :
(II) AIGLX: Suspending AIGLX clients for VT switch
(II) intel(0): xf86UnbindGARTMemory: unbind key 0
(II) intel(0): xf86UnbindGARTMemory: unbind key 1
(II) intel(0): xf86UnbindGARTMemory: unbind key 2
(II) intel(0): xf86UnbindGARTMemory: unbind key 3
Asus P5E-VM HDMI motherboard (Intel G35 + ICH9R based)
Belinea 101920 LCD screen (19'' 4/3, with DVI and VGA inputs)
Softwares versions :
ubuntu hardy heron (testing version)
kernel - 2.6.24, running 32bit
xf86-video-intel - git head from today (January 25th 2008)
xserver - 7.3 / 1.4.1 (Ubuntu provided git 20080118 snap)
mesa - 7.0.2
consolekit - 0.2.3
libdrm2 - 2.3.0
libxrandr - 1.2.2
Attached Xorg.0.log, lsmod, xorg.conf, kernel .config and :
- lspci -vvvxxxx before suspending
- lspci -vvvxxxx after suspending
Created attachment 13942 [details]
Created attachment 13943 [details]
Created attachment 13944 [details]
lspci -vvvxxxx before suspend-to-ram
Created attachment 13945 [details]
lspci -vvvxxxx after suspend-to-ram
Created attachment 13946 [details]
Created attachment 13947 [details]
Yeah, this is a known issue. The fix is to use updated (i.e. from git) DRM bits that suspend/resume VGA state in addition to graphics state. Can you give that a try and make sure it works for you?
I updated drm and libdrm to git head, it didn't fix the issue.
I only updated drm kernel modules (drm.ko and i915.ko) and
libdrm (and already have a fairly recent xf86-video-intel git
snap). Should I also update the mesa lib or xserver ?
I investigated through my distro suspend scripts to reproduce
(and test several options) manually. Worth to note:
A pure and simple "echo mem > /sys/power/state" won't work
at all. I mean, when resumed, the screen remains black even
under X (with my distro scripts, only the vt remains black
but the X session and display are properly restored). To
get the display back under xorg, I must do a "vbetool post".
I guess this not worth a bug report, since it is handled by
vendors/distros suspend-resume scripts' quirks.
Doing "vbetool vgamode set 3" during resume (my distro's scripts
tries to do this), either with or without latest git drm, outputs
"Function not supported".
Same error message when I do a "vbetool vbestate restore <
/var/lib/acpi-support/vbestate" during resume (this file,
/var/lib/acpi-support/vbestate is generated before suspend
with a "vbetool vbestate save" without error message).
I don't know if they completely fail, but those two commands are
not useful here (they don't improve the "console broken" situation,
and removing them does not block xorg display to restore properly),
even if they are executed by default by my distro scripts (that's
part of pm-utils, that handles suspend-resume on Fedora and Ubuntu).
Doing a "vbetool dpms on" at resume didn't help either.
I also tried several "sysctl -w kernel.acpi_video_flags=x"
(setting it to 0, 1, 2 and 3 before suspend) with no success.
Also tested, without success (same "broken console" problem) :
- a 2.6.24 kernel without framebuffer (with and without drm from git)
- a 2.6.22 kernel with default included drm
- XAA instead of EXA
- with a DVI attached monitor instead of VGA
- not loading (blacklisting) drm and i915 modules
So, the minimal suspend-resume script to reproduce the problem
here (and without breaking x11 after resume) is :
echo -n mem > /sys/power/state
vbetool post </dev/tty0
This may not be a suspend/resume problem per se then. We've had reports of mode setting in general being flaky on some of these types of machines, maybe the timing is just right at suspend time to hit that bug most or all of the time.
The updated DRM bits (with suspend/resume hooks) are supposed to eliminate the need for vbetool stuff in your suspend/resume scripts. If possible, it would be good if you could get intel_reg_dumper output (it's in src/reg_dumper in the xf86-video-intel tree) from before the suspend and then after the resume, possibly from a network console. Since X is switched away from before suspend, you should capture state prior to the suspend but after doing a VT switch to a text terminal. Then on resume, try to capture it again before you try to switch back to X.
Created attachment 13993 [details]
intel_reg_dumper's output just before suspending
Created attachment 13994 [details]
intel_reg_dumper's output just after suspending
Created attachment 13995 [details]
intel_reg_dumper's output final, after resumed and vbetool post
Created attachment 13996 [details]
dmesg with drm debug=1, and doing a x11, suspend, resume vbetool post cycle
I did 3 intel_reg_dumper's dumps (while running git's drm.ko and
i915.ko): one just before the suspend, one just at resume, and the
last (wasn't asked for but...) after a vbetool post. This means:
intel_reg_dumper > ~/regdump_just_before
echo -n mem > /sys/power/state
intel_reg_dumper > ~/regdump_just_after
vbetool post </dev/tty0
intel_reg_dumper > ~/regdump_final_after_post
Strangely the dump just after resume is identical to the final dump
after vbe post (although this post made enough of a difference to
restore the x11 display back).
While at it, also booted and loaded drm with debug=1. Then did a
classic gdm start, suspend, resume, vbetool post. Output in the
last attached dmesg file.
Your comment about a possible race condition reminded me that once,
when I tried resuming without the vbetool post quirk, I saw the
xorg session display resumed just a tiny fraction of second before
the screen goes definitively black.
I'm ccing Hong. Not sure if it's related with a weird blanking screen bug "resolved " by touching/reading all regs again...
Created attachment 14182 [details] [review]
Re-enable pipes on resume
Can you give this patch a try? It should apply to the git version of DRM and correctly re-enable your pipes (I noticed in the reg dumper output that they were disabled).
(In reply to comment #16)
> Created an attachment (id=14182) [details]
> Re-enable pipes on resume
> Can you give this patch a try? It should apply to the git version of DRM and
> correctly re-enable your pipes (I noticed in the reg dumper output that they
> were disabled).
Well done, this patch is a net improvement!
It does not fixes the console brokenness after resuming, but it obsoletes the need for the "vbetool post" workaround (was needed to get the X11 display back).
With this patch applied, I can suspend & resume with just a pure "echo -n mem > /sys/power/state" and no other quirk at all, that's impressive.
Hm, now I wonder if you're seeing 14236... can you get some pre- and post-resume register dumps now that you're running the patched DRM? I'm curious what differences there are that might account for your corruption. A screenshot or photo would also be nice.
Created attachment 14199 [details]
intel_reg_dump before suspend, running patched drm
Created attachment 14200 [details]
intel_reg_dump after resume, running patched drm
Created attachment 14201 [details]
commented dmesg logs from drm (debug=1) during switch to vt and back
Yes I've seen #14236. The main reason why I opened a new bug: I
don't have any problem after hibernation (the bug #14236's reporter
says it have a similar problem both after s2r and s2d). Suspend-to-disk
(hibernation) works here, and doesn't break VT.
For the screenshot: that would be just a boring black screen; I have
no display distortion/corruption at all; when I switch to console
after a suspend-resume cycle, the screen behaves exactly like it does
when I shutdown the computer or pull off the wire: it blanks, writes
out "No input connection" for a few seconds, and remains blank.
I looked a the differences in dmesg logs (with drm debug=1) when
switching to console, and when sitching back to X11, with both a sane
system (that hasn't been suspended before), and a broken-console system.
The logs are almost totaly identical; the only visible difference shows
up when I switch back to xorg : this, on a sane/working system:
[drm:drm_unlocked_ioctl] pid=5482, cmd=0x4018641b, nr=0x1b, dev 0xe200, auth=1
[drm:drm_unlocked_ioctl] ret = -22
become that, on a previously suspend system:
[drm:drm_unlocked_ioctl] pid=5482, cmd=0x4004644d, nr=0x4d, dev 0xe200, auth=1
[drm:drm_unlocked_ioctl] pid=5482, cmd=0x40446440, nr=0x40, dev 0xe200, auth=1
Attached the compiled and commented relevant parts of dmesg. It probably
does no matter, but just in case...
The register dumps look strange, it's as if the VGA registers are being completely clobbered... are you sure you're running DRM modules from git as of today (I just checked in a couple of fixes)? Or maybe your suspend/resume scripts are doing a 'vbetool post' or similar? Doesn't look like you have any fb drivers builtin or loaded...
Stuff like this:
-(II): CR00: 0x5f
-(II): CR01: 0x4f
-(II): CR02: 0x50
-(II): CR03: 0x82
-(II): CR04: 0x55
-(II): CR05: 0x81
-(II): CR06: 0xbf
-(II): CR07: 0x1f
+(II): CR00: 0x00
+(II): CR01: 0x00
+(II): CR02: 0x00
+(II): CR03: 0x80
+(II): CR04: 0x00
+(II): CR05: 0x00
+(II): CR06: 0x00
+(II): CR07: 0x00
definitely shouldn't happen in the latest code, since it explicitly saves and restores these registers. But differences like these would definitely explain your VT corruption on resume.
The attached dumps where done with the drm git tip from a few hours ago
(before your two last commits, I'm at 76748efae2f51409813eeb6b91b783c73cb2845e)+
the attached patch. I didn't use vbetool (thanks to your patch).
Also I'm not 100% sure I've disabled all the necessary things to remove
framebuffer totally from kernel. So I attach my .config for verification
(that's for a vanilla 2.6.24).
I'll update my drm to latest git and repost the register dumps in a few minutes.
Created attachment 14203 [details]
2.6.24 kernel .config (if I didn't mess, framebuffer should be disabled)
Created attachment 14204 [details]
reg dump before suspend, drm git from now + patch (git 6f19473191ae543fcc199d252c5865c0734d38ad)
Created attachment 14205 [details]
reg dump after resume, drm git from now + patch (git 6f19473191ae543fcc199d252c5865c0734d38ad)
Just for the record, the two latests register dumps (attachments 14204 and 14205)
are generated with a 2.6.24 vanilla kernel (compiled with the .config in attachment 14203 [details]), and using git tip drm as of now
(6f19473191ae543fcc199d252c5865c0734d38ad) plus the patch attached to this bug.
I did exactly this to dump the registers (from an xterm) :
intel_reg_dumper > /tmp/reg_dump_patched_drm_before_susp
echo -n mem > /sys/power/state
intel_reg_dumper > /tmp/reg_dump_patched_drm_after_susp
Created attachment 14207 [details] [review]
Save/restore MGGC register
Given that the VGA registers don't seem to be restored, I wonder if VGA routing on your bridge is disabled for some reason...
Can you try out this patch?
Created attachment 14208 [details]
lspci -xxx, after recompiling drm with the two attached patches and suspend+resuming
lspci -xxx with the two patches applied on top of today drm git tip, and after suspending and resuming.
attachment 14207 [details] [review] didn't fix the problem.
intel_register_dump outputs (before, and after suspend) are identical to the previous attached versions.
Hm, no looks like we can't really save/restore that register w/o resetting the chip altogether, since it's RO status is controlled by the SMRAM reg. But the fact that post-suspend its value is 0x0030 and post-resume it's 0x0002 makes it seem like the BIOS did something bad...
Created attachment 14209 [details] [review]
Read & write VGA regs via MMIO instead of port I/O
Ok, here's a crazy and totally untested patch. It may cause resume to just hard hang, but it may also give you a console back (I doubt it'll still work though).
I already told the results to James Barnes directly, but for
the record, and in case we or someone else would take a look
at this bug later: the above patch (attachment 14209 [details] [review]) prevents
the system from suspending (when trying to suspend, I'm left
with a perpetual blinking cursor on a black terminal).
So, being unable to suspend, I can't tell if it helps to resume.
Something else, if the motherboard BIOS may be the culprit: I
use the latest Asus provided BIOS as of now (version 0405).
There's also a BIOS "Repost video on S3 resume" option but that
seems totally ineffective for this problem (and was ineffective
to solve the need for "vbetool post on resume" before this other
bug was fixed James Barnes with - now commited - attachment 14182 [details] [review]).
Updating summary. Would be interesting to find out if other G35 users have the same problem. If they do, it might be a bug in the Intel provided BIOS bits for G35 based systems, rather than an Asus specific problem.
Also, I reported this issue to Asus, since it really looks like the MGGC GMCH register is set to the wrong value on resume, disabling VGA access entirely.
For what it's worth, this seems to be an issue for another platform as well
One reports that it's working with an older processor, but not with new. Don't know much, but I would like to attribute that to a hardware or bios bug.
Ah, interesting, thanks for the link. Sounds like it may be a BIOS issue (possibly related to a certain CPU in the board). Hope Asus finds & fixes it soon...
According to the above link, this similar bug occurs on Vista with e8400 wolfedale cpus. I have a Kentsfield, Intel Core 2 Quad Q6600 revision G0.
I can also confirm this bug on:
Celeron 420 (stepping 1), Asus P5E-VM HDMI (Bios rev 0301)
Using intel driver version 2.2.1-, on
Kernel 2.6.24-ARCH #1 SMP PREEMPT, x86_64
(II) Module intel: vendor="X.Org Foundation"
compiled for 18.104.22.168, module version = 2.2.1
Module class: X.Org Video Driver
ABI class: X.Org Video Driver, version 2.0
Also, after triggering the bug somethinf freaky happened to me after playing with screen rotation.
Instead of rotating the screen back, my computer seemed to black out. No video output in X, no video in console. I could tell it switched, because of my num-lock status changed.
I tried to do trigger a reboot, but nothing seemed to happen. (My computer is virtually silent, and it's a bit almost impossible to hear it work.) After around a minute of waiting, I still didn't have VGA output, so I powered it down, and did a cold boot.
STILL, VGA output was not restored. (I'm not sure it even reached post) I tried to press F1 (Since that's what I usually do since bios halt and complain that my system do not have a master IDE drive. All this to no avail.
I was unable to get any life from it until I removed main power cable and replugged it. After that the machine booted up, but all cmos data was gone. Clock, but not date, was also reset. I'm not too keen on trying to reproduce this, but I will report it to Asus, after I get some sleep.
Bios being cleared, could be the result of not being able too post for a few times and loading fail-safe defaults. Me pressing F1 blindly might have approved it. Or maybe It was restored trough asus safety-net for bios corruption. I have no idea.
well, can I mark this as NOTOURBUG now?
Yeah, probably. I've been talking with Asus about it and they seem to have an idea of what's going on, but I don't know if they've released an update to fix the problem yet. Ben, have you checked their website recently? Do you still see this problem with the latest BIOS bits?
Asus didn't released any new BIOS version since January; I'm already using the latest one (0405).
And yes, it seems clear now that this is a plain NOTOURBUG (esp. since we know Windows has similar problems with this motherboard).
Thanks for your help in hunting this bug (and also, thank you for fixing an other bug in the process (ie. the drm patch obsoleting the need for "vbetool post")).
New bios 0503 is supposed to fix this, I'll attempt the upgrade now and test it for linux.
Wish me luck! =)
It did not seem to help me at all. I lost vga outside X, and did not get it back until after a cold boot. :(