Description
Émeric Maschino
2006-08-04 11:55:32 UTC
Created attachment 6456 [details]
xorg.conf file
Created attachment 6457 [details]
Xorg.0.log file
Created attachment 6458 [details]
glxinfo output
Created attachment 6459 [details]
Output of strace glxinfo revealing the failed ioctl call
We need to find out which ioctl is failing. The kernel output might give a hint if setting /sys/module/drm/parameters/debug to 1 before running glxinfo. Created attachment 6487 [details]
Output of strace glxinfo with /sys/module/drm/parameters/debug set to 1
Thanks, but I only see strace output in there, not kernel output (as obtainable with dmesg). Created attachment 6501 [details]
Kernel output obtained with dmesg
(In reply to comment #7) > Thanks, but I only see strace output in there, not kernel output (as obtainable > with dmesg). Sorry, I didn't know what informations you were looking for. BTW, having /sys/module/drm/parameters/debug set to 1 doesn't change the kernel output produced by dmesg. Am I missing something trivial here? Does this kernel output contain the informations you were expecting? (In reply to comment #9) > BTW, having /sys/module/drm/parameters/debug set to 1 doesn't change the kernel > output produced by dmesg. Am I missing something trivial here? Does this kernel > output contain the informations you were expecting? No, did you set it before running glxinfo? If so, does reading the file return 1 after setting it? Created attachment 6508 [details]
glxinfo kernel output with /sys/modules/drm/parameters/debug set to 1
I have the strong impression that the buffer isn't big enough to catch all the
produced logs...
(In reply to comment #10) > No, did you set it before running glxinfo? If so, does reading the file return 1 > after setting it? Well, hum. I simply forget to invoke glxinfo once the debug parameter was set to 1. Hope the output will be helpful this time. (In reply to comment #12) > > Hope the output will be helpful this time. Yes, thanks. It looks like the failing ioctl is a red herring - it's the setversion ioctl, which only succeeds for the X server, but failure is ignored. It looks like the real problem is that the X server thinks the device is on PCI domain 1 whereas the kernel thinks it's on domain 0. Changing bug fields to reflect this. If you could try xorg-server 1.1 from X.Org 7.1, this problem might be fixed there with some luck. It may be possible to work around this in the X server DRI module though, I'll attach a test patch. Created attachment 6560 [details] [review] Possible workaround This patch for the X server dri module might serve as a workaround. Arguably, it should really refuse to enable the DRI in this case though. > Yes, thanks. It looks like the failing ioctl is a red herring - it's the
> setversion ioctl, which only succeeds for the X server, but failure is ignored.
> It looks like the real problem is that the X server thinks the device is on PCI
> domain 1 whereas the kernel thinks it's on domain 0. Changing bug fields to
> reflect this. If you could try xorg-server 1.1 from X.Org 7.1, this problem
> might be fixed there with some luck.
Unfortunately, this bug is still present with X.org 7.1.1 and xorg-server 1.1.1.
Well, at least this is how the packages are numbered with current Fedora Core
Rawhide. xdpyinfo and glxinfo seem to confirm these release numbers.
(In reply to comment #14) > Created an attachment (id=6560) [edit] > Possible workaround > > This patch for the X server dri module might serve as a workaround. Arguably, > it should really refuse to enable the DRI in this case though. I've tried it with X.org 7.1.1 as shipped with Fedora Core Rawhide. Great job: glxinfo now reports "Direct rendering: Yes". Many thanks :-) I don't know if this is related, but with your patch, glxinfo and glxgears report than visual 0x4b isn't supported. BTW, glxgears gives me the following output: libGL warning: 3D driver claims to not support visual 0x4b 3452 frames in 5.0 seconds = 690.264 FPS Two comments. First, I don't know if DRI is *really* working, since these numbers seem a bit low. Second, these two lines are the only ones that are displayed. Shortly after, the system locks hard, the screen was frozen, I can move the mouse and there was a huge activity on the HDD. Unfortunately, I can't remotely access to the system to diagnose what's going wrong. Anyway, thanks for your time and consideration. > libGL warning: 3D driver claims to not support visual 0x4b > 3452 frames in 5.0 seconds = 690.264 FPS > > Two comments. First, I don't know if DRI is *really* working, since these > numbers seem a bit low. The first line (which is reported in another entry but harmless, BTW) wouldn't be there if it wasn't using direct rendering. > Second, these two lines are the only ones that are displayed. Shortly after, > the system locks hard, [...] Make sure you're running current xf86-video-ati git, a stability fix for some R300 family cards went in there only recently. (In reply to comment #17) > Make sure you're running current xf86-video-ati git, a stability fix for some > R300 family cards went in there only recently. OK, I'll look at this. But I presume it would be preferable to open a separate bug to track stability problems. BTW, back to the initial problem. What's the "status" of the patch you proposed? Will it be integrated into mainstream or is it a "quick and dirty" hack that also requires more deeper changes elsewhere in the code and thus won't be integrated as is? (In reply to comment #18) > BTW, back to the initial problem. What's the "status" of the patch you proposed? > Will it be integrated into mainstream or is it a "quick and dirty" hack that > also requires more deeper changes elsewhere in the code and thus won't be > integrated as is? As I said, it's just a workaround, and IMO the X server should really refuse to enable the DRI in the first place in this situation, as it can't assume the different PCI IDs refer to the same device. The real fix would be to make the X server's PCI domain numbering consistent with the kernel's. With some luck, the pci-rework branch will take care of this. Adding Ian Romanick to the CC list. Can you try again without the workaround but with the DRM from current git? Commit 205c573e449b38d759273f6a51eb8c1131585ece might have an impact on this. (In reply to comment #20) > Can you try again without the workaround but with the DRM from current git? > Commit 205c573e449b38d759273f6a51eb8c1131585ece might have an impact on this. Thanks for the update. Tried it without the workaround but still the permission problem. Do you need some kind of backtrace with the git changes applied? Just let me know. BTW, I just noticed that current openSUSE FACTORY 10.2 Alpha4Plus X.org 7.1.0 doesn't exhibit the permission issue but states that DRI is disabled (although it's listed in the Modules section of the /etc/X11/xorg.conf file). Actual Debian Etch X.org 7.0 and Fedora Core Rawhide X.org 7.1.1 still have the permission issue. (In reply to comment #21) > Do you need some kind of backtrace with the git changes applied? Just the usual X server log file and kernel output would be nice for a start. Created attachment 7076 [details]
dmesg.log with DRM git patch 205c573e449b38d759273f6a51eb8c1131585ece applied
Created attachment 7077 [details]
Xorg.0.log file with DRM git patch 205c573e449b38d759273f6a51eb8c1131585ece applied
(In reply to comment #22) > Just the usual X server log file and kernel output would be nice for a start. Sorry for the delay. Please have a look at attachments id 7076 and 7077. That workaround patch makes it work for me (Gentoo xorg-server 1.1.1-r1). If you need any information off me please just ask :) (In reply to comment #26) > That workaround patch makes it work for me (Gentoo xorg-server 1.1.1-r1). If you > need any information off me please just ask :) Just to clarify the situation, Tim was talking about the workaround id=6560, not about the DRM git patch 205c573e449b38d759273f6a51eb8c1131585ece. So this issue is unfortunately still present :-( Bugzilla Upgrade Mass Bug Change NEEDSINFO state was removed in Bugzilla 3.x, reopening any bugs previously listed as NEEDSINFO. - benjsc fd.o Wrangler This bug is solved for me (w/ xorg-server 1.4.0.90) by attachment #14066 [details] [review] from bug #14326 which I think is the correct fix for this problem so marking bug as a dupe. *** This bug has been marked as a duplicate of bug 14326 *** *** Bug 15404 has been marked as a duplicate of this bug. *** I think this was incorrectly resolved as duplicate. The xserver side should be fixed with pciaccess, now it's probably up to the DRM to use a real implementation of drm_get_pci_domain() instead of hardcoding it to 0. Created attachment 15767 [details]
log from drm compiled without hardcoded domain
It seems you have right as there is no crash when I replace the hardcoded domain from 0 to pci_domain_nr() in drmP.h. But GL still doesn't work as for some reason it doesn't add required visuals.
[morgoth6@pegasos ~]$ glxinfo
name of display: :0.0
Error: couldn't find RGB GLX visual
visual x bf lv rg d st colorbuffer ax dp st accumbuffer ms cav
id dep cl sp sz l ci b ro r g b a bf th cl r g b a ns b eat
----------------------------------------------------------------------
0x21 24 tc 1 0 0 c . . 0 0 0 0 0 0 0 0 0 0 0 0 0 None
0x22 24 dc 1 0 0 c . . 0 0 0 0 0 0 0 0 0 0 0 0 0 None
0x6f 32 tc 1 0 0 c . . 0 0 0 0 0 0 0 0 0 0 0 0 0 None
(In reply to comment #33) > It seems you have right as there is no crash when I replace the hardcoded > domain from 0 to pci_domain_nr() in drmP.h. Can you provide a patch for that? Unfortunately, this change will probably break older X servers that hardcoded the domain to 0, so a full solution may require some DRM interface versioning magic. > Error: couldn't find RGB GLX visual That's a different issue which should be fixed now in Mesa Git. Sure. It was just a quick hack to see is this would fix the problem here. Created attachment 15774 [details] [review] quick drm domain patch True. With recent mesa patch GL works just fine here only the number of visuals decreased a lot: visual x bf lv rg d st colorbuffer ax dp st accumbuffer ms cav id dep cl sp sz l ci b ro r g b a bf th cl r g b a ns b eat ---------------------------------------------------------------------- 0x21 24 tc 0 32 0 r y . 8 8 8 8 0 24 0 0 0 0 0 0 0 None 0x22 24 dc 0 32 0 r y . 8 8 8 8 0 24 0 0 0 0 0 0 0 None 0x6f 32 tc 0 32 0 r . . 8 8 8 8 0 24 0 0 0 0 0 0 0 None Hi, I'm currently running Debian GNU/Linux Testing "Squeeze" on my hp workstation zx6000 sporting an ATI FireGL X1 graphics adapter. I've upgraded the X Window system with the (Debian Experimental?) X.org 7.4~5, X server 1.5.99.902 (i.e. 1.6.0 RC2), Mesa 7.3.1 and open source radeon driver 6.11.0 packages available on the Debian FTP mirrors, and DRI is back again on ia64/Itanium :-) The bad news is that it's highly unstable. I mean, you can query information with glxinfo without a problem, but don't try a GL screensaver or play with glxgears: you'll lock your system hard within seconds. I'm now trying to find out how to provide valuable debug information to the X.org/Debian developers. It would be nice make 3D hardware acceleration on ia64/Itanium a reality again. Thank you for the accomplished work until now. Émeric (In reply to comment #38) > The bad news is that it's highly unstable. I mean, you can query information > with glxinfo without a problem, but don't try a GL screensaver or play with > glxgears: you'll lock your system hard within seconds. Does it work better with different values for Option "AGPMode", or with Option "BusType" "PCI"? (In reply to comment #39) > Does it work better with different values for Option "AGPMode", or with Option > "BusType" "PCI"? Yes, indeed! Downgrading (well, performance wise, there's no such a big difference) to AGP 2x (rather than the default AGP 4x) fixed the issue. I'm now getting ~2790fps with glxgears. Adding BusType "AGP" with the default AGP 4x setting makes the system more stable... for roughly a minute. I'm getting ~2800fps with glxgears but the system will eventually lock hard. With AGP Fast Writes option enabled, X mode can't be started at all and the system locks hard, even with AGP 2x mode. Both XAA and EXA acceleration architectures work properly. Many thanks to all people involved in this great job. Émeric I've gone ahead and added an AGP quirk for your system: a7f465f73363fce409870f62173d518b1bc02ae6 Hello Alex, (In reply to comment #41) > I've gone ahead and added an AGP quirk for your system: > a7f465f73363fce409870f62173d518b1bc02ae6 Thank you, but could you remove this AGP quirk, please? Here are the reasons. "Downgrading" to AGP 2x or even AGP 1x only makes the problem appears later. I mean, rather than locking the system within seconds, it will take several minutes, but the system will eventually lock. Well, it doesn't really lock in fact, I was mistaken in my previous post. I had some free time to perform tests since then. At AGP 2x/1x, simple GL-applications (glxgears, GL screensavers and even Quake 2) "usually" run without a problem. I say "usually", because if you enable the shadows in Quake 2 (gl_shadows variable set to 1 in config.cfg), you will experience the issue that I will now describe. Independently of AGP speed, serious GL-applications like the SPECviewperf 7.1.1 suite (need to be recompiled for Linux ia64), completely flood the system within seconds. It's not hard locked as I thought initially. Indeed, I can ssh to it and the top command reveals that the Xorg process eats all the CPU (and sometimes more with a whooping 320% CPU utilization peak!). At this stage, I can't restart the X server locally or kill it remotely and a reboot is welcome. Is there something I can try to help figure out what is the cause of this outrageous CPU utilization? Thanks, Émeric |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.