After a switch from 6.8.2 to 6.99.99.902 on x86_64 machine attempt to start X server invariably ends up with: Fatal server error: Caught signal 11. Server aborting and I am reduces to only vesa driver which still is ok. This is a regress from 6.8.2 radeon driver which worked on the same hardware without issues. Comparing old logs from a working setup with current logs a crash seems to happen when a monitor detection should happen. A full sample log and a config file (generated mostly by system-config-display from Fedora rawhide) are attached. This bug was first reported as https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=173439
Created attachment 3874 [details] a sample log file from an attempt to start X server with radeon driver
Created attachment 3875 [details] a configuration file used What "(Secondary)" really means in 'BoardName' I really do not know. This was generated. One would think that this should not matter.
Does Option "NoDDC" in the radeon Device section work around the problem? It would be great if you could run the server inside gdb and get a backtrace. Beware that you can't do this from the same machine.
> Does Option "NoDDC" in the radeon Device section work around the problem? It definitely changes what happens although results are still problematic. Visually the whole screen goes blank, a keyboard is dead and after a login from remote I have an unkillable X server process pegged around 99% CPU. The only way to restore video and sanity is 'shutdown -r now'. I attach a sample log from a situation when NoDDC option was used below and also, for a comparison, a log from a working situation with 6.8.2. Do not seem to be drastically different but results are not the same OTOH when NoDDC allows me to get far enough I am see: (WW) RADEON: No matching Device section for instance (BusID PCI:1:0:1) found and that even I added an explicit BusID identifier in a Device section (either PCI:1:0:1 or PCI:1:0:0) or even with another Device section for "Videocard1" and matching Screen section. A layout of video on a PCI bus happens to be that: -[0000:00]-+-00.0 VIA Technologies, Inc. VT8385 [K8T800 AGP] Host Bridge +-01.0-[0000:01]--+-00.0 ATI Technologies Inc R300 AD [Radeon 9500 Pro] | \-00.1 ATI Technologies Inc R300 AD [Radeon 9500 Pro] (Secondary) (plus a number of other devices). The card does have digital output but I have hooked up there only a small analog monitor. > It would be great if you could run the server inside gdb and get a backtrace. > Beware that you can't do this from the same machine. I am not really sure how to do something of that sort from another machine. Does it has to be x86_64 too?
Created attachment 3889 [details] X server log after NoDDC option was added
Created attachment 3890 [details] log from a working setup with Version 6.8.2
(In reply to comment #4) > > Does Option "NoDDC" in the radeon Device section work around the problem? > > It definitely changes what happens although results are still problematic. > Visually the whole screen goes blank, a keyboard is dead and after a login > from remote I have an unkillable X server process pegged around 99% CPU. Could be DRI related, try disabling it. > OTOH when NoDDC allows me to get far enough I am see: > > (WW) RADEON: No matching Device section for instance (BusID PCI:1:0:1) found You can ignore this, the secondary function is just there for multihead to work properly with some versions of Windows. > > It would be great if you could run the server inside gdb and get a backtrace. > > Beware that you can't do this from the same machine. > > I am not really sure how to do something of that sort from another machine. The usual way is to ssh in. > Does it has to be x86_64 too? No, it doesn't matter what kind of machine it is.
>> I am not really sure how to do something of that sort from another machine. > The usual way is to ssh in. Ah, misunderstanding. One is not running gdb from another machine but simply not from a console login when trying to start X. OK, here is what gdb has to say when "NoDDC" is not in use: Program received signal SIGSEGV, Segmentation fault. xf86DoEDID_DDC2 (scrnIndex=0, pBus=0x7fef70) at xf86DDC.c:221 221 VDIF_Block = (gdb) l 216 #ifdef DEBUG 217 if (!tmp) 218 ErrorF("Cannot interpret EDID block\n"); 219 ErrorF("Sections to follow: %i\n",tmp->no_sections); 220 #endif 221 VDIF_Block = 222 VDIFRead(scrnIndex, pBus, EDID1_LEN * (tmp->no_sections + 1)); 223 tmp->vdif = xf86InterpretVdif(VDIF_Block); 224 225 return tmp; (gdb) bt #0 xf86DoEDID_DDC2 (scrnIndex=0, pBus=0x7fef70) at xf86DDC.c:221 #1 0x00002aaaab8b409c in RADEONDisplayDDCConnected (pScrn=0x7fc350, DDCType=DDC_VGA, port=0x7f5ff0) at radeon_driver.c:1029 #2 0x00002aaaab8b540f in RADEONQueryConnectedMonitors (pScrn=0x7fc350) at radeon_driver.c:2062 #3 0x00002aaaab8c038c in RADEONPreInit (pScrn=0x7fc350, flags=Variable "flags" is not available. ) at radeon_driver.c:4871 #4 0x000000000045fce1 in InitOutput (pScreenInfo=0x6cad40, argc=1, argv=0x7fffff8cb5a8) at xf86Init.c:612 #5 0x0000000000432a88 in main (argc=1, argv=0x7fffff8cb5a8, envp=Variable "envp" is not available. ) at main.c:372 Sure enough the line in question tries to dereference 'tmp' which happens to be zero while it is clear from the code that this is really not expected. This happens on the fourth call of xf86DoEDID_DDC2 with 'EDID_block' printing as (unsigned char *) 0x7fb050 "" and 'xf86InterpretEDID()' indeed then returns NULL. On three preceding calls 'EDID_block' is consistently NULL so the whole function immediately returns NULL as well. > Could be DRI related, try disabling it. Good guess. If I will add "NoDDC" and will comment out in xorg.conf Load "dri" line then I can start server using radeon driver.
Created attachment 3944 [details] for completness - log from a server starting with "NoDDC" and no dri loaded Should I open new bug about DRI or this is not needed?
dupe, already fixed. *** This bug has been marked as a duplicate of 4859 ***
If this is already fixed in 6.8.2 then why it reappears in 6.99.99.902? If you know that 'xf86InterpretEDID()' is allowed to return NULL then the fix is indeed obvious. What about my question on a DRI ticket?
(In reply to comment #11) > If this is already fixed in 6.8.2 then why it reappears in 6.99.99.902? Can you verify that CVS HEAD doesn't have the fix and reopen this bug if it doesn't? > What about my question on a DRI ticket? Isn't there one about that already?
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.