Bug 5341

Summary: System locks up when starting X w/ DRI on a Radeon RV370 (X300)
Product: DRI Reporter: Bernhard Rosenkraenzer <bero>
Component: DRM/otherAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: high CC: airlied, benh, bjencks, glisse
Version: XOrg git   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
drm debug messages up to the point of the crash
none
drm debug messages with gart_info.bus_addr
none
new full debug messages from FreeBSD from serial console
none
Fix current DRM on FreeBSD none

Description Bernhard Rosenkraenzer 2005-12-14 21:52:40 UTC
Using DRM modules from DRM CVS and Xorg and Mesa from CVS (all taken as of  
yesterday), the system locks up when starting X if DRI is enabled. 
 
The same DRM/Xorg/Mesa combination works perfectly on a Radeon Mobility 9600 
M10. 
 
PCI ID of graphics card triggering the lockup: 1002:5b60, Subsystem 174b:0500 
The same card with the same Xorg/Mesa works nicely (but without 3D) if I move 
the radeon.ko kernel module out of the way.
Comment 1 Aapo Tahkola 2005-12-15 02:13:04 UTC
Does changing option GARTSize to 16, 32 or 64 affect anything?
Also check that EnablePageFlip isnt true...
Comment 2 Bernhard Rosenkraenzer 2005-12-15 03:06:21 UTC
Same effect with GARTSize 16, 32 and 64 and an explicit Option 
"EnablePageFlip" "false". 
 
Last couple of lines from 
 
mount -o remount,sync / 
Xorg -verbose 9 &>X.log 
 
: 
 
(II) RADEON(0): [DRI] installation complete 
(II) RADEON(0): [drm] Added 32 65536 byte vertex/indirect buffers 
(II) RADEON(0): [drm] Mapped 32 vertex/indirect buffers 
(II) RADEON(0): [drm] dma control initialized, using IRQ 10 
(II) RADEON(0): [drm] Initialized kernel GART heap manager, 13369344 
(II) RADEON(0): Direct rendering enabled 
[HANGS] 
Comment 3 Benjamin Herrenschmidt 2005-12-15 10:58:48 UTC
Does it work if you edit radeon_driver.c, function RADEONSetFBLocation() and
comment out those 2 lines:

    OUTREG(RADEON_MC_FB_LOCATION, mc_fb_location);
    OUTREG(RADEON_MC_AGP_LOCATION, mc_agp_location);

The code that "calculates" those values is totally bogus imho and may conflict
with what the DRM is doing. I'll try to come up with a proper patch later, but
it would be interesting if that is the cause of your problem.
Comment 4 Bernhard Rosenkraenzer 2005-12-15 22:33:41 UTC
Removing those 2 lines doesn't change anything.  
  
The same problem also occurs on a different brand X300 (PCI ID 1002:5b60, 
Subsystem 196d:1070). Both machines showing this problem here are Athlon64 
boxes running in 32bit mode. 
Comment 5 Markus Niemistö 2005-12-15 23:18:22 UTC
I guess your card is a PCI express card? I have exactly the same problem with
PCI express X600 (5b62), but I am running 64-bit version of Linux on Athlon64.
Comment 6 Bernhard Rosenkraenzer 2005-12-16 00:16:42 UTC
Yes, all X300/X600/... cards are PCI Express. 
 
The driver works perfectly on the AGP cards (at least as far as the 9600 is 
concerned). 
Comment 7 Michel Dänzer 2005-12-16 01:12:36 UTC
FWIW, no problems here with an X550 (1002:5b60 / 174b:1490) with 64-bit
X11R6.9RC2 and the rest from CVS.
Comment 8 Gernot Pansy 2005-12-16 03:14:25 UTC
same on Xpress 200M (RV370 based). i think the lock only happens with cards 
that didn't have memory on board  and so have to share the ram. 
Comment 9 Benjamin Herrenschmidt 2005-12-16 07:52:59 UTC
Ok, let's try another one. In RADEONSetFBLocation(), comment out this one:

    OUTREG (RADEON_BUS_CNTL, bus_cntl | RADEON_BUS_MASTER_DIS);
Comment 10 Markus Niemistö 2005-12-17 00:01:39 UTC
I tried commenting that line both with and without the previous two from comment
#3 and had no luck at all.

However I located the location where system crashes. On my system the crash
occurs when DRM tries to zero out the pci-gart table in function
drm_ati_pcigart_init on line 186 in ati_pcigart.c.
Comment 11 Dave Airlie 2005-12-17 10:52:01 UTC
does you card have any onboard RAM??

I'm thinking I need to make some changes to the GART allocate for PCIE for those
types of cards..
Comment 12 Markus Niemistö 2005-12-17 19:45:34 UTC
My X600 has 128 MB DDR memory onboard.
Comment 13 Dave Airlie 2005-12-30 13:15:16 UTC
can you attach a DRM log ?? echo 1 > /sys/module/drm/parameters/debug

though you might need a serial console to get it all... 
Comment 14 Markus Niemistö 2006-01-01 00:48:45 UTC
Created attachment 4206 [details]
drm debug messages up to the point of the crash
Comment 15 Dave Airlie 2006-01-02 12:51:10 UTC
(In reply to comment #14)
> Created an attachment (id=4206) [edit]
> drm debug messages up to the point of the crash
> 

Can you change the DRM_DEBUG in drivers/char/drm/radeon_cp.c 
DRM_DEBUG("Setting phys_pci_gart to %p %08lX\n", dev_priv->gart_info.addr,
dev_priv->pcigart_offset);
to also printout dev_priv->gart_info.bus_addr?

I'm wondering if there is some issue there... 
Comment 16 Markus Niemistö 2006-01-02 20:27:23 UTC
Created attachment 4214 [details]
drm debug messages with gart_info.bus_addr

(In reply to comment #15)
> Can you change the DRM_DEBUG in drivers/char/drm/radeon_cp.c 
> DRM_DEBUG("Setting phys_pci_gart to %p %08lX\n", dev_priv->gart_info.addr,
> dev_priv->pcigart_offset);
> to also printout dev_priv->gart_info.bus_addr?
> 
> I'm wondering if there is some issue there...

Here you go
Comment 17 Markus Niemistö 2006-01-28 05:05:36 UTC
I got my X600 working under Linux/i386 with Ben's latests patches. I haven't
tried it without them yet. However, DRI still doesn't work under FreeBSD/amd64.
I had my serial console attached and got quite a lot debug data. It crashes in
radeon_cp_init_ring_buffer() with gpf. I'll try to compile a debug kernel with
all fancy debug data and find out the exact location.
Comment 18 Markus Niemistö 2006-01-28 05:08:27 UTC
Created attachment 4488 [details]
new full debug messages from FreeBSD from serial console
Comment 19 Don Wilde 2006-02-01 05:41:30 UTC
I'm afraid Ben's three patches did _not_ help my FreeBSD-6.0-STABLE (updated 
1/30/06) system with an X300 PCIe Dell dual-head. I've also tried killing the 
HyperThreading, the second head, etc., to no avail. This, on the standard X.org 
6.9 as installed through ports. I remade xorg-libraries after patching the 
radeon_driver.c file. 
 
(--) PCI:*(1:0:0) ATI Technologies Inc RV370 5B60 [Radeon X300 (PCIE)] rev 0, 
Mem @ 0xd0000000/27, 0xdfde0000/16, I/O @ 0xdc00/8, BIOS @ 0xdfe00000/17 
(--) PCI: (1:0:1) ATI Technologies Inc RV370 [Radeon X300SE] rev 0, Mem @ 
0xdfdf 
0000/16 
 
The X log does not get written when it locks up. What else can I post that 
would help? 
Comment 20 Markus Niemistö 2006-02-01 06:01:15 UTC
Created attachment 4523 [details] [review]
Fix current DRM on FreeBSD

The virtual field of struct drm_sg_mem_t was not initialized in drm_sg_alloc
which caused crashes. This patch addresses this issue and also makes current
DRM compile on FreeBSD.
Comment 21 Markus Niemistö 2006-02-01 06:09:57 UTC
(In reply to comment #19)
> I'm afraid Ben's three patches did _not_ help my FreeBSD-6.0-STABLE (updated 
> 1/30/06) system with an X300 PCIe Dell dual-head. I've also tried killing the 
> HyperThreading, the second head, etc., to no avail. This, on the standard X.org 
> 6.9 as installed through ports. I remade xorg-libraries after patching the 
> radeon_driver.c file. 

Try the patch I attached with CVS versions of DRM and xorg. I also needed to
apply two of Ben's patches (radeon-memmap-7.0-2.diff and
radeon-memmap-drm-3.diff). The patch fixes an issue that only exists with
non-agp radeon cards.
  
> The X log does not get written when it locks up. What else can I post that 
> would help? 

I was able to get the vital debugging information via a serial console only.
Just compile kernel with DDB in and attach another computer with a null-modem
cable. There is more help about this subject in the FreeBSD developers handbook.
Oh... And don't forget to set sysctl hw.dri.0.debug to 1 to see the DRM debug
messages.
Comment 22 Don Wilde 2006-02-01 07:04:16 UTC
(In reply to Comment #21) I'm afraid my g-world firewall is blocking my direct 
CVS access. I'll have to set up a redirect through my outside server as I do 
for CVSup. Thanks for the rapid response, Markus! I understand what I need to 
do and will do it.  
 
Just one q: Where will I find Ben's memmap patches? 
Comment 23 Markus Niemistö 2006-02-01 18:07:06 UTC
(In reply to comment #22)
> Just one q: Where will I find Ben's memmap patches? 
Sorry. I tought I mentioned this. You can find the patches from
http://gate.crashing.org/~benh/

Comment 24 Ross Vandegrift 2006-02-03 05:31:50 UTC
(In reply to comment #8)
> same on Xpress 200M (RV370 based). i think the lock only happens with cards 
> that didn't have memory on board  and so have to share the ram. 

I think I may be seeing this same issue.  I'm running Xorg/Mesa/dri/drm from CVS
today, plus Ben's Radeon memmap-3 patch.  Everything works well if I don't
enable DRI.

In order to get the radeon DRM module to recognize my chip, I had to add this to
the drm_pciids.txt file:
0x1002 0x5955 CHIP_RV350|CHIP_IS_IGP "ATI Radeon XPRESS 200M"

From searching the net, RV350 seems to be the most appropriate choice
(CHIP_RV370 doesn't appear to exist), but I could be wrong.

X starts to a blank screen and freezes.  Can't switch VTs, NumLock/CapsLock
don't work, and I can't kill X.

However, the box isn't locked hard.  ssh sessions to the laptop still work as
normal and I can reboot.  chvt hangs if I try to switch back to a text console.

I enabled DRM_DEBUG as described in the thread, and my dmesg gets spammed with this:

[drm:radeon_do_cp_idle] 
[drm:drm_ioctl] ret = fffffff0
[drm:drm_ioctl] pid=5587, cmd=0x6444, nr=0x44, dev 0xe200, auth=1
[drm:radeon_cp_idle] 
[drm:radeon_do_cp_idle] 

There was some init stuff at the beginning, but it scrolled out of the buffer
very quickly.
Comment 25 Dave Airlie 2006-09-21 20:50:33 UTC
no idea what the state of this bug is modulo the addition of a comment about
something completely unrelated.

author please reopon if you still have problem with latest code.
Comment 26 Benjamin Close 2008-01-11 02:36:41 UTC
Bugzilla Upgrade Mass Bug Change

NEEDSINFO state was removed in Bugzilla 3.x, reopening any bugs previously listed as NEEDSINFO.

  - benjsc
    fd.o Wrangler
Comment 27 Alex Deucher 2008-01-11 12:10:14 UTC
closing. please reopen if you are still having problems with a more recent version of the driver.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.