Description
Tobias Diedrich
2011-03-06 16:14:19 UTC
Created attachment 44394 [details] Compiled gpu state and config details May be moot because of the revert_6bda10d152735c22baf1dcd92937420b4b0a359a_fence_pipelining_disable patch. I had this applied because of the bisect result in https://bugs.freedesktop.org/show_bug.cgi?id=34584 to try if it helps in my case here. The other i915 patches I have applied in this dump are: # i915_invalid_unfenced_alignment Index: linux-2.6.38-rc7/drivers/gpu/drm/i915/i915_gem.c =================================================================== --- linux-2.6.38-rc7.orig/drivers/gpu/drm/i915/i915_gem.c 2011-03-06 23:13:21.150307624 +0100 +++ linux-2.6.38-rc7/drivers/gpu/drm/i915/i915_gem.c 2011-03-06 23:13:53.198944328 +0100 @@ -1411,6 +1411,7 @@ obj->tiling_mode == I915_TILING_NONE) return 4096; + return i915_gem_get_gtt_size(obj); /* * Older chips need unfenced tiled buffers to be aligned to the left * edge of an even tile row (where tile rows are counted as if the bo is # i915_drm_debug Index: linux-2.6.38-rc7/drivers/gpu/drm/i915/i915_drv.c =================================================================== --- linux-2.6.38-rc7.orig/drivers/gpu/drm/i915/i915_drv.c 2011-03-07 01:31:21.787178130 +0100 +++ linux-2.6.38-rc7/drivers/gpu/drm/i915/i915_drv.c 2011-03-09 21:47:07.333890303 +0100 @@ -280,6 +280,8 @@ { struct drm_i915_private *dev_priv = dev->dev_private; + printk(KERN_ERR "i915_drm_freeze()\n"); + drm_kms_helper_poll_disable(dev); pci_save_state(dev->pdev); @@ -295,6 +297,7 @@ drm_irq_uninstall(dev); } + printk(KERN_ERR "i915_save_state()\n"); i915_save_state(dev); intel_opregion_fini(dev); @@ -309,6 +312,8 @@ { int error; + printk(KERN_ERR "i915_suspend()\n"); + if (!dev || !dev->dev_private) { DRM_ERROR("dev: %p\n", dev); DRM_ERROR("DRM not initialized, aborting suspend.\n"); @@ -340,6 +345,8 @@ struct drm_i915_private *dev_priv = dev->dev_private; int error = 0; + printk(KERN_ERR "i915_drm_thaw()\n"); + if (drm_core_check_feature(dev, DRIVER_MODESET)) { mutex_lock(&dev->struct_mutex); i915_gem_restore_gtt_mappings(dev); @@ -378,6 +385,8 @@ { int ret; + printk(KERN_ERR "i915_resume()\n"); + if (dev->switch_power_state == DRM_SWITCH_POWER_OFF) return 0; @@ -515,7 +524,7 @@ } dev_priv->last_gpu_reset = get_seconds(); if (ret) { - DRM_ERROR("Failed to reset chip.\n"); + DRM_ERROR("Failed to reset chip (gen=%d, ret=%d).\n", INTEL_INFO(dev)->gen, ret); mutex_unlock(&dev->struct_mutex); return ret; } @@ -596,11 +605,14 @@ struct drm_device *drm_dev = pci_get_drvdata(pdev); int error; + printk(KERN_ERR "i915_pm_suspend()\n"); + if (!drm_dev || !drm_dev->dev_private) { dev_err(dev, "DRM not initialized, aborting suspend.\n"); return -ENODEV; } + printk(KERN_ERR "switch_power_state: %d (switch_off=%d)\n", drm_dev->switch_power_state, DRM_SWITCH_POWER_OFF); if (drm_dev->switch_power_state == DRM_SWITCH_POWER_OFF) return 0; @@ -619,6 +631,8 @@ struct pci_dev *pdev = to_pci_dev(dev); struct drm_device *drm_dev = pci_get_drvdata(pdev); + printk(KERN_ERR "i915_pm_resume()\n"); + return i915_resume(drm_dev); } @@ -627,6 +641,8 @@ struct pci_dev *pdev = to_pci_dev(dev); struct drm_device *drm_dev = pci_get_drvdata(pdev); + printk(KERN_ERR "i915_pm_freeze()\n"); + if (!drm_dev || !drm_dev->dev_private) { dev_err(dev, "DRM not initialized, aborting suspend.\n"); return -ENODEV; @@ -640,6 +656,8 @@ struct pci_dev *pdev = to_pci_dev(dev); struct drm_device *drm_dev = pci_get_drvdata(pdev); + printk(KERN_ERR "i915_pm_thaw()\n"); + return i915_drm_thaw(drm_dev); } # i915_debug_test Index: linux-2.6.38-rc7/drivers/gpu/drm/i915/i915_irq.c =================================================================== --- linux-2.6.38-rc7.orig/drivers/gpu/drm/i915/i915_irq.c 2011-03-09 01:10:52.312479773 +0100 +++ linux-2.6.38-rc7/drivers/gpu/drm/i915/i915_irq.c 2011-03-09 01:23:34.436161397 +0100 @@ -954,6 +954,16 @@ } } +#define I810_PGETBL_CTL 0x2020 +#define I810_PGE_ERR 0x2024 +#define I965_PGETBL_CTL2 0x20C4 + { + u32 pgetbl_ctl = I915_READ(I810_PGETBL_CTL); + printk(KERN_ERR " PGETBL_CTL: 0x%08x\n", pgetbl_ctl); + u32 pgetbl_ctl2 = I915_READ(I965_PGETBL_CTL2); + printk(KERN_ERR " PGETBL_CTL2: 0x%08x\n", pgetbl_ctl2); + } + if (eir & I915_ERROR_MEMORY_REFRESH) { u32 pipea_stats = I915_READ(PIPEASTAT); u32 pipeb_stats = I915_READ(PIPEBSTAT); However I'm pretty sure either the i915_invalid_unfenced_alignment or the commit revert make it crash much less oftern with 2.6.38-rc7/rc8 (Currently upgraded to -rc8 without patches and it already crashed again). Created attachment 44395 [details]
Compiled crashlog for basically unpatched 2.6.38-rc8
[ 2353.144057] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[ 2353.148057] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -11 (awaiting 223878 at 223869, next 223879)
[ 2354.652005] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[ 2354.660008] [drm:i915_reset] *ERROR* Failed to reset chip.
Created attachment 44416 [details]
Gpu hang while browsing in chrome, X was stuck at 100% cpu.
Created attachment 44417 [details]
Another one while I was on this very bugtracker.
BTW, these include the i915_error_state.
Created attachment 44418 [details]
Again, browsing in chrome
I'm going to try running with 6bda10d152735c22baf1dcd92937420b4b0a359a reverted again.
Created attachment 44560 [details]
20110315222921 gpu hang while browsing
Created attachment 44561 [details]
20110316001333 gpu hang while browsing
Created attachment 44562 [details]
20110316213815 crash while browsing in chrome
Created attachment 44563 [details]
20110317230553 gpu hang while browsing
Created attachment 44564 [details]
20110317233448 gpu hang while browsing, X using 100% cpu
X was stuck, Strg+Alt+FN didn't work, needed to log in remotely to dump state and reboot.
20110317233448.i915gm.crashlog.tar.gz:
20110317233448.i915gm.crashlog/config.gz
20110317233448.i915gm.crashlog/quilt_top.txt
20110317233448.i915gm.crashlog/patches/
20110317233448.i915gm.crashlog/patches/nbd-sectsize
20110317233448.i915gm.crashlog/patches/config
20110317233448.i915gm.crashlog/patches/enable_ccache
20110317233448.i915gm.crashlog/patches/series
20110317233448.i915gm.crashlog/i915_error_state.txt
20110317233448.i915gm.crashlog/dmesg_boot.txt
20110317233448.i915gm.crashlog/dmidecode.txt
20110317233448.i915gm.crashlog/lspci_nxxx.txt
20110317233448.i915gm.crashlog/lspci.txt
20110317233448.i915gm.crashlog/dmesg_now.txt
20110317233448.i915gm.crashlog/xorg.conf
20110317233448.i915gm.crashlog/Xorg.0.log
20110317233448.i915gm.crashlog/intel_reg_dumper.txt
20110317233448.i915gm.crashlog/intel_gpu_dump.txt
20110317233448.i915gm.crashlog/dpkg.txt
20110317233448.i915gm.crashlog/uname.txt
20110317233448.i915gm.crashlog/intel_stepping.txt
Created attachment 44571 [details] [review] Fix tiling corruption Please try this patch. The patch doesn't apply cleanly against either of 2.6.38-rc7, -rc8 or .38 proper. I'll try wiggling it into 2.6.38 manually. Created attachment 44596 [details] [review] Fix tiling corruption (v2.6.38) This is the 2.6.38 (as opposed to linus/master) variant. Alternatively you can pull it from drm-intel-staging. Created attachment 44597 [details] [review] Fix tiling corruption (wiggled for v2.6.38) You accidentally attached the same patch again. ;) Here is my wiggled version, the compile just finished, I'll try booting it now. Created attachment 44598 [details]
20110318224655 gpu hang while browsing in chrome, X at 100% cpu
20110318224655.i915gm.crashlog.tar.gz contents:
20110318224655.i915gm.crashlog/config.gz
20110318224655.i915gm.crashlog/quilt_top.txt
20110318224655.i915gm.crashlog/patches/
20110318224655.i915gm.crashlog/patches/i915_fix_tiling_corruption_from_pipelined_fencing.txt
20110318224655.i915gm.crashlog/patches/nbd-sectsize
20110318224655.i915gm.crashlog/patches/config
20110318224655.i915gm.crashlog/patches/enable_ccache
20110318224655.i915gm.crashlog/patches/series
20110318224655.i915gm.crashlog/i915_error_state.txt
20110318224655.i915gm.crashlog/dmesg_boot.txt
20110318224655.i915gm.crashlog/dmidecode.txt
20110318224655.i915gm.crashlog/lspci_nxxx.txt
20110318224655.i915gm.crashlog/lspci.txt
20110318224655.i915gm.crashlog/dmesg_now.txt
20110318224655.i915gm.crashlog/xorg.conf
20110318224655.i915gm.crashlog/Xorg.0.log
20110318224655.i915gm.crashlog/intel_reg_dumper.txt
20110318224655.i915gm.crashlog/intel_gpu_dump.txt
20110318224655.i915gm.crashlog/dpkg.txt
20110318224655.i915gm.crashlog/uname.txt
20110318224655.i915gm.crashlog/intel_stepping.txt
Hmm. Try making this change to xf86-video-intel: diff --git a/src/intel_memory.c b/src/intel_memory.c index 64dfd8e..e805ff1 100644 --- a/src/intel_memory.c +++ b/src/intel_memory.c @@ -307,5 +307,6 @@ void intel_set_gem_max_sizes(ScrnInfoPtr scrn) gp.value = &value; gp.param = I915_PARAM_HAS_RELAXED_FENCING; ret = drmIoctl(intel->drmSubFD, DRM_IOCTL_I915_GETPARAM, &gp); - intel->has_relaxed_fencing = ret == 0; + //intel->has_relaxed_fencing = ret == 0; + intel->has_relaxed_fencing = 0; } Created attachment 44599 [details]
20110318231206 gpu hang while browsing in chrome, X still usable, but slow.
20110318231206.i915gm.crashlog.tar.gz contents:
20110318231206.i915gm.crashlog/config.gz
20110318231206.i915gm.crashlog/quilt_top.txt
20110318231206.i915gm.crashlog/intel_gpu_gather_info.sh
20110318231206.i915gm.crashlog/patches/
20110318231206.i915gm.crashlog/patches/i915_fix_tiling_corruption_from_pipelined_fencing.txt
20110318231206.i915gm.crashlog/patches/nbd-sectsize
20110318231206.i915gm.crashlog/patches/config
20110318231206.i915gm.crashlog/patches/enable_ccache
20110318231206.i915gm.crashlog/patches/series
20110318231206.i915gm.crashlog/i915_error_state.txt
20110318231206.i915gm.crashlog/dmesg_boot.txt
20110318231206.i915gm.crashlog/dmidecode.txt
20110318231206.i915gm.crashlog/lspci_nxxx.txt
20110318231206.i915gm.crashlog/lspci.txt
20110318231206.i915gm.crashlog/dmesg_now.txt
20110318231206.i915gm.crashlog/xorg.conf
20110318231206.i915gm.crashlog/Xorg.0.log
20110318231206.i915gm.crashlog/intel_reg_dumper.txt
20110318231206.i915gm.crashlog/intel_gpu_dump.txt
20110318231206.i915gm.crashlog/dpkg.txt
20110318231206.i915gm.crashlog/uname.txt
20110318231206.i915gm.crashlog/intel_stepping.txt
Rebuilt xserver-xorg-video-intel with this change, verified its active: [ 422.050] (WW) intel(0): has_relaxed_fencing is 1, forcing to 0. Created attachment 44619 [details]
20110319194019 gpu hang while browsing in chrome.
Took longer to trigger this time.
Created attachment 44642 [details]
20110320225050 Kicking stuck wait on render ring while using MPlayer, driver recovered.
20110320225050.i915gm.crashlog.tar.gz contents:
20110320225050.i915gm.crashlog/config.gz
20110320225050.i915gm.crashlog/quilt_top.txt
20110320225050.i915gm.crashlog/intel_gpu_gather_info.sh
20110320225050.i915gm.crashlog/patches/
20110320225050.i915gm.crashlog/patches/i915_fix_tiling_corruption_from_pipelined_fencing.txt
20110320225050.i915gm.crashlog/patches/nbd-sectsize
20110320225050.i915gm.crashlog/patches/config
20110320225050.i915gm.crashlog/patches/enable_ccache
20110320225050.i915gm.crashlog/patches/series
20110320225050.i915gm.crashlog/i915_error_state.txt
20110320225050.i915gm.crashlog/dmesg_boot.txt
20110320225050.i915gm.crashlog/dmidecode.txt
20110320225050.i915gm.crashlog/lspci_nxxx.txt
20110320225050.i915gm.crashlog/lspci.txt
20110320225050.i915gm.crashlog/dmesg_now.txt
20110320225050.i915gm.crashlog/xorg.conf
20110320225050.i915gm.crashlog/Xorg.0.log
20110320225050.i915gm.crashlog/intel_reg_dumper.txt
20110320225050.i915gm.crashlog/intel_gpu_dump.txt
20110320225050.i915gm.crashlog/dpkg.txt
20110320225050.i915gm.crashlog/uname.txt
20110320225050.i915gm.crashlog/intel_stepping.txt
Actually I just found it didn't recover, video playback always triggered the "Kicking stuck wait on render ring" again after this until I did a suspend_to_memory-resume cycle. Note that I'm explicitly using the video overlay (XvPreferOverlay==1 and vo=xv:port=86), last time I tried the textured video adapter it was still inferior, it regularily had frame swapping/tearing issues. Created attachment 44732 [details]
Kicking stuck wait on render ring while playing video in MPlayer
Created attachment 48063 [details] [review] Fix unfenced alignment on pre-g33 I think this is the final nail in the coffin... commit e28f87116503f796aba4fb27d81e2c3d81966174 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jul 18 13:11:49 2011 -0700 drm/i915: Fix unfenced alignment on pre-G33 hardware Align unfenced buffers on older hardware to the power-of-two object size. The docs suggest that it should be possible to align only to a power-of-two tile height, but using the already computed fence size is easier and always correct. We also have to make sure that we unbind misaligned buffers upon tiling changes. In order to prevent a repetition of this bug, we change the interface to the alignment computation routines to force the caller to provide the requested alignment and size of the GTT binding rather than assume the current values on the object. Reported-and-tested-by: Sitosfe Wheeler <sitsofe@yahoo.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36326 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com> |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.