| Summary: | Xorg start fails with missleading log entries: Module [...] does not have a [...] data object. | ||||||
|---|---|---|---|---|---|---|---|
| Product: | xorg | Reporter: | Knut Petersen <Knut_Petersen> | ||||
| Component: | Server/DDX/Xorg/dlloader | Assignee: | Adam Jackson <ajax> | ||||
| Status: | RESOLVED INVALID | QA Contact: | Xorg Project Team <xorg-team> | ||||
| Severity: | normal | ||||||
| Priority: | medium | CC: | daniel | ||||
| Version: | git | Keywords: | patch | ||||
| Hardware: | x86 (IA32) | ||||||
| OS: | Linux (All) | ||||||
| Whiteboard: | 2011BRB_Reviewed | ||||||
| i915 platform: | i915 features: | ||||||
| Attachments: |
|
||||||
Hmm, it seems like you're starting with LD_BIND_NOW or RTLD_NOW enabled, which isn't supported. Is that the case? No. set | grep LD shows none of the LD/RTLD variables. cu, knut There's two places where you can ask for LD_BIND_NOW-style behaviour, in the ld.so environment (quite difficult with suid executables actually) and at ld time itself. I suspect if you run 'readelf -a foo.so | grep NOW' against one of your compiled modules you'll see something like: 0x00000018 (BIND_NOW) 0x6ffffffb (FLAGS_1) Flags: NOW Which means you've put '-z now' into your ldflags. Don't have done that. No. There is no -z in LDFLAGS.
No. "readelf -a foo.so | grep NOW' does not succeed to find "NOW".
I did not install a new glibc, binutils or something like that.
Xorg is built using the following script. I believe it is ok.
export PREFIX=/usr
export PKG_CONFIG_PATH=$PREFIX/lib/pkgconfig
export PATH=$PREFIX/bin:$PATH
export ACLOCAL="aclocal -I $PREFIX/share/aclocal"
export LD_LIBRARY_PATH=$PREFIX/lib
export PYTHONPATH=$PREFIX/lib/python2.7/site-packages
export CFLAGS="-v -O3 "
util/modular/build.sh $PREFIX --modfile modules_to_build --autoresume built-modules.txt \
--confflags "--enable-kdrive --with-dri-drivers=i915 --disable-gallium --localstatedir=/var"
A full new build after make clean, make realclean, git reset --hard does not help.
ltrace shows that dlopen is called with flags 257. That is ok.
vsnprintf("(II) Loading /usr/lib/xorg/modules/extensions/libglx.so\n", 1024, "(II) Loading %s\n", 0xbf8166f8) = 56
fwrite("(II) Loading /usr/lib/xorg/modules/extensions/libglx.so\n", 56, 1, 0x8220d28) = 1
dlopen("/usr/lib/xorg/modules/extensions/libglx.so", 257) = NULL
dlerror() = "/usr/lib/xorg/modules/extensions/libglx.so: undefined symbol: DRIGetDrawableInfo"
snprintf("(EE) Failed to load %s: %s\n", 1024, "%s%s%s", "(EE)", " ", "Failed to load %s: %s\n") = 27
clock_gettime(1, 0xbf816240, 0x79732064, 0x6c6f626d, 0x5244203a) = 0
sprintf("[ 71701.622] ", "[%10.3f] ", ...) = 13
fwrite("[ 71701.622] ", 13, 1, 0x8220d28) = 1
vsnprintf("(EE) Failed to load /usr/lib/xorg/modules/extensions/libglx.so: /usr/lib/xorg/modules/extensions/libglx.so: undefined symbol: DRIGetDrawableInfo\n", 1024, "(EE) Failed to load %s: %s\n", 0xbf8166f8) = 145
fwrite("(EE) Failed to load /usr/lib/xorg/modules/extensions/libglx.so: /usr/lib/xorg/modules/extensions/libglx.so: undefined symbol: DRIGetDrawableInfo\n", 145, 1, 0xb7537560(EE) Failed to load /usr/lib/xorg/modules/extensions/libglx.so: /usr/lib/xorg/modules/extensions/libglx.so: undefined symbol: DRIGetDrawableInfo
) = 1
fwrite("(EE) Failed to load /usr/lib/xorg/modules/extensions/libglx.so: /usr/lib/xorg/modules/extensions/libglx.so: undefined symbol: DRIGetDrawableInfo\n", 145, 1, 0x8220d28) = 1
__strdup(0x822b9b0, 0xbf8167ec, 0xbf8167e8, 0xbf81677c, 0) = 0x822b480
strchr("glx", '.') = NULL
asprintf(0xbf81678c, 0x81e8905, 0x8229c58, 0xbf81677c, 0) = 13
dlsym(NULL, "glxModuleData") = NULL
dlopen(NULL, 257) = 0xb7837900
dlsym(0xb7837900, "glxModuleData") = NULL
snprintf("(EE) LoadModule: Module %s does not have a %s data object.\n", 1024, "%s%s%s", "(EE)", " ", "LoadModule: Module %s does not have a %s data object.\n") = 59
clock_gettime(1, 0xbf816270, 1, 0x8220d28, 1) = 0
sprintf("[ 71701.647] ", "[%10.3f] ", ...) = 13
fwrite("[ 71701.647] ", 13, 1, 0x8220d28) = 1
vsnprintf("(EE) LoadModule: Module glx does not have a glxModuleData data object.\n", 1024, "(EE) LoadModule: Module %s does not have a %s data object.\n", 0xbf816728) = 71
fwrite("(EE) LoadModule: Module glx does not have a glxModuleData data object.\n", 71, 1, 0xb7537560(EE) LoadModule: Module glx does not have a glxModuleData data object.
) = 1
fwrite("(EE) LoadModule: Module glx does not have a glxModuleData data object.\n", 71, 1, 0x8220d28) = 1
snprintf("(II) UnloadModule: "%s"\n", 1024, "%s%s%s", "(II)", " ", "UnloadModule: "%s"\n") = 24
clock_gettime(1, 0xbf816230, 71, -1, 0xb7536ff4) = 0
sprintf("[ 71701.660] ", "[%10.3f] ", ...) = 13
fwrite("[ 71701.660] ", 13, 1, 0x8220d28) = 1
vsnprintf("(II) UnloadModule: "glx"\n", 1024, "(II) UnloadModule: "%s"\n", 0xbf8166ec) = 25
fwrite("(II) UnloadModule: "glx"\n", 25, 1, 0x8220d28) = 1
snprintf("(II) Unloading %s\n", 1024, "%s%s%s", "(II)", " ", "Unloading %s\n") = 18
clock_gettime(1, 0xbf816210, -1, 0xbf816230, 3) = 0
sprintf("[ 71701.669] ", "[%10.3f] ", ...) = 13
fwrite("[ 71701.669] ", 13, 1, 0x8220d28) = 1
vsnprintf("(II) Unloading glx\n", 1024, "(II) Unloading %s\n", 0xbf8166c8) = 19
fwrite("(II) Unloading glx\n", 19, 1, 0x8220d28) = 1
libdri etc do provide the symbols required for libglx. Xorg finds them
and loads them after libglx failed. ldconfig -p | grep /usr/lib/xorg does show all the required libraries. According to the man page dlopen should load those libraries, shouldn´t it?!
I´d suspect a problem with ld*, but why is only Xorg module loading broken? What is so special about Xorg? I am perplexed.
cu,
Knut
> libdri etc do provide the symbols required for libglx. Xorg finds them
> and loads them after libglx failed. ldconfig -p | grep /usr/lib/xorg does show
> all the required libraries. According to the man page dlopen should load those
> libraries, shouldn´t it?!
Yes, it should.
To use libglx as an example, the only reference it makes to DRIGetDrawableInfo is as a function call:
glx/glxdri.c: retval = DRIGetDrawableInfo(pScreen, drawable->base.pDraw, index, stamp,
These are _normally_ resolved lazily (ie, when called) by the dynamic loader. However if you force symbols to be resolved before they're all available, dlopen will fail. That's why I keep asking about -z now: that's the thing that changes functional call resolution from lazy to up-front.
You _must_ be getting that behaviour from somewhere. Check the Xorg binary itself. Check for wrapper scripts. Check your OS for security policy changes (-z now lets you do some additional security hardening).
A full extra verbose build log does not show -z now.
After exporting LD_DEBUG and LD_DEBUG_OUTPUT I got "log".
grep "relocation" log shows
14821: relocation processing: /lib/libgpg-error.so.0 (lazy)
14821: relocation processing: /lib/libc.so.6 (lazy)
14821: relocation processing: /lib/librt.so.1 (lazy)
14821: relocation processing: /lib/libm.so.6 (lazy)
14821: relocation processing: /usr/local/lib/libXdmcp.so.6 (lazy)
14821: relocation processing: /usr/local/lib/libXau.so.6 (lazy)
14821: relocation processing: /lib/libz.so.1 (lazy)
14821: relocation processing: /usr/lib/libfontenc.so.1 (lazy)
14821: relocation processing: /usr/lib/libfreetype.so.6 (lazy)
14821: relocation processing: /usr/lib/libXfont.so.1 (lazy)
14821: relocation processing: /usr/lib/libpixman-1.so.0 (lazy)
14821: relocation processing: /lib/libpthread.so.0 (lazy)
14821: relocation processing: /usr/lib/libpciaccess.so.0 (lazy)
14821: relocation processing: /lib/libdl.so.2 (lazy)
14821: relocation processing: /lib/libgcrypt.so.11 (lazy)
14821: relocation processing: /lib/libdbus-1.so.3 (lazy)
14821: relocation processing: /usr/lib/libhal.so.1 (lazy)
14821: relocation processing: Xorg (lazy)
14821: relocation processing: /lib/ld-linux.so.2
14821: relocation processing: /lib/libgcc_s.so.1 (lazy)
14821: relocation processing: /usr/lib/xorg/modules/extensions/libextmod.so (lazy)
14821: relocation processing: /usr/lib/xorg/modules/extensions/libdbe.so (lazy)
14821: relocation processing: /usr/lib/xorg/modules/extensions/libglx.so (lazy)
14821: relocation processing: /usr/lib/xorg/modules/extensions/librecord.so (lazy)
14821: relocation processing: /usr/lib/libdrm.so.2 (lazy)
14821: relocation processing: /usr/lib/xorg/modules/extensions/libdri.so (lazy)
14821: relocation processing: /usr/lib/xorg/modules/extensions/libdri2.so (lazy)
14821: relocation processing: /usr/lib/libdrm_intel.so.1 (lazy)
14821: relocation processing: /usr/lib/xorg/modules/drivers/intel_drv.so (lazy)
Everything is processed lazy as it should.
Here is the section related to searching of DRIGetDrawableInfo during processing of libglx:
14821: symbol=DRIGetDrawableInfo; lookup in file=Xorg [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/libhal.so.1 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libdbus-1.so.3 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libgcrypt.so.11 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libdl.so.2 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/libpciaccess.so.0 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libpthread.so.0 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/libpixman-1.so.0 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/libXfont.so.1 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/libfreetype.so.6 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/libfontenc.so.1 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libz.so.1 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/local/lib/libXau.so.6 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/local/lib/libXdmcp.so.6 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libm.so.6 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/librt.so.1 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libc.so.6 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libgpg-error.so.0 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/ld-linux.so.2 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/xorg/modules/extensions/libextmod.so [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/xorg/modules/extensions/libdbe.so [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/xorg/modules/extensions/libglx.so [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libdl.so.2 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libm.so.6 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/librt.so.1 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libc.so.6 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/ld-linux.so.2 [0]
14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libpthread.so.0 [0]
14821: /usr/lib/xorg/modules/extensions/libglx.so: error: symbol lookup error: undefined symbol: DRIGetDrawableInfo (fatal)
dlopen does not have a look at libdri.so
readelf shows four needed shared libs for libglx.so.
I don´t know how the linker exactly finds symbols in the various libraries, but: Shouldn´t there be a "NEEDED" entry for libdri.so in libglx.so?
Dynamic section at offset 0x5dee8 contains 27 entries:
Tag Type Name/Value
0x00000001 (NEEDED) Shared library: [libdl.so.2]
0x00000001 (NEEDED) Shared library: [libm.so.6]
0x00000001 (NEEDED) Shared library: [librt.so.1]
0x00000001 (NEEDED) Shared library: [libc.so.6]
0x0000000e (SONAME) Library soname: [libglx.so]
0x0000000c (INIT) 0xe818
0x0000000d (FINI) 0x4ee28
0x00000004 (HASH) 0x138
0x6ffffef5 (GNU_HASH) 0x57c
0x00000005 (STRTAB) 0xefc
0x00000006 (SYMTAB) 0x63c
0x0000000a (STRSZ) 2090 (bytes)
0x0000000b (SYMENT) 16 (bytes)
0x00000003 (PLTGOT) 0x5dff4
0x00000002 (PLTRELSZ) 16 (bytes)
0x00000014 (PLTREL) REL
0x00000017 (JMPREL) 0xe808
0x00000011 (REL) 0x18b0
0x00000012 (RELSZ) 53080 (bytes)
0x00000013 (RELENT) 8 (bytes)
0x00000016 (TEXTREL) 0x0
0x0000001e (FLAGS) TEXTREL STATIC_TLS
0x6ffffffe (VERNEED) 0x1840
0x6fffffff (VERNEEDNUM) 2
0x6ffffff0 (VERSYM) 0x1726
0x6ffffffa (RELCOUNT) 3970
0x00000000 (NULL) 0x0
cu,
Knut
(In reply to comment #6) > I don´t know how the linker exactly finds symbols in the various libraries, > but: Shouldn´t there be a "NEEDED" entry for libdri.so in libglx.so? If these were actual shared libraries, then yes, but these are loadable modules which rely on the symbols being found at runtime in either the loading program or the other objects it's already dlopen'ed. That said, the Solaris packages do include a patch to add that dependency, as part of our checking all symbols are resolvable at build time (-z defs): http://src.opensolaris.org/source/xref/x-cons/xnv-clone/open-src/xserver/xorg/dixmods-deps.patch If that was useful to other platforms, I'd be happy to contribute upstream, similar to the recently submitted http://patchwork.freedesktop.org/patch/7209/ Created attachment 51740 [details] [review] fix of longstanding error handling bug in the module loader We could argue about the error message... I think we should give a hint to Fred Foobar how he could fix the problem on his system. But as dlopen() never should fail, we also should ask for a bug report. cu, Knut Please send your patch to xorg-devel for review. (In reply to comment #8) > Created attachment 51740 [details] [review] [review] > fix of longstanding error handling bug in the module loader > > We could argue about the error message... > > I think we should give a hint to Fred Foobar how he could fix the problem on > his system. But as dlopen() never should fail, we also should ask for a bug > report. > > cu, > Knut Someone else fixed the 2nd Bug. The main problem still exists in current git master - Xorg fails to start without a manually created "Section Modules", even after I installed a fresh version of openSuSE. Knut This came up again on intel-gfx@, and it does indeed appear to be a toolchain issue: http://lists.freedesktop.org/archives/intel-gfx/2012-July/019079.html The solution is pretty simple: ============================================= Never ever include -v or --verbose in CFLAGS! ============================================= Why? Because otherwise there will be some output to stdout during the -fPIC test compile executed from configure, and that output causes the build system to erroneously assume that -fPIC does not work. Hence xorg parts that normally would be build with -fPIC will be built without that flag. The resulting Xorg server will fail to start with the normal configuration setup as lazy resolution is assumed but impossible. It will work perfectly if you add a suitable Section "Module" that loads all necessary modules in the right order. I think the test for "-fPIC" support is fundamentally broken and should be fixed. Or would it be better to check for -v and --verbose in CFLAGS? cu, Knut |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
About a week ago I recompiled a fresh git tree of Xorg. Building the tree succeeded with export PREFIX=/usr export PKG_CONFIG_PATH=$PREFIX/lib/pkgconfig export PATH=$PREFIX/bin:$PATH export ACLOCAL="aclocal -I $PREFIX/share/aclocal" export LD_LIBRARY_PATH=$PREFIX/lib export PYTHONPATH=$PREFIX/lib/python2.7/site-packages export CFLAGS="-v -O3 " util/modular/build.sh $PREFIX --modfile modules_to_build --autoresume built-modules.txt \ --confflags "--enable-kdrive --with-dri-drivers=i915 --disable-gallium --localstatedir=/var --with-gnu-ld" but Xorg did not start: /usr/bin/Xorg X.Org X Server 1.11.0 Release Date: 2011-08-26 X Protocol Version 11, Revision 0 Build Operating System: Linux 3.0.4-main i686 Current Operating System: Linux linux-iffr 3.0.4-main #8 PREEMPT Sat Sep 24 16:21:19 CEST 2011 i686 Kernel command line: root=/dev/hda2 acpi_enforce_resources=lax drm.debug=0x0 3 Build Date: 25 September 2011 10:49:38PM Current version of pixman: 0.23.5 Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. (==) Log file: "/var/log/Xorg.0.log", Time: Mon Sep 26 07:15:12 2011 (==) Using config directory: "/etc/X11/xorg.conf.d" (==) Using system config directory "/usr/share/X11/xorg.conf.d" (EE) Failed to load /usr/lib/xorg/modules/extensions/libglx.so: /usr/lib/xorg/modules/extensions/libglx.so: undefined symbol: DRIGetDrawableInfo (EE) LoadModule: Module glx does not have a glxModuleData data object. (EE) Failed to load module "glx" (invalid module, 0) (EE) Failed to load /usr/lib/xorg/modules/drivers/intel_drv.so: /usr/lib/xorg/modules/drivers/intel_drv.so: undefined symbol: vgaHWFreeHWRec (EE) LoadModule: Module intel does not have a intelModuleData data object. (EE) Failed to load module "intel" (invalid module, 0) (EE) Failed to load /usr/lib/xorg/modules/drivers/vesa_drv.so: /usr/lib/xorg/modules/drivers/vesa_drv.so: undefined symbol: shadowUpdatePacked (EE) LoadModule: Module vesa does not have a vesaModuleData data object. (EE) Failed to load module "vesa" (invalid module, 0) (EE) Failed to load /usr/lib/xorg/modules/drivers/fbdev_drv.so: /usr/lib/xorg/modules/drivers/fbdev_drv.so: undefined symbol: fbdevHWProbe (EE) LoadModule: Module fbdev does not have a fbdevModuleData data object. (EE) Failed to load module "fbdev" (invalid module, 0) (EE) No drivers available. Fatal server error: no screens found I checked all the files, made sure that all were the fresh versions. I found no sign of a build or install problem. I had a look at the failed modules with ldd and nm - everything looked fine. The symbols reported as undefined definitely were undefined, but all the "LoadModule: Module XXX does not have a XXX data object" messages were definitely wrong. I tried to find some hints in the www. Google showed thousands of hits for "LoadModule: Module" "does not have a" "data object". I found nothing that helped. No reply on a question posted to the intel-gfx mailing list. Yesterday I had the time to look at the problem again. I searched for the modules/libraries that defined the unresolved symbols, found them in various xorg modules, instruced Xorg to load those modules, found additional unresolved symbols, repeated the process described above, and finally Xorg was functional again. The solution to my problem is a new file 10-modules.conf in a xorg.conf.d directory that contains the following section. Section "Module" Load "dri" Load "dri2" Load "vgahw" Load "fb" Load "xaa" Load "int10" Load "vbe" Load "shadowfb" Load "shadow" Load "fbdevhw" EndSection Bug 1: ====== The following excerpt from the Xorg log (without the config file mentioned above) shows that Xorg knows that it is necessary to load glx, dri and dri2. It loads those modules in the specified order, but it is necessary to load dri and dri2 first, otherwise loading of glx fails. (II) "extmod" will be loaded by default. (II) "dbe" will be loaded by default. (II) "glx" will be loaded by default. (II) "record" will be loaded by default. (II) "dri" will be loaded by default. (II) "dri2" will be loaded by default. (II) Loading /usr/lib/xorg/modules/extensions/libextmod.so (II) Module extmod: vendor="X.Org Foundation" compiled for 1.11.0, module version = 1.0.0 (II) Loading /usr/lib/xorg/modules/extensions/libdbe.so (II) Module dbe: vendor="X.Org Foundation" compiled for 1.11.0, module version = 1.0.0 (II) Loading /usr/lib/xorg/modules/extensions/libglx.so (EE) Failed to load /usr/lib/xorg/modules/extensions/libglx.so: /usr/lib/xorg/modules/extensions/libglx.so: undefined symbol: DRIGetDrawableInfo (EE) LoadModule: Module glx does not have a glxModuleData data object. (II) Unloading glx (EE) Failed to load module "glx" (invalid module, 0) (II) Loading /usr/lib/xorg/modules/extensions/librecord.so (II) Module record: vendor="X.Org Foundation" compiled for 1.11.0, module version = 1.13.0 (II) Loading /usr/lib/xorg/modules/extensions/libdri.so (II) Module dri: vendor="X.Org Foundation" compiled for 1.11.0, module version = 1.0.0 (II) Loading /usr/lib/xorg/modules/extensions/libdri2.so (II) Module dri2: vendor="X.Org Foundation" compiled for 1.11.0, module version = 1.2.0 BTW: I use the git tree of Xorg since march 2011 - the openSuSE 11.4 xorg is broken for i915GM hardware. I never needed a "Section Module" configuration entry. So either something screwed up autoloading of required Xorg modules about two weeks ago, or I screwed up my Xorg configuration files by accident. Bug 2 ===== All those "LoadModule: Module ABC does not have a XYZ data object" are definitely wrong and misleading. cu, Knut