Since Fedora updated to beignet 1.2 all my Einstein@Home openCL tasks are invalid. This is with Intel(R) HD Graphics 5500 (Broadwell GT2), kernel 4.7.4-200.fc24.x86_64, libdrm: 2.4.70-1.fc24 and llvm 3.8.0-1.fc24.
The beignet version 1.1 worked fine. I managed to bisect it to 27a95c6678e3e6426ad1bc7927d92f1488fe884c is the first bad commit
Author: Guo Yejun <firstname.lastname@example.org>
Date: Mon Feb 29 08:08:06 2016 +0800
enable FP_CONTRACT on as default, and implemented with MAD
According to OpenCL spec, FP_CONTRACT is on as default, while LLVM/clang
just enabled it at http://reviews.llvm.org/D14200 at Nov 2015. So we still
need set this option explicitly now.
The implementation is hardware MAD instruction whose accuracy is enough
Passed test: contractions of conformance test
Signed-off-by: Guo Yejun <email@example.com>
Reviewed-by: Ruiling Song <firstname.lastname@example.org>
Reverting this commit on to of current beignet git master solves the problems. the results are not completelly off, but the difference is big enough to differ sufficiently from other platforms and fail the Einstein@home task validation.
You can reproduce this either by installing Boinc client, attaching to the Einstein@Home project and setting preferences to allow openCL work for intel CPUs or you can test with the standalone app. The complete sources are here: https://einstein.phys.uwm.edu/download/brp-src-release.zip however it is quite hard to get it working so for convenience I'll attach compiled test app here. Just run runtest and check the results file against the results-1.52-intel_gpu file. It runs around 16 min here with my Broadwell GT2.
Created attachment 127140 [details]
Created attachment 127148 [details]
I tried it on my skylake machine with ubuntu 16.04, and observed the difference when FP_CONTRACT is on/off.
I'm also curious why the math accuracy makes such a difference. Could you narrow down the application to a simple one? I'm not sure if it is a driver issue, or if the application algorithm itself is accuracy sensitive.
As a workaround, you can try to make FP_CONTRACT off in the opencl kernel.
> I'm also curious why the math accuracy makes such a difference. Could you
> narrow down the application to a simple one? I'm not sure if it is a driver
> issue, or if the application algorithm itself is accuracy sensitive.
By narrowing down the application you mean to narrowing it down to specific kernel? To tell the truth I'm mostly a user of the application and don't have much idea what is going on, in my understanding it should be some sort of FFT, but how much is it accuracy sensitive I have no idea. I'll try to get more info from some person who actually understands the code and I'll have a look at the source code again to see if I can narrow this down.
BTW is there some switch or debug variable to dump the compiled kernels in some human readable format so I can actually see what is the difference in produced instructions with and without FP_CONTRACT.
You can find these in docs/Beignet/Backend.mdwn.
In you case, I think OCL_OUTPUT_LLVM_BEFORE_LINK=1 is enough.
-- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/beignet/beignet/issues/75.