Bug 15334

Summary: Radeon EXA + Sampling off the edge of REPEAT_NONE sources
Product: xorg Reporter: Owen Taylor <otaylor>
Component: Driver/RadeonAssignee: xf86-video-ati maintainers <xorg-driver-ati>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: leio, npeninguy
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Patch fixing repeating transformed sources
none
Trial patch to use border color on R100/R200
none
pycairo test case
none
Expected output from pycairo test case
none
r100 output of pycairo test case
none
OpenGL test case showing same hw problem
none
Expected output from GL test case
none
R100 output of GL test case none

Description Owen Taylor 2008-04-03 00:02:54 UTC
Created attachment 15640 [details] [review]
Patch fixing repeating transformed sources

REPEAT_NONE (The default pixman repeat mode) is being handled in
the Radeon EXA code render acceleration code incorrectly ... the
current behavior is pretty much what REPEAT_PAD is supposed to
do, while REPEAT_NONE should give alpha=0 pixels when sampling
off the edge.

This is easy to do using the border color feature of the hardware
when the source pixmap has alpha, but since the border color
register is in the same format as the source pixmap, I don't
how to solve the issue in the no-alpha source case, so the
attached patch just falls back.

Notes:

 - The same issue is almost certainly present for R100/R200,
   but lacking both experience and hardware in that area I
   haven't tried to address it.

 - The handling of untransformed sources may not be right
   if the upper layers don't clip the rendered area to the
   source bounds. (cairo does this on the client side, so
   my test program didn't reveal this one way or the other
   and I don't want to dig through the code right now.)

   The correct fix if that isn't happening would be to add
   that clipping, since it's a significant optimization,
   and a common case where you don't want to fall back to
   software.

 - It would be theoretically possible to detect that the
   sampled area is entirely within the source and avoid
   the fallback for non-alpha sources but it would be hard
   and I think it's a rare case not worth worrying about.
  
 - The names for the clamp modes in radeon_reg.h are 
   not at obvious to me:

#       define R300_TX_CLAMP_WRAP                       0
#       define R300_TX_CLAMP_MIRROR                     1
#       define R300_TX_CLAMP_CLAMP_LAST                 2
#       define R300_TX_CLAMP_MIRROR_CLAMP_LAST          3
#       define R300_TX_CLAMP_CLAMP_BORDER               4
#       define R300_TX_CLAMP_MIRROR_CLAMP_BORDER        5
#       define R300_TX_CLAMP_CLAMP_GL                   6
#       define R300_TX_CLAMP_MIRROR_CLAMP_GL            7

   CLAMP_GL seems to correspond to GL_CLAMP_TO_BORDER
   which I believe was added later in GL and not to GL_CLAMP.
   The corresponding #defines in Mesa make more sense to me:

#       define R300_TX_REPEAT                    0
#       define R300_TX_MIRRORED                  1
#       define R300_TX_CLAMP_TO_EDGE             2
#       define R300_TX_MIRROR_ONCE_TO_EDGE       3
#       define R300_TX_CLAMP                     4
#       define R300_TX_MIRROR_ONCE               5
#       define R300_TX_CLAMP_TO_BORDER           6
#       define R300_TX_MIRROR_ONCE_TO_BORDER     7

   But I don't understand this well enough to say if the 
   correspondence to GL is really exact enough to merit those
   names. The r500 docs document the values as:

   00 - Wrap (repeat)
   01 - Mirror
   02 - Clamp to last texel (0.0 to 1.0)
   03 - MirrorOnce to last texel (-1.0 to 1.0) 
   04 - Clamp half way to border color (0.0 to 1.0)
   05 - MirrorOnce half way to border color (-1.0 to 1.0)
   06 - Clamp to border color (0.0 to 1.0)
   07 - MirrorOnce to border color (-1.0 to 1.0)
 
  Anyways, a digression...
Comment 1 Michel Dänzer 2008-04-03 00:17:11 UTC
A possible better solution would be to transform the destination coordinates such that no rasterization occurs outside of the non-repeated texture area (and to use the hardware scissor to prevent rasterization outside of the provided destination rectangle).
Comment 2 Owen Taylor 2008-04-03 07:17:51 UTC
I see how a scissor could be used in the case where the image of the
source is an exact pixel-aligned rectangle:

 - No transformation
 - Integer translation + 90 degree rotation
 - Arbitrary translation + arbitrary scale + FILTER_NEAREST
   (avoiding off-by-ones on the edges could be hard for this)

Do you see it as being more general than that?
Comment 3 Michel Dänzer 2008-04-03 09:04:10 UTC
(In reply to comment #2)
Ignore the scissor for now - it would just be to prevent rendering outside of the composite operation's destination rectangle.

The idea is to transform the destination vertices such that their texture coordinates coincide with the corners of the texture. Or does the spec require that all pixels of the destination rectangle do get rasterized?

Either way, I didn't think of non-linear filtering...

Another possibility might be to handle this in the shader?

It's a pity the R300 3D engine seems to require the texture to have alpha for your solution to work... AFAICT this might not be the case on R200 at least, as the border colour registers always take a 32 bit ARGB value there.
Comment 4 Owen Taylor 2008-04-03 10:53:11 UTC
If I take a small image (say a 1x1 pixmap) and scale it up with REPEAT_NONE
and bilinear filtering, the right result according to the RENDER spec is not
a sharp-edge square, it's a blurry dot.

(This means that using REPEAT_NONE with a transform is seldom desirable,
but...) So, I don't think an approach of modifying
the output vertices will work.

(Also, with some modes, such as SOURCE, you do have to render all pixels)

The two things I can think of that would work:

 - Copy the source texture to a temporary with alpha
 - As you say, use a shader program.

From what I saw browsing the the register header file you are probably
right about R100/R200.
Comment 5 Michel Dänzer 2008-04-03 23:20:28 UTC
Another crazy idea: Modulate the alpha channel with an additional alpha-only texture that samples 0.0 at the border and 1.0 otherwise. This might not even require actual storage.

Anyway, I guess we can use this patch for now and worry about alternatives later if it turns out to be important in practice. If nobody beats me to it, I'll integrate it when I get around to it.

I do see some minor issues in the patch though: I think the comment should say 'alpha=0' rather than just 'alpha', and I'm not sure the driver should use pixman macros directly.
Comment 6 Owen Taylor 2008-04-04 21:52:05 UTC
Hmm, it took me drawing some pictures and working out the different cases
for edges and corners, but the "crazy idea" is actually exact.

(It's not obvious because modulation with alpha and bilinear filtering
can't generally be swapped  .. it works here because at the edges where
we are modulating with alpha the color is constant)

As far as I can see, other than creating a temporary alpha surface the
size of the source (which seems prohibitive) the main win of the idea
is making it simpler to implement the correct behavior with a shader ...
instead of having to implement the bilinear filtering from scratch
in the shader, you just have to sample the texture normally than 
multiply it by, say:

 clamp(0.5 + (s * width), 0, 1) *
 clamp(0.5 + ((1-s) * width), 0, 1) *
 clamp(0.5 + (t * height), 0, 1) *
 clamp(0.5 + ((1-t) * height), 0, 1)

But back to more immediately to this patch - how would you prefer to see
the test for "source has alpha" be done? I chose to use the pixman macros
because AFAIK they are as public as the pixman format constants that we use
in this file and the test was simple and compact that way.

(Yes, that should be alpha=0, I'll do a new rev of the patch that fixes
up both at once.)
Comment 7 Michel Dänzer 2008-04-07 00:58:37 UTC
(In reply to comment #6)
> As far as I can see, other than creating a temporary alpha surface the
> size of the source (which seems prohibitive) [...]

The nifty part is I don't think we need any additional storage. :) Something like

R300_EASY_TX_FORMAT(ZERO, ZERO, ZERO, ONE, X8)

should do the trick I think.


> But back to more immediately to this patch - how would you prefer to see
> the test for "source has alpha" be done?

Use PICT_FORMAT_A() instead of PIXMAN_FORMAT_A() (see /usr/include/xorg/picture.h). If you don't mind, I can push the patch with those two changes.
Comment 8 Owen Taylor 2008-04-07 07:26:36 UTC
> The nifty part is I don't think we need any additional storage. :) Something
> like
> 
> R300_EASY_TX_FORMAT(ZERO, ZERO, ZERO, ONE, X8)
> 
> should do the trick I think.

So the idea is you just point TXOFFSET to some random piece of memory
(the start of the source texture, say), and it won't matter because
the data doesn't get used, or maybe doesn't even get read at all?

If that works, that's nifty indeed.

> > But back to more immediately to this patch - how would you prefer to see
> > the test for "source has alpha" be done?
> 
> Use PICT_FORMAT_A() instead of PIXMAN_FORMAT_A() (see
> /usr/include/xorg/picture.h). If you don't mind, I can push the patch with
> those two changes.

Sure, if you want to do that, it would appreciated.
Comment 9 Roland Scheidegger 2008-04-07 08:42:44 UTC
(In reply to comment #8)
> > The nifty part is I don't think we need any additional storage. :) Something
> > like
> > 
> > R300_EASY_TX_FORMAT(ZERO, ZERO, ZERO, ONE, X8)
> > 
> > should do the trick I think.
> 
> So the idea is you just point TXOFFSET to some random piece of memory
> (the start of the source texture, say), and it won't matter because
> the data doesn't get used, or maybe doesn't even get read at all?
It won't get used, but I'd suspect it still will get read (so yes the texture address must be valid for the gpu to read).
Comment 10 Michel Dänzer 2008-04-07 09:24:50 UTC
(In reply to comment #8)
> So the idea is you just point TXOFFSET to some random piece of memory
> (the start of the source texture, say), and it won't matter because
> the data doesn't get used, or maybe doesn't even get read at all?

Exactly.


> > If you don't mind, I can push the patch with those two changes.
> 
> Sure, if you want to do that, it would appreciated.

Done, thanks.
Comment 11 Owen Taylor 2008-04-13 08:20:29 UTC
On the R300_EASY_TX_FORMAT(ZERO, ZERO, ZERO, ONE, X8) idea:

Isn't the basic problem we are trying to solve here that a texture format 
of R300_EASY_TX_FORMAT(X, Y, Z, ONE, W8Z8Y8X8) doesn't allow us to have 
alpha in the border color?

By analogy I don't think we are going to have any more luck 
getting alpha in a the border color for a texture format of
R300_EASY_TX_FORMAT(ZERO, ZERO, ZERO, ONE, X8).

(The docs describe the swizzling fields of TX_FORMAT as "Specifies 
swizzling for each channel _at the input of the pixel shader_",
so that also implies that it happens after texture sampling/filtering.)
Comment 12 Owen Taylor 2008-04-13 08:56:44 UTC
Created attachment 15878 [details] [review]
Trial patch to use border color on R100/R200

Here's a patch that tries using the border color on R100/R200 (taking
the advantage of the fact that it is always ARGB in a fixed format.)
However, it fails on R100 because the hardware doesn't seem smoothly blend
with the border color using linear interpolation, but rather to 
abruptly just use the border color for pixels near the edge.

I'll attach a test that demonstrates this with expected and actual
output.

I reproduced the same behavior using an OpenGL program, so I don't
think it's misprogramming of the hardware, though obviously some
missing element could be shared between Mesa and the 2D driver.

I have not yet tested on R200. The easy test would be to try the
GL program, which I'll attach as well. Compile with:

 gcc -g -Wall -o gl-border-test gl-border-test.c `pkg-config --cflags --libs gl glu` -lglut

And run as:

 ./gl-border-test +linear
Comment 13 Owen Taylor 2008-04-13 09:01:18 UTC
Created attachment 15879 [details]
pycairo test case

This test case will not work properly with either the current cairo
(bug 15349) or the current X server (bug 15369). If someone finds that
the GL test case works on R200, I'll come up with a more constrained
test case that doesn't trigger the other bugs along with a new driver patch.
Comment 14 Owen Taylor 2008-04-13 09:02:16 UTC
Created attachment 15880 [details]
Expected output from pycairo test case
Comment 15 Owen Taylor 2008-04-13 09:02:46 UTC
Created attachment 15881 [details]
r100 output of pycairo test case
Comment 16 Owen Taylor 2008-04-13 09:04:41 UTC
Created attachment 15882 [details]
OpenGL test case showing same hw problem

As the leftover comment at the top shows, this was derived from the
Mesa demo program trispd.c.
Comment 17 Owen Taylor 2008-04-13 09:05:15 UTC
Created attachment 15883 [details]
Expected output from GL test case
Comment 18 Owen Taylor 2008-04-13 09:05:41 UTC
Created attachment 15884 [details]
R100 output of GL test case
Comment 19 Mart Raudsepp 2008-04-13 20:29:38 UTC
(In reply to comment #18)
> Created an attachment (id=15884) [details]
> R100 output of GL test case

R200 gets an identical output from the GL test case
Comment 20 Roland Scheidegger 2008-04-14 08:01:53 UTC
(In reply to comment #11)
> By analogy I don't think we are going to have any more luck 
> getting alpha in a the border color for a texture format of
> R300_EASY_TX_FORMAT(ZERO, ZERO, ZERO, ONE, X8).
> 
> (The docs describe the swizzling fields of TX_FORMAT as "Specifies 
> swizzling for each channel _at the input of the pixel shader_",
> so that also implies that it happens after texture sampling/filtering.)
Yes, looks like you're right.
Comment 21 Roland Scheidegger 2008-04-14 08:08:17 UTC
(In reply to comment #12)
> Created an attachment (id=15878) [details]
> Trial patch to use border color on R100/R200
> 
> Here's a patch that tries using the border color on R100/R200 (taking
> the advantage of the fact that it is always ARGB in a fixed format.)
> However, it fails on R100 because the hardware doesn't seem smoothly blend
> with the border color using linear interpolation, but rather to 
> abruptly just use the border color for pixels near the edge.
Yes, I'm pretty sure this is a r100+r200 hardware limitation, apparently the hw won't do bilinear filtering but use the border color without filtering for these pixels if the sampling center is outside 0.0-1.0. You could use mesa's tests/texwrap to see what all modes produce. Maybe such a border color sampling was asked for (or is at least fully compliant) for d3d.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.