My FOSDEM 2022 talk: Fun with border colors in Vulkan
FOSDEM 2022 took place this past weekend, on
February 5th and 6th. It was a virtual event for the second year in a row, but
this year the Graphics
devroom made a comeback and I participated in it with a talk titled
“Fun with border
colors in Vulkan”. In the talk, I explained the context and origins behind the
VK_EXT_border_color_swizzle
extension that was published last year and in
which I’m listed as one of the contributors.
Big kudos and a big thank you to the FOSDEM organizers one more year. FOSDEM is arguably the most important free and open source software conference in Europe and one of the most important FOSS conferences in the world. It’s run entirely by volunteers, doing an incredible amount of work that makes it possible to have hundreds of talks and dozens of different devrooms in the span of two days. Special thanks to the Graphics devroom organizers.
For the virtual setup, one more year FOSDEM relied on Matrix. It’s great because at Igalia we also use Matrix for our internal communications and, thanks to the federated nature of the service, I could join the FOSDEM virtual rooms using the same interface, client and account I normally use for work. The FOSDEM organizers also let participants create ad-hoc accounts to join the conference, in case they didn’t have a Matrix account previously. Thanks to Matrix widgets, each virtual devroom had its corresponding video stream, which you could also watch freely on their site, embedded in each of the virtual devrooms, so participants wanting to watch the talks and ask questions had everything in a single page.
Talks were pre-recorded and submitted in advance, played at the scheduled times, and Jitsi was used for post-talk Q&A sessions, in which moderators and devroom organizers read aloud the most voted questions in the devroom chat.
Of course, a conference this big is not without its glitches. Video feeds from presenters and moderators were sometimes cut automatically by Jitsi allegedly due to insufficient bandwidth. It also happened to me during my Q&A section while I was using a wired connection on a 300 Mbps symmetric FTTH line. I can only suppose the pipe was not wide enough on the other end to handle dozens of streams at the same time, or Jitsi was playing games as it sometimes does. In any case, audio was flawless.
In addition, some of the pre-recorded videos could not be played at the scheduled time, resulting in a black screen with no sound, due to an apparent bug in the video system. It’s worth noting all pre-recorded talks had been submitted, processed and reviewed prior to the conference, so this was an unexpected problem. This happened with my talk and I had to do the presentation live. Fortunately, I had written a script for the talk and could use it to deliver it without issues by sharing my screen with the slides over Jitsi.
Finally, as a possible improvement point for future virtual or mixed editions, is the fact that the deadline for submitting talk videos was only communicated directly and prominently by email on the day the deadline ended, a couple of weeks before the conference. It was also mentioned in the presenter’s guide that was linked in a previous email message, but an explicit warning a few days or a week before the deadline would have been useful to avoid last-minute rushes and delays submitting talks.
In any case, those small problems don’t take away the great online-only experience we had this year.
Transcription
Another advantage of having a written script for the talk is that I can use it to provide a pseudo-transcription of its contents for those that prefer not to watch a video or are unable to do so. I’ve also summed up the Q&A section at the end below. The slides are available as an attachment in the talk page.
Enjoy and see you next year, hopefully in Brussels this time.
Slide 1 (Talk cover)
Hello, my name is Ricardo Garcia. I work at Igalia as part of its Graphics Team, where I mostly work on the CTS project creating new Vulkan tests and fixing existing ones. Sometimes this means I also contribute to the specification text and other pieces of the Vulkan ecosystem.
Today I’m going to talk about the story behind the “border color swizzle” extension that was published last year. I created tests for this one and I also participated in its release process, so I’m listed as one of the contributors.
Slide 2 (Sampling in Vulkan)
I’ve already started mentioning border colors, so before we dive directly into the extension let me give you a brief introduction to sampling operations in Vulkan and explain where border colors fit in that.
Sampling means reading pixels from an image view and is typically done in the fragment shader, for example to apply a texture to some geometry.
In the example you see here, we have an image view with 3 8-bit color components in BGR order and in unsigned normalized format. This means we’ll suppose each image pixel is stored in memory using 3 bytes, with each byte corresponding to the blue, green and red components in that order.
However, when we read pixels from that image view, we want to get back normalized floating point values between 0 (for the lowest value) and 1 (for the highest value, i.e. when all bits are 1 and the natural number in memory is 255).
As you can see in the GLSL code, the result of the operation is a vector of 4 floating point numbers. Since the image does not have alpha information, it’s natural to think the output vector may have a 1 in the last component, making the color opaque.
If the coordinates of the sample operation make us read the pixel represented there, we would get the values you see on the right.
It’s also worth noting the sampler argument is a combination of two objects in Vulkan: an image view and a sampler object that specifies how sampling is done.
Slide 3 (Normalized Coordinates)
Focusing a bit on the coordinates used to sample from the image, the most common case is using normalized coordinates, which means using floating point values between 0 and 1 in each of the image axis, like the 2D case you see on the right.
But, what happens if the coordinates fall outside that range? That means sampling outside the original image, in points around it like the red marks you see on the right.
That depends on how the sampler is configured. When creating it, we can specify a so-called “address mode” independently for each of the 3 texture coordinate axis that may be used (2 in our example).
Slide 4 (Address Mode)
There are several possible address modes. The most common one is probably the one you see here on the bottom left, which is the repeat addressing mode, which applies some kind of module operation to the coordinates as if the texture was virtually repeating in the selected axis.
There’s also the clamp mode on the top right, for example, which clamps coordinates to 0 and 1 and produces the effect of the texture borders extending beyond the image edge.
The case we’re interested in is the one on the top left, which is the border mode. When sampling outside we get a border color, as if the image was surrounded by a virtually infinite frame of a chosen color.
Slide 5 (Border Color)
The border color is specified when creating the sampler, and initially could only be chosen among a restricted set of values: transparent black (all zeros), opaque white (all ones) or the “special” opaque black color, which has a zero in all color components and a 1 in the alpha component.
The “custom border color” extension introduced the possibility of specifying arbitrary RGBA colors when creating the sampler.
Slide 6 (Image View Swizzle)
However, sampling operations are also affected by one parameter that’s not part of the sampler object. It’s part of the image view and it’s called the component swizzle.
In the example I gave you before we got some color values back, but that was supposing the component swizzle was the identity swizzle (i.e. color components were not reorder or replaced).
It’s possible, however, to specify other swizzles indicating what the resulting final color should be for each of the 4 components: you can reorder the components arbitrarily (e.g. saying the red component should actually come from the original blue one), you can force some of them to be zero or one, you can replicate one of the original components in multiple positions of the final color, etc. It’s a very flexible operation.
Slide 7 (Border Color and Swizzle pt. 1)
While working on the Zink Mesa driver, Mike discovered that the interaction between non-identity swizzle and custom border colors produced different results for different implementations, and wondered if the result was specified at all.
Slide 8 (Border Color and Swizzle pt. 2)
Let me give you an example: you specify a custom border color of 0, 0, 1, 1 (opaque blue) and an addressing mode of clamping to border in the sampler.
The image view has this strange swizzle in which the red component should come from the original blue, the green component is always zero, the blue component comes from the original green and the alpha component is not modified.
If the swizzle applies to the border color you get red. If it does not, you get blue.
Any option is reasonable: if the border color is specified as part of the sampler, maybe you want to get that color no matter which image view you use that sampler on, and expect to always get a blue border.
If the border color is supposed to act as if it came from the original image, it should be affected by the swizzle as the normal pixels are and you’d get red.
Slide 9 (Border Color and Swizzle pt. 3)
Jason pointed out the spec laid out the rules in a section called “Texel Input Operations”, which specifies that swizzling should affect border colors, and non-identity swizzles could be applied to custom border colors without restrictions according to the spec, contrary to “opaque black”, which was considered a special value and non-identity swizzles would result in undefined values with that border.
Slide 10 (Texel Input Operations)
The Texel Input Operations spec section describes what the expected result is according to some steps which are supposed to happen in a defined order. It doesn’t mean the hardware has to work like this. It may need instructions before or after the hardware sampling operation to simulate things happen in the order described there.
I’ve simplified and removed some of the steps but if border color needs to be applied we’re interested in the steps we can see in bold, and step 5 (border color applied) comes before step 7 (applying the image view swizzle).
I’ll describe the steps with a bit more detail now.
Slide 11 (Coordinate Conversion)
Step 1 is coordinate conversion: this includes converting normalized coordinates to integer texel coordinates for the image view and clamping and modifying those values depending on the addressing mode.
Slide 12 (Coordinate Validation)
Once that is done, step 2 is validating the coordinates. Here, we’ll decide if texel replacement takes place or not, which may imply using the border color. In other sampling modes, robustness features will also be taken into account.
Slide 13 (Reading Texel from Image)
Step 3 happens when the coordinates are valid, and is reading the actual texel from the image. This immediately implies reordering components from the in-memory layout to the standard RGBA layout, which means a BGR image view gets its components immediately put in RGB order after reading.
Slide 14 (Format Conversion)
Step 4 also applies if an actual texel was read from the image and is format conversion. For example, unsigned normalized formats need to convert pixel values (stored as natural numbers in memory) to floating point values.
Our example texel, already in RGB order, results in the values you see on the right.
Slide 15 (Texel Replacement)
Step 5 is texel replacement, and is the alternative to the previous two steps when the coordinates were not valid. In the case of border colors, this means taking the border color and cutting it short so it only has the components present in the original image view, to act as if the border color was actually part of the image.
Because this happens after the color components have already been reordered, the border color is always specified in standard red, green, blue and alpha order when creating the sampler. The fact that the original image view was in BGR order is irrelevant for the border color. We care about the alpha component being missing, but not about the in-memory order of the image view.
Our transparent blue border is converted to just “blue” in this step.
Slide 16 (Expansion to RGBA)
Step 6 takes us back to a unified flow of steps: it applies to the color no matter where it came from. The color is expanded to always have 4 components as expected in the shader. Missing color components are replaced with zeros and the alpha component, if missing, is set to one.
Our original transparent blue border is now opaque blue.
Slide 17 (Component Swizzle)
Step 7, finally the swizzle is applied. Let’s suppose our image view had that strange swizzle in which the red component is copied from the original blue, the green component is set to zero, the blue one is set to one and the alpha component is not modified.
Our original transparent blue border is now opaque magenta.
Slide 18 (VK_EXT_custom_border_color)
So we had this situation in which some implementations swizzled the border color and others did not. What could we do?
We could double-down on the existing spec and ask vendors to fix their implementations but, what happens if they cannot fix them? Or if the fix is impractical due to its impact in performance?
Unfortunately, that was the actual situation: some implementations could not be fixed. After discovering this problem, CTS tests were going to be created for these cases. If an implementation failed to behave as mandated by the spec, it wouldn’t pass conformance, so those implementations only had one way out: stop supporting custom border colors, but that’s also a loss for users if those implementations are in widespread use (and they were).
The second option is backpedaling a bit, making behavior undefined unless some other feature is present and designing a mechanism that would allow custom border colors to be used with non-identity swizzles at least in some of the implementations.
Slide 19 (VK_EXT_border_color_swizzle)
And that’s how the “border color swizzle” extension was created last year. Custom colors with non-identity swizzle produced undefined results unless the borderColorSwizzle feature was available and enabled. Some implementations could advertise support for this almost “for free” and others could advertise lack of support for this feature.
In the middle ground, some implementations can indicate they support the case, but the component swizzle has to be indicated when creating the sampler as well. So it’s both part of the image view and part of the sampler. Samplers created this way can only be used with image views having a matching component swizzle (which means they are no longer generic samplers).
The drawback of this extension, apart from the obvious observation that it should’ve been part of the original custom border color extension, is that it somehow lowers the bar for applications that want to use a single code path for every vendor. If borderColorSwizzle is supported, it’s always legal to pass the swizzle when creating the sampler. Some implementations will need it and the rest can ignore it, so the unified code path is now harder or more specific.
And that’s basically it. Sometimes the Vulkan Working Group in Khronos has had to backpedal and mark as undefined something that previous versions of the Vulkan spec considered defined. It’s not frequent nor ideal, but it happens. But it usually does not go as far as publishing a new extension as part of the fix, which is why I considered this interesting.
Slide 20 (Questions?)
Thanks for watching! Let me know if you have any questions.
Q&A Section
Martin: The first question is from "ancurio" and he’s asking if swizzling is implemented in hardware.
Me: I don’t work on implementations so take my answer with a grain of salt. It’s my understanding you can usually program that in hardware and the hardware does the swizzling for you. There may be implementations which need to do the swizzling in software, emitting extra instructions.
Martin: Another question from "ancurio". When you said lowering the bar do you mean raising it?
I explain that, yes, I meant to say raising the bar for the application. Note: I meant to say that it lowers the bar for the specification and API, which means a more complicated solution has been accepted.
Martin: "enunes" asks if this was originally motivated by some real application bug or by something like conformance tests/spec disambiguation?
I explain it has both factors. Mike found the problem while developing Zink, so a real application hit the problematic case, and then the Vulkan Working Group inside Khronos wanted to fix this, make the spec clear and provide a solution for apps that wanted to use non-identity swizzle with border colors, as it was originally allowed.
Martin: no more questions in the room but I have one more for you. How was your experience dealing with Khronos coordinating with different vendors and figuring out what was the acceptable solution for everyone?
I explain that the main driver behind the extension in Khronos was Piers Daniell from NVIDIA (NB: listed as the extension author). I mention that my experience was positive, that the Working Group is composed of people who are really interested in making a good specification and implementations that serve app developers. When this problem was detected I created some tests that worked as a poll to see which vendors could make this work easily and what others may need to make this work if at all. Then, this was discussed in the Working Group, a solution was proposed (the extension), then more vendors reviewed and commented that, then tests were adapted to the final solution, and finally the extension was published.
Martin: How long did this whole process take?
Me: A few months. Take into account the Working Group does not meet every day, and they have a backlog of issues to discuss. Each of the previous steps takes several weeks, so you end up with a few months, which is not bad.
Martin: Not bad at all.
Me: Not at all, I think it works reasonably well.
Martin: Specially when you realize you broke something and the specification needs fixing. Definitely decent.