Year-end donations round, 2020 edition

Posted on .

There’s no doubt 2020 has been a special year, often for the worse. Due to the ongoing global pandemic, many people have lost their jobs, have reduced healthcare options or have had and will continue to have trouble getting food and feeding their families. And this is on top of many other ongoing problems, which have only gotten worse this year. For those of us lucky enough to have something to spare, donating to NGOs and putting our grain of sand in the pile is critical.

On a personal level, I donate to several “classic” NGOs, so to speak. On a professional level, at Igalia we collaborate with a wide variety of NGOs, in some cases on very specific projects.

However, at the end of the year I always like to make a small round of personal donations to projects and organizations which are also important for our digital lives on a daily basis. This year I’ve selected the following ones:

  • EFF has done a superb job, as usual. Apart from their crucial defense of civil liberties and digital rights, we owe them many things you may be using everyday. Let’s not forget they were involved in starting the Let’s Encrypt project and created Privacy Badger and HTTPS Everywhere, among other tools. This year, they went way beyond their call of duty representing the current maintainers of youtube-dl and helped getting the project restored on GitHub.

  • Signal is, to me, an essential tool I use everyday to communicate with my friends and family. It’s a free and open source software project providing us an easy-to-use messaging application pushing the state of the art in end-to-end encryption protocols for your text messages and your audio and video calls.

  • Internet Archive is, to me, another essential project. Somewhat connected in spirit to youtube-dl, it’s playing a critical role in cultural preservation and providing free access to millions of works of art and science.

  • Free Software Foundation Europe promotes free software in the European Union, also running campaigns to increase its use in public administrations and their computers, as well as attempting to encourage publicly-funded projects to be released as free software.

I didn’t donate to Wikipedia this year because I prefer to chip in when they run their fundraising campaigns. In the past I’ve also donated to Mozilla but I understand it may be a bit controversial. The best thing you can do for Mozilla and for the open web is to keep using and promoting Firefox, in my humble opinion.

In addition, I encourage you to donate to small free and open source software projects you may be using everyday, in which the impact of a large amount of small donations can be significant. I was about to donate money to uBlock Origin but they politely reject donations in their README file. However, maybe you develop software professionally on Windows and happen to use WinSCP very frequently, for example. Just think of the free software projects you use everyday. Probably, some of them may not have large corporate sponsors behind. They may also offer support contracts as their main revenue source, and these could be useful for your employer.

Embedding YouTube videos without making your site fatter

Posted on . Updated on .

Making this site lighter and improving load times for my readers has been a priority for some years. I’ve stopped embedding web fonts, I’ve started using Unicode icons instead of relying on Font Awesome and I’ve also started loading Disqus comments on demand, which also has a positive impact on the privacy of anyone reading these pages.

However, on a few occasions I’ve wanted to embed a YouTube video in one of the posts, and I had never realized this can heavily impact page sizes and load times. Take, for example, the following HTML document.

<html>
<head><title>Embedded YouTube Video</title></head>
<body>
<iframe
    width="1280"
    height="720"
    src="https://www.youtube.com/embed/ck7utXYcZng"
    frameborder="0"
    allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
    allowfullscreen>
</iframe>
</body>
</html>

The iframe code you see above is almost verbatim copied from the code YouTube gives you when you right-click on the video and select “Copy embed code”. If you store that document locally and try to open it with Firefox using its network inspection tool, you’ll discover it attempts to load, as of the time this text is being written, around 1.84MB of data, and that’s with uBlock Origin blocking some additional requests. The largest piece of that being YouTube’s base Javascript library.

Firefox network inspection tool showing the base YouTube Javascript library weighting 1.46MB

On the one hand, it’s likely many people already have that file, and some others, in their browser cache. On the other hand, I don’t feel comfortable making that assumption and throwing my hands up. This prompted me to try to find a way to embed YouTube videos without adding so much data by default, and it turns out other people have found solutions to the problem, which I’ve slightly tuned and I’m re-sharing here. Instead of using the previous iframe code, I use something slightly more convoluted.

<html>
<head><title>Embedded YouTube Video</title></head>
<body>
<iframe
    title="Video: Why Can't You Download Videos on YouTube? How a 20-Year-Old Law Stops youtube-dl Users AND Farmers"
    height="720"
    width="1280"
    frameborder="0"
    allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
    allowfullscreen
    src="https://www.youtube.com/embed/ck7utXYcZng"
    srcdoc="
        <style>
            * {
                padding: 0;
                margin: 0;
                overflow: hidden;
            }
            html, body {
                height: 100%;
            }
            img, span {
                /* All elements take the whole iframe width and are vertically centered. */
                position: absolute;
                width: 100%;
                top: 0;
                bottom: 0;
                margin: auto;
            }
            span {
                /* This mostly applies to the play button. */
                height: 1.5em;
                text-align: center;
                font-family: sans-serif;
                font-size: 500%;
                color: white;
            }
        </style>
        <!-- The whole frame is a link to the proper embedded page with autoplay. -->
        <a href='https://www.youtube.com/embed/ck7utXYcZng?autoplay=1'>
            <img
                src='https://img.youtube.com/vi/ck7utXYcZng/hqdefault.jpg'
                alt='Video: Why Cant You Download Videos on YouTube? How a 20-Year-Old Law Stops youtube-dl Users AND Farmers'
            >
            <!-- Darken preview image laying this on top. Also makes the play icon stand out. -->
            <span style='height: 100%; background: black; opacity: 75%'></span>
            <!-- Play icon. -->
            <span>&#x25b6;</span>
        </a>"
></iframe>
</body>
</html>

The first few lines are almost identical up to the src property. I’ve only added the title property for accessibility reasons. Special care needs to be taken if the video title contains double quotes or single quotes. The src property contains the classic URL but it’s only used as a fallback for browsers that do not support the srcdoc property that starts on the next line. srcdoc allows you to specify an inline document that will be used instead of loading the frame from an external URL. Support for it is widespread nowadays. As you can see, the embedded inline document contains a style element followed by an a element pointing to the real embedded iframe, only this time with the autoplay parameter set to 1, so the video will immediately start playing when the frame is loaded.

The provided style sheet makes sure the link fills the entire embedded iframe, so clicking anywhere on it loads the video and starts playing it. Inside the link you’ll find 3 items. The first one is the video thumbnail in YouTube’s “high quality” version, which is actually lightweight and gets the job done as the background image. My guess is the name predates HD and FHD content. On top of that image I’ve placed a span element with a black background color and 75% opacity that, again, fills the whole iframe and darkens the background image, making the play button stand out. Finally, another span element is laid out on top of those, containing the Unicode character for a triangle pointing to the right in a large font. This serves as the aforementioned play button and gives readers the visual clue they need to click on the video to start playing it.

With those changes and for this particular video, the browser only needs to load around 9KB of data. You can see what it looks like below.

Edit 2020-12-08: The autoplay parameter is ignored by YouTube on mobile and users will need to tap twice to watch the video: once to click on your link and load the iframe and once more to start playing the video from it. Still, I think double tapping is worth saving almost 2MB by default. On an iPad I only need to tap once, so it depends on the exact device and what’s considered “mobile”.

Origins of the youtube-dl project

Posted on . Updated on .

As you may know, as of the time this text is being written youtube-dl’s repository at GitHub is blocked due to a DMCA takedown letter received by GitHub on behalf of the RIAA. While I cannot comment on the current maintainers' plans or ongoing discussions, in light of the claims made in that letter I thought it would be valuable to put in writing the first years of youtube-dl as the project creator and initial maintainer.

Copper thieves

All good stories need at least a villain so I have arbitrarily chosen copper thieves as the villains of the story that set in motion what youtube-dl is today. Back in 2006 I was living in a town 5 to 10 kilometers away from Avilés, which is itself a small city or town in northern Spain. While people in Avilés enjoyed some nice infrastructures and services, including cable and ADSL Internet access, the area I lived in lacked those advantages. I was too far away from the telephone exchange to enjoy ADSL and copper thieves had been stealing copper wires along the way to it for years, causing telephone service outages from time to time and making the telephone company replace those wires with weaker and thinner wires, knowing they would likely be stolen again. This had been going on for several years at that point.

This meant my only choice for home Internet access so far had been a dial-up connection and a 56k V.90 modem. In fact, connection quality was so poor I had to limit the modem to 33.6 kbps mode so the connection would be at least stable. Actual download speeds rarely surpassed 4 KB/sec. YouTube was gaining popularity then to the point it was purchased by Google at the end of that year.

Up all night to get some bits

Watching any YouTube video on the kind of connection I described above was certainly painful, as you can imagine. Any video that was moderately big would take ages to download. For example, a short 10 MB video would take, if you do the math, 40 minutes to download, making streaming impossible. A longer and higher-quality video would take several hours and render the connection unusable for other purposes while you waited for it to be available, not to mention the possibility of the connection being interrupted and having to start the download process again. Now imagine liking a specific video a lot after watching it and wanting to watch it a second or third time. Going through that process again was almost an act of masochism.

This situation made me interested in the possibility of downloading the videos I was trying to watch: if the video was interesting, having a copy meant I could watch it several times easily. Also, if the downloader was any good, maybe the download process could be resumed if the connection was interrupted, as it frequently was.

At the time, there were other solutions to download videos from YouTube, including a quite popular Greasemonkey script. By pure chance, none of the few I tested were working when I did, so I decided to explore the possibility of creating my own tool. And that is, more or less, how youtube-dl was born. I made it a command-line program so it would be easy to use for me and wrote it in Python because it was easy thanks to its extensive standard library, with the nice side effect that it would be platform independent.

An Ethereal start

The initial version of the program only worked for YouTube videos. It had almost no internal design whatsoever because it was not needed. It did what it had to do as a simple script that proceeded straight to the point. Line count was merely 223, with only 143 being actual lines of code, 44 for comments and 36 of them blank. The name was chosen out of pure convenience: youtube-dl was an obvious name, hard to forget, and it could be intuitively typed as “Y-O-U-TAB” in my terminal.

Having been using Linux for several years at that point, I decided to publish the program under a free software license (MIT for those first versions) just in case someone could find it useful. Back then, GitHub did not exist and we had to “make do” with SourceForge, which had a bit of a tedious form that you needed to fill to create a new project. So, instead of going to SourceForge, I quickly published it under the web space that my Internet provider gave me. While not usual today, it was common for ISPs to give you an email address and some web space you could upload stuff to using FTP. That way, you could have your own personal website on the net. The first ever version made public was 2006.08.08, although I probably had been using the program for a few weeks at that point.

To create the program, I studied what the web browser was doing when watching a YouTube video using Firefox. If I recall correctly, Firefox didn’t yet have the development tools it has today to analyze network activity. Connections were mostly HTTP and Wireshark, known as “Ethereal” up to that year, proved invaluable to inspect the network traffic coming in and out of my box when loading a YouTube video. I wrote youtube-dl with the specific goal of doing the same things the web browser was doing to retrieve the video. It even sent out a User-Agent string that was verbatim copied from Firefox for Linux, as a way to make sure the site would send the program the same version of video web pages that were used to study what the web browser was doing.

In addition, YouTube used Adobe Flash back then for the player. Videos were served as Flash Video files (FLV), and this all meant a proprietary plugin was required to watch them on the browser (many will remember the dreaded libflashplayer.so library), which would have made any browser development tools useless. This proprietary plugin was a constant source of security advisories and problems. I used a Firefox extension called Flashblock that prevented the plugin from being loaded by default and replaced embedded content using the plugin, in web pages, with placeholder elements containing a clickable icon so content would be loaded only on demand and the plugin library was not used unless requested by the user.

Flashblock had two nice side effects apart from making the browsing experience more secure. On the one hand, it removed a lot of noisy and obnoxious ads from many web pages, which could also be a source of security problems when served by third parties. On the other hand, it eased analyzing how videos were being downloaded by the video player. I would wait until the video page had finished downloading completely and then start logging traffic with Wireshark just before clicking on the embedded video player placeholder icon, allowing it to load. This way, the only traffic to analyze was related to the plugin downloading the video player application and the application itself downloading the video.

It’s also worth noting the Flash Player plugin back then was already downloading a copy of those videos to your hard drive (they were stored in /tmp under Linux) and many users relied on that functionality to keep a copy of them without using additional tools. youtube-dl was simply more convenient because it could retrieve the video title and name the file more appropriately in an automated way, for example.

Ahh, fresh meat!

The Flash Player plugin was eventually modified so videos wouldn’t be so easily available to grab. One of the first measures was to unlink the video file after creating it, so the i-node would still exist and be available to the process using it (until it was closed) while keeping the file invisible from the file system point of view. It was still possible to grab the file by using the /proc file system to examine file descriptors used by the browser process, but with every one of those small steps youtube-dl turned to be more and more convenient.

As many free and open source enthusiasts back then, I used Freshmeat to subscribe to new releases of projects I was interested in. When I created youtube-dl, I also created a project entry for it in that website so users could easily get notifications of new releases and a change log listing new features, fixes and improvements. Freshmeat could also be browsed to find new and interesting projects and its front page contained the latest updates, which usually amounted to only a few dozens a day. It’s only my guess that’s the way Joe Barr (rest in peace), an editor for linux.com, found out about the program and decided to write an article about it back in 2006. Linux.com was a bit different then and I think it was one of the frequently-visited sites for Linux enthusiasts together with other classics like Slashdot or Linux Weekly News. At least, it was for me.

From that point on, youtube-dl’s popularity started to grow and I started getting some emails from time to time to thank me for creating and maintaining the program.

Measuring buckets of bits

Fast forward to the year 2008. youtube-dl’s popularity had kept growing slowly and users frequently asked me to create similar programs to download from more sites, a request I had conceded a few times. It was at that point that I decided to rewrite the program from scratch and make it support multiple video sites natively. I had some simple ideas that would separate the program internals into several pieces. To simplify the most important parts: one would be the file downloader, common for every website, and another one would be the information extractors: objects (classes) that would contain code specific to a video site. When given a URL or pseudo-URL, the information extractors would be queried to know which one could handle that type of URL and then requested to extract information about that video or list of videos, with the primary goal of obtaining the video URL or a list of video URLs with available formats, together with some other metadata like the video titles, for example.

I also took the chance to switch version control systems and change where the project would be hosted. At that moment, Git was winning the distributed version control systems war for open source projects, but Mercurial also had a lot of users and, having tested both, I decided I liked it a bit more than Git. I started using it for youtube-dl and moved the project to Bitbucket, which was the natural choice. Back then, Bitbucket could only host Mercurial repositories, while GitHub only hosted Git repositories. Both were launched in 2008 and were a breath of fresh air compared to SourceForge. The combination of compartmentalized per-user project namespaces (i.e. the name of your project did not have to be globally unique but unique for your projects) with distributed source control systems meant you could publish your personal projects in a matter of minutes to any of the two sites. In any case, migrating the project history to Git and moving the project to GitHub was still a couple of years away in the future.

When rewriting the project I should have taken the chance to rename it, no doubt, but I didn’t want to confuse existing users and kept the name in an effort to preserve the little popularity the program had.

The technological context at home also switched a bit that year. Mobile data plans started to gain traction and, at the end of that year, I got myself a 3G modem and data plan that, for the first time, allowed me to browse the web at decent speeds. In any case, that didn’t make me stop using youtube-dl. I was paying 45 euros a month but the monthly data cap was limited to 5GB. Connection speed was finally great but, doing the math, I could only use an average of around 150MB a day, which meant I had to be selective when using the network and avoid big downloads if possible. youtube-dl helped a lot to prevent me from downloading large video files multiple times.

Episode: a new home

Some time later, at the end of 2009, I moved and finally started living with my girlfriend (now my wife and the mother of my two children) in Avilés. For the first time, I started accessing the Internet using the type of connection and service that had been the standard for many of my friends and family for many years. I remember it was a 100/10 Mbps (down/up) cable connection with no monthly cap. That change definitely marked a turning point in how often I used youtube-dl and how much attention I paid to the project.

Not much later, I finally moved it to Git and GitHub, when the market had spoken and both tools were the way to go. YouTube also started experimenting with HTML5 video, even if it wouldn’t become the default option until around 2015. In 2011 I had been working a full-time job as a software engineer for several years and, in general, I was not eager to get home to code a bit more tuning youtube-dl or implementing the most popular feature request I was probably not going to use personally.

In the second half of 2011 I was in the middle of another important personal software project and decided to step down as the youtube-dl maintainer, knowing I hadn’t been up to the task for several months. Philipp Hagemeister had proved to be a great coder and had some pending pull requests in GitHub with several fixes many people were interested in. I gave him commit access to my youtube-dl repo and that’s mostly the end of the story on my side. The project’s Git master branch log shows I had a continuous stream of commits until March 2011, when they jump to August 2011 to merge a fix by Philipp. Since then, a single clerical commit in 2013 to change rg3.github.com to rg3.github.io in the source code, which was needed when GitHub moved user pages from USERNAME.github.com to USERNAME.github.io in order to, if I recall correctly, avoid security problems with malicious user web pages being served from their own official github.com domain.

While I was basically not involved as a developer of youtube-dl, for years the official project page kept sitting under my username at https://github.com/rg3/youtube-dl and https://rg3.github.io/youtube-dl/. I only had to show up when Philipp or other maintainers asked me to give commit access to additional developers, like Filippo Valsorda at the time or Sergey, one of the current maintainers. Unfortunately, in 2019 we had a small troll problem in the project issue tracker and only project owners were allowed to block users. This made us finally move the project to a GitHub organization where everyone with commit access was invited (although not everyone joined). The GitHub organization has allowed project maintainers to act more freely without me having to step in for clerical tasks every now and then.

I want to reiterate my most sincere thanks to the different project maintainers along these years, who greatly improved the code, were able to create an actual community of contributors around it and who made the project immensely more popular than it was when I stepped down almost 10 years ago, serving the needs of thousands of people along the way.

Offline and free

I’d like to remark one more time that the purpose of youtube-dl as a tool has barely changed along its 14 years of existence. Before and after the RIAA’s DMCA letter was received, many people have explained how they use youtube-dl with different goals in mind.

For me, it has always been about offline access to videos that are already available to the general public online. In a world of mobile networks and always-on Internet connections, you may wonder if that’s really needed. It must be, I guess, if Netflix, Amazon, Disney or HBO have all implemented similar functionality in their extremely popular streaming applications. For long road trips, or trips abroad specially with kids, or underground or on an airplane, or in a place with poor connectivity or metered connections, having offline access to that review, report, podcast, lecture, piece of news or work of art is incredibly convenient.

An additional side-effect of youtube-dl is online access when the default online interface is not up to the task. The old proprietary Flash plugin was not available for every platform and architecture, depending on what your choice was. Nowadays, web browsers can play video but may sometimes not take advantage of efficient available GPU decoding, wasting large amounts of battery power along the way. youtube-dl can be combined with a native video player to make playing some videos possible and/or efficient. For example, mpv includes native youtube-dl support. You only need to feed it a supported video site URL and it will use youtube-dl to access the video stream and play it without storing anything in your hard drive.

The default online interface may also lack accessibility features, may make content navigation hard for some people or lack color blind filters that, again, may be available from a native video player application.

Last, but not least, tools like youtube-dl allow people to access online videos using only free software. I know there are not many free, libre and open source software purists out there. I don’t even consider myself one, by a long shot. Proprietary software is ever present in our modern lives and served to us every day in the form of vast amounts of Javascript code for our web browser to run, with many different and varied purposes and not always in the best interest of users. GDPR, with all its flaws and problems, is a testament to that. Accessing online videos using youtube-dl may give you a peace of mind incognito mode, uBlock Origin or Privacy Badger can only barely grasp.

My participation in XDC 2020

Posted on . Updated on .

Filed under: igalia

The 2020 X.Org Developers Conference took place from September 16th to September 18th. For the first time, due to the ongoing COVID-19 pandemic, it was a fully virtual event. While this meant that some interesting bits of the conference, like the hallway track, catching up in person with some people and doing some networking, was not entirely possible this time, I have to thank the organizers for their work in making the conference an almost flawless event. The conference was livestreamed directly to YouTube, which was the main way for attendees to watch the many different talks. freenode was used for the hallway track, with most discussions happening in the ##xdc2020 IRC channel. In addition ##xdc2020-QA was used for attendees wanting to add questions or comments at the end of the talk.

Igalia was a silver sponsor of the event and we also participated with 5 different talks, including one by yours truly.

My talk about VK_EXT_extended_dynamic_state was based on my previous blog post, but it includes a more detailed explanation of the extension as well as more detailed comments and an explanation about how the extension was created. I took advantage of the possibility of using pre-recorded videos for the conference, as I didn’t fully trust my kids wouldn’t interrupt me in the middle of the talk. In the end I think it was a good idea and, from the presenter point of view, I also found out using a script and following it strictly (to some degree) prevented distractions and made the talk a bit shorter and more to the point, because I tend to beat around the bush when talking live. You can watch my talk in the embedded video below.

Slides for the talk are also available and below you can find a transcript of the talk.

<Title slide>

Hello, my name is Ricardo García, I work at Igalia as part of its Graphics team and today I will be talking about the extended dynamic state Vulkan extension. At Igalia I was involved in creating CTS tests for this extension and also in reviewing the spec when writing those tests, in a very minor capacity. This extension is pretty simple and very useful, and the talk is divided in two parts. First I will talk about the extension itself and then I’ll reflect on a few bits about how this extension was created that I consider quite interesting.

<Part 1>

<Extension description slide>

So, first, what does this extension do? Its documentation says:

VK_EXT_extended_dynamic_state adds some more dynamic state to support applications that need to reduce the number of pipeline state objects they compile and bind.

In other words, as you will see, it makes Vulkan pipeline objects more flexible and easier to use from the application point of view.

<Pipeline diagram slide>

So, to give you some context, this is [the] typical graphics pipeline representation in many APIs like OpenGL, DirectX or Vulkan. You’ve probably seen variations of this a million times. The pipeline is divided in stages, some of them fixed-function, some of them programmable with shaders. Each stage usually takes some data from the previous stage and produces data to be consumed by the next one, apart from using other external resources like buffers or textures or whatever. What’s the Vulkan approach to represent this process?

<Creation structure slide>

Vulkan wants you to specify almost every single aspect of the previous pipeline in advance by creating a graphics pipeline object that contains information about how every stage should work. And, once created, most of these pipeline parameters or configuration cannot be changed. As you can see here, this includes shader programs, how vertices are read and processed, depth and stencil tests, you name it. Pipeline objects are heavy objects in Vulkan and they are hard to create. Why does Vulkan want you to do that? The answer has always been this keyword: “optimization”. Giving all the information in advance gives more chances for every current or even future implementations to optimize how the pipeline works. It’s the safe choice. And, despite this, you can see there’s a pipeline creation parameter with information about dynamic state. These are things that can be changed when using the pipeline without having to create a separate and almost identical pipeline object.

<New dynamic states slide>

What the extension does should be pretty obvious now: it adds a bunch of additional elements that can be changed on the fly without creating additional pipelines. This includes things like primitive topology, front face vertex order, vertex stride, cull mode and more aspects of the depth and stencil tests, etc. A lot of things. Using them if needed means fewer pipeline objects, fewer pipeline cache accesses and simpler programs in general. As I said before, it makes Vulkan pipeline objects more flexible and easier to use from the application point of view, because more pipeline aspects can be changed on the fly when using these pipeline objects instead of having to create separate objects for each combination of parameters you may want to modify at runtime. This may make the application logic simpler and it can also help when Vulkan is used as the backend, for example, to implement higher level APIs that are not so rigid regarding pipelines. I know this extension is useful for some emulators and other API-translating projects.

<New commands slide>

Together with those it also introduces a new set of functions to change those parameters on the fly when recording commands that will use the pipeline state object.

<Pipeline diagram slide>

So, knowing that and going back to the graphics pipeline, the obvious question is: does this impact performance? Aren’t we reducing the number of optimization opportunities the implementation has if we use these additional dynamic states? In theory, yes. In practice, it depends on the implementation. Many GPUs and Vulkan drivers out there today have some pipeline aspects that are considered “dynamic” in the sense that they are easily changed on the fly without a perceptible impact in performance, while others are truly important for optimization. For example, take shaders. In Vulkan they’re provided as SPIR-V programs that need to be translated to GPU machine code and creating pipelines when the application starts makes it easy to compile shaders beforehand to avoid stuttering and frame timing issues later, for example. And not only that. As you create pipelines, you’re telling the implementation which shaders are used together. Say you have a vertex shader that outputs 4 parameters, and it’s used in a pipeline with a fragment shader that only uses the first 2. When creating the pipeline the implementation can decide to discard instructions that are only related to producing the 2 extra unused parameters in the vertex shader. But other things like, for example, changing the front face? That may be trivial without affecting performance.

<Part 2>

<Eric Lengyel tweet slide>

Moving on to the second part, I wanted to talk about how this extension was created. It all started with an “angry” tweet by Eric Lengyel (sorry if I’m not pronouncing it correctly) who also happens to be the author of the previous diagram. He complained in Twitter that you couldn’t change the front face dynamically, which happens to be super useful for rendering reflections, and pointed to an OpenGL NVIDIA extension that allowed you to do exactly that.

<Piers Daniell reply slide>

This was noticed by Piers Daniell from NVIDIA, who created a proposal in Khronos. That proposal was discussed with other vendors (software and hardware) that chimed in on aspects that could be or should be made dynamic if possible, which resulted in the multi-vendor extension we have today.

<RADV implementation slide>

In fact, RADV was one of the first Vulkan implementations to support the extension thanks to the effort by Samuel Pitoiset.

<Promoters of Khronos slide>

This whole process got me thinking Khronos may sometimes be seen from the outside as this closed silo composed mainly of hardware vendors. Certainly, there are a lot of hardware vendors but if you take the list of promoter members you can see some fairly well-known software vendors as well, and API usability and adoption are important for both groups. There are many people in Khronos trying to make Vulkan easier to use even if we’re all aware that’s somewhat in conflict with providing a lower level API that should let you write performant applications.

<Khronos Contributors slide>

If you take a look at the long list of contributor members, that’s only shown partially here because it’s very long, you’ll notice a lot of actors from different backgrounds as well.

<Vulkan-Docs repo slide>

Moreover, while Khronos and its different Vulkan working groups are far from an open source project or community, I believe they’re certainly more open to contributions than what many people think. For example, the Vulkan spec is published in a GitHub repo with instructions to build it (the spec is written in AsciiDoc) and this repo is open for issues and pull requests. So, obviously, if you want to change major parts of Vulkan and how some aspects of the API work, you’re going to meet opposition and maybe you should be joining Khronos to discuss things internally with everyone involved in there. However, while an angry tweet was enough for this particular extension, if you’re not well-known you may want to create an issue instead, exposing your use case and maybe with other colleagues chiming in on details or supporting of your proposal. I know for a fact issues created in this public repo are discussed in periodic Khronos meetings. It may take some weeks if people are busy and there’s a lot of things on the table, but they’re going to end up being discussed, which is a very good thing I was happy to see, and I want to put emphasis on that. I would like Khronos to continue doing that and I would like more people to take advantage of the public repos from Khronos. I know the people involved in the Vulkan spec want to make the text as clear as possible. Maybe you think some paragraph is confusing, or there’s a missing link to another section that provides more context, or something absurd is allowed by the spec and should be forbidden. You can try a reasoned pull request for any of those. Obviously, no guarantees it will go in, but interesting in any case.

<Blend state tweet slide>

For example, in the Twitter thread I showed before, I tweeted a reply when the extension was published and, among a few retweets, likes and quoted replies I found this very interesting Tweet I’m showing you here, asking for the whole blend state to be made dynamic and indicating that would be game-changing for some developers and very interesting for web browsers. We all want our web browsers to leverage the power of the GPU as much as possible, right? So why not? I thought creating an issue in the public repo for this case could be interesting.

<Dynamic blend state issue slide>

And, in fact, it turns out someone had already created an issue about it, as you can see here.

<Tom Olson reply slide>

And in this case, in this issue, Tom Olson from ARM replied that the working group had been discussing it and it turns out in this particular case existing hardware doesn’t make it easy to make the blend state fully dynamic without possibly recompiling shaders under the hood and introducing unwanted complexity in the implementations, so it was rejected for now. But even if, in this case, the reply is negative, you can see what I was mentioning: the issue reached the working group, it was considered, discussed and the issue creator got a reply and feedback. And that’s what I wanted to show you.

<Final slide>

And that’s all. Thanks for listening! Any questions maybe?

The talk was followed by a Q&A section moderated, in this case, by Martin Peres. In the text below RG stands for Ricardo Garcia and MP stands for Martin Peres.

RG: OK…​ Hello everyone!

MP: OK, so far we do not have any questions. Jason Ekstrand has a comment: "We (the Vulkan Working Group) has had many contributions to the spec".

RG: Yeah, yeah, exactly. I mean, I don’t think it’s very well known but yeah, indeed, there are a lot of people who have already contributed issues, pull requests and there have been many external contributions already so these things should definitely continue and even happen more often.

MP: OK, I’m gonna ask a question. So…​ how much do you think this is gonna help layering libraries like Zink because I assume, I mean, one of the big issues with Zink is that you need to have a lot of pipelines precompiled and…​ is this helping Zink?

RG: I don’t know if it’s being used. I think I did a search yesterday to see if Zink was using the extension and I don’t remember if I found anything specific so maybe the Zink people can answer the question but, yeah, it should definitely help in those cases because OpenGL is not as strict as Vulkan regarding pipelines obviously. You can change more things on the fly and if the underlying Vulkan implementation supports extended dynamic state it should make it easier to emulate OpenGL on top of Vulkan. For example, I know it’s being used by VKD3D right now to emulate DirectX 12 and there’s a emulator, a few emulators out there which are using the extension because, you know, APIs for consoles are different and they can use this type of extensions to make code better.

MG: Agree. Jason also has another comment saying there are even extensions in flight from the Mesa community for some windowing-system related stuff.

RG: Yeah, I was happy to see yesterday…​ I think it was yesterday, well, here at this XDC that the present timing extension pull request is being handled right now on GitHub which I think is a very good thing. It’s a trend I would like to [see] continue because, well, I guess sometimes, you know, the discussions inside the Working Group and inside Khronos may involve IP or whatever so it’s better to have those discussions sometimes in private, but it is a good thing that maybe, you know, there are a few extensions that could be handled publicly in GitHub instead of the internal tools in Khronos. So, yeah, that’s a good thing and a trend I would like to see continue: extensions discussed in public.

MG: Yeah, sounds very cool. OK, I think we do not have any question…​ other questions or comments so let’s say thank you very much and…​

RG: Thank you very much and let me congratulate you for…​ to the organizers for organizing XDC and…​ everyone, enjoy the rest of the day, thank you.

MG: Thank you! See you in 13m 30s for the status of freedesktop.org’s GitLab cloud hosting.

Regarding Zink, at the time I’m writing this, there’s an in-progress merge request for it to take advantage of the extension. Regarding the present timing extension, its pull request is at GitHub and you can also watch a short talk from Day One of the conference. I also mentioned the extension being used by VKD3D. I was specifically referring to the VKD3D-Proton fork.

References used in the talk:

VK_EXT_extended_dynamic_state released for Vulkan

Posted on . Updated on .

Filed under: igalia

A few days ago, the VK_EXT_extended_dynamic_state extension for Vulkan was released and included for the first time as part of Vulkan 1.2.145. This is a pretty interesting extension that makes Vulkan pipelines more flexible and practical for many use cases. At Igalia, I was involved in getting this extension out the door as the author of its VK-GL-CTS tests and, in a very minor capacity, by reviewing the spec text and contributing a couple of small fixes to it.

Vulkan pipelines

The purpose of this Vulkan extension is to make Vulkan pipelines less rigid by allowing them to have certain values set dynamically when you use the pipeline instead of those values being set in stone when creating the pipeline. For those less familiar with Vulkan, Vulkan pipelines are one of the most “heavy” objects in the API. Vulkan typically has compute and graphics pipelines. For this extension, we’ll be talking about graphics pipelines. A pipeline object, when created, contains a lot of information about what the GPU needs to do when rendering a scene or part of a scene, like how triangle vertices need to be read from memory, the number of textures, buffers and images that will be used, parameters for color blending operations, depth and stencil tests, multisample antialiasing, viewports, etc.

Vulkan, being a low-overhead API that tries to help you squeeze as much performance as possible out of a GPU, wants you to specify all that information in advance so implementations (GPU plus driver) have higher chances of optimizing the process, both at pipeline creation time and at runtime. Every time you “bind a pipeline” (i.e. setting it as the active pipeline for future commands) you’re telling the implementation how everything should work, which is usually followed by commands telling the GPU to draw lots of geometry using the previous parameters.

Creating a pipeline may also involve compiling shaders to native GPU instructions. Shaders are “small” programs that run on the GPU when the rendering process reaches a programmable stage. When a GPU is drawing anything, the drawing process is divided in stages. Each stage takes a number of inputs both directly from the previous stage and as external resources (buffers, textures, etc), and produces a number of outputs to be directly consumed by the next stage or as side effects in external resources. Some of those stages are fixed and some are programmable with user-provided shader programs. When these shaders are not so small, compiling and optimizing them to native GPU instructions takes some time. Usually not a very long time, but every millisecond counts when you only have 16 of them to draw the next frame in order to achieve 60 frames per second. Stuff like this is what drove the creation of the ACO shader compiler for the Mesa RADV driver and it’s also why some drivers hash shader contents and use a shader cache to check if that exact shader has been compiled before. It’s also why Vulkan wants you to create pipelines in advance if possible. Otherwise, if you realize you need a new pipeline in the middle of preparing the next frame in an action game, the pipeline creation process may make the game stutter at that point due to the extra processing time needed.

Vulkan gives you several possibilities to alleviate the problem. You can create every pipeline you may need in advance. This is one of the most effective approaches but may involve a good number of pipelines due to the different possible combinations of pipeline parameters you may want to use. Say you want to vary 7 different parameters independently from each other with two possible values each. That means you have to create 128 different pipelines and manage them in your application. Another option is using a pipeline cache that will speed up creation of pipelines identical or similar to other ones created in the past. This lets you focus only on the pipeline variants you need at a given point in time. Finally, Vulkan gives you the possibility of changing a few pipeline parameters on the fly instead of giving them fixed values at pipeline creation time. This is the dynamic state inside the pipeline.

Dynamic state and VK_EXT_extended_dynamic_state

Dynamic state helps in addition to anything I mentioned before. It makes your application logic easier by not having to deal with so many different variations and reduces the total number of times you may have to create a new pipeline, which may decrease initialization time, pipeline cache sizes and access, state changes and game stuttering. VK_EXT_extended_dynamic_state, when available and as its name implies, extends the number of pipeline elements that can be part of that dynamic state. It adds states like the culling mode, front face, primitive topology, viewport with count, scissor with count (previously, viewports and scissors could be changed dynamically but not their counts), vertex input binding stride, depth test activation and writes, depth comparison operation, depth bounds activation and stencil test activation and operations. That’s a pretty large set of new dynamic elements.

The obvious question that follows is if using so many dynamic elements decreases performance, in the sense that it may reduce the optimization opportunities the implementation may have because some details about the pipeline are not known in advance. The answer is that this really depends on the implementation. For example, in some implementations the culling mode or front face may be set in a register before drawing operations and there’s no practical difference between setting it when the pipeline is bound to be used or dynamically before a large set of drawing commands are used.

I’ve measured the impact of enabling every new dynamic state in a simple GPU-bound Vulkan program that displays a rotating model on screen and I haven’t noticed any performance impact with the NVIDIA proprietary driver and a GTX 1070 card, but your mileage may vary. As usual, measure before deploying.

VK_EXT_extended_dynamic_state can also help when Vulkan is used as the backend to implement other higher level APIs which are not as rigid as Vulkan itself and in which some drawing parameters can be changed on the fly, being up to the driver to implement those changes as efficiently as possible. We’re talking about OpenGL, or DirectX up to version 11. As you can imagine, it’s an interesting extension for projects like DXVK and it can help improve the state of Linux gaming through Wine and Proton.

Origins of VK_EXT_extended_dynamic_state

The story about how this extension came to be is also interesting. It all started as a reaction to an “angry” tweet by Eric Lengyel in which he lamented that he had to create two separate pipelines just to change the front face or winding order of triangles when rendering a reflection. That prompted Piers Daniell from NVIDIA to start a multivendor effort inside Khronos that resulted in VK_EXT_extended_dynamic_state. As you can read in the extension summary, several companies where involved: AMD, Arm, Broadcom, Google, Imagination, Intel, NVIDIA, and Valve.

For that reason, this extension is also one of the many success stories from the Khronos Group, a forum in which hardware and software vendors, big and small, participate designing and standardizing cross-platform solutions for the graphics industry. Many different points of view are taken into account when designing those solutions. If you look at the member list you’ll see plenty of known logos from hardware manufacturers and software developers, including companies making widely available game engines.

In this case an angry tweet was enough to spark an effort, but that’s not the ideal situation. You can propose specification improvements, extensions or new ideas using the Vulkan Docs repository. An issue could be enough and, for small changes, a pull request can be even better.