Gaming, anti-aliasing, lights and performance

Posted on . Updated on .

It’s been a few months since my last game review and I wanted to write about the last batch of games I’ve been playing and how they use different anti-aliasing and lighting techniques to improve the way they look on screen.

Doom

A few months ago I played Doom, the reboot from 2016. It was one of the most praised games that year and I can certainly see its virtues. However, I think gamers in general and PC gamers in particular were blinded by its brilliant technology and performance on different systems and forgot a bit about the somewhat basic game play.

To me, Doom is a solid 8.5 game but it’s far from a GOTY award. It’s fun, it’s replayable and it’s divided in maps that are themselves divided, for the most part, in different arenas were you fight hordes of monsters. This simple concept, coupled with a simple plot, makes it easy to enjoy the game in short gaming sessions, clearing arena after arena. For obsessive-compulsive completionists like me, this division makes the game less addictive, which is arguably a good thing, compared to other games were there’s always a pending secondary quest, a question mark in a map or a new item to get. Doom provides great replay value and a quite difficult ultra nightmare mode with permadeath. I intended to complete the game in that mode but I dropped out despite observing constant progress in my playthroughs because it was going to take too long and I wasn’t enjoying the ride.

How does Doom manage to run so well compared to other games? The underlying technology is fantastic but I’d add it’s not an open-world game, it doesn’t have vegetation or dozens of NPCs to control. Maps are full of small rooms and corridors, and even semi-open spaces are relatively simple. 3D objects in game don’t abuse polygon count. They depend on good textures and other effects to look good. Textures themselves are detailed but not in excess. They’re used wisely. Interactive elements get more detailed textures while ornamental items get simpler ones. In general, it performs well for the same reasons Prey, for example, also performs well.

Doom offers several interesting choices as anti-aliasing options, many of them cheap and effective for that specific game, including several ones with a temporal anti-aliasing (TAA) component. TAA offers the best quality anti-aliasing in modern games, as of the time I’m writing this, if used well. It can be cheap and very effective, but the drawbacks may include ghosting and an excessively blurred image.

I experienced TAA extensively for the first time when I played Fallout 4 some months ago (I reviewed it here). In this game, the ghosting effect of TAA is very noticeable the first time you’re out in the open and move among trees but, still, it’s a small sacrifice to make compared to a screen full of visible “jaggies”. I believe TAA is the best option when playing Fallout 4 despite the ghosting and the increased image blurriness.

In Doom, it may or may not be the best option. The lack of vegetation, trees and its relatively simple geometry means the game doesn’t look bad at all without any temporal component if you don’t mind a few jaggies here and there, but I still recommend you to enable any form of TAA if you can. If you find the image to be too blurry, try to compensate that with the in-game sharpening setting. Ghosting is barely noticeable. It happens when enemies move very quickly right in front of the camera and, ostensibly, when Samuel Hayden appears on screen, around his whole body and, in particular, his fingers and legs. Disabling all temporal components is the only way to see Hayden as it was intended. The ghosting effect around him is so visible I thought it was done on purpose as a weird artistic effect the first time I saw it. Fortunately, both ghosting situations happen once in a blue moon, which is why I still recommend TAA for this game.

Batman: Arkham Knight

I also played Arkham Knight and it’s a good game. Plenty of challenges for a completionist but it’s a bit repetitive. Like I said when I reviewed Arkham Origins, I still think Asylum is the best in the series. Arkham Knight features graphical improvements, a darker plot that brings the game closer to Nolan’s film trilogy, and too many gadgets. The amount of key combinations and types of enemies reaches an overwhelming level and you need to add the Batmobile on top of them. Don’t get me wrong, it’s a solid 8 and its performance problems at launch are mostly moot now thanks to software changes and the arrival of the GeForce 10xx series, which handle the game reasonably well. However, it’s not a game I see myself replaying in the future.

Arkham Knight’s urban environments do not need temporal anti-aliasing that much, which is good because the game doesn’t have that option and still manages to look reasonably good. The game suffers from small performance problems when you enable every graphics option and ride in the Batmobile or glide high above the city, but frame rate drops are not severe.

Rise of the Tomb Raider

Finally, a few days ago I finished playing Rise of the Tomb Raider. I liked the game the same way I liked the 2013 reboot, with a score of 8 or 8.5, but other reviewers have been more positive. Some elements have been improved and others have been made worse. The plot, atmosphere and characters are better. Keyboard and mouse controls have received more attention and many game mechanics have been expanded without making them too complicated. On the other hand, completing the game to 100% is now harder and not more fun. The series is starting to join the trend of adding more collectibles and things to find just to make the player spend more time playing without actually being more fun and rewarding. Still, I completed the game to 100%.

With my GTX 1070 on a 1080p60 monitor the game runs perfectly fine with everything turned up to 11, except when it does not. There are a few places in the game were the frame rate tanks, and sometimes for no obvious reasons. One of the most noticeable places I remember was an underground tunnel with relatively simple geometry where I was able to make it drop to 42 FPS.

The way to fix that is very simple. The first thing to go is VXAO, an ambient occlusion technique. Dropping that to HBAO+ or simply “On” gives back a good amount of frames. In general, ambient occlusion is something that improves the overall look of the image. It’s heavily promoted by NVIDIA but, in my opinion, the difference between its basic forms and the most expensive ones doesn’t have a dramatic impact on image quality. If you have the power to run a game maxed out, be my guest. If you’re trying to find a compromise, ambient occlusion should be one of the first things to go.

To get back more frames you can also tune shadow quality down a notch or two. In Rise of the Tomb raider, the difference between “very high” and “high” is noticeable but not too much, and going down to “high” avoided many frame rate drops for me.

As opposed to ambient occlusion, I personally find shadow quality to have a very perceptible visual impact in most games. Reducing shadow quality normally means shadows look more blocky and weird and, if they move, they tend to do it in leaps instead of smoothly. I distinctly remember playing Max Payne 3 some years ago (that uninterruptible movie with some interactive segments interleaved, if you recall) and it featured a closeup of Max Payne in the main game menus with visible jaggy shadows across his cheek that drove me nuts (why would game developers choose to have that shadow there in such a controlled situation?).

Contrary to both previous games, Rise of the Tomb Raider features lots of vegetation, trees, small details and shiny surfaces, but it doesn’t offer a temporal anti-aliasing option. The result is a picture full of jaggies at times that, no doubt, will bother a few gamers.

In general, recent games tend to include more geometry, vegetation and lighting details that exacerbate every aliasing problem. At the same time, you have more detailed textures and you don’t want to blur them, specially when you’re looking at something close up. This is why, in many recent games, FXAA, the poster child of cheap and effective post-processing anti-aliasing, is a bad option. It’s very cheap but it doesn’t do a very good job and it blurs the image a lot. If you’re playing, let’s say, Fallout 3 (a 2008 title, time flies!), FXAA is an excellent choice. The relatively simple geometry, compared to today’s games, makes FXAA effective at removing jaggies. The lack of detail means textures don’t look blurrier than usual with it.

Moving forward in time, Crysis 3 was one of the first mainstream games to feature SMAA prominently, another form of post-processing anti-aliasing which is very fast on modern cards and improved the situation a bit. It attempted to fix jaggies like FXAA did but without blurring textures. Very cheap but it does cost a bit more than FXAA and is not as easily injected from outside the game (FXAA can be activated, in general, from NVIDIA’s control panel and used to improve the visual aspect of many old games that do not support it directly). SMAA did a good job for many years and was my preferred choice for a long time. I still choose it depending on the game.

These days, omitting a TAA option can be a significant mistake for certain titles like Rise of the Tomb Raider. In contrast, I’ve just started playing Mankind Divided, which offers a TAA option, and graphically it’s quite impressive (except for the way people look, a bit DeusEx-y if you know what I mean). Its developers did make incomprehensible decisions when choosing settings for the predefined quality levels. In my opinion, you should fine-tune most parameters by hand in the game, reading some online guides and experimenting. They also included a very visible option to turn MSAA on without any warnings about performance.

In any case, TAA in Mankind Divided is complemented by a sharpening filter that prevents excessive blurring if you’re inclined to activate it. Granted, the game doesn’t feature much vegetation but it looks remarkably good in any case. Both features complement each other rather well and give you a sharp image mostly free of notable jaggies.

The cost of TAA varies with the specific implementation. In general, it costs more than FXAA and SMAA but is very far from costing as much as other less-effective techniques like MSAA.

Roadmap for libbcrypt

Posted on . Updated on .

libbcrypt (previously just called “bcrypt”) is a small project I created some time ago when I was looking for a C or C++ library implementing the bcrypt password hashing algorithm and found no obvious choices that met my criteria. The most commonly referred implementation was the one provided by Solar Designer, but it lacks good documentation and you still need to dive into the source code to find out which headers to include and which functions to use from the whole package. So I did just that and built a small and better documented wrapper around it, just for fun. The goal was lowering the barrier of entry for anyone wanting to use bcrypt in C/C++. Python, Ruby, Go or JavaScript have other implementations available.

What I did was simple: I wrote a well-documented header file with a minimal set of functions, inspired by one of the Python implementations, and a single implementation file that, combined with Solar Designer’s code, generates a static library you can link into your programs. Later, I fixed some stupid coding mistakes I made, despite its small size, and forgot about it.

The project, as of the time I’m writing this, has 95 stars and 35 forks in GitHub (not many, but more than others) and not long ago I realized it’s one of the first Google search results when trying to find a “bcrypt library”. So it seems my small experiment has been promoted and I have to answer to a social contract!

In the last couple of weeks, I’ve been working a few minutes almost every day polishing the library, improving its documentation, reading other people’s code and documentation and adding some functionality. You can take a look at the results in the project’s future branch. Summary of changes from master:

  • The main implementation has been changed from a static to a dynamic library so it’s easier to update the implementation if a problem is found, without recompiling everything. I use -fvisibility=hidden to hide internal symbols and speed up link time. A static library is also provided, just in case you need it.

  • The function to generate salts has been changed from using /dev/urandom to using getentropy. That means the library will probably only compile under a modern Linux and maybe OpenBSD, and this is the main reason these changes are still not merged to the master branch. Without despising the BSDs at all, let’s be practical: Linux is the most widely used Unix-like system for servers and getentropy, introduced by OpenBSD, is just better than /dev/urandom because it’s simpler and safer to use, can be used in a chroot, etc. With Linux now implementing it, there are not many reasons to use anything else.

  • I have added a manpage documenting everything better and emphasizing the 72-byte implementation limit in password length.

  • I have added functions and documentation explaining the rationale for pre-hashing passwords before sending them to bcrypt, which works around the previous limitation in part.

  • There’s now an install target.

  • I have added a pkg-config file to ease finding out compilation and linkage flags in the installed library.

  • Tests have been separated to their own file and made a bit more practical.

I’ll merge these changes to master after things calm down for a while, I process any feedback I receive (if any) and after Red Hat, Debian and Ubuntu have a stable or long-term support version with glibc >= 2.25, which introduced getentropy. The next Ubuntu LTS will have it, the next Debian stable will have it and RHEL 8 will probably have it.

I may also try to package the library for Fedora, which eventually should make it land in Red Hat. I’m not a Fedora packager yet and this may be a good use case to try to become one and learn about the process.

If anyone’s interested in the project, please take a look at the future branch, comment here, open issues in GitHub, mail me, etc. Any feedback is appreciated because this was just a small experiment and I’m not a user of my own library. Also, I don’t recall ever publishing a shared library before. If I’m doing something wrong and you have experience with that, feedback is appreciated too.

What about Argon2?

Argon2 is another password hashing algorithm that won the Password Hashing Competition in 2015. The competition tried to find an algorithm that was better than bcrypt and scrypt. Its official implementation is released under the terms of the CC0 license, it works under Linux and many other platforms, and builds a shared and static library featuring both high-level and low-level functions. In other words, Argon2 already has a pretty good official, easy-to-use, API and implementation.

In my opinion, if you want to use Argon2, you should be using its official library or libsodium. The latter has packages for most distributions and systems. Argon2 is the best option if you want to move away from bcrypt, but there is no need to do it as of the time I’m writing this. The benefits are mostly mathematical and theoretical. Argon2 is much better, but bcrypt is still very secure.

Argon2 has three parameters allowing you to control the amount of time, memory and threads used to hash the password, as well as a command-line tool to experiment and find out the best values for those parameters in your use case. The libsodium documentation also has a guide to help you choose the right values.

The official library only contains the password hashing functions and leaves out details like generating a random salt or pre-hashing the password to harmonize password processing time. A few quick tests done with the argon2 command line tool from the official implementation revealed a small, almost insignificant password processing time difference between a small 5-byte password and a large 1MB one. I conclude Argon2 doesn’t need pre-hashing but I don’t know the underlying details. You can also choose the size of the generated password hash.

If you’re using libsodium, it includes several functions to generate high quality random bytes, needed for salts. A plain call to getentropy in Linux should also be trivial.

What about scrypt?

If you’re currently using scrypt you already have an implementation, and if you’re not using it yet but you’re considering using it in the future, you could skip it and jump straight to Argon2 (see the previous section). It’s a better option, in my opinion.

If you insist on using scrypt, there’s already a libscrypt project that’s packaged for Debian, Fedora, Arch Linux and others. It takes most of its code from the official scrypt repository to create a library separated from the official command-line tool.

The library also covers obtaining random data to generate seeds (it uses /dev/urandom), but not password pre-hashing. As with Argon2, a small test with a custom command-line tool revealed a very small and almost insignificant password processing time difference between a small and a very large password, so I conclude again no password pre-hashing is needed for libscrypt.

Saving 800KB per page with a bit of JavaScript

Posted on . Updated on .

I know the title is almost click bait, but I wanted to write about a small change in the generated HTML code for post pages in this blog. It prevents another 800KB of JavaScript code and data from being loaded per page as of the time I’m writing this. The long explanation can be found below, but the short explanation follows: I added a button in the bottom every post page to load comments. Until that button is pressed, the additional Disqus JavaScript code, which embeds comments in the page, will not be loaded and run. Before the modification, comments would load automatically. That additional code and data weights around 800KB (it was 1.2MB when I decided to write about this a few days ago) even for a page with zero comments. Most of it does not change over time to recurring visitors and can be cached by the browser, but it resides in a subdomain related to my Disqus account under disqus.com, so I guess the browser has to download it first for every domain you visit with Disqus comments.

Background

This blog is ultimately hosted in my personal web space at FastMail. I’ve mentioned FastMail several times in the past, specially when I blogged about their CardDAV support. FastMail handles all my personal email for around 40€ a year. In addition to email, they also provide a few other services, like my contact list and my calendar (both accessible from my phone through the standard apps).

More importantly, FastMail also gives me a fair chunk of web space to upload files to. I can upload them through their web interface or using WebDAV, and I can upload HTML files and decide to publish them in a chosen domain, subdomain and path under my control. In any case, only static files are supported. FastMail is not a dedicated web hosting company. You can’t run PHP or CGI scripts, and you can’t touch the web server in any significant way. For example, I can’t configure a TLS certificate.

Disqus

Without dynamic content you can’t have user comments and, without user comments, what’s a blog? Well, in my case it would be exactly the same because most posts have zero comments, haha! Anyway, Disqus is probably the easiest solution to implement comments on a static site. They have free accounts and you only have to include a small piece of JavaScript code at the bottom of every page you want to insert comments in, together with a specially marked element (usually an empty div) that will be replaced with an iframe containing the comments.

That small piece of JavaScript will, in turn, load several larger scripts from Disqus that will examine your page and replace the designated element with the embedded comments. They will take into account the web page you’re running the script from so as to load and store the right set of comments. Pretty clever and simple from the user standpoint. The comments are hosted and stored at Disqus while your static site lives happily in your own domain. The problem is those scripts are pretty big. As I explained previously, I aim for my site to be available and readable from slow connections. I had checked my post pages were light, but I didn’t realize they were dynamically loading so much data.

A nice side-effect of the new “Load comments” button is that, if you visit this blog with uBlock Origin or a similar add-on installed, you’ll see post pages don’t need to have any element blocked until you press the button. This means users with or without an ad blocker will benefit from increased privacy when passing by the blog.

Cloudflare

While we’re at it, I put Cloudflare in front of the blog. I know we probably shouldn’t rely on a few CDNs to serve half the contents of the web, but Cloudflare was the easiest solution for me to add TLS support to this blog. You may have noticed the (probably) green padlock in the address bar. The site is now served over HTTPS, scoring an A+ in the SSL Server Test from SSL Labs. I’m using Cloudflare in its Flexible mode. That means Cloudflare retrieves my content over plain HTTP but caches it and serves it to you over HTTPS. As I don’t host any service with personal information here, this mode just means visitors will get added privacy at no cost thanks to the opportunistic encryption, and I also get some peace of mind knowing if any page hosted here is eventually hit with a lot of traffic (legit or not), I won’t reach the strict FastMail bandwidth limits, making the whole site unavailable.

If you ever comment, your experience should also be marginally better. Previously, it was an HTTP page with an embedded HTTPS comments iframe, but what you saw in the address bar was the plain text connection. As you logged in through Disqus, Google, Facebook or whatever account you may have been using to comment, it wasn’t crystal clear that it was a secure connection. With the site now being served through HTTPS, the absence of a mixed content warning should inspire more confidence. By the way, thanks to Mozilla for including an insecure login warning in Firefox. It’s a bit of a shame that searching for “firefox insecure password warning” leads you mostly to pages explaining how to disable it.

Kudos to Cloudflare for providing free accounts and making the integration process incredibly easy. You only have to create the account and tell them about the domain you want them to proxy. Then, you review the entries they import from your existing name servers, adding things they may have missed, removing some entries and deciding what Cloudflare will cache, and finally you switch your existing name servers to theirs. For simple cases like mine your site will experience no downtime. I did the switch in about one hour, including reviewing the DNS entries, reading a bit of documentation and flipping some switches in the Cloudflare control panel.

Edit: I forgot to mention the site is also available over IPv6 as another nice side effect of using Cloudflare.

Now running Fedora 27

Posted on .

When I upgraded to Fedora 26 I mentioned it had been the smoothest Fedora upgrade I had experienced, but Fedora 27 has broken that record. It was short and completely painless, so there’s nothing more to say. Congratulations to everyone working on Fedora for their stellar job!

New Intel Coffee Lake CPUs

Posted on .

I’ve been reading and watching several reviews and benchmarks covering the new Intel Coffee Lake processors released on October 5th. Here’s my opinion on them.

First of all, congratulations to Intel for providing more competition in the CPU space, which is always good for us consumers. Reviewers have mainly focused on the i7-8700k and the i5-8400 as that’s mainly what Intel provided them. Availability for both in these first months is going to be limited and the only motherboards available will have the Z370 chipset and will be expensive until more limited chipsets and cheaper motherboards are released in 2018. This issue will be present for the whole Coffee Lake lineup, so it has to be taken into account for now.

The i7-8700k is like an i7-7700k with two more cores and four more threads. This means it’s now able to be tied or surpass the Ryzen 7 1700 in some multithreaded tests and be right behind it in others. That’s actually pretty good, and in thread-limited scenarios its single-thread performance is much better thanks to its superior IPC and clock frequencies. Also, its floating point performance and support for AVX-512 makes it a clear winner in isolated video encoding tests using x264 and similar software (i.e. when encoding video without gaming at the same time). It is, however, more expensive, requires paying for a separate cooler and uses more power than the R7 1700. Of course, it’s a clear winner for gaming.

The i5-8400 is a nice answer to the Ryzen 5 1600 in some scenarios. It clearly wins in gaming benchmarks. The two extra cores and threads are a welcome upgrade in the i5 line and pave the way for game developers to focus on using more threads and distribute CPU load better. It features some very interesting turbo clock rates while keeping the global TDP low. The Ryzen 5 1600, however, is more than enough in most 60FPS scenarios and still wins in many multi-threaded benchmarks.

Another i5 in the line, the i5-8600k, basically obsoletes the i7-7700k at a lower price, because apparently 6 cores with 6 real threads are able to perform almost as fast as 4 cores with hyperthreading. To me, however, the i5-8600k is way less interesting because it costs over 70 dollars more than the i5-8400 and needs a separate cooler.

Which CPU would I get?

Tough call. Let’s examine different situations. Take into account I may be very bad at judging these things.

Gaming at 120Hz or more? Intel is your best bet. i7-8700k if you’re streaming or multitasking while gaming and can afford it, or the i5-8400 if just gaming or on a tight budget. Maybe wait to see if Intel will offer any middle ground and if it’s worth it. i5-8500? i5-8600? But, initially, the i5-8400 is a very affordable gaming beast.

Gaming at 60Hz and with production workloads benefiting heavily from CPU multithreading? Let’s say virtual machines or heavily-multithreaded applications. I’d probably stick with the Ryzen 7 1700 thanks to its current price/performance ratio, but the i7-8700k is also very nice if your particular productivity programs show it being superior, and maybe if you want to put more emphasis on the gaming part of the equation. The i7 will definitely cost more and require a separate cooler, but has been shown to perform better in some productivity cases.

Gaming at 60Hz with some multitasking and multithreading? Tough call, but I think I’d stick with the Ryzen 5 1600 for now. That may change in the future as more games are released and cheaper motherboards for Intel become available, showing how the i5-8400 performance evolves. The Ryzen 5 1600 has a better cooler, will cost around the same, and uses the AM4 socket which will probably be around for longer. Again, if you have a specific use case showing the i5-8400 to be better in benchmarks, then buy that.

Strictly gaming at 60Hz? Any of the previous two. This is mostly my situation and I’d still pick the R5 1600 for now, because the multithreaded performance will be there if I ever need it. I may be wrong, though, and maybe the gaming industry will take more time to adapt and some future games will make the R5 1600 struggle for some reason. There are a few games where the R5 1600 struggles a bit now. The i5-8400 is probably the conservative choice as shown by gaming benchmarks, but it really needs cheap motherboards available. Picking the R5 also means going with the market underdog and encouraging competition.

Strictly gaming at 60Hz with specific titles that require good single thread performance or make the R5 1600 struggle and bottleneck the GPU to keep the framerate above 60? i5-8400 with a future cheap motherboard, or a Kaby Lake CPU.

Minimizing CPU cost as much as possible for a cheap gaming rig? The i3-8100 looks nice on paper, but I’d wait for more desktop CPUs to be announced and released. We’ll see what the new Pentium-class processors have to offer. As things stand right now, I wouldn’t bother with the Ryzen 3 line. Maybe go down to the Pentium G4560 and spend more on the GPU?

Office desktop with no gaming? Go with an Intel Pentium, either Kaby Lake (the Pentium G4560 is awesome) or wait for Coffee Lake Pentiums with cheap motherboards and Raven Ridge APUs. The integrated GPU is essential to save costs, and Intel’s provided cooler is more than adequate for Pentiums. Don’t consider Ryzen unless you have a very specific use case for a separate low-power GPU like the GeForce GT 1030 or RX 550. Keep an eye on motherboard and power supply costs.