Color Weirdness

This page serves as an explanation of the key things I know with respect to rendering color and anti-aliasing and where CSS & the HTML canvas (among other things) is different in those respects from what you might want.

Chapter 0: What is color?

As a foundation, I believe it is useful to know how color and light works. Color is the perception humans (and other animals) have to distinguish different frequencies of light. It's not a perfect perception though, rather it's heavily biased towards distinguishing particular frequencies of light while other frequencies are completely invisible (e.g. ultraviolet light). There are two main forms of light, which you might've learned in school: emission and absorption. These two forms of light are easily distinguished by their spectra; see below.

Understanding the exact details of this isn't too important, just know that emission light is important when talking about displays that emit light, like a monitor, and absorption light is important when talking about displays that reflect light, like what comes out of a printer. I'll only be discussing emission light here. To detect light, humans have 4 types of detectors: rods, which specifically work in dim light and don't detect color, and 3 types of cones which do detect color. If you want to know more about that, see Wikipedia. Color perception is somewhat subjective, so there have been some attempts to model "typical perception", notably CIE 1931, CIELUV, and CIELAB.

Chapter 1: How is color displayed?

Color on a (modern) display, is generally created via a square pixel which is divided into thirds, the left-most third being red, then green, and lastly blue. specifics beyond that is somewhat variable, There are several other pixel layouts, but for convenience, I'll assume RGB. That means that a given display, in a sense, has 3x as many pixels as advertised which is worth taking advantage of if possible.

Below is an animation of two rectangles going left and right. Both are rendered using 1-bit per channel color, but the bottom rectangle takes advantage of the sub-pixels (assuming RGB). As a result, the bottom rectangle's animation should look 3x as smooth.

If you're using a non-RGB or rotated monitor, the above sub-pixel version may actually look worse. The demos on this page will look much better if you remedy that. Here's what the above should look like at the pixel level:

You might've noticed that the bottom version doesn't quite look 3x as smooth as the top one. That's because the green sub-pixel is brighter than the red sub-pixel, which is brighter than the blue sub-pixel, so the frames where a green sub-pixel changes are much more significant than frames where a blue sub-pixel changes. The reason the sub-pixels aren't all the same brightness has to do with the complexities of how human color perception works.

Most pixels these days aren't just 1-bit per channel. If they're 100% sRGB, then they're 8-bits per channel, and if they're not 100% sRGB, they probably pretend to be. Here's a demo with a color-picker:



sRGB is weird though. It has a non-linearity of luminance to it, as opposed to linear-RGB which, as the name suggests, is linear. This will be important later.

Chapter 2: What is Anti-aliasing?

Anti-aliasing is where you use color blending at the edge of shapes to smooth out edges. It's extremely useful in making digital graphics look smooth. It's somewhat difficult to do well though and is the main reason I'm writing this. There are four main types of anti-aliasing:

Aliased

No anti-aliasing is simply referred to as aliased. The result is generally lots of jaggies.


Supersampling

Standard supersampling treats each pixel as if the RGB are all in the same spot. For anti-aliasing, each pixel takes on the color of the average of several samples from within the area the pixel represents. How the average of multiple colors is calculated depends on the color space used; see Chapter 3. The samples used often form a square grid (e.g. 2x2, 3x3 etc.), though not in Monte Carlo based rendering like path tracing.


Sub-pixel supersampling

Similar to standard supersampling but taking into account sub-pixels. There are two main reasons to prefer standard supersampling; sub-pixel supersampling is more computationally expensive for minimal improvement, and if the sub-pixel layout is different from what was expected, sub-pixel supersampling is actually worse.


Pixel-perfect

Similar to standard supersampling, pixel-perfect rendering treats each pixel as if the RGB are all in the same spot. Pixel-perfect rendering is what you get if you take the limit of supersampling as the number of samples approaches infinity. It's not often practical to do pixel-perfect rendering, but for simple shapes like circles on a flat color, it's not too hard to derive.


Sub-pixel-perfect

Sub-pixel-perfect rendering is the same as pixel-perfect rendering but taking into account sub-pixels. This is the absolute highest quality possible on a monitor and has the same drawbacks as sub-pixel supersampling. Fonts are usually rendered at this quality.


Chapter 3: How does color blending work?

Color blending is used to smoothly interpolate from one color to another. In the case of anti-aliasing, what the average of your samples looks like is different depending on the color space used for color blending, so using the right color space when doing color blending is important. Unfortunately, color blending in the right color space seems to get overlooked a lot. Generally speaking, color blending should be done in a linear color space. This is because non-linear color spaces are not mathematically consistant, so doing math, like linear interpolation, in them results in (essentially) incorrect results. Unfortunately, color blending is often done in sRGB, which is not a linear color space.

Here's a demo of linear interpolation from black to white in sRGB (a non-linear color space) vs linear-RGB (a linear color space):

non-linear (sRGB)

linear (linear-RGB)

You might notice that the non-linear one looks more even. That's because human perception is better at distinguishing small differences in dark colors than small differences in bright colors. This makes sense; it's far more useful to be able to tell the difference between 0.01 and 0.02 than it is to be able to tell the difference between 0.51 and 0.52 (0 is black and 1 is white).

Human perception also changes depending on enviormental brightness. If you hover over the gradients, the rest of the screen will go black and, if you're also in a dark room, your perception should adjust to the darkness & percieve the dark colors as brighter even though the gradient doesn't change (unless you have an adaptive brightness monitor or adaptive brightness enabled).

To compensate for human perception, sRGB (among other color spaces) makes the difference in brightness between dark colors smaller than the difference in brightness between bright colors. This makes it so that disproportionately more of the color space represents dark colors. How much a color space compensates for this is referred to as gamma and the process of correcting for this is called gamma correction. Gamma often refers to an exponent applied to the luminance of the colors. sRGB is generally referred to as having a gamma of about 2.2, although sRGB gamma doesn't actually fit to an exponent curve. The gamma correction formula for sRGB is: if value <= 0.0031308:
  12.92 * value
else:
  1.055 * value1/2.4 - 0.055
A linear color-space is one where the gamma is 1. Here's a demo of linear interpolation in various color spaces, both linear and non-linear:


sRGB

linear-RGB aka CIE 1931 RGB

HSL / HSV

linear-HSL / linear-HSV

true HSL

true linear-HSL

true HSV

true linear-HSV

of these color spaces, linear-RGB (aka CIE 1931 RGB) is the most accurate in terms of actual color blending. For that reason, it's usually a good color spaces to do pixel-level color blending in. Interpolation in other color spaces can be useful for artistic reasons.

Chapter 4: Canvas color blending & anti-aliasing

Although linear-RGB is a more appropriate color space to do color blending in, sRGB is often more convenient, which I'm guessing is why CSS & the HTML Canvas use sRGB for color blending. This has some immediate negative side effects, the most obvious one being how it does anti-aliasing.

The HTML Canvas' anti-aliasing looks especially bad when straight edges are animated at fractional pixels per frame. The result is a kind of pulsating effect, which you don't get when using linear-RGB.

sRGB anti-aliasing (CSS & Canvas)


linear-RGB anti-aliasing (pixel-perfect)


linear-RGB anti-aliasing (sub-pixel-perfect)


Chapter 5: Dithering

Dithering isn't that important most of the time, but there are some key situations where it becomes very valuable. Specifically, dithering is useful for slow gradients. Here's an example of the artifacting that happens (without dithering) on slow gradients:

To emphasize the effect, the top half is 6-bits per channel, but you should still be able to see it in the lower half which is standard 8-bits per channel. If you hover over the image, the rest of the page will go black, which should make it easier to see. If the bottom half still looks smooth to you, try turning your screen brightness up.

Dithering is a technique of selectively varying the color of pixels so as to simulate a higher bit-depth than is actually available. There are lots of various dithering algorithms, including for arbitrary palettes (for which I highly recommend reading this article). However, for performance reasons, and because the improvement from better algorithms using Error diffusion is barely noticeable, it's generally preferred to use an ordered dithering dithering algorithm for 8+ bits-per channel dithering. That generally means one of:

Cluster-dot Halftone

Bayer

Void-and-Cluster

You can also have an arbitrarily large dithering matrix, but for the above I used an 8x8 matrix for Cluster-dot Halftone & Bayer and a 64x64 matrix for Void-and-Cluster. Each of these three have their own strengths and weaknesses.

Cluster-dot Halftone tries to clump pixels of the same color together, which can be a good thing in some situations; notably it's useful in printing because ink tends to glob together. It's also useful if the image being rendered might be down-scaled because the down scaling is less likely to ruin the dither. It is the most coarse of these three though.

Bayer is, arguably, the easiest to make, plus it's the most intuitive. It also is, historically, the most common dithering method. It can sometimes be a good choice stylistically, but the very rigid structure it has can sometimes cause undesirable artifacting.

Void-and-Cluster is based off blue noise, so it looks the least structured of these three, which can be good or bad, but generally it makes the smoothest gradients of these three.

Chapter 6: How do you make Void-and-Cluster?

There don't seem to be very many resources on how to make a Void-and-Cluster dither matrix, so I'll just go ahead and provide that here. I used this paper as a reference, but I use a slightly different method for generating the initial bitmap, which I've found produces better results.

An ordered dithering dithering matrix is generally a square matrix that includes every integer from 0 to N×N where N×N is the size of the matrix. As an example, these are 2x2 and 4x4 Bayer dithering matricies: 0 2
3 1
 0  8  2 10
12  4 14  6
 3 11  1  9
15  7 13  5

To use a dithering matrix, you add(matrix[X mod N][Y mod N] + 0.5) / N×N - 0.5, whereXis the x-coordinate andYis the y-coordinate, to each channel of each pixel's color before rounding the color to a renderable color.

    To make a Void-and-Cluster dithering matrix, the process is roughly:
  1. Create a bitmap of points roughly evenly distributed
  2. Find the biggest cluster
  3. Remove the point at the center of the biggest cluster
  4. Set dither matrix at the coordinate of where the point was removed as N where N is the number of points remaining
  5. Repeat 2-4 until there are no points left
  6. Revert back to the original bitmap
  7. Find the biggest void
  8. Add a point at the center of the biggest void
  9. Set dither matrix at the coordinate of where the point was added as N where N is the number of points remaining
  10. Repeat 7-9 until the grid is filled

For finding clusters and voids, a gaussian blur with a diameter equal to the matrix size and wrapping around both horizontally and vertically is generally used. The brightest and darkest points after applying the blur are the clusters and voids. You can cache the blurred version of the bitmap and just add/subtract the difference for each point added/removed, which helps a lot with performance. You can try various starting numbers of points in the initial bitmap and various matrix sizes. Here is some JavaScript code that generates a Void-and-Cluster dither matrix.

Before moving on, I just want to briefly mention that temporal dithering is another potentially useful form of dithering. The main downside of temporal dithering is that it may cause an undesirable flickering effect.

Chapter 7: Image scaling

The last thing I want to touch on is image scaling, in particular upscaling. Image scaling is often done using bilinear interpolation, which is fine, except the interpolation between pixel colors is often done in sRGB, not linear-RGB.

Nearest neighbor scaling

normal (sRGB) bilinear scaling

linear-RGB bilinear scaling

There are, of course, many other image scaling algorithms, for example Bicubic interpolation, but the main point I wanted to make is that using a non-linear color space when doing color interpolation causes ugly artifacts. As for downscaling, that goes back to the same things as anti-aliasing.


View code on GitHub