Hope here is a good place to discuss since this topic is kind related to Epic’s choices of using ACES.
My studio is using UE 5.2.1 developing a title and I am investigating engine’s HDR pipeline in a deep dive, trying to understand what exact each pixel value manipulation does.
I am stuck at the very last step of UE’s generalized tone-mapping step, which is using ACES outTransform containg a SSTS step.
The main confusion is what is the input unit of SSTS? I understand output unit is absolute linear value in Nits, so the default Mid-15/Max-1000 parameter would be good enough to give to swapchain back buffer after a final target encoding transfer. Input is confusing.
I did a naive chart to help me understand the curve.
0.18(input) perfectly mapped to 15(nits, output) is a good start. This makes me think input should be scene linear render result. But what about unit? Output is display unit in nits very clearly. Does input also has a clear definition or more-or-less depend on how renderer is setup, and ACES transform just make it become good enough on a display?
For example, while understanding 0.18->15 is not hard, what about 10->516? What exactly does 516 nits output expect in 10 input?
A followup question is about mid-brightness choices. Default 1000/2000 preset use 15, but how about other monitor with lower peak brightness, like 400/600? Is there a formula here?
Input is relative scene exposure, i.e. linear.
To see the S-shape, I usually plot the x axis as log2(x)-log2(0.18). This puts 0.18 at 0 and then units are relative stops. There are many ways though to scale or stretch how the x-axis or y-axis is presented by converting the units and log scaling.
If you need more guidance, I can try to dig up some of my old plots. There’s also a dated, but perhaps relevant post here where the SSTS concept was being demonstrated.
No, though theoretically it should be. This is one of the known inconsistincies that the Output Transform WG is correcting in ACES2. ACES2 will adjust the curve in a smooth and predictable fashion as peak luminance changes, so this is built in to the formula and will be a smooth continuum such that you could enter any luminance within the parameters and get a default mapping that “makes sense” compared to the other settings.
Great info Scott. The big picture is getting more clear to me. And regarding linear and output Nits, just want to check if I understand correctly now:
Linear value: generally describe how bright a exposed scene is in a mathematical way. For example, 0.36 in linear should be double-bright than 0.18 in linear. It’s not strictly related to a real unit in optics.
Output display value: typically what we call display brightness in Nits or cd/m^2. It’s a optics unit that can be measured in real world. However, due to human vision system’s non-linear nature, just double Nits won’t give eye a feeling that brightness is doubled, and that’s where all the curves kick in. E.G. To make 0.36 double bright than 0.18, it’s Nits should be more than double to make our human eye feel that way.
It is worth paying attention to avoiding conflating “brightness” with colour cognition attributes. Further, stimulus is not a simple mapping to cognition.
Further, when it comes to articulations that lead to a cognition of colour, the idea of “intensity” of the cognition is slippery. Thinking purely in a stimulus defined specification of luminance units such as nits is gravely misleading. For example, BT.709 “blue” carries an equivalent force of “intensity” to BT.709’s “yellow” when balanced, despite BT.709’s “blue” being of significantly lower luminance.
TL;DR: Exert extreme caution when thinking about colour cognition. Also note that “nits” has nothing to do with the channel by channel mechanic when dealing with any value off of the achromatic axis; it’s all random mumbo jumbo.