ACES 2 in Game Engines: Variable reference/max luminance

I’m working on adding HDR output support to the Godot game engine and want to ensure that everything is setup up to correctly implement ACES 2 in the future, either as an add-on or integrated into the core of the engine.

I am relatively new to the world of HDR output, so please point out if I may have any incorrect assumptions or understandings! Also, I happen to be at GDC next week and would love to chat in person if that might be easier :slight_smile:

Typically a game engine will obtain reference and maximum luminance of the display from the OS.

The reference luminance is used as the “white” (1.0) value for mapping SDR content to the dynamic range of the display. This means that OS windows will render within the reference luminance range and a game’s GUI is expected to be within this range as well. The reference luminance is often exposed to the player, which allows the player to adjust the brightness of the image if they find that the default OS reference luminance is not suitable for their game use. BT.2408 recommends this reference luminance to be about 203 nits, so games may default to 200 nits as a player-facing value. I noticed that my Windows computer with a Sony TV reports 240 nits as its reference luminance.

The maximum luminance is ignored in a scenario where there is no tonemapping and values are clipped or tonemapped by the display (the scene referred values are passed directly onto the display). When tonemapping, the maximum luminance is used to determine how the “shoulder” of the tone curve should behave. This is primarily relevant to input values above middle grey 0.18, and most relevant to values above 1.0 (reference luminance). Again, this luminance value can be configured by the user in most games because how this value is best configured may be determined by how their display is tonemapping its input values.

With these two values, the range of output values can easily be determined as [0, (max_luminance / ref_luminance)], which would give a range up to 5.0 in the reasonable scenario of ref_luminance = 200 and max_luminance = 1000. This also means that SDR content is simply [0, 1] because ref_luminance and max_luminance are both 200 nits.

So this sort of variable dynamic range determined by reference and maximum luminance gives the player reasonable control over the brightness of their game and also allows them to adjust the HDR behaviour of their specific display.

Onto ACES 2: I see that the source is provided as CTL files. Thanks! I’ve been able to run the tool and reproduce the test images exactly, which gives me a great starting point for making a simplified approximate of ACES 2 that is suitable for running in a game engine that must run on web and mobile platforms with very little processing resources leftover for tonemapping.

EDIT:

After reviewing some of yesterday’s meeting, I understand the following approach is incorrect, as these constants should not be modified directly and new optimizations are yet to make their way to the CTL reference implementation. My mistake! Please feel free to disregard the following original text:

Regarding variable reference/max luminance, I expect I can simply modify the following constants to see what the behaviour should be with different reference and max luminance values:

TSParams:

const float n_r // normalized white in nits (what 1.0 should be)
const float c_d // output luminance of 18% grey (in nits)

General constants:

const float peakLuminance // luminance the tone scale highlight rolloff will target in cd/m^2 (nits)

Is this correct?

To test my understanding, I tried the following: Change the reference and maximum luminance values for Output.Academy.Rec709-D65_100nit_in_Rec709-D65_sRGB-Piecewise.ctl from 100/100 to 200/200:

const float n_r = 200.0;
const float c_d = 10.013 * (n_r / 100.);
const float peakLuminance = 200.;

This should result in the same image data in linear [0,1] range. When mapping to the screen a 200 nits reference luminance will be used to multiply the values into nits units. But when I tried this, I noticed the resulting TIFF file appears brighter. Maybe I’m just misinterpreting the TIFF data? Or is there something else about this that I’m not understanding?

Thanks!

I understand now that documentation is on the way to address how to implement this sort of behaviour correctly. Please don’t rush! I believe that you have other important deadlines coming up that most definitely take priority :slight_smile:

Here is a bit more detail on the APIs that I’m working with: Use DirectX with Advanced Color on high/standard dynamic range displays - Win32 apps | Microsoft Learn.

So let’s say Windows is operating with an HDR display and a 200 nit “SDR reference white level”, which is a reasonable value, according to my personal setup, the BT.2408 recommendation, and the Microsoft documentation.

If my game is only outputting SDR, this means the value sent to the display will be as follows:

  1. linear scene referred → nonlinear ACES (via ACES ODT)
  2. nonlinear ACES → nonlinear sRGB (via piecewise sRGB EOTF)
  3. nonlinear sRGB → nonlinear ACES (via the inverse piecewise sRGB EOTF)
  4. nonlinear ACES → scaled nonlinear ACES (multiplied by “SDR reference white level”)
  5. scaled nonlinear ACES → PQ signal (or equivalent HDR EOTF)

So if we take this example of a 200 nit “SDR reference white level” and look at an 18% scene referred value, it would look something like this:

  1. 0.18 → approximately 0.10 (or 10 nits) [nonlinear ACES]
  2. 0.100.38 [nonlinear sRGB]
  3. 0.380.10 [nonlinear ACES]
  4. 0.1020 nits [scaled nonlinear ACES]
  5. 20 nits → output via PQ or equivalent HDR EOTF

From my personal experience, this provides an image on an HDR display that looks similar to the image I would see on the same display when it is operating in SDR mode. Additionally, this scaling based on the “SDR reference white level” is entirely handled by the operating system and is outside of the control of the developer in the case of an SDR app.

From the meeting recording, I understand that in HDR, the ACES ODT will usually map 18% scene referred to around 15 nits instead of 10 because of the higher dynamic range.

With the goal of using ACES to produce a similar appearance on both SDR and HDR displays, I would expect that a video game that is rendering 18% scene referred would provide an HDR output value of around 30 nits when used with a 200 nit “SDR reference white level”. (Or 15 nits with 100 nit white, or 22.5 nit for 150 nit white, or 37.5 nit for 250 nit white, etc.) And, of course, this has an impact on the maximum luminance because the low end has been scaled upwards.

When writing your documentation, I would be interested to know how to correctly configure ACES to work with this sort of OS-controlled variable “SDR reference white level” and maximum luminance, but again, no rush! Of course, it’s entirely possible that I am mistaken in my understanding of how things work and what is most desirable with real-world consumer hardware/operating systems. I am quite new to this, after all.

I am not really familiar with how HDR is dealt with on Windows. However I believe that the linear_scale_factor parameter would be the appropriate one to use for what you want to achieve.

If I understand you correctly, you are saying that if a display reports 200 nits for reference white, then PQ encoded data sent to it will be displayed at twice the absolute nit value produced by the reference PQ EOTF. So in that case, SDR is displayed with 20 nit mid grey, and you want 1000 nit HDR displayed with the same mid grey brightness. By default, the 1000 nit ACES 2.0 HDR Output Transforms put mid grey at 14.5 nits (see this table) but your display will show the PQ encoded result twice as bright, so 29 nits. You therefore need to use a linear scale factor of 20 / 29 = 0.6897 to match SDR and HDR.

You can see an example of the use of linear_scale_factor (which is set to 1.0 in most Output Transforms) in this transform for Dolby Cinema. The ACES 2.0 tone curve for a peak white of 225 nits is used, with a linear_scale_factor of 0.48, producing a peak white output of 108 nits (225 * 0.48).

The idea is that to produce a valid ACES 2.0 Output Transform, you should only vary the parameters in the macro level CTL files, like the 108 nit one above. If you dig into the sub-functions in the library code, and start varying the parameter values there, you are diverging from the intent of ACES 2.0.

EDIT: Corrected my linear scale factor calculation to use 20 instead of 10.

1 Like

Thanks!

It was my understanding that the difference between middle grey on SDR and HDR was an intentional part of ACES, so this doesn’t sound like something that I want to change.

Instead, I’m curious how I get from around 14.5 nits to 29 nits, and how the maximum luminance should then be adjusted. If I simply multiply by 200 / 100, then my maximum luminance would be 2000 nits, which is more than the users display can handle.

So then, would I need a linear_scale_factor of 200 / 100 = 2.0 and peakLuminance of 1000 / linear_scale_factor = 500 nits?

(edit: fixed math typo)

Apologies then if I have misunderstood. Are you saying that the system in fact displays PQ encoded material at the intended absolute nit value, and for displaying SDR with a peak brightness of 200 nits you, the developer, need to take the 100 nit SDR value, linearise it, double the linear value and then PQ encode it?

In. that case, since SDR then produces a mid grey at 20 nits, you could use an HDR rendering with peak white set to 690 nits, producing a mid grey of 13.8 nits, and a linear_scale_factor of 1.449. This would result in mid grey matching SDR, and a peak of 1000 nits, since 13.8 * 1.449 = ~20 and 690 * 1.449 = ~1000.

(The peak values in my spreadsheet are just selected examples. You can change one of them to a different value to see what the resulting mid-grey is. That is how I came up with the value of 690)

since 13.8 * 1.449 = ~20

Hmm. Again, I’m not sure that I want HDR to match mid grey of SDR exactly; this seems against the intended design of ACES. Instead, I want the experience of switching between SDR mode and HDR mode on a consumer TV using Windows to be similar to using a 100 nit SDR reference monitor and a (???) nit HDR reference monitor.

Are you saying that the system in fact displays PQ encoded material at the intended absolute nit value

Yes, I believe this is correct, in spite of the system displaying SDR content at a variable nit value based on “SDR reference white level”, which might be around 200 or so nits depending on the display that is connected to the computer.

and for displaying SDR with a peak brightness of 200 nits you, the developer, need to take the 100 nit SDR value, linearise it, double the linear value and then PQ encode it?

If I wanted to display SDR content in my HDR game, then yes I would need to take a [0, 1] range SDR value, multiply it by “SDR reference white level” to convert it to nits, and then PQ encode it.

So my goal is not to match SDR grey to HDR grey. Instead, because Windows produces a PQ signal for SDR content with a variable peak nit value (not fixed at 100 nits, but instead based on the display), I want my HDR content to similarly respect this variable “reference” nit value. I believe this would give me a similar final experience of comparing an SDR reference monitor that is fixed at 100 nits with an HDR reference monitor that is also fixed at a specific nit value.

Now that I’ve had the chance to discus this, it sounds like this goal is actually quite simple to achieve with ACES. I believe it would be as follows:

// Provided by the operating system as described here:
// https://learn.microsoft.com/en-us/windows/win32/api/wingdi/ns-wingdi-displayconfig_sdr_white_level
// Optionally controlled by the player through game settings.
sdr_reference_white_level = ??? // in nits

// Provided by the operating system as described here:
// https://learn.microsoft.com/en-us/windows/win32/api/dxgi1_6/ns-dxgi1_6-dxgi_output_desc1
// Optionally controlled by the player through game settings.
max_luminance = ??? // in nits

// This new constant simply represents that the video game uses
// Output.Academy.Rec709-D65_100nit_in_Rec709-D65_sRGB-Piecewise.ctl
// for SDR output.
aces_sdr_reference_white_level = 100 // in nits

// The following are configuration constants in the HDR ODT:
linear_scale_factor = sdr_reference_white_level / aces_sdr_reference_white_level
peakLuminance = max_luminance / linear_scale_factor

And, if I’m understanding correctly, this would give me a final range of [0, max_luminance] and the ACES behaviour between SDR and HDR would be correct, even though Windows automatically scales all SDR apps to sdr_reference_white_level when outputting HDR PQ.

Is this the recommended way to handle this behaviour in Windows and other operating systems that automatically scale SDR apps to higher-than-100-nits levels when outputting HDR PQ signals?

As I said, I’m not familiar with Windows HDR, but @doug_walker 's suggestion for Mac HDR displays in Reference Mode (which does pin SDR white at 100 nits) was to linearly scale the display light value of HDR so HDR and SDR mid grey WERE comparable. This was just a suggestion, and the effect of it has not been tested.

If you have a display which shows SDR at anything other than 100 nits, you can’t really compare it with the experience of looking at ACES on reference SDR and HDR monitors. If the SDR is “gained up” you need to also do something similar to the HDR, otherwise the HDR will appear dimmer than SDR (although with more highlight range). ACES 2.0 mid grey does vary with peak brightness by design. But it does so relative to 10 nit mid grey for 100 nit SDR. Change the SDR and you break that assumption.

My 690 nit proposal was one option. But it was theoretical, not tested. What the appropriate approach for non reference monitors is will have to be researched.

BT.2408-8 recommends changing either the SDR or HDR monitor’s luminance in order to make the brightness below reference white more comparable when the two are to be seen side by side.

1 Like

OK, thanks! This makes sense.

I will not be starting work on an approximation for the Godot game engine until all simplifications and optimizations have been finalized in ACES 2 and documentation has been finished.

I will keep my finger on the pulse to see if this sort of thing is researched in the future, as it would be good to know what the recommendation should be for systems with HDR output that treat SDR content with a variable maximum nit value.

One other “feature” of the approach that I proposed is that the player or game developer could simply set the sdr_reference_white_level to 100 and the max_luminance to 1000 to exactly reproduce an ODT like Output.Academy.P3-D65_1000nit_in_P3-D65_sRGB-Piecewise.ctl. This way, the exact original ACES behaviour is still fully accessible if desired.

Thanks again!

I think there is a misunderstanding here.
The 200 nits reference white in HDR comes from an broadcast reality to acknowledge the fact that all TVs in SDR are showing peak white at at least 200 nits and not 100 nits as defined in ITU BT 2035.
I don’t think this is a useful path for game development.
In a game you can ask the user about its surround, viewing condition and preference by showing suitable calibration images.
If you want to use ACES I would use the vanilla HDR rendering assuming a dim surround. Then if the user wants it to make bright brighter you implement it as a simple gain in display referred.
You need to lower the tone mapping peak accordingly to make room for the post Tone mapping gain. So an overall brighter image compressed highlights more.

1 Like

The 200 nits reference white in HDR comes from an broadcast reality to acknowledge the fact that all TVs in SDR are showing peak white at at least 200 nits and not 100 nits as defined in ITU BT 2035.

Yes, I definitely added confusion by using 200 nits as an example. I should have written SDRWhiteLevelInNits instead of 200 nits. My mistake.

Then if the user wants it to make bright brighter you implement it as a simple gain in display referred.

I believe this is exactly what the linear_scale_factor in ODT CTL files is.

In pseudo-code, I believe that you’re suggesting the following?

// Optionally controlled by the player through game settings.
// Defaults to 100 nits.
player_reference_luminance = ??? // in nits

// Provided by the operating system as described here:
// https://learn.microsoft.com/en-us/windows/win32/api/dxgi1_6/ns-dxgi1_6-dxgi_output_desc1
// Optionally controlled by the player through game settings.
max_luminance = ??? // in nits

// This new constant simply represents that the video game
// uses 100 nits reference luminance for ACES ODTs
aces_reference_white_level = 100 // in nits

// The following are configuration constants in the HDR ODT:
linear_scale_factor = player_reference_luminance / aces_reference_white_level
peakLuminance = max_luminance / linear_scale_factor

In your suggestion, I did not see mention of how to handle ACES’s peakLuminance, so I left this as I had initially proposed. Am I understanding your suggestion correctly?

ACES is not tested against dynamic mapping. But the tone curve part will work as you mentioned.
You should be able to get any diffuse white to peak white tone mapping ratio (within resonable limits).
I am not sure about the other components though (I was not involved there and I do not fully understand the design).

If you are looking into optimising (simplifying) the code for realtime games, you could reformulate the tone curve parameter to do what you want. Should not be difficult.

Hi, I have a question. What TRC are you using to convert SDR content into HDR linear color space? Windows HDR support is currently using piecewise sRGB to covert into linear blending space(actually they should use power 2.2, this seems to be a bug), and this causing SDR contents’ dark part becomes washed-out.

Things aren’t finalized yet, but the engine works in linear encoding for most things, so some “SDR” content already exists in linear space, ready to be encoded in PQ (or equivalent).

The correct thing to do, as you know, would be to piecewise sRGB encode this sRGB SDR content, and then re-linearize it using a 2.2 power function… but the performance cost of doing this on every pixel will likely not be viable for most games.

If we were starting from scratch, maybe we could do this during import… but I’m not sure how annoying this would make things for existing 2D art workflows…

I need to think more about what the right tradeoffs for this will be…

(Also, I don’t actually know what TRC stands for, so I assume you mean a transfer function.)

If you are dealing with linear render result in engine, I think it can be directly encoded to PQ/scRGB. For sRGB textures, I think they should be decoded using power 2.2 to render linear space? I can’t imagine a case that need to do runtime linear->piecewise->power2.2 conversion.

Well, a game needs to support both HDR and sRGB SDR output with an in-game setting to toggle between the two modes. So simply doing this at import would require duplicating all sRGB SDR textures in the game for HDR and SDR modes and the ability to swap them out at any point… it’s this runtime switching between SDR and HDR modes that makes me think it would be necessary to perform this nonsense at runtime to “simulate” the SDR look in HDR output.

I believe that other engines have their GUI/HUD rendering with an entirely different buffer format that is layered on top of HDR buffers, so their solution to all of this might be slightly different. In Godot, 2D is much more of a first-class feature, and many games, even with HDR rendering buffers, are 2D games, so our approach is slightly different.

I might talk a bit with the rendering folks on the project to see what their thoughts are.

But I will admit, I’m not sure why Window’s approach is “wrong”: Say you have scene referred linear values and you piecewise encode them to be sRGB code values for use by a display. To get back to scene referred values, you should perform the piecewise transfer, just like Windows does. So in this way, Windows approach is correct.

The issue comes down to the reference display being a part of the sRGB standard, I believe. It makes it tricky to know, in a general way, if you’re trying to get back to original linear scene referred values or if you’re trying to get to the intended look of the displayed image, but then converted back to linear values. :face_with_spiral_eyes:

Anyway, a bit off topic. Sorry that all of this is likely entirely unhelpful…

For sRGB textures, I think they should be decoded using power 2.2 to render linear space?

Oh, sorry, for this case: no, this doesn’t make sense because in SDR mode this would mean the 2.2 linearization of piecewise sRGB would happen twice: once at import and then again on the display.

That is, unless for sRGB SDR you inverse 2.2 power encode before sending to the display instead of doing a piecewise encoding to the display? I am thinking aloud here, but maybe this is the obvious solution that every other game engine does that I had just never thought of, since that’s not how Godot has done it. Or maybe this makes no sense at all. At some point in the next few weeks I might sit down and actually try this out, as this will be important for our HDR output support.

Check this: GitHub - dylanraga/win11hdr-srgb-to-gamma2.2-icm: Transform Windows 11's virtual SDR-in-HDR curve from piecewise sRGB to Gamma 2.2

Edit: some extra info here: An implementation bug in ACM/HDR · Issue #32 · dantmnf/MHC2 · GitHub

Yes, this is my opinion. In this way we can get a optical-accurate linear render space, and reduce unneeded gamma conversion. This should work correctly if using Display P3 for display too, since it actually also follows sRGB’s reference display behavior.

This would mean any sRGB PBR colour (albedo) textures must be 2.2 power linearized before lighting calculations are performed with them. Is this the way that things are normally handled in the industry? Are “standard” sRGB albedo textures that people generally produce intended to be interpreted this way before lighting calculations are performed on them instead of using the piecewise linearization function?

I thought about this again. Textures may should still use sRGB curve. Since creators won’t directly display them as pictures but render them with models, and the tools they used to create textures have higher possibility to use sRGB curve rather than 2.2 curve.