Where should mid-gray end up through an Output Transform?

Well, hello again :wink: It looks like this thread is becoming my favorite one… As requested, I share my naive question about midgray here, hoping that you may find it relevant. As explained before, I am here to make sure team CG is heard… :stuck_out_tongue:

Main question is about midgray. Currently 0.18 scene-referred pegs at 0.10 in display linear light, because of theaters, right ? Do we want to stick to that ? I guess it brings another question : should movie theaters be the main target of the Output Transforms ? With Netflix and 5G being here ?

There have been already some proper answers about this topic. Here are a few :


This seems like a resonable point to debate. It’s not just because it’s a theatre, but because that’s where SDR content in a theatre is usually placed. In an HDR world it might make sense to move it up … frankly, we can move it per output though so it’s not a super important decision.


and currently the Dolby 108-nit OT places gray at 7.2 nits, while 1000+nit OTs place it at 15 nits. at one point the dolby one was tested at 10 nits, which made it hold to the 10% of peak white that we have in SDR (48:4.8). however, at a certain point that 10% ratio breaks down - you don’t want it at 100 nits for 1000 nit display! additionally, i had to point out that the dolby 108 nit odt only really added a stop of headroom, and by moving gray up a whole stop, you were not really gaining anything in the highlights - hence the 7.2, which allowed for a little more oomph by default but still left a little more headroom for highlights


certainly the BBC team have talked about HLG diffuse white being typically a stop higher than SDR


BBC recommend 203nit diffuse white, partly I think so HDR doesn’t look darker than SDR which you switch a consumer TV. ITU-R BT.2408 formalises that.


Kind of have same remark as to whether we still want to have sdr theater as the golden target other OT should match. I think this was discussed when we talked about preserving intent vs. using the medium to it’s full extent. I guess with more technically derived OT this kind of questions are less relevant.

I am just curious because ACES is now a standard in the CG industry and not only in the movie industry. What about folks watching a TV/Youtube/Vimeo show on their mobile, tablets or screens at home ? Is there such a thing a completely dark theater (most of them in Paris have ambient lighting coming from the floor/stairs and emergency exits for security reasons…) ?

I have more questions than answers for sure… I guess OT are tricky because they are not only mathematical but also perceptual. More questions will come about :

  • Surround : a pure power function pushes midgray, right ? Is this questionable for imagery ?
  • Refactor existing VS indepedent new beginning ? I am curious to see if there is agreement on this one. What should we do about per channel lookup and the infamous sweeteners of the RRT (glow module, red modifier and global desaturation) ?
  • max(RGB). I have often been told that it should not be used as a tone-mapper but more of a 2 stage solution, which would include gamut mapping… We are looking at a norm that literally delivers the chromaticity directly at the same energy level, right ?

Thanks for your help and insight !



I was re-reading the great ACES_RAE document and having talked to a few supervisors of several animation studios in Europe/Canada, I thought it might be interesting to highlight a few things.

In the ACES RAE document, it is written that :

It is not uncommon to hear people saying they do not like the cumulative effects: crushing effect
on shadows, and heavy highlight roll off, with too much look for a rendering that should be average.

It is the number one thing that I am hearing when I give a hand about ACES, that is very dark and contrasted. So I thought that pegging 0.18 pegging at 0.18 in display linear light on an SDR display would help with this (rather than 0.10 at the moment).

If I shoot a grey card out in the wild, then I take it home and display it on our iPad or Television, I would expect the middle grey to be pegged at 18% to be honest.

Another interesting thing that is mentioned in the ACES_RAE pdf is about Epic Games and Unreal, showing that the spread of ACES is way outside of the cinema/theater scope. For example, the AR/VR examples are often used to justify the need of invertibility. So I imagined the same arguments could be used for pegging mid-gray at 18% on an SDR display.

This is why my question was also about the use of ACES in the CG industry, outside of the movie industry and if we are aiming at an average/universal output transform.

Hope this makes things clearer,


1 Like

The first question to ask those people is whether when they have made that assessment they were working with a display and environment calibrated accordingly. The sRGB ODT mandates using the Piece-Wise EOTF and not Gamma 2.2 along with a Dim Surround. Only when that conditions are met that then it is really possible to say it exhibit too much contrast with which I agree btw!

There is no rule stating that a camera should do expose an 18% gray card like so, most of the camera and their light meters are calibrated with different constants and thus will not produce the same image for the same scene. This is actually one of the core discussion topics of the IDT working group.

I don’t know if this is comparable though, invertibility in the cited AR context is required from a technical standpoint, it has not much to do with aesthetics really.

The differences between SDR Theatrical Exhibition under Dark Viewing Conditions and HDR on a phone display watched under Average Viewing Conditions are so vast that there is no way a universal output transform would work for them. The underlying model could be the same though, e.g. SSTS.



Hey @Thomas_Mansencal , thanks for your answer.

Yes, it is the first thing I checked with them : display and surround. I do remember your numerous answers on this topic on acescentral and I also remember that you agreed with the too much contrast statement. :wink:

I forgot to mention that one issue mentioned by many studios is that producers like to review on their ipads. Sometimes in their car on their way to work. Hence my example. But I also get your point on IDT.

Another thing that made me think about this is the recent announcement from Warner Bros Studios to release on HBO Max their entire 2021 slate of movies. I thought that it was worth mentioning in this conversation.

Would it be possible to detail the acronym when you use them ? It is just that I am not very familiar with all the acronyms used and even a google search is not helping in this case. Does it stand for Synchrosqueezing transforms? :wink:

I have more questions than answers obviously and I am happy to discuss and learn on these topics.


SSTS stands for Single Stage Tone Scale, and is the tone mapping functions introduced in the 1.1 HDR Output Transforms, replacing the RRT + ODT two stage transform from previous versions, and which is still used for SDR Output Transforms.

Provided they don’t do any colour critical review work when that is the case, it should be allright. He could have had his sunglasses at this time…

I certainly can but where do I start, where do I stop? I used Input Device Transform (IDT) above :slight_smile:




Thank you both for your answers. That’s really appreciated !

yes, this is my first post ever to aces central.
(couldn’t resist - middle grey is my favorite topic.)
(it is also, for the moment, my last post to aces central.)

the mapping of 0.18 in scene space to 0.10 in sdr output referred space
has a long history. in film, a “perfect” 18% grey onset exposure
resulted in a “LAD” (laboratory aim density) on the negative, and this
was printed to a “LAD” of approximately 1.00 over base on the print.
as density == log10(1/transmittance, a density of 1.00 allowed exactly
(wait for it…) 1/10 of the light through the film print -
thus resulting in 0.10 of peak luminance in the theatre.
this 10% peak luminamce worked out fine for sdr rec709 home video as well.

one of the most amazing confirmations of this being a “good choice”
had to do with the STEM material - this was a bunch of material shot by
ASC DP’s as an early test for the assessment of digital cinema projection.
this wasn’t shot and printed with grey charts - it was lots of different
setups (day/night/warm/cool/etc) that were all graded to “look really good”.

a few years later, lars borg did an analysis of the digital cinema output
referred STEM material - and he calculated that the average linear floating
point RGB pixel values for the ENTIRE stem material was 0.11 0.09 0.11.

  1. this is really frakking close to 0.10
  2. it’s slightly magenta because the DCI white point is slightly green -
    so this green bias was graded out - just by eye - so that the
    results “look really good”.

so that target of 0.10 peak luminance for traditional output referred
displays happens to be…well…let’s say “correct”.
(and the fact that we’re calling 0.18 reflectance the “average scene
luminance” is another talk for another time…)

now we come to the horror^H^H^H^H^H^H joy which we call “HDR”:

  1. there is no standard “peak luminance” - you can get so-called HDR
    displays that peak anywhere from 200 to 3000 nits. and the image is
    supposed to “look right” on all of these.
  2. the creative community is split - one half wants their HDR to look
    basically the same as their SDR with some extra highlight (and occasionally
    shadow) detail, while the other half wants to flaunt and revel in
    the total HDR sandbox (“if you’re gonna give me that space to work in,
    i’m gonna take full advantage of it!”)

so we have a conundrum. the “7.2” and “15” nit targets mentioned by scott
dyer elsewhere in this discussion were essentially a compromise between
the two creative strategies outlined above. and this may be the best we
can do right now - arrive at a “compromise” that will allow the creatives
to get to where they want to go without too much effort.

hopefully, HDR displays will coalesce around some nominal peak luminance,
and then we would be able to discuss “Where should mid-gray end up through
an Output Transform?” for the moment, we should keep 10% of peak luminance
for SDR. maybe once we have a large number of HDR releases of different
subject matter to analyze, lars borg can repeat his analysis and we can
thus determine where the creatives think mid grey should end up in HDR.



Welcome here Joshua,

It is great to have you around and what a great entrance! Thanks for this great post!



Indeed, such a great post ! Thanks for sharing that with us.

I guess what I can’t wrap my head around is if people watch AR/VR or play video games in dark theaters ? Just kidding… But worth debating at least. At least I have learned something today. :wink:

Thanks everyone !


Picking this back up as I’m not sure it was fully concluded and it’s interesting to consider.

To tie a bow on one of Chris’ original concerns was about how dark and contrasty ACES outputs look, this has been discussed here and in other areas about that likely stemming more from the RRT implementation (which by the sounds of things will likely change in the next iteration of ACES) and less from the ODT. While this was the likely cause of the behavior he was seeing, the conversation led down an interesting road.

I don’t deal much with HDR deliverables at the moment, so most of this is theoretical for me (so take it with a grain of salt). It was mentioned somewhere about the diffuse white of HDR being higher than SDR and how this may affect the mid-grey level of HDR in comparison to SDR. From what I remember reading a bit ago, when they were forming HLG a study revealed that most consumer displays were actually being used much brighter than the 100-nit standard, and so the 203-nit diffuse white was a better match for “real world” SDR. I’m pretty sure the original PQ spec was based around the 100-nit SDR standard, but eventually a PQ 203-nit variant was also created that more closely tracked HLG, and appears to be the operating standard now, at least according to BT.2408.

BT.2408 establishes an 18% grey card/mid-grey value at 38% for both PQ and HLG; is there a reason we wouldn’t stick with this in the output transform?

As far as display brightness (peak luminance) is concerned for HDR, I think we’ll be chasing our tails if we get too concerned with specific display brightnesses outside of established standards or “best practices” (e.g. 1,000 nits peak for HLG is a common reference point). Theoretically any displays not meeting the standard should employ some type of tone (and gamut) mapping, which is outside of our control as the content creators (and admittedly some devices do this better than others). PQ takes this a step further and allows scene (or shot) metadata, theoretically to allow some “creative intent” input to the tone mapping process.

The concept of surround luminance is a whole other topic (and much more complex than I know about), but again, this compensation has to happen at the display device itself and is outside our control. I am perhaps just unaware, but I don’t know of any deliverable or metadata system that would allow differing settings from the content side based on surround illuminance of the display device (e.g. delivering a different “grade” of a film based on the device’s indicated viewing surround environment).

I think remapping based on ambient lighting conditions is very important, and this is precisely what Dolby is trying to accomplish with Dolby Vision IQ. It’s still in its infancy but it sounds promising. I haven’t seen any IQ-specific controls in the DoVi metadata, but presumably that could be included in a future revision.
It would also make sense for mobile devices which all have light sensors built in already. I just learned my LG CX has it built in, but I didn’t notice before because it’s not available in Cinema mode.