Output Transform Tone Scale

Ah I see where the confusion comes from. For me, HDR reference white is not peak nits but diffuse white (I used to call this paper white but I switched after reading BT.2408-3). Basically the output value of ssts(1.0). I do not consider the inverse EOTF (or the OOTF in the case of HLG) at this point. Instead, my only consideration is: where do I want this to land in absolute land and I can validate this by doing a straight InversePQ(value / 10000) without any further processing.

Here’s the quote from which I take my definition:

The reference level, HDR Reference White, is defined in this Report as the nominal signal level of a 100% reflectance white card. That is the signal level that would result from a 100% Lambertian reflector placed at the centre of interest within a scene under controlled lighting, commonly referred to as diffuse white. There may be brighter whites captured by the camera that are not at the centre of interest, and may therefore be brighter than the HDR Reference White.

Graphics White is defined within the scope of this Report as the equivalent in the graphics domain of a 100% reflectance white card: the signal level of a flat, white element without any specular highlights within a graphic element. It therefore has the same signal level as HDR Reference White, and graphics should be inserted based on this level.

Only for achromatic R=G=B. For everything else it’s a skew.

I agree 100% that it’s a skew since we evaluate it per-channel and that brighter saturated colours have it worse. The stated goal though is to have a float3 vector which gives the expected output in absolute luminance if the display gamut was AP1. Given that we need to perform AP1->Display gamut conversion then it’s more like a rough idea but it remains useful enough that I’ve been able to fake a SDR from the PQ output of the HDR transform without inverting everything. I only used PQ->Linear, a few rescale, clip and gamma operations, Bt2020->Bt709 followed by more gamma adjustments (sRGB monitor curve folded in the lot). Why? XBox GameDVR obligé :slight_smile:

Just a quick post to say that I’ve updated the “ToneCompress” nuke node in my post above.

  • I successfully worked out the math for the the inverse transform. (making progress huh? :smiley: Couldn’t have done that a year ago!)
  • I fixed a bug with the blue channel
1 Like

Anyone tried fitting the current HDR transforms with the Michaelis-Menten inspired curve?

This weekend I’ve spent some time more deeply examining the math of the Michaelis-Menten style compression function that @daniele posted. I have a few useful observations and discoveries which I will share here.

I think it is useful to split apart the compression function into discrete components. I keep going back to the simplicity of my first prototype NaiveDisplayTransform. In that prototype it was very easy to calculate a gamut volume compression as display referred output approaches display maximum, because the “shoulder” compression was an isolated operator. In psuedocode,

norm = max(r,g,b)
shoulder_compress = compress(norm)
factor = 1 - shoulder_compress / norm
factor = pow(factor, bias)

factor is then the factor for the lerp towards 1.0 in the rgb ratios. Super simple, and looks better than the hacks I’ve been doing in my last versions of the OpenDisplayTransform where I am using some power bias of the compressed norm as the factor.

But for this approach to work, shoulder compression must be separated from the other components of the compression.

So! I started digging in to the math of Michaelis-Menten. Daniele’s compression curve is a combination of 4 main things:

  • A normalization factor (basically a multiply).
  • A shoulder compression to accomplish highlight intensity rolloff.
  • A power function to adjust contrast and do surround compensation.
  • A toe compression to do flare compensation.

Shoulder
The shoulder compression function is the first thing I started digging into. While implementing my Hill-Langmuir sigmoid compression function, I remember reading in the wikipedia article about it that the Hill-Langmuir equation is a special case of a rectangular hyperbola.

In fact, when n=1 in the Hill-Langmuir equation, the function is a rectangular hyperbola. In it’s simplest form a rectangular hyperbola is the function f\left(x\right)=\frac{1}{x}

When n=1 in the Hill-Langmuir equation above f\left(x\right)=\frac{x}{x+1} Does this look familiar? Yes, it’s a “simple Reinhard” compression function, which is a hyperbola that is offset such that it passes through the origin.

What is super super interesting about these curves however, is how they look on a x-log, y-linear plot. Spoiler: It’s a sigmoid!

Here is a desmos plot showing this:

Toe
So what happens in the Hill-Langmuir equation above when n>1? We are increasing the strength of the toe compression. I didn’t understand this until seeing it split apart in Daniele’s compression curve, but the function is a skewed parabola: f\left(x\right)=\frac{x^{2}}{x+a}

I got really fascinated by this function because I’ve never really seen parabolic functions used in image processing before, and they have a number of very interesting qualities:

  • Exponential (parabolic?) increase in compression as the x value approaches the vertex of the parabola.
  • Pretty much linear beyond a certain distance from the vertex.

Parabolic Compression Function
So I spent a few days reading about and playing around with Conic Sections and Parabolas, which I haven’t really investigated since highschool many years ago. One super interesting form of this function is as follows f\left(x\right)=\sqrt{2cx+\left(j^{2}-1\right)x^{2}}, where j is the eccentricity and c is the slope.

Depending on the eccentricity, it smoothly transitions form an ellipse to a parabola, to a hyperbola. Pretty crazy! :smiley:

Then I got to thinking… (Can you tell I’m really good at getting side-tracked?). Maybe this parabolic function could work really well as a compression function. Back in the gamut mapping virtual working group, the compression function which I liked the look of the most was actually the log compression function. Something about having a more linearly increasing slope over the compressed area, to distribute compressed values more evenly, helped preserve more tonality in affected regions.

This is the desmos plot of that log function

Of course the problem was that there was no closed-form solution for solving for the y=1 intersection (at least not that I could find at the time).

So I spent a little while investigating if there could be a way to create a parabolic compression function which operated in a similar way. It took me a while but I figured out the math to do it. I’ll link desmos plots of my process below in case anyone is interested in this stupidly nerdy stuff:

And finally, in simplified form, with solution for intersection constraint:

f\left(x\right)=\left\{x\ge t:\ c\sqrt{x-t+\frac{c^{2}}{4}}-c\sqrt{\frac{c^{2}}{4}}+t\right\}

where c is the calculated scale factor based on some constraint coordinate which the function must pass through: c=\frac{\left(1-t\right)}{\sqrt{\left(l-t\right)-\left(1-t\right)}}, l is the x coordinate at y=1 that the compression function must pass through, and t is the threshold at which compression starts.

I literally just figured this out so I haven’t really tested it yet, but I’m super curious to see how it works for gamut mapping distance compression.

As usual here it is as a nuke node as well:
CompressParabolic.nk (2.7 KB)

I’ll stop rambling about nerd stuff now. Just wanted to share some of the things I’ve been up to in case anyone is interested or finds this useful! :slight_smile:

6 Likes

Why do you need to split this all up?

I suppose I do not have a good reason to split all of the different pieces up into individual operations (besides this making it easier for me to understand more completely).

I guess the only valid reason is to get access to the shoulder compression in isolation from the other steps for the “path to white” factor, which I mentioned above. Although perhaps this is invalid as well, if I’m missing something obvious. (Quite possible).

1 Like

Several years ago now I had investigated a few of these functions for use as a tonescale (although you have presented quite a few more than I remember). The problem I always encountered was that if I adjusted one part of the curve, too much of the rest of the curve changed. So, for example, if I had the position and slope of mid-gray where i wanted it but then tweaked the shoulder ever so slightly, it would affect both the midpoint and max point slightly. I wanted exact control over each key point that I cared about - and I didn’t want to set a value where I liked it and then need to go back and retweak the parameters if I changed a different value. It was infuriating and felt like pure trial and error curve tuning.

This is one of the reasons I eventually resorted to the SSTS, which is two B-splines joined intelligently at the mid point. It offered me exactly the control over the key points I wanted - the min, mid max locations and slopes, as well as the “sharpness” of the bend between min-mid and mid-max.

I’m saying this mainly for context and not trying to say B-splines are the only solution and I have been thorougly enjoying your posts and explorations into other curve options, @jedsmith. If you can find a solution that works and is simpler than the SSTS that will be fantastic.

5 Likes

Wouldn’t this create a “kink” though ? I don’t have a HDR monitor to check but I was curious about it.

image

Also, I am linking here this topic that is related :

Hope this helps,
Chris

1 Like

Cool stuff, keep it coming!

Quick remark: We did play quite a bit with hyperbolic ones, e.g. tanh, which is one of the 4 conic sections, those with eccentricity > 1.

Hi,

I semi-randomly came across this while looking through litterature: Hyperbola tone mapping

Cheers,

Thomas

2 Likes

After last week’s meeting, I was curious to see if I could solve @daniele’s tonal compression function for arbitrary mid grey and diffuse white intersections.

After a lot of trial and error (mostly error), I managed something.

  • d_{0}d_{1} maps input middle grey x to output middle grey y.
  • w_{0}w_{1} maps input white x to output white y.
  • p adjusts contrast
  • t_{0} adjusts toe / shadow / flare compenstation.
  • s_{x} and s_{y} are the input and output domain scales for the intersection points.

If t_{0} = 0, the input → output mapping is exact. If t_{0}>0, the input → output mapping will be changed slightly, depending on where the toe pivot p_{t} is set.

I tried for a long time to solve for the intersections with toe pivot adjustment integrated, but the equations become unmanagebly complex and beyond my capabilities. With the compression and toe equations separate, they are both much simpler.
I was able to solve the toe equation for 2 intersects at d_{1} and w_{1}, however this causes undesireable behavior, because as you increase t_{0}, values between d_{1} and w_{1} increased.

I settled on a compromise where the toe compression function pivots around a single value p_{t}, which can be set to taste. If it’s set at 1.0, d_{0} gets a bit darker when t_{0} is increased. If it’s set to 0.1, w_{1} gets a bit brighter when t_{0} is increased.

I’m pretty sure there is a better and/or simpler way to do these things, but I’m just doing what I can with my current math ability. Any suggestions I would be happy to hear them!

As usual, here’s a Nuke node implementation as well, for testing:
ToneCompress_v2.nk (2.8 KB)

I’ve included a few “eye matched” non-scientific presets for SDR and a few different flavors of HDR.

3 Likes

Great work as ever, @jedsmith. I’ve been experimenting with using that tone scale node in Nuke, combined with the @doug_walker /@garydemos Weighted Yellow Power Norm to create my own version of a naive DRT (nothing against yours – I just wanted to build one myself so I understood exactly how it worked and how to invert it). I’ve also built a K1S1 emulating LMT for it, but not having an HDR monitor I need somebody else to judge how it looks in HDR.

I have to admit, I have not fully wrapped my head around how it works. It seems currently that the HDR presets all have the same contrast and toe values, which has the effect of making the shadows increasingly crushed as peak luminance rises. But this can be compensated for by changing the toe values.

2 Likes

Power norms do not work.

I do have an HDR monitor, and 'd love to have a look…

I’ve been doing a lot of my own experiments as well with various permutations of Jed’s / Daniele’s / Doug’s / Troy’s / my own stuff too. Nothing too exciting to share.

I’ve also been comparing the HDR and SDR variants of ACES-1 vs TCam2 vs IPP2 vs ALF2 on all sorts of imagery; fire, for instance, looks pretty different in ACES, compared to others.

Yes, Daniele suggested that the toe and contrast parameters could be controlled for by a higher-level model. I’ve only dabbled briefly with it, but I like that it seems to maintain its appearance for different viewing conditions…

Anecdotally, to my eye, the Weighted Yellow Power Norm looks almost exactly like a max(R,G,B) norm, except with slightly darker blues, which serves to almost comparatively ‘denoise’ renders [of plates with grain / noise, not of CG]… maybe I’m not looking at the “right” mixtures, but the WYPN seems to smooth textures in a way max(R,G,B) does not (i.e., with noisy blue channels), and so far I’m digging it.

Norms

To visualize what different norms are actually doing I’ve found it useful to plot hue sweeps.
MaxRGB keeps it’s shape.


Vector Length / Euclidean distance forms a smooth curve through the secondaries but the cusp at the primaries sharp.
norm_euclidean
Power norm is pretty wonky and really depends on what power you use. This is with 2.38 :confused:

And sweeps in the same order:

Here’s a nuke setup with a bunch of norms and the sweep if anyone wants to play.
plot_norm_sweep.nk (20.6 KB)

Updated ToneCompress Formulation

I found the intersection constraints and behavior in my last post to be a bit inelegant. I kept working on the problem and I’ve come up with something which I think is better.

ToneCompress_v3.nk (4.1 KB)

Changes

I’ve made a couple of changes to my previous approach.

  • Move toe/flare adjustment before shoulder compression in order to avoid changing peak white intersection. This simplifies a lot the calculations for the peak white intersection constraint.
  • Simplify parameters a bit.
  • Remove constraint on output domain scale. This lets us scale the whole curve. This will change middle grey though. Curious to hear thoughts on this. Is it okay for mid-grey intersection constraint to be altered when adjusting output domain scale?
  • Change shoulder compression function to a piecewise hyperbolic curve with linear section. After a bunch of experimenting, I’ve decided that I like the look of keeping a linear section in the shoulder compression. There is a bit more… skin sparkle.
    Here are a couple of pictures to show what I mean.

5 Likes

If you change the order of operation you give up a scene referred exposure control.

Not sure I would want to give this up.
Also the flare is more display referred I would say.
What was your motivation to change the order ?

1 Like

Hey @daniele, thanks for the feedback!
I hear what you are saying about the flare adjustment being more display referred, and it does make sense.

My motivation was because I was fixated on getting a precise solve for a middle-grey intersection constraint. The math for this solve became more and more complex the later in the chain the flare adjustment was, so I thought it was a decent workaround to put it earlier.

I believe there is still a scene-referred exposure control with this approach (in that desmos graph I linked, adjusting g_1 is a scale on input x). It’s quite possible I’m missing something here.

Based on my testing the max value I would probably use for the toe/flare adjustment in SDR is about 0.01.

At this point I am on the verge of removing the toe from the middle grey intersection constraint and putting it after the shoulder compression, with a pivot at 1.0. It makes the whole thing a lot simpler.

Next week I’m probably going to come back around to the original version you posted and finally understand why you designed it the way you did :stuck_out_tongue: