Output Transform Tone Scale

After last week’s meeting, I was curious to see if I could solve @daniele’s tonal compression function for arbitrary mid grey and diffuse white intersections.

After a lot of trial and error (mostly error), I managed something.

  • d_{0}d_{1} maps input middle grey x to output middle grey y.
  • w_{0}w_{1} maps input white x to output white y.
  • p adjusts contrast
  • t_{0} adjusts toe / shadow / flare compenstation.
  • s_{x} and s_{y} are the input and output domain scales for the intersection points.

If t_{0} = 0, the input → output mapping is exact. If t_{0}>0, the input → output mapping will be changed slightly, depending on where the toe pivot p_{t} is set.

I tried for a long time to solve for the intersections with toe pivot adjustment integrated, but the equations become unmanagebly complex and beyond my capabilities. With the compression and toe equations separate, they are both much simpler.
I was able to solve the toe equation for 2 intersects at d_{1} and w_{1}, however this causes undesireable behavior, because as you increase t_{0}, values between d_{1} and w_{1} increased.

I settled on a compromise where the toe compression function pivots around a single value p_{t}, which can be set to taste. If it’s set at 1.0, d_{0} gets a bit darker when t_{0} is increased. If it’s set to 0.1, w_{1} gets a bit brighter when t_{0} is increased.

I’m pretty sure there is a better and/or simpler way to do these things, but I’m just doing what I can with my current math ability. Any suggestions I would be happy to hear them!

As usual, here’s a Nuke node implementation as well, for testing:
ToneCompress_v2.nk (2.8 KB)

I’ve included a few “eye matched” non-scientific presets for SDR and a few different flavors of HDR.

3 Likes

Great work as ever, @jedsmith. I’ve been experimenting with using that tone scale node in Nuke, combined with the @doug_walker /@garydemos Weighted Yellow Power Norm to create my own version of a naive DRT (nothing against yours – I just wanted to build one myself so I understood exactly how it worked and how to invert it). I’ve also built a K1S1 emulating LMT for it, but not having an HDR monitor I need somebody else to judge how it looks in HDR.

I have to admit, I have not fully wrapped my head around how it works. It seems currently that the HDR presets all have the same contrast and toe values, which has the effect of making the shadows increasingly crushed as peak luminance rises. But this can be compensated for by changing the toe values.

2 Likes

Power norms do not work.

I do have an HDR monitor, and 'd love to have a look…

I’ve been doing a lot of my own experiments as well with various permutations of Jed’s / Daniele’s / Doug’s / Troy’s / my own stuff too. Nothing too exciting to share.

I’ve also been comparing the HDR and SDR variants of ACES-1 vs TCam2 vs IPP2 vs ALF2 on all sorts of imagery; fire, for instance, looks pretty different in ACES, compared to others.

Yes, Daniele suggested that the toe and contrast parameters could be controlled for by a higher-level model. I’ve only dabbled briefly with it, but I like that it seems to maintain its appearance for different viewing conditions…

Anecdotally, to my eye, the Weighted Yellow Power Norm looks almost exactly like a max(R,G,B) norm, except with slightly darker blues, which serves to almost comparatively ‘denoise’ renders [of plates with grain / noise, not of CG]… maybe I’m not looking at the “right” mixtures, but the WYPN seems to smooth textures in a way max(R,G,B) does not (i.e., with noisy blue channels), and so far I’m digging it.

Norms

To visualize what different norms are actually doing I’ve found it useful to plot hue sweeps.
MaxRGB keeps it’s shape.


Vector Length / Euclidean distance forms a smooth curve through the secondaries but the cusp at the primaries sharp.
norm_euclidean
Power norm is pretty wonky and really depends on what power you use. This is with 2.38 :confused:

And sweeps in the same order:

Here’s a nuke setup with a bunch of norms and the sweep if anyone wants to play.
plot_norm_sweep.nk (20.6 KB)

Updated ToneCompress Formulation

I found the intersection constraints and behavior in my last post to be a bit inelegant. I kept working on the problem and I’ve come up with something which I think is better.

ToneCompress_v3.nk (4.1 KB)

Changes

I’ve made a couple of changes to my previous approach.

  • Move toe/flare adjustment before shoulder compression in order to avoid changing peak white intersection. This simplifies a lot the calculations for the peak white intersection constraint.
  • Simplify parameters a bit.
  • Remove constraint on output domain scale. This lets us scale the whole curve. This will change middle grey though. Curious to hear thoughts on this. Is it okay for mid-grey intersection constraint to be altered when adjusting output domain scale?
  • Change shoulder compression function to a piecewise hyperbolic curve with linear section. After a bunch of experimenting, I’ve decided that I like the look of keeping a linear section in the shoulder compression. There is a bit more… skin sparkle.
    Here are a couple of pictures to show what I mean.

5 Likes

If you change the order of operation you give up a scene referred exposure control.

Not sure I would want to give this up.
Also the flare is more display referred I would say.
What was your motivation to change the order ?

1 Like

Hey @daniele, thanks for the feedback!
I hear what you are saying about the flare adjustment being more display referred, and it does make sense.

My motivation was because I was fixated on getting a precise solve for a middle-grey intersection constraint. The math for this solve became more and more complex the later in the chain the flare adjustment was, so I thought it was a decent workaround to put it earlier.

I believe there is still a scene-referred exposure control with this approach (in that desmos graph I linked, adjusting g_1 is a scale on input x). It’s quite possible I’m missing something here.

Based on my testing the max value I would probably use for the toe/flare adjustment in SDR is about 0.01.

At this point I am on the verge of removing the toe from the middle grey intersection constraint and putting it after the shoulder compression, with a pivot at 1.0. It makes the whole thing a lot simpler.

Next week I’m probably going to come back around to the original version you posted and finally understand why you designed it the way you did :stuck_out_tongue:

Really fascinating stuff! Would you mind posting a link to the EXR image from @ChrisBrejon so I can try it out in your Nuke scripts?

Interesting, what is the reasoning?

You would model flare in scene-referred domain differently. Also flare in scene-referred state is a “per shot” adjustment. A global display glare/flare is a fixed operation per viewing condition typically (also you could do better with more complex image dependent models). Also that particular form is quite display driven because it does not physically correctly models flare but maintains shadow information. So we rate detail over accuracy.

3 Likes

Given I brought up the point in the last meeting, and given several birdies have asked that I post the question here, I figured I’d bump up the recent revival of this thread.

First, a salient quote from Jed Smith:

Even in this valuable post of Jed’s there is an implicit assumption of what “tone” is. These sorts of a priori assumptions will doom any attempt at modelling. What is this particular version of “tone”?

Further down this ladder, I’d like to highlight a quote that @Alexander_Forsythe made in Slack, when I, being a buffoon, kept repeating the question “What is tone?”

I’ll have to look in the usual places to see if it’s already defined but I’ll take a shot off the top of my head. Take with a grain of salt.

Tone mapping: the intentional modification of the relationship between relative scene luminance values and display luminance values usually intended to compensate for limitations in minimum and maximum achievable luminance levels of a particular display, perceptual effects associated with viewing environment differences between the scene and the reproduction, and preferential image reproduction characteristics. Tone mapping may be achieved through a variety of means but is quantified by its net effect on relationship between relative scene luminance values and display luminance values.

That is, if this definition of “tone” is acceptable, does anything that uses the term “tone” (EG: “tone mapping”, “Simple Stage Tone Scale”) actually perform the implied mechanic? Does a “tone map” actually “map tones”?

None of the functional formula examples thus far map “tones” according to this definition. Either this should be considered a show stopper, or further interrogation is required.

Given the above, if we revisit this issue cited by Forsythe, we can ask some further questions…

A few questions:

  1. Is there a conflation between chroma and tone in some of these nuanced discussions?
  2. If there is, how are the two related?
  3. If there is a relationship, can anyone answer, in a clear statement, what “Film magic” did regarding tone, and this nuanced interaction?
  4. Given tone’s importance here, and specifically the “shape” of classical film negative to print to print through resultant density, is everyone confident in the evaluation of film density plots being 100% correlated to emission of light from a fixed chromaticity display?
  5. If not, is it feasible to consider how film density plots correlate to light transmission?
  6. Further along, is the correlation between film dye density and transmission related to discussions of chroma, as per Forsythe’s question?

TL;DR: Perhaps we should be avoiding discussions of “bleaching” or “desaturation” and interrogate precisely what film was doing, and more importantly, what underlying perceptual (?) mechanic was it facilitating that makes the rendition of imagery successful in the medium, and how is it related to “tonality”? With respect to gamut mapping “tones”, the underlying definitions and potential mechanics should be clearly identified to evaluate whether or not any mechanic is actually doing what it purports to achieve?

A while back @Alexander_Forsythe suggested it would be wise to include a discussion of a Jone’s Diagram such as the following. It seems related to this discussion, and why the “s” curve is even a part of the discussion around “tonality”.


Image from here.

Apologies to anyone who feels this is pedantry. From my dumbass vantage, I cannot see how we can solve any design problem without firmly locating what the problem is, in clear and precise terms. Even given the large amount of discussion on this subject thus far, I’m unconvinced we have definitions clear enough to write any algorithm for.

4 Likes

Ok, so we are talking about display-flare then, might be worth clarifying for people casually reading the thread. As you say it is also very much viewing conditions dependent, how do you quantify it?

Cheers,

Thomas

As I was re-reading that, I could not help but think about the DRT impact on Lookdev. You are effectively doing shading work and adjusting specular response here! :slight_smile:

1 Like

Could we aim at something like the Filmic from Troy ? With several “Looks” for contrast ?

Five contrast base looks for use with the Filmic Log Encoding Base. All map middle grey 0.18 to 0.5 display referred. Each has a smooth roll off on the shoulder and toe. They include :

  1. Very High Contrast.
  2. High Contrast.
  3. Medium High Contrast.
  4. Base Contrast. Similar to the sRGB contrast range, with a smoother toe.
  5. Medium Low Contrast.
  6. Low Contrast.
  7. Very Low Contrast.

Chris

Name a single processing that doesn’t? Default per channel tosses in a grade that varies per colour space, per display, but few seem to comment on that?

Perhaps this is indicative of an entire thought process that isn’t extending from first principles…

1 Like

Encode for display and adjust exposure… :slight_smile:

That feels like the right track!

So why doesn’t any example provided anywhere on this site aim at that ground truth?

Happy to take stab at helping us speak a common language.

I think the term “Tone Mapping” and “Tone Mapping Operator” largely comes from the computer graphics side of the coin whereas in historical terminology the topic as a whole is referred to as “Tone Reproduction” and the “Tone Reproduction Curve”. Where the former is largely used in the context of Image to Image “mapping”: HDR EXR to SDR JPEG for example. Whereas the latter definition is largely used in reference to Scene to Image “reproduction”, where both the Scene and resulting Image have “Tone” or “Tone Scale”, but the objective quality metric is the “reproduction” of “Tone” from one domain to another.

I’m sure you could point out many examples of each term used in an alternative manner, but that’s generally how I would look at it.

The term “Tone” its self is not scientific. There is a reason this does not show up in any color appearance models, nor the CIE Vocabulary (e-ILV | CIE), it is exclusive to the media/imaging community.

I would place the origin of this concept in our community/domain first to Hurter and Driffield and their pioneering work on sensitometry, and later to L.A. Jones for his investigation of Tone Reproduction across the full imaging chain. I’ll leave it to the reader to explore the works of these individuals.

Specifically with respect to Jones and his 1920 “On the theory of tone reproduction, with a graphic method for the solution of problems” (https://doi.org/10.1016/S0016-0032(20)92118-X). He defines the problem space as “…the extent to which it is possible by the photographic process to produce a pictorial representation of an object which will, when viewed, excite in the mind of the observer the same subjective impression as that produced by the image formed on the the retina when the object its self is observed…”. He then goes on to say “The proper reproduction of brightness and brightness differences…is of preeminent importance…”. Brightness does have a definition in the color science community, to quote Fairchild “…visual sensation according to which an area appears to emit more or less light”. Jones utilises an explicit simplifying assumption, which I believe is largely implicit in most discussions of “Tone Curves” or “Tone Mapping Operators”. That is, that all objects in the scene are non-selective, and the imaging forming mechanism is also non-selective. Non-selective meaning it absorbs/reflects all incident electromagnetic energy equally. He uses this simplification because “under such conditions values of visual brightness are directly proportional to photographic brightness.”

There are of course deviations in perceived brightness of the scene and reproduction with respect to many factors, reflective selectivity being one of them, that breaks the simplifying assumption. However, the simplifying assumption is used both for convenience, and to focus analysis on the primary principal component of Image Reproduction, or more specifically Tone Reproduction. With the added assumption that the dominant illuminant in the scene and the image viewing environment is the adopted illuminant, you could call this problem space more specially “Neutral Scale Tone Reproduction”.

All that to say, “Tone Reproduction” is largely focused on the reproduction of Scene brightness (and relative brightness) of non-selective objects onto a non-selective Image medium.

I would argue that the implicit assumption (of neutrality/ non-selectivity) is generally held by most that discuss the topic, and that rather the reference to “Film” is significantly more vague and less defined than “Tone”. “Film” being an incredibly varied and complex quantum mechanical piece of technology with over a century’s worth of innumerable manifestations. “Film” has done many things, from image my teeth at the dentist, help prove Einstein’s general theory of relativity, to being scratched with a knife by an animator and projected.

We would do well to define “Film” in the same rigour as “Tone”, because I don’t believe the objective here is to emulate a single batch of a single brand of photo-chemical film processed on a certain day.

11 Likes