RGB Saturation Gamut Mapping Approach and a Comp/VFX Perspective

That said, and to play devil advocated a little bit, it is probably entirely viable to have official LMTs with hardcoded parameters for the operator and for a particular input device.

Now jumping back to the other side of the fence: the problem with that approach, is that it is stepping on the IDT toes and this is simply not great from an atomicity and separation of concerns: it should be the responsibility of the IDT to deliver colours in AP0 or AP1.

At this stage of the work (and to me), the gamut healing/medicina/compress operator responsibility is to guarantee that whatever is being fed to the RRT or rendering processes is within the working space and spectral locus, nothing more, nothing less. It is our guardian to continue image processing with sane values.

Cheers,

Thomas

What’s in the original image?
I’m curious about the capture’s original shadows etc.
How does the original look if just wholly desaturated a bit towards some mid gray?

The current proposals intentionally shift the hues.
Would it look better with a non-shifting method?
Best, Lars

I’m going to be a bit verbose here… bear with me. :bear:

Say for example we have an Arri AlexaWideGamut source image that has had the current 3x3 matrix IDT applied to bring the image into an ACEScg working gamut. Say there are out of gamut values in this RGB space.

Say we use the current proposed gamut compression algorithm with equal max distance parameter values (say 0.2). This process will result in each of the RGB components being compressed equally.

If we process this gamut compressed image and view it through the ACES RRT+ODT, and compare it to the stock Arri image display pipeline, there absolutely will be apparent hue shifts. This is because the IDT portion of this processing pipeline has introduced a difference in the RGB ratios, because the chromaticity coordinates of the source gamut (AWG) are not of equal distance from the target gamut (AP1).

Therefore, the max distance parameter needs to be biased according to the source gamut.

I think that this absolutely depends on the circumstance in which this tool is used.

For gamut compression applied before RRT+ODT, I absolutely agree. A good set of default parameter settings that work well for the most common source images, and are not exposed to the user, would be essential here.

For gamut compression applied in a VFX pipeline by a vendor, this is absolutely not the case. The more parameters and customization the better, in this circumstance. The gamut compress operator is likely to be customized for the specific needs of a show, likely shot by a specific set of digital cinema cameras.

For gamut compression in the DI, parameterization should be less technical and more artistically driven. Tweak the apparent hue and saturation after display rendering transform, make it look good. In this circumstance parameters with a good set of defaults would be essential.

Moving forward I think it might be important to consider for what circumstance the work we are doing is targeted. I haven’t heard that discussed much so far in this working group.

With all that said, here is a proposal for a default set of max distance values. I put together a set of max distance values in ACEScg from a variety of digital cinema camera source gamuts. Note that this is based on the assumption that there are no out of gamut values in the camera vendor’s source gamut, which may or may not be the case.

Gamut Max Distance R Max Distance G Max Distance B
Arri AlexaWideGamut 1.075553775 1.218766689 1.052656531
DJI D-Gamut 1.07113266 1.1887573 1.065459132
BMD WideGamutGen4 1.049126506 1.201927185 1.067178249
Panasonic VGamut 1.057701349 1.115383983 1.004894257
REDWideGamutRGB 1.059028029 1.201209426 1.24509275
Canon CinemaGamut 1.087849736 1.210064411 1.166528344
GoPro Protune Native 1.038570166 1.138519049 1.227653146
Sony SGamut 1.054785252 1.149565697 1.003163576
Sony SGamut3.Cine 1.072079659 1.198700786 1.026392341
Max 1.087849736 1.218766689 1.24509275
Average 1.059741957 1.172203864 1.111424208

With a few outliers (RedWideGamutRGB), there are some common trends here. With some padding, a sane set of default max distance values might be something in the realm of 0.09 0.24 0.12. These numbers were arrived at by both looking at the averages and max of the above distances, as well as evaluating different settings on the source imagery that we have available to work with.

All of the test images so far visually subjectively “look pretty good” with these settings.

I know what @Thomas_Mansencal is going to say - “we should not care about how it looks at this stage, the only thing we should do is compress all values into gamut” - but what if it looks bad after the view transform is applied? We can’t really go back and fix it. And again, I think this statement depends on the context in which this tool is applied.

Please correct me if any of my assumptions are wrong and I’m curious to hear thoughts on this! :slight_smile:

Those aren’t necessarily the max distances if you include the noisy shadows. Here are the ACEScg values (no gamut mapper) for a particularly bad pixel in a dark area of Fabián Matas’ ALEXA Mini nightclub shot:

>>> RGB_ACEScg = [-0.00624, -0.00199, 0.00006]
>>> ach = np.max(RGB_ACEScg)
>>> diff = ach - RGB_ACEScg
>>> diff
array([ 0.0063 ,  0.00205,  0.     ])
>>> diff_norm = diff / ach
>>> diff_norm
array([ 105.        ,   34.16666667,    0.        ])

When max(rgb) is very small, normalising the distance by dividing by it produces very large numbers. I assume this is the logic behind the shadow roll off. But this means that negatives remain in the noise floor, requiring them to be dealt with in comp using another approach. Indeed if all three channels are negative, the normalised distance is negative (because you are dividing by ach = max(r, g, b) which is still negative. So no roll off ever gets applied there.

Is this a problem? It certainly needs discussing. It doesn’t produce the coloured artefatcs that are the most obvious issue that we are combatting. But negative values are always potentially problematic, particularly if they are not immediately obvious.

This is similar to what I showed with the fire image during last night’s meeting. But it’s useful to show that it’s still an issue with a high end camera, not just a mid-range one like the FS-7.

1 Like

In my experience, negative values in dark grainy areas is very common. This is something compositors are used to dealing with and generally speaking these negative values are not problematic (unless a compositor is inexperienced and does not know how to deal it properly).

Yes exactly. In my humble opinion, negative values in grainy areas below a certain threshold should not be considered as out of gamut colors or even colors at all. If the max pixel value of an rgb triplet is 0.000006, should this really be considered a color?

Another question I have been wondering about is the question of exposure agnosticism. How important is this feature and why? The shadow rolloff approach to reduce the invertibility problems caused by very high distances in dark areas of grain does introduce a small change to the behavior of the algorithm with adjusted exposure. However this change is not excessive or even significant in the testing I have done. I would be curious to hear other opinions on this …

It is certainly important to look at images but it has to be done as the last step IMHO, here is, doctored, what I was writing on Slack last week:

I’m taking the exercise with three principles:

  • The outliers we are trying to bring inside the working space, are naturally narrow-band and extremely saturated sources, so I want to preserve that quality at all costs, we should not be taking too many creative decisions at this stage, I don’t think it is the purpose of this group.
  • I would like the solution to be defect-free as if we were doing a plane, very smooth curvature, I was playing with qualifiers the other day and curves with a kink can produce defects when grading.
  • Has to be elegant, the RGB method is extremely elegant in that sense.

I think the first point can be assessed without any visual tests, if the compression function we pick, reduces colour purity too much, it is not good.

The second point is a bit harder, you take it from a mathematical standpoint and enforce/strive for C2 continuity (as much as possible) which Tanh/Arctan or a custom spline can do but not the others as you get a kink with them.

The last point is easy, less code less operation and simple design primes :slight_smile:

Now for the context(s), there are many and it is entirely viable to use the Gamut Mapping operator in other pipelines than the strict ACES one, especially because it is agnostic. However in the context of ACES where the transformations are immutable and without user control, we, unfortunately, don’t have too much choice.

This resonates with this post I just made here: Notice of Meeting - ACES Gamut Mapping VWG - Meeting #15 - 6/4/2020 - #6 by Thomas_Mansencal.

Let’s assume that we have determined good thresholds for the various cameras and we ship a dozen of LMTs for them. In one year, the IDT Virtual Working Group propose new IDTs that are bringing values in AP1 or extremely close to it. We would have to ship another set of Gamut Mapping LMTs. The subsequent year, the RRT & ODT Virtual Working Group propose changes that make the parameterisation of the Gamut Mapping operator sub-optimal, we then have a dozen of LMTs to update in a backwards-incompatible way.

This also supports the reasoning that while it is critical to perform to visual assessment, it should certainly not be our single prime metric because what looks great today, might simply not tomorrow.

Cheers,

Thomas

I don’t disagree that in an ideal world, giving options to tweak and customize per shot, per plate even, to our heart’s desire is the way to fix all gamut issues. I agree that with this tool and some tweaking, you can get really great results across the board, especially with different cameras.

However, it isn’t this algorithm’s job to handle / compensate for disparate sources. That’s the IDT’s job. Only in the IDT can we truly solve per-camera/sensor/source gamut issues. In an ACES scene-referred world, where data handed to the algorithm is in an AP0 container, our best bet is to be as agnostic as possible. As stated in the original proposal for this group and in our progress report, we acknowledge that this isn’t a one-size-fits-all problem, and the work we do can only really help 80-90% of generalized use cases.

We are stuck in the inbetween - between known input and known output. Our work plays a crucial role in helping ease the path of the future RRT/ODT group, and could possibly be rendered irrelevant by future improvements in the IDT space. I acknowledge this isn’t as flexible as we would probably like - but here we are. I’ll also echo that the replies on the post @Thomas_Mansencal mentioned:

Are super relevant here.

2 Likes

I agree. In the context of an immutable OCIO implementation for example, you would want the best default values to handle the most common scenarios.

I believe there is good reason to optimize for the most common scenarios however.

As I outlined in my post above, data that has been transformed into an ACES 2065-1 container from different digital cinema cameras will not be the same, especially in the values that are out of gamut. There will be out of gamut values that have different maximum distance and biases.

I think it would be worth taking a look at the current most common digital cinema cameras in use by the industry, and making a set of default max distance limit values that work well on average with data from these cameras.

I believe that doing something like this will give us a better looking results than compressing equally in all directions from half-float inf to gamut boundary.

Quick development update:
I made a few additional changes to the gamut-compress master branch.

  • On the Nuke node and the Nuke Blinkscript Node:

    • Add helper functions for calculating max distance limits given a source gamut or an input image.
    • Set default distance limit values according to average of popular digital cinema cameras
    • As suggested a couple meetings ago, I modified the threshold parameter to be adjustable for each RGB component.
  • On the Resolve DCTL I’ve fixed a bug with the shadow rolloff parameter not working.

  • Thanks to some excellent help from @Jacob I’ve added a Fuse version of the GamutCompress tool for Blackmagic Fusion. It works in Fusion Studio and in Resolve Lite or Resolve Studio’s Fusion page. No watermark, fully functional.

  • For the record, both the Nuke node and the Fuse are fully functional in Nuke Non-Commercial, and Blackmagic Resolve Lite.

1 Like

The problem with picking the average for a given set of cameras is that at best you cover the average cases, those close to the center of the data distribution. All the cases on the far side of the distribution will not work and we are somehow back to square one with user complaining, etc… If anything, we should only consider the outermost outliers here, but we won’t be in a position where we can guarantee that all the data is within AP1 ∩ Spectral Locus.

Maybe the solution for immutable pipelines is to have an infinite (HalfFloat) operator that gives the guarantee that data ⊂ (AP1 ∩ Spectral Locus) and another one optimised for say something like FilmLight E-Gamut that do not give that guarantee but yields better result for the common cases. People could blend between the two output if they wanted… :slight_smile:

Cheers,

Thomas

If using tanh compression you don’t need to go as far as infinity. Strictly tanh(x) tends to 1 as x tends to infinity. But for practical purposes the tanh of any value above 19 is 1.0.

1 Like

ArcTanh[1] is Infinity
Next value down in half-float: (Do we get 12 bits of precision in half float? or 11)
ArcTanh[1. - 2^-12] is merely 4.5
ArcTanh[1. - 2^-11] – 4.15
Thus, in practice we can only invert to 4.5. This surprises me, as that’s not very much.
Is that right?

Shouldn’t we be evaluating the full compression function here instead of one of its component? How can we compare anything otherwise?

We should. and I think 4.5 is then an overestimate.

I ran some numbers, to be double-checked because I’m tired.

I’m generating Float16 domain HFD above 1 and finding the point at which the difference between a compressed value and the next one is 0, tanh is looking a bit sad here, not that the other ones are great either, atan is a good compromise:

argmax(f(HFD[x]) - f(HFD[x + 1]))

[ Threshold 0.0 ]
tanh 6.58594
atan 7376.0
simple 23168.0


[ Threshold 0.1 ]
tanh 6.02734
atan 6636.0
simple 14744.0


[ Threshold 0.2 ]
tanh 5.46875
atan 5900.0
simple 13104.0


[ Threshold 0.3 ]
tanh 4.91016
atan 3650.0
simple 8108.0


[ Threshold 0.4 ]
tanh 4.35156
atan 3130.0
simple 6952.0


[ Threshold 0.5 ]
tanh 3.61914
atan 1844.0
simple 5792.0


[ Threshold 0.6 ]
tanh 3.0957
atan 1476.0
simple 3278.0


[ Threshold 0.7 ]
tanh 2.57227
atan 783.0
simple 1738.0


[ Threshold 0.8 ]
tanh 1.97852
atan 369.5
simple 820.0


[ Threshold 0.9 ]
tanh 1.48926
atan 93.0625
simple 205.625


[ Threshold 1.0 ]
tanh 1
atan 1
simple 1

https://colab.research.google.com/drive/1f-5A-u7hqklDHYoWNLsQVSsy8MCUk6Js?usp=sharing

PS: Keep in mind that no particular distance is set, e.g. tanh defaults to inf and that the threshold values between the functions are not really comparable, tanh and atan can go lower than simple.

1 Like

Interesting. Certainly has serious implications for invertibility. The advantages of the C2 continuity of tanh need to be balanced against the fact that it goes so soon into non-invertibility.

15 posts were split to a new topic: Gamut mapping compression curves

Yeah, atan is the next one that is C2 continuous and maintains saturation to high level.

Hey @JamesEggleton,

It should be trivial to add them in the notebook, everything runs online!

I believe that preservation of “color purity” or “saturation” is a by-product of the threshold, not the compression function.

Tanh happens to maintain saturation at a high level because the initial slope is steep.

However, if the threshold is set with the intention of preserving high saturation, the same result can be achieved with the simple Reinhard compression function.

Here’s an example using the beautiful lambertian sphere :smiley:


This is the original with out of gamut values rendered through the ACES Rec.709 view transform.


Here’s the same image with tanh cmopression, with a threshold value of 0.4.


And the same image with Reinhard compression with a smaller threshold of 0.21. Note that the color purity of the blue sphere is pretty similar to the tanh compression curve.

(As a side-note if you look closely at the left cyan blob you can make out the more sudden transition of the Reinhard curve. I’m not sure if this is a by-product of the lack of C2 continuity or just the more abrupt transition of the curve from linear to compressed.)


And finally here’s a plot comparing tanh and reinhard.

Note that distance here is a parameter that specifies the distance beyond 1.0 to compress to 1.0. So a value of 0.24 compresses a distance of 1.24 to the gamut boundary.

I am following this conversation with great interest. It is all amazing work you guys do.
While it is important to solve the bit with the 1D mapping of RGB ratios it is not the last bit.
We need to remember that the neat trick of constructing RGB ratios might have reduced the 3 dimensional problem to 1D on first sight. But I am afraid there are more things to explore.

Here are a few questions or thoughts:

  • How does the Gamut Mapping algorithm handle motion graphics with data that is very different to natural image statistics?
  • How does a red logo look like if the whole logo is out of gamut along the primary vector (but still with some scatter?
  • how well can we pull a key of saturated greens (slightly moved towards cyan (because of unideal white balance)) if we apply the gamut compression early in the pipeline?
  • what happens with feathers of wrong premultipied saturated text(yes this happens sometimes)?
  • How well can you grade through such a process? If I crank up naive saturation before, does it react good?

This might help us to answer the questions about order, mandatority (is this a word) etc…

My thoughts on the 1D mapping:
On natural images all of the proposed shaper function might work. I think a bit of overshoots are actually not a problem, I would rather give the next process block a bit of flesh to work with before flattening any data. If you leave a bit of topology in the data it is easy for other steps to further sander it. It is much harder to unflatten flat areas.

3 Likes