Debating CAMs

I do agree that in the end colorimetry is just bullshit and it is all a hack. But then, one of the design requirements was to create a “simple” algorithm. And this current CAM Output Transform is arguably one of the most complex piece of software engineered by man. Ever.

Joke aside, I really appreciate the work and efforts but still wonder if the ratio complexity/visual result is worth it. And what does this CAM really bring to the table.

Again the discussion is interesting. So let´s keep going ! :wink:

Regards,
Chris

1 Like

It is not, by any means. You should dive into Unreal Engine codebase for some perspective.

The Hellwig et al. (2022) CAM is not adding a lot of complexity, we are talking about roughly 3-4 times the line count compared to a colour model like ICtCp. ZCAM and CIECAM16 are in the ballpark. The source of complexity is all the tweaks we are bolting around the models: tone curve, chroma compression, gamut mapping, etc…

As the main advantage of the CAM is the viewing conditions management, it could be argued that if we don’t use it we should drop the CAM entirely but it would be dismissing that its perceptual uniformity is good. Another appeal of this specific model is the lightness correlate that accounts for the HKE effect. The model is also mathematically simple and invertible, c.f. Hunt’s model for comparison.

If we pedal back to some of the requirements, something that was pointed out by many people, yourself included @ChrisBrejon, is that the horrible™ hue skews should be removed. We also talked at length about the desire of having highlights desaturation, i.e. path to white. Those two points can be handled by a good perceptually uniform model (or colour appearance model) with low effort, because brightness/lightness are decorrelated from hue and chroma.

No one EVER said that this is the perfect approach, and no one said that CAMs are the silver bullet. @Alexander_Forsythe and I were talking about their immaturity and lack of spatial modelling circa 2016 on Slack , nothing to see really…

It is worth reminding that we had a few simpler DRT candidates but the author retracted them by adopting a license that is incompatible with the Academy and ASWF so it was natural to fallback on another candidate. If anyone has a simpler and better display rendering transform to offer, then this is the good place and time to do so. Everybody will be super grateful!

Cheers,

Thomas

4 Likes

Not possible. There is no “perceptual uniformity” without considering the spatiotemporal articulation.

You are conflating authorship control with the mechanics of pictures.

TCAMv2 for example, utilized a chromaticity linear approach (clip distortions notwithstanding) to assert the control aspect, not for the mechanics of picture making.

The assumption is that a CAM can be used to form a picture. Seems like a broad leap.

Some pretty hard revisionist history going on here…

Cheers,
T

Both of these questions are predicated on a certain orthodoxy. “scene to display” for example, might be considered as “We have meaningful data, and we seek to reveal it”, which loops back to the tautology of “The role of a camera is to present the stimulus as measured”. This is the precise error Judd, Plaza, and Balcom predicated their work upon.

If folks believe that a discrete sample based approach using tristimulus colourimetry can be used for an “appearance” model, the question comes down to a very boolean series of questions.

When we look at a high intensity and pure coloured laser projected onto a wall, does it appear to be white in appearance? If the answer is no, and the discrete sampling “CAM” predicts such an appearance, then the model is broken. The ways that colours attenuate in a picture does not correlate to any visual cognition model.

If, when we look at a Caucasian face in standard ecological cognition contexts, they appear “more yellow” and “more pale”? If the answer is no, and the discrete sampling “CAM” approach manifests this, then the model is broken. The ways that Caucasian skin is presented in a picture does not correlate to any visual cognition model.

The idea that a picture is present in open domain tristimulus is an a priori error of logic; the stimulus that we look at in a picture is formed and shaped by the mechanics at work in the picture formation chain. Everything from a well engineered per channel (EG: Harald Brendel’s / ARRI’s work), to an inverse 2.2 EOTF encoding, to more detailed efforts, creates something wholly new not present in the camera or render colourimetric triplets. Specifically, the happy accident of crosstalk from per channel mechanics of dyes and additive light are the thing we would be wise to be actively analyzing more deeply.

I would suggest folks take a critical look at the above questions. It comes down to an either-or scenario:

  1. That a picture is nothing more than emulating appearances of stimuli. If this were even remotely correct, we should expect an intense green laser to appear “white”, or Caucasian skin to appear pale and toward the Tritan confusion line through achromatic in day to day visual cognition.
  2. If we reject the idea that the above effects manifest in standard ecological cognition, and that we can see them manifesting in some model, the model cannot be behaving as an appearance model, as these effects do not occur as they do in pictures.

If any of these models were of any utility, one would think that the “simple” problem of achieving an appearance match between Display P3 and BT.709 pictures would be the perfect application of such.

Sadly, every single one of these self professed “appearance” models provide no such solutions to this “simple” problem either.

2 Likes

Of course there is, in a complete form certainly not. Perceptual uniform spaces enable better control over hues paths, one can saturate / desaturate a sky, a colourful light with a hue path that appears more linear. Spaces like IPT et al. have been designed to do exactly that, irrespective of what you think.

We are not using the CAM to render the picture…

Do you think I would be saying that without any data point? You weren’t even there at the time, how could you even know?

<div class="username">troy_s</div><div class="time">2017-01-15 19:20</div><div class="msg">has joined #general</div>

This leads me to something more interesting to discuss about: Given you seem to know better than everybody else, how about, for once, you formally propose your solution to render pictures so that the group can evaluate it?

Is that it?

Cheers,

Thomas

2 Likes

So we agree it is complex. Cool.

To be exact, our comment was : hue skews should be either removed or properly engineered and not be an accidental result.

Well there was a slide around September 2021 stating that “ACES is science” to introduce the CAMs.

Maybe it would be worth thinking about why we lost “the author” on the way (and a few brilliant people). Maybe the gaslighting is not helping…

There was this approach from back in the day. Maybe there is more than meets the eye.

By looking at the #aces slack archive where you stated in 2016 :

Regarding the RRT, do you think it would be worth touching base with the Academy and let them know we don’t like it? :smile:

I have to say that it is a bit sad that any comment or critic towards ACES generates such petty debate. A shame really because I cannot help but think about the missed opportunity of this VWG.

Again thanks for the hard work and all,
Chris

Sometimes you have to do complex things to get something. Wētā FX pipeline and processes are complex but that what empower us to do Avatar. Manuka is most certainly more complex than all the renderers out there, but there are reasons for that.

I don’t see a problem with stating that CAMs are science, they are part of advanced colorimetry so I’m not sure what you are trying to say. If it is that the science is incomplete, it certainly is and this is what makes that field exciting.

Regarding OpenDRT, I’m not sure again what you are talking about. The only thing I remember is Jed expressing that he was not feeling that it would be right to have OpenDRT licensed with a license compatible with that of the Academy given how much Daniele inspired its design thus he decided to adopt GPL3 instead. It is a decision that I respect. There is also no animosity in my statement. I don’t think anyone in the TAC or ACES leadership has any grudge against Jed, I certainly don’t.

WRT to your last paragraph, two things:

  • We did let know the Academy that we did not like the RRT, invited Alex and Scott and produced the RAT document which has been the basis of ACES 2.0.
  • You have not asked us if it was appropriate to quote conversations from the colour-science Slack workspace. The #aces channel has been made public only very recently and I never got agreement from all the participants for all the history to be public. Please never do that again.

Thomas

3 Likes

Of course there is not. The elasticity of our visual cognition is perhaps deluding us? The reasons this is fundamentally impossible is because our visual cognition is a fields-first cognition loop. There are countless examples of the influence of fields, including quite a number of models that provide such a transducer stage, post fields analysis. Citations as per some of the names already offered, and others if anyone sees any use.

If one truly believes that a field-agnostic metric is of any service, one merely needs to look at examples as to how the spatiotemporal articulation is a primary stage in our visual cognition, cascading upwards and receiving feedback downwards in the reification of meaning process. I have found no better demonstration than those of Adelson’s Snakes.



Given we know from these demonstrations that the spatiotemporal articulation fields are incredibly low in the order stack, we can also hypothesize that the fields and visual cognition will shift with a shift in spatiotemporal dimensions.

Some conclusions one might draw from these demonstrations:

  • Visual cognition, such as the reification of lightness, clearly has a primary driver of field relationship in the reification process, as well as a bit of research suggesting that instability based on cognition is also present.
  • The idea of the transducer as well as amplification becomes apparent in the field relationships. Has implications for HDR, for example. IE: The R=G=B peak output of the last diamond set is often cognized as exceeding the tristimulus magnitude of the display in terms of lightness reification.

The last question is in relation to the following demonstration, posted by @priikone, which I believe is the ARRI Reveal system:

We should be able to predict that at a given quantisation level of the signal that we can induce the cognition of “edge” or “other”. The picture in this case, expressed in some unit of colourimetric-adjacent magnitudes, relative to the observer. Indeed it seems feasible that at some scales, the signal is discretized, and an aliasing may be more or less cognized:

Indeed we see a repeating patterned relationship with the classic Cornsweet, Craik, and O’Brien demonstration.

Observers of the picture may cognize:

  1. A “left looks lighter than right” reification.
  2. A “dip” immediately adjacent to the ramps, aka “Mach” band.

If we attempt to provide a metric to this, we might harness luminance metrics. While it can be stated that the transduction / amplification mechanism makes no singular luminous efficacy unit function feasible, for a high level analysis, it seems at least applicable.

At specific octaves, we are able to at least get a semblance of visual cognition reification probability at the “full screen” dimension. For example, at low and middling high frequencies, and assuming a simplistic calibration to something like a 12” diagonal at 18-24” viewing:


We can at least see some degree of hope of a practical utility to help guide our analysis of the signal that may aid in locating regions that cognitive scission / segmentation has a higher probability of occurring.

For the inquisitive, visual fields play a tremendous role in the reification of colour in our cognition, which the aforementioned fields-first frequency analysis can provide some insight and predictive capability. There is likely a direct line to this and some of the incredible complexity in the formed picture from Red XMas, for example. Discs in the following are equal tristimulus magnitudes, and follow the patterns outlined above in the transduction / amplification concept.


Broad conclusions:

  • Fields first thinking should be at the forefront of analysis.
  • A general consistency of the viewing field in terms of spatiotemporal dimensions is likely mandatory for evaluating “smoothness” of fields.

This is not what I think. I am a dilettante idiot buffoon that reads vastly wiser and experienced minds. I don’t have an original thought in my body.

It strikes me that the claims of such systems are bogus. That’s just my pure hack opinion. Folks are free to evaluate the evidence and believe what they want.

Not a single tristimulus triplet in terms of colourimetry as it exists in the open domain data buffer is ever presented in the form of a spatiotemporal articulation. Not a single one. If we were to apply some colourimetric measurement between the thing we are looking at (the picture / image), versus the colourimetric data in the EXRs, there are new samples formed.

The whole discussion of hue flights and the attenuation of chroma? That’s a byproduct of the crosstalk from the per channel model, not the higher level lofty idea of the model. It is an accident.

Maybe this is how you personally approach understanding. I do not. I’ve been openly saying forever that I don’t even understand what “tone” means, and I’ve tried to be diligent in exploring concepts and understanding without cleaving to the orthodoxy. So let me be clear:

I have no ### idea how visual cognition works.
I consider picture-texts a higher order of complexity above the basic ecological cognition of moving a body through space.

What I do believe, is that much of the iron fisted beliefs that orbit in some small circles do not afford any shred of veracity under scrutiny.

I have proposed what I have believed to be the best paths for a long while now; attempt to enumerate the rates of change and model them according to a specific metric that holds a connection to the ground truth in question. Try to hook the map (the metric) up to the territory (the specific thing attempted to be measured).

Curves, for example…

The basic mechanics of a curve in a per channel model is far from “simple”:

  1. A curve does not hold a connection to reified lightness, yet it is analysed as such. It holds a direct link to a metric of luminance in exactly one edge case of R=G=B when applied on a channel by channel basis.
  2. A curve adjusts purity in terms of rates of change, depending on the engineering of the three channels, in the output colourimetry.
  3. A curve adjusts the rates of change of the flights of axial colourimetric angle.
  4. A curve adjusts the intensity in a non-uniform manner, origin triplet depending.

Some questions I believe deserve due diligence:

  1. When considering a triplet of “high purity”, how does the transformation to result relate to the curve rate of changes of the above three metrics? “Middling purity”? “Low purity?”
  2. When considering the above three broad classes of “purities”, how do the above track in relation to the equal-energy case, and at what spatiotemporal frequencies?
  3. Are there known methods that can be used to broadly analyse, predict, and estimate where visual discontinuities exist? Could they be leveraged to make predictions in relation to the given curves at given spatiotemporal frequencies?
  4. In the case of negative lobed values, are there broad trends that can be used to coerce those values into the legitimate domain prior to picture formation? Are there “rules” that should be established here? Why?

Of course not.

The reason behind the link you posted is to try and get a handle on how the seeming simplicity of per-channel mechanics in forming pictures is actually incredibly complex. I have used the experiments and the basic mechanics to glean insight into the surprisingly complex interactions when projected as colourimetry, and tried to get a better understanding on how these rates of change interact with our picture cognition.

I think we all can, or should aspire, to be far better at kindling our understanding of pictures.

5 Likes

I beg to differ…

All those spatial effects are great and well known but:

  • How many pictures do you see everyday looking like those that you posted?
  • What makes you think that the mechanics for a complex visual field are exactly the same than with your simple examples? What you are showing here are extreme outliers in a standard distribution of pictures as we author them in the entertainment industry. The effects that you highlighted are mixed, averaged together and dependent on so many factors that it becomes extremely hard to point them out on a specific picture. Can you for example identify and show them with precision on the Blue Bar picture?
  • How do you apply the learnings to picture rendering? Which model should we be using?

Because we are pragmatic, need to move forward and because those critical questions won’t be solved anytime soon, the VWG is working with colour models that give better control over hue and some more.


Would the appearance of the Red Christmas Lights picture affected by a blue surround or a patch in its the centre, I’m certain that it would be. Would the appearance of an overexposed blue sky “desaturated” through that model be affected by a bright purple patch around it, of course if would be! Here is a photograph of yours of such a sky:

image

Thomas

PS: That last one was admittedly snarky, see it as fair game :slight_smile:

2 Likes

For historical purposes as I was replying and you deleted your post (which is quite an habit I must admit)

The proposition is not boolean, I was describing the opposite: Spatially induced effect have an infinite quantity of magnitudes. Those magnitudes form a standard distribution and it turns out that you actually picked the most outliers and extreme examples.

I then proceeded to take one of those and shown that perceptual uniformity is still a thing, even under the strongest spatial induction but you still dismiss it, which is quite baffling. No one with normal vision would say that the Oklab and IPT gradients look less perceptually uniform than the CIELab or HSV ones. Do overall their overall hues change because of the purple induction, yes they certainly do.

I asked you to highlight areas in the Blue Bar image where spatial induction has magnitudes similar to your examples. I’m genuinely curious if they can be identified with precision and what should be done with them.

Again, no one denies that spatio-temporal induced effects are not important but I (and plenty of others) have put a cross on modelling them years ago because it is the hardest problem in vision. The current models (or their extensions), i.e. iCAM06, Retinex, are not exactly successful either and introduce objectionable artefacts, e.g. haloing. I tend to leave this stuff to researchers while following their work very closely.


From a pure complexity standpoint, we are talking about easily order(s) of magnitude more code, so if the 50-60 lines of Hellwig et al. (2022) is “one of the most complex piece of software engineered by man. Ever”, well… hold my beer :slight_smile:.

Ultimately photographers, artists and colorists have always done a better work than any spatial-temporal model or algorithm.

This brings me those fond memories when local tonemapping operators halos were all rage:

2 Likes

The colour-science Slack is private, ACESCentral is public. I started to write my answer then you deleted yours. It is a familiar pattern of yours that wasted my time on numerous occasions, I decided to finish writing this time.

Hello, let’s see if we can keep this conversation going in a respectful manner. Thanks !

Hello again, please do not deform or misuse my statements. In my original answer, I was talking about the whole Output Transform, not just the CAM model. And I even mentioned that this was a joke. I don’t understand why you keep coming at me about this.

I have watched every single meeting of the OT VWG and my overall sensation is that the complexity involved is getting in the way. I agree that complexity is not an issue per-se, it becomes an issue when we cannot handle it.

So it happens that I worked on Avatar (at Framestore) and I also worked at Weta Digital (War for the planet of the apes). I would argue that a 1000 artists working +60-80 hours allowed to deliver those movies. I would never say for instance that Glimpse “saved” the Lego Movie, Max Liani did.

But again, it does not really matter if things are complex or simple. In the end, those are “bait” words and just a matter of perspective. So you’ re right. What matters is if the output transform is working or not… But it strikes me that we go from “ACES is science” to layers of “tweaking” and no one stops and asks “hey, are we going in the right direction ?”

Surely I must not be the only one thinking this…

Maybe you should not have generated and shared the archive then ? I also agree with Troy that even on a public forum, a deleted post by an author should be respected. Don’t you think ?

Regards,
Chris

1 Like

We are all tremendously thankful you did.

Manuka was not used on the first movie, it did not exist. People were doing the hours they wanted, every minute of work has always been paid which is not the case in London for example. This is not a right place to debate about this anyway.

It is getting personal, as always, and out of hands, like quite often. Let’s then get a bit more personal and give some context to people here.

When I start to reply to Troy on ACEScentral and he deletes his post, again, and has done so numerous times, what do you think I should do? I could let go, I have always done that, thing is that this time I did not. I wanted to make my point heard and it needed the context to make sense.

For what its worth, I read almost all the posts on ACEScentral and Troy is the only person here I ever seen deleting his replies.

You are bringing Slack again, let’s dive there then. Do you know why I blocked post editing on our Slack? Maybe not so read this: We had a disagreement in the past that leaded him to insult me on one of our channels, writing all sort of colourful words, editing, changing them for some more, and ultimately deleting his posts. It has been very frustrating for me because I unfortunately don’t have access to those as we do not pay for the instance. This could be solved easily albeit in a costly way. Suffice to say that because of his behaviour we talked about kicking him. We did not have to as he excluded himself for a while. I think I can find an email he sent saying he would take some time off and that he loved us. He then came back one day as if nothing happened. Have I been more cautious and more reactive with him since then? Of course! Does it show? For sure, we had a lot of disagreements over Slack, Twitter and here. I enjoy them most of the time except when they start leaning toward personal attacks and insults, we are wandering in that territory now.

Let’s finish on the archive: It is not public and has never been, it is password protected purposely. That password, I give it to the members when they request it. If it was public, it would be on the colour-science website. It is unbelievable that I have to explain that, doubling so when you are also a recipient of this email:

1 Like

No worries. I still think that these messages are useful and that interesting info is shared.

In the end, debates are what make us progress and learn.

A shame it got personal and “colorful” (pun intended).

Have a nice week-end everybody !
Chris

Nothing to disagree about here, which is the reason we made the channel public from now on.

To get back to the CAMs, I could be wrong but isn’t Pomfort’s LiveGrade offering a CAM based DRT? @Alexander_Forsythe : I think we discussed about that last year no?

1 Like

To get back to the CAMs, I could be wrong but isn’t Pomfort’s LiveGrade offering a CAM based DRT? @Alexander_Forsythe : I think we discussed about that last year no?

I think Pomfort has integration with Colorfront.

Colorfront has something about “Using the Human Perceptual Model for Multiple Display Mastering”

1 Like

Thanks @cameronrad, I will read but I think that it is what I was looking for/alluding to!

Is this correct, that ACES 2 DRT most likely will be able to smoothly handle colors of AP1, but not AP0?

Hi,

This is lacking a lot of implementation details but it seems like they do use it for adapting the image to various display targets not so much for rendering which more like the intended usage for a CAM.

Cheers,

Thomas

1 Like