Both of these questions are predicated on a certain orthodoxy. “scene to display” for example, might be considered as “We have meaningful data, and we seek to reveal it”, which loops back to the tautology of “The role of a camera is to present the stimulus as measured”. This is the precise error Judd, Plaza, and Balcom predicated their work upon.
If folks believe that a discrete sample based approach using tristimulus colourimetry can be used for an “appearance” model, the question comes down to a very boolean series of questions.
When we look at a high intensity and pure coloured laser projected onto a wall, does it appear to be white in appearance? If the answer is no, and the discrete sampling “CAM” predicts such an appearance, then the model is broken. The ways that colours attenuate in a picture does not correlate to any visual cognition model.
If, when we look at a Caucasian face in standard ecological cognition contexts, they appear “more yellow” and “more pale”? If the answer is no, and the discrete sampling “CAM” approach manifests this, then the model is broken. The ways that Caucasian skin is presented in a picture does not correlate to any visual cognition model.
The idea that a picture is present in open domain tristimulus is an a priori error of logic; the stimulus that we look at in a picture is formed and shaped by the mechanics at work in the picture formation chain. Everything from a well engineered per channel (EG: Harald Brendel’s / ARRI’s work), to an inverse 2.2 EOTF encoding, to more detailed efforts, creates something wholly new not present in the camera or render colourimetric triplets. Specifically, the happy accident of crosstalk from per channel mechanics of dyes and additive light are the thing we would be wise to be actively analyzing more deeply.
I would suggest folks take a critical look at the above questions. It comes down to an either-or scenario:
That a picture is nothing more than emulating appearances of stimuli. If this were even remotely correct, we should expect an intense green laser to appear “white”, or Caucasian skin to appear pale and toward the Tritan confusion line through achromatic in day to day visual cognition.
If we reject the idea that the above effects manifest in standard ecological cognition, and that we can see them manifesting in some model, the model cannot be behaving as an appearance model, as these effects do not occur as they do in pictures.
If any of these models were of any utility, one would think that the “simple” problem of achieving an appearance match between Display P3 and BT.709 pictures would be the perfect application of such.
Sadly, every single one of these self professed “appearance” models provide no such solutions to this “simple” problem either.
Of course there is, in a complete form certainly not. Perceptual uniform spaces enable better control over hues paths, one can saturate / desaturate a sky, a colourful light with a hue path that appears more linear. Spaces like IPT et al. have been designed to do exactly that, irrespective of what you think.
We are not using the CAM to render the picture…
Do you think I would be saying that without any data point? You weren’t even there at the time, how could you even know?
<div class="username">troy_s</div><div class="time">2017-01-15 19:20</div><div class="msg">has joined #general</div>
This leads me to something more interesting to discuss about: Given you seem to know better than everybody else, how about, for once, you formally propose your solution to render pictures so that the group can evaluate it?
Sometimes you have to do complex things to get something. Wētā FX pipeline and processes are complex but that what empower us to do Avatar. Manuka is most certainly more complex than all the renderers out there, but there are reasons for that.
I don’t see a problem with stating that CAMs are science, they are part of advanced colorimetry so I’m not sure what you are trying to say. If it is that the science is incomplete, it certainly is and this is what makes that field exciting.
Regarding OpenDRT, I’m not sure again what you are talking about. The only thing I remember is Jed expressing that he was not feeling that it would be right to have OpenDRT licensed with a license compatible with that of the Academy given how much Daniele inspired its design thus he decided to adopt GPL3 instead. It is a decision that I respect. There is also no animosity in my statement. I don’t think anyone in the TAC or ACES leadership has any grudge against Jed, I certainly don’t.
WRT to your last paragraph, two things:
We did let know the Academy that we did not like the RRT, invited Alex and Scott and produced the RAT document which has been the basis of ACES 2.0.
You have not asked us if it was appropriate to quote conversations from the colour-science Slack workspace. The #aces channel has been made public only very recently and I never got agreement from all the participants for all the history to be public. Please never do that again.
Of course there is not. The elasticity of our visual cognition is perhaps deluding us? The reasons this is fundamentally impossible is because our visual cognition is a fields-first cognition loop. There are countless examples of the influence of fields, including quite a number of models that provide such a transducer stage, post fields analysis. Citations as per some of the names already offered, and others if anyone sees any use.
If one truly believes that a field-agnostic metric is of any service, one merely needs to look at examples as to how the spatiotemporal articulation is a primary stage in our visual cognition, cascading upwards and receiving feedback downwards in the reification of meaning process. I have found no better demonstration than those of Adelson’s Snakes.
Given we know from these demonstrations that the spatiotemporal articulation fields are incredibly low in the order stack, we can also hypothesize that the fields and visual cognition will shift with a shift in spatiotemporal dimensions.
Some conclusions one might draw from these demonstrations:
Visual cognition, such as the reification of lightness, clearly has a primary driver of field relationship in the reification process, as well as a bit of research suggesting that instability based on cognition is also present.
The idea of the transducer as well as amplification becomes apparent in the field relationships. Has implications for HDR, for example. IE: The R=G=B peak output of the last diamond set is often cognized as exceeding the tristimulus magnitude of the display in terms of lightness reification.
The last question is in relation to the following demonstration, posted by @priikone, which I believe is the ARRI Reveal system:
We should be able to predict that at a given quantisation level of the signal that we can induce the cognition of “edge” or “other”. The picture in this case, expressed in some unit of colourimetric-adjacent magnitudes, relative to the observer. Indeed it seems feasible that at some scales, the signal is discretized, and an aliasing may be more or less cognized:
A “dip” immediately adjacent to the ramps, aka “Mach” band.
If we attempt to provide a metric to this, we might harness luminance metrics. While it can be stated that the transduction / amplification mechanism makes no singular luminous efficacy unit function feasible, for a high level analysis, it seems at least applicable.
At specific octaves, we are able to at least get a semblance of visual cognition reification probability at the “full screen” dimension. For example, at low and middling high frequencies, and assuming a simplistic calibration to something like a 12” diagonal at 18-24” viewing:
We can at least see some degree of hope of a practical utility to help guide our analysis of the signal that may aid in locating regions that cognitive scission / segmentation has a higher probability of occurring.
For the inquisitive, visual fields play a tremendous role in the reification of colour in our cognition, which the aforementioned fields-first frequency analysis can provide some insight and predictive capability. There is likely a direct line to this and some of the incredible complexity in the formed picture from Red XMas, for example. Discs in the following are equal tristimulus magnitudes, and follow the patterns outlined above in the transduction / amplification concept.
Fields first thinking should be at the forefront of analysis.
A general consistency of the viewing field in terms of spatiotemporal dimensions is likely mandatory for evaluating “smoothness” of fields.
This is not what I think. I am a dilettante idiot buffoon that reads vastly wiser and experienced minds. I don’t have an original thought in my body.
It strikes me that the claims of such systems are bogus. That’s just my pure hack opinion. Folks are free to evaluate the evidence and believe what they want.
Not a single tristimulus triplet in terms of colourimetry as it exists in the open domain data buffer is ever presented in the form of a spatiotemporal articulation. Not a single one. If we were to apply some colourimetric measurement between the thing we are looking at (the picture / image), versus the colourimetric data in the EXRs, there are new samples formed.
The whole discussion of hue flights and the attenuation of chroma? That’s a byproduct of the crosstalk from the per channel model, not the higher level lofty idea of the model. It is an accident.
Maybe this is how you personally approach understanding. I do not. I’ve been openly saying forever that I don’t even understand what “tone” means, and I’ve tried to be diligent in exploring concepts and understanding without cleaving to the orthodoxy. So let me be clear:
I have no ### idea how visual cognition works. I consider picture-texts a higher order of complexity above the basic ecological cognition of moving a body through space.
What I do believe, is that much of the iron fisted beliefs that orbit in some small circles do not afford any shred of veracity under scrutiny.
I have proposed what I have believed to be the best paths for a long while now; attempt to enumerate the rates of change and model them according to a specific metric that holds a connection to the ground truth in question. Try to hook the map (the metric) up to the territory (the specific thing attempted to be measured).
Curves, for example…
The basic mechanics of a curve in a per channel model is far from “simple”:
A curve does not hold a connection to reified lightness, yet it is analysed as such. It holds a direct link to a metric of luminance in exactly one edge case of R=G=B when applied on a channel by channel basis.
A curve adjusts purity in terms of rates of change, depending on the engineering of the three channels, in the output colourimetry.
A curve adjusts the rates of change of the flights of axial colourimetric angle.
A curve adjusts the intensity in a non-uniform manner, origin triplet depending.
Some questions I believe deserve due diligence:
When considering a triplet of “high purity”, how does the transformation to result relate to the curve rate of changes of the above three metrics? “Middling purity”? “Low purity?”
When considering the above three broad classes of “purities”, how do the above track in relation to the equal-energy case, and at what spatiotemporal frequencies?
Are there known methods that can be used to broadly analyse, predict, and estimate where visual discontinuities exist? Could they be leveraged to make predictions in relation to the given curves at given spatiotemporal frequencies?
In the case of negative lobed values, are there broad trends that can be used to coerce those values into the legitimate domain prior to picture formation? Are there “rules” that should be established here? Why?
Of course not.
The reason behind the link you posted is to try and get a handle on how the seeming simplicity of per-channel mechanics in forming pictures is actually incredibly complex. I have used the experiments and the basic mechanics to glean insight into the surprisingly complex interactions when projected as colourimetry, and tried to get a better understanding on how these rates of change interact with our picture cognition.
I think we all can, or should aspire, to be far better at kindling our understanding of pictures.
All those spatial effects are great and well known but:
How many pictures do you see everyday looking like those that you posted?
What makes you think that the mechanics for a complex visual field are exactly the same than with your simple examples? What you are showing here are extreme outliers in a standard distribution of pictures as we author them in the entertainment industry. The effects that you highlighted are mixed, averaged together and dependent on so many factors that it becomes extremely hard to point them out on a specific picture. Can you for example identify and show them with precision on the Blue Bar picture?
How do you apply the learnings to picture rendering? Which model should we be using?
Because we are pragmatic, need to move forward and because those critical questions won’t be solved anytime soon, the VWG is working with colour models that give better control over hue and some more.
Would the appearance of the Red Christmas Lights picture affected by a blue surround or a patch in its the centre, I’m certain that it would be. Would the appearance of an overexposed blue sky “desaturated” through that model be affected by a bright purple patch around it, of course if would be! Here is a photograph of yours of such a sky:
PS: That last one was admittedly snarky, see it as fair game
The proposition is not boolean, I was describing the opposite: Spatially induced effect have an infinite quantity of magnitudes. Those magnitudes form a standard distribution and it turns out that you actually picked the most outliers and extreme examples.
I then proceeded to take one of those and shown that perceptual uniformity is still a thing, even under the strongest spatial induction but you still dismiss it, which is quite baffling. No one with normal vision would say that the Oklab and IPT gradients look less perceptually uniform than the CIELab or HSV ones. Do overall their overall hues change because of the purple induction, yes they certainly do.
I asked you to highlight areas in the Blue Bar image where spatial induction has magnitudes similar to your examples. I’m genuinely curious if they can be identified with precision and what should be done with them.
Again, no one denies that spatio-temporal induced effects are not important but I (and plenty of others) have put a cross on modelling them years ago because it is the hardest problem in vision. The current models (or their extensions), i.e. iCAM06, Retinex, are not exactly successful either and introduce objectionable artefacts, e.g. haloing. I tend to leave this stuff to researchers while following their work very closely.
From a pure complexity standpoint, we are talking about easily order(s) of magnitude more code, so if the 50-60 lines of Hellwig et al. (2022) is “one of the most complex piece of software engineered by man. Ever”, well… hold my beer .
Ultimately photographers, artists and colorists have always done a better work than any spatial-temporal model or algorithm.
This brings me those fond memories when local tonemapping operators halos were all rage:
The colour-science Slack is private, ACESCentral is public. I started to write my answer then you deleted yours. It is a familiar pattern of yours that wasted my time on numerous occasions, I decided to finish writing this time.
Hello, let’s see if we can keep this conversation going in a respectful manner. Thanks !
Hello again, please do not deform or misuse my statements. In my original answer, I was talking about the whole Output Transform, not just the CAM model. And I even mentioned that this was a joke. I don’t understand why you keep coming at me about this.
I have watched every single meeting of the OT VWG and my overall sensation is that the complexity involved is getting in the way. I agree that complexity is not an issue per-se, it becomes an issue when we cannot handle it.
So it happens that I worked on Avatar (at Framestore) and I also worked at Weta Digital (War for the planet of the apes). I would argue that a 1000 artists working +60-80 hours allowed to deliver those movies. I would never say for instance that Glimpse “saved” the Lego Movie, Max Liani did.
But again, it does not really matter if things are complex or simple. In the end, those are “bait” words and just a matter of perspective. So you’ re right. What matters is if the output transform is working or not… But it strikes me that we go from “ACES is science” to layers of “tweaking” and no one stops and asks “hey, are we going in the right direction ?”
Surely I must not be the only one thinking this…
Maybe you should not have generated and shared the archive then ? I also agree with Troy that even on a public forum, a deleted post by an author should be respected. Don’t you think ?
Manuka was not used on the first movie, it did not exist. People were doing the hours they wanted, every minute of work has always been paid which is not the case in London for example. This is not a right place to debate about this anyway.
It is getting personal, as always, and out of hands, like quite often. Let’s then get a bit more personal and give some context to people here.
When I start to reply to Troy on ACEScentral and he deletes his post, again, and has done so numerous times, what do you think I should do? I could let go, I have always done that, thing is that this time I did not. I wanted to make my point heard and it needed the context to make sense.
For what its worth, I read almost all the posts on ACEScentral and Troy is the only person here I ever seen deleting his replies.
You are bringing Slack again, let’s dive there then. Do you know why I blocked post editing on our Slack? Maybe not so read this: We had a disagreement in the past that leaded him to insult me on one of our channels, writing all sort of colourful words, editing, changing them for some more, and ultimately deleting his posts. It has been very frustrating for me because I unfortunately don’t have access to those as we do not pay for the instance. This could be solved easily albeit in a costly way. Suffice to say that because of his behaviour we talked about kicking him. We did not have to as he excluded himself for a while. I think I can find an email he sent saying he would take some time off and that he loved us. He then came back one day as if nothing happened. Have I been more cautious and more reactive with him since then? Of course! Does it show? For sure, we had a lot of disagreements over Slack, Twitter and here. I enjoy them most of the time except when they start leaning toward personal attacks and insults, we are wandering in that territory now.
Let’s finish on the archive: It is not public and has never been, it is password protected purposely. That password, I give it to the members when they request it. If it was public, it would be on the colour-science website. It is unbelievable that I have to explain that, doubling so when you are also a recipient of this email:
This is lacking a lot of implementation details but it seems like they do use it for adapting the image to various display targets not so much for rendering which more like the intended usage for a CAM.
One of the goals we have been persuing (and arguably a rod we have made for our own backs) is that we are trying to gracefully handle values outside of AP1 and AP0, as many real world production camera IDTs will land values outside of those domains.
Both the infamous Blue Bar and Red-Xmas position meaningful picture information outside of both the spectral locus, and in many cases AP0.
Many things would be a lot easier if we simply said “sod it, if your data lands outside the locus, then that’s an IDT problem, not a DRT problem”
Many of the “hacks” we’ve had to implment, like the modified CAM primaries are specifically about handling these values when they sit in places that a non physically plausible (as least by the normal definition of what data in an ACES AP0 frame is meant to mean).
Wouldn’t it actually better to focus on AP0 gamut at best, or even AP1?
Whatever out-of-working-AP1-gamut values a camera have, this all is always put back into a working gamut by a RGC now and hopefully by better IDT in the future. I’m for sure infinitely far from being the best colorist as an artist, but talking about technical side of color, I’m relatively good for a colorist (not compared to all of you here of course, but more educated about the technical side, than a lot of overall good (and some of them famous) colorists). But still I can’t for sure, without checking it at first, answer you what grading operations would or wouldn’t brake out-of-gamut colors. So it’s a must have to deal with it at the first step. At least this is what I teach my occasional students on private color grading lessons. And it’s incredibly rare for them to do anything but a straight conversion by a 3x3 matrix and at least be aware of out-of-gamut colors as a thing that is not Alexa-to-ACES-police-lights specific. Using offset over LogC3 and believing it’s identical to Exposure in RAW is another popular misunderstanding. If the last one I almost destroyed among colorists from post ussr countries by strongly promoting “gain, offset, gamma wheels in linear” for the last couple of years, the first one (out-of-gamut colors as a thing for almost ANY 3x3 conversion) is still a mystery. And they usually are really good artists working on big movies and having salary 2-10 times bigger than mine. Another example is how I tried to explain to a relatively famous Nuke instructor(!) that by default Color Space in Read node is actually just a curve and does nothing with primaries. He strongly believed that it also converts primaries to some special nuke internal color space.
Sorry for the long text, but I often see here on the forum how people expect from users to have way more technical knowledge than they really have.
So I can’t expect from anybody to be careful with those fragile negative out-of-gamut colors during grading session (especially paid per hour). It’s faster to deal with them at the first step and later to stick within a working gamut. If someone uses offset over log and creates tons of negative values, it’s on them. These negative values created by offset wheel contain neutral colors as well, but you don’t add soft-clipping for zero saturation neutral colors to the shadows in DRT to protect from it. And also Show looks are often baked into LUTs that usually clamp at 0 and 1 anyway.
So my opinion is to stick to the working gamut for DRT. It’s maybe like 10% of projects where will be at least one pixel outside of the working gamut. Because the rest use Show LUT that clamps everything to AP1 ACEScct.
I see a lot more benefits for nice looking images and for pipeline in developing better IDT instead of DRT that is able to handle out-of-gamut colors.
By the way, what’s the point of AP0 now? AP0 is not used anywhere except storing the source or graded images and exchanging between some departments. But AP1 EXRs can contain negative values as well.