Hi Alex,
thanks for the answer. I am aware of the definition of the RRT and OCES, but I am struggling to understand it.
Let’s examine the definition in more detail:
So the “perceptual factors” sound very much like “colour appearance phenomena” to me, “factors” we need to account for in order to make the image appear as an observer perceiving the scene directly.
The “scene” is a very brought concept which needs more definition in my opinion. On my own image appearance matching tests, we found that the definition of the scene was one of the biggest factors when deriving display rendering transforms (more below).
The definition of the Reference Display is also not clear to me either. (I know the words… but what do they mean):
From documents/LaTeX/TB-2014-002/appendixA.tex
“RRT (Reference Rendering Transform) – Converts the scene-referred ACES2065-1 colors into colorimetry for an idealized cinema projector with no dynamic range or gamut limitations.”
Or on:
documents/LaTeX/P-2013-001/introduction.tex
“…Although it is possible to use the Output Color Encoding Specification (OCES) encoding for image storage, it is primarily a conceptual encoding, a common `jumping-off point’ for Output Device Transforms (ODTs)…”
Mh… does not clarify things for me in practice.
Is the references display an ideal 3 primary display?
Here “extreme wide gamut” is a bit ambiguous.
Why is a cinema-like environment coded into the Reference Viewing Condition? This might be unideal if I want to build a perfect pipeline for VR maybe?
(For me, an ideal display would have much more primaries and all “three component” approaches would be obsolete anyway.)
Also, how do you verify the isolated performance of the RRT if such a display is not available to the present day?
“… Required to convert…” does not really specify the intent of the RRT. Is its definition to match appearance, to satisfy a hand full of expert viewers or something else? Is it performance measure an objective or subjective one.
Assuming you put Colour appearance matching in the ODTs:
I have doubts that you can derive and visually verify “colour appearance models” if the source space is an abstract one (or ambiguously defined). If you encounter discrepancies in colour matching it is hard to debug - where does the error comes from (the RRT or the CAM in the ODT)?
The Scene:
for a CAM “the scene” could be just another viewing condition. So the problem of transforming from a specific scene referred image state to a specific display referred image state could be (and can be) solely solved by a one suitable model, I suppose.
I write a “specific scene” because there is strong evidence that such a model could produce the best results if it would alter its parameters based on the specific scene and specific display viewing condition. A scene with less dynamic range (the configuration of the lighting on set) might need a different set of tone mapping and gamut mapping parameters than a scene with a lot of dynamic range.
In Motion Picture, the situation is relaxed because we have a skilled colourist that can alter the image state by additional operations ( this is one aspect of colour grading), but there might be other use cases.
So one fundamental question might come down to “what is the ‘golden’ reference scene” for a static colour management framework?
The RRT:
To me, we really should open the discussion about the RRT if we consider enhancing complexity in the ODTs.
In many related disciplines, complexity comes with a great penalty for computation cost, manufacturing costs etc… so complexity should always be avoided. I think such a mindset would be good for the development of the “next ACES”. Every line of code in the image processing pipeline really needs to show a clear benefit. I have doubts that we will see real-time implementations of the actual real code of RRT+ODT if we increase complexity rather then decreasing it.
Removing or redesigning the RRT really is a good opportunity to reduce complexity.
I hope some of this makes sense.
best
Daniele