A single fixed Output Transform vs a choose-your-own approach

Thanks, that’s much appreciated. Food for thought !

Chris

The LMT is scene linear to scene linear, and the RRT is scene linear to display linear if I’m not mistaken.

Having just read through Ed’s paper (that Alex posted), I can see the value of a single RRT, and I would also keep Chris’ four steps segregated and not combine the RRT & TCT:

The RRT would provide an initial tonescale and gamut mapping and, as the name implies, is the “reference” for all other display transforms (it’s the “golden image”, despite not being actually visible; it’s purely theoretical as no current or even near-future display technology could accurately show it). This is the scene linear to display linear transform, and there is only one in this case. It needs to have established specifications for the theoretical display it is transforming to; we could propose, for example, something like 10,000nit peak, D65 white, AP1 (or maybe BT.2020) primaries, dark surround.

Transforms for “real world” displays (BT.709, Rec.2100 PQ 1k, DCIP3, etc) are based off this theoretical display, so the TCT for each of these displays is distinct, and mapping “down” (smaller gamut and/or dynamic range) from the theoretical display (display linear → display linear).

Both the RRT and TCT are potentially modular. You can exchange the single RRT for a different renderer (K1S1-esque, etc) and it would affect all outputs equally. Or you could swap out an individual TCT if desired. This would accommodate (I think it was Thomas that was proposing) a display-level LMT; except instead of an in-line LMT for a particular display, it would simply be a different TCT with the desired alterations.

Unless I’m thinking about this differently than Chris, to clarify the Miro board in Chris’ drawing the display linear line should be between the RRT and the TCT, as after the RRT you are in display linear space.

This is a nice summary of the status quo of ACES :-).

Don’t get me wrong, this is conceptually all understandable, and this theory harmonise digital and film pipelines quite well, but I am wondering if that concept of RRT really brings us something in practice?
And there are successful pipelines out there that work nicely without it.

I think I finally understood what you meant ! When you said I am focusing on displays rather than transforms. I re-watched meeting#8 recording and realized that I misunderstood what you said about the fork…

It only took me six days to figure something out ! Yay !

So, beware, I am going to trash my Miro board (but we have the screenshot if we ever need to come back to the RRT+TCT version).

Chris

If I’m understanding you correctly, the question is whether there is one “master” rendering, or whether there are multiple renderings, defined by some type of groupings (by viewing conditions or display capabilities)? Maybe this is alternately stated as there being an “intermediate” transform (in the case of a single RRT) versus sets of direct scene to viewing condition (display) transforms. I think that’s what you’re getting at?

For ease of description I’m going to describe this as two-stage (RRT to TCT) vs single stage (scene to viewing condition)

I’m going to make a start at a pros/cons list to help (at least me) think through this:

Two-stage pros:

  • Single core renderer is easy to update/revise (ACES 2.1, 2.2, etc)
  • Single core renderer is easy to swap for alternate renderings
  • Specific displays or viewing conditions can be added (TCT’s) without re-analyzing/re-creating a rendering algorithm

Two-stage cons:

  • Two step process is potentially less cleanly invertible
  • Requires better metadata tracking of the RRT version and TCT versions
  • Cannot take advantage of a potential future display that has greater capabilities than the “reference”/theoretical display of the RRT

Single-stage pros:

  • A more “pure” transform from scene values to display values
  • Likely more easily invertible
  • Can add new viewing conditions at any time

Single-stage cons:

  • Multiple groups of renderings have to be created and maintained for every rendering “look”
  • Rendering sets may be incomplete if developed for specific uses

I’m sure there are more things to add to the list, but that’s what I’ve got off the top of my head for now.

EDIT: Daniele, you also at one point talked about a “sliding scale” rendering where there is for instance a 100nit render and a 5,000nit render, and anything required in-between is calculated from a merger of these renders. Is there a way that fits into one or both of these scenarios, or is this a third type of approach?

As some whose happy that ACES is simple to use and gives pretty pictures out of the box (most of the time) I may be misunderstanding some of the intentions of having different, and custom, rendering transforms. I share some of the concerns raised in the TAC meeting about it as well. So here’s some of my (randomish) questions and concerns about this.

  • I think there should be better understanding why there is need for custom rendering transforms. As @jzp said in some meeting that in 99% of shows they’re dealing with the RRT/ODT will be inverted. In other words shows that need to finished in ACES, the shows weren’t actually using ACES, so people just have to deal with it. I think this has been well established as a fact of life, but I don’t think it has been established what are the reasons why this happens so often (anyone know?). Maybe there’s something that the ACES Project/Academy could do to help with this situation (other than/in addition to the proposed model(s))?
  • Is there a danger that post facilities, and software vendors, start providing their own rendering transforms, and suddenly there’s 15 slightly different versions of effectively the same thing? What’s the point of ACES at that point?
  • How would this be presented to the end user? What new settings would be expected to be exposed to the end user (such as a colorist) in these types of models?
  • What is the benefit of custom rendering transforms for these end users? Right know with ACES we have simplicity and consistency. Doesn’t that go away with custom renderings? Or at least things will get more complicated and more things to keep track of.
  • Since everyone always mentions the ARRI K1S1 I’m going to as well. To me ARRI K1S1 is not a rendering transform from the point of view of ACES. It’s a look, an LMT. From ACES perspective it should be possible to have an LMT that has equivalent (not identical) look than the K1S1 with RRT+ODT (no inverting). If this isn’t possible, then that’s a real ACES problem, and that should be addressed by the working group.
  • What’s the role of LMTs with different rendering transforms? Does it then mean that, in practice, LMTs would be tied to a specific rendering transform? If you change the rendering the look will also change. Could the LMT even break under a different rendering? And in this case then, what is the value of LMTs? I thought the value of LMTs is that you develop a look and it will look the same or sufficiently similar regardless of where you output it.

Hello,

happy to answer a couple of these questions to the best of my abilities. :wink:

If I understood, the main reason why is “studio requirements”. Aka a project has to be ACES compliant, but will not use the ACES Output Transform. It has been brought up by Daniele and Josh Pines in meeting#5 (at minute 35:43).

The point of ACES is to provide a Vanilla Transform (like it has always been the case) that can be used out-of-the-box. But if an expert user needs to swap the Output Transform for any reason, he/she should be able to do it easily.

I don’t expect new settings to be honest. But I could completely wrong. The important thing is that ACES would still work out-of the-box.

I don’t think it will complicate things. ACES would still work out-of-the box for all users. But if someone needs flexibility for whatever reason, his/her life will be much simpler. :wink:

Hum, I do think the ARRI K1S1 is a DRT. There is a comparison of several DRTs on the dropbox paper. You can see the main differences, strength and weaknesses of each system : ACES, RRI, RED, FilmLight… ACES does come with a look as well.

That’s a possibility. I have posted some images comparing ACES 1.2 to RED IPP2 in this thread. If you look at the saturated render of Louise, you will some gamut clipping with the ACES Output Transform.

To me, the beauty of this idea is that it is not mutually exclusive with the Vanilla ACES Output Transform (expression used by Thomas in meeting#2 at 35:09). So it is a win-win in my opinion both for non-expert users (such as myself) and expert users such as Daniele or Joshua.

Happy to discuss,
Chris

Just wanted to point out that, in a sense, this is what the “SSTS” in the ACES 1.2 HDR Output Transforms is doing. The tone scale in those transforms automatically adjusts based on what is essentially an interpolation between a curve that is similar to the current “SDR” system tone scale and one that I designed for the “wide-open RRT tone scale” (aka 0.0001 to 10000 nit range). Anything that falls in between, including 1000-nit, 2000-nit, 4000-nit that are calculated using some interpolation between those two “endpoint” curves.

I have no idea if that is the “right” way to handle tone mapping, but it seemed to behave quite well in all the tests I tried. More importantly, it’s a behavior that is intuitive and makes sense.

I think tone mapping-wise, this is a good approach. If our images were black-and-white, maybe we’d be done! In color land - the manner in which we boost image saturation where needed, while maintaining hue, and while also keeping colors neatly within our display gamuts is going to be the part where we really need to focus our efforts on defining what behavior makes the most sense within the architecture we are building…

Digression over. Please resume discussing the system architecture now :slightly_smiling_face:

If it’s working (which it sounds like it is), then it seems like a great approach. That keeps us from having to generate a bunch of discrete transforms, and also makes the system more flexible (and robust) as you can just update the endpoints if required, instead of updating every transform. Quite interesting!

As for the topic of multiple renderers, one thing it accomplishes is formalizing a way to allow vendor (or I suppose even facility) renderers. Currently that’s being accomplished by applying an inverse RRT and the desired renderer inside the LMT space, which is less than desirable. Regardless of how good our vanilla render looks, there will be people that for one reason or another want/need to use another renderer (TCAM, IPP2, K1S1). The idea of keeping the system flexible has merit (with certain caveats of course). It’s a bit of Pandora’s box, but for instance there could be a “path to white” renderer and a “gamut clip” renderer. There was a brief discussion on having a “federated” group of renderers that is managed as opposed to letting it be wide open, but that discussion seemed to have gotten tabled for now.

To clarify a little what I wrote earlier, what I was proposing above (based somewhat on Ed’s paper) was an RRT that renders out to a defined (albeit theoretical) display; I need to revisit the details, but I’m pretty sure this is a departure of how the current OCES system is designed.

Thanks guys !

Just to wanted to mention about the architecture thing that I found this old message by Daniele from the Gamut Mapping VWG :

I think Gamut Mapping and compression is an essential part of a successful image processing pipeline. It is needed in a different context at different parts of the processing stack […]. So further gamut compression might be needed to prepare scene-referred data for further display rendering (very close to the end of the scene-referred stage (maybe within an LMT)). […] After display rendering, we might need further gamut compression to produce the final image on a given display gamut. However, if the scene-referred data is already sensible, gamut mapping on the display side is not that big of a problem in our experience.

And since it is not the first time I hear about two-stage gamut mapping, I was wondering if this be should reflected in the diagram/Miro board.

Chris

PS : acescentral is full of these little hidden gems that I find fascinating to dig from time to time… :wink:

2 Likes

I was thinking more about the reasons, given these studio requirements, why shows don’t end up using ACES more often, why they don’t do lookdev in ACES or monitor ACES on-set, etc? It is a pipeline after all, and those using it shouldn’t have these issues later in post. If productions don’t want to use/can’t use ACES then the ACES Project/Academy might want to be interested to know more about the reasons.

Sure, it’s a DRT, but in ACES it perhaps shouldn’t have to be, was my viewpoint. In my mind it should be possible to get there (equivalent) with an LMT through RRT+ODT, be it K1S1 or IPP2 equivalent. Sure, we can’t do everything that a 3D LUT does, but the RRT+ODT shouldn’t be so restrictive (or have such flaws) that one can’t get close enough.

As far as the look of ACES goes, one of the considerations listed in the working group dropbox is to impart a lesser of a look to encourage LMT development. I think that’s a good idea.

1 Like

I’ve been thinking about this more and more lately; and I do wonder whether a single RRT is a necessary constraint in the age of technologies like OCIO, AMF, and CLF.

OCIO-v2 lets one unambiguously define, communicate, and procedurally traverse graphs mapping ACES2065-1 to arbitrary output targets via arbitrary intermediate ViewTransforms on demand, a la architectural analogs to the following style pipelines:


An OCIO config is capable of defining and procedurally traversing the graph containing all paths from AP0 colorimetry —> output light via the one (of few) mastering-target-referred ViewTransforms demanded by one (of many) display device / encodings.

OCIO is capable of representing that path equally well as shader code or Academy CLF.

Alternatively, Academy AMF sidecar data provides a means to describe specific transform chains per-clip / per-rrt-version / per-output-transform permutations; which, in turn, can also be baked to CLF via OCIO (feature-pending).

On one end, this doesn’t necessarily preclude the use of a single RRT; and on the other end, it doesn’t preclude the use of multiple hypothetical Academy and/or vendor-provided DRT “families”.

This is a really long winded way of saying it, but ALLS I’M SAYIN IS, as brilliant as a single RRT architecture is, viable, open, Academy-sponsored (directly or indirectly) means for alternatives have emerged over the past decade, congruent with how ACES is often ab/used in production; and if we can agree and take for granted that we already have (and use) the necessary technology for specifying and generating program-specific • output-specific transforms, the entire world is our oyster.

Of course, I’m conveniently sidestepping the problem of tracking LMTs to the DRT families / master-variants to which they refer, depending what we decide to do with ViewTransforms; but we’re still taking finite permutations of manageable complexity.

The entire world. Our oyster. Just saying.

4 Likes

couldn’t agree more.

1 Like

I’m going to need some further details on the World → Oyster conversion math being used here. But in principle I agree.

2 Likes

You might find that to be pedantry but it seems worth clarifying: I think that there should be a single Reference Rendering Transform (RRT), if you have multiple RRT then none is really the Reference right? The SSTS is modular enough that it should be able to cater for all the possible dynamic ranges and I don’t see a reason why once we have settled on the Hue Invariance topic, we should have more than a single shipping RRT. It would be kind of failure showing that we, as a group, could not come to an agreement. There will be compromised made, some people will not be happy (I’m also happy :smiley: to not be happy :frowning: ) but if we don’t do compromises we will end up with y = x for the RRT.

Should we arrive at that point, the TAC should be involved to inform the preferred direction and in the eventuality there would be TAC disagreement, the Leadership should take the hard decision for the project.

Obviously, this does not preclude the system to accept other Rendering Transform (RT). It is, after all, a block with known input and outputs thus with the proper metadata tracking there should be no showstopper issues for people to swap it out with whatever they want.

Cheers,

Thomas

1 Like

I wasn’t really arguing for or against any particular approach; I just wanted to point out that there are options available to us that didn’t really exist during the development of ACES-1.0, and that a single, static RRT needn’t necessarily be the written-in-stone prerequisite it once was, and we shouldn’t let such a constraint get in the way of exploring all options – especially at this stage. In other words, if we arrived at something we were happy with that violated the single-RRT constraint, it’s definitely a solvable problem we can kick over to the TAC.

Let’s get pedantic! :cowboy_hat_face: I think we’re in agreement (you and I anyway) that there should be a single Reference – to me, the question is whether the emphasis is on RRT or the RRT. If we agree that the goal is to manifest a single rendering intent (by some definition) across devices and viewing conditions, does a one-size-fits-all transform serve us or fight against us?

I think the SSTS is a brilliant, elegant mechanism that seems to get us where we need to go, pending critical evaluation by the rest of this VWG; and you could be absolutely right, that once we’ve come to a consensus about Hue Invariance, other considerations will snap into place under a single shipping RRT.

At the same time, I’d like to explore biologically-inspired mechanisms for tonemapping, although I don’t know a whole lot about them myself. I’m assuming a parametric Michaelis–Menten function couldn’t possibly work like the SSTS; but it would still serve the greater purpose of transforming scene light to a display-appropriate reference render.

There’s also a lot of conceptual overlap with the question of how / whether to allow master-/output-target referred LMTs.

At the beginning of Output Architecture VWG meeting #9, Alex Fry brought up whether its possible to have an LMT do the dirty work of adding hue skews resulting from channel-independent tonescale lookups to a hue-invariant RRT; and I think it was pretty quickly agreed that a single LMT wouldn’t suffice for all outputs, due to the Something Something effect eliciting qualitatively differing hue appearances as a function of luminance (Helmholtz-Kohlrausch? I can’t remember the exact problem)

  • Wouldn’t the same issues apply to contributions of most LMTs?
  • Is this something we can / should control for with a CAM?

At the end of the day, Thomas, you’re also right about not being able to please everyone. Some dead guy once said something along the lines of, “the best compromise is that which leaves everyone unhappy.” But I also think we can afford to lean a little into architectural complexity (i.e. permutation hell). Bonus points if architectural complexity serves to lessen algorithmic / mathematical complexity elsewhere.

Sidebar: can we come up with a different acronym (for internal use) for these sorta-kinda LMTs that only do what they’re supposed to do for a limited subset of outputs?

I propose LQT (Look Quasi-Transform) cuz it’s visually similar to, yet clearly distinguishable from, “LMT”. On the other hand, that might not be a great idea, because it implies “quasi-transform” is actual terminology, and not something I just made up while typing this.

Hey, just to clarify a bit on what I was trying to express on tonight’s meeting:

  1. If we have an easy to invert RRT, and everyone knows it can be reverted and there are user-friendly tools to do it so we don’t have anymore the “I don’t like ACES look”, we should be pretty well. To me the problem today is that it’s difficult and it takes a color scientist to hack it.
  2. just for flexibility, would be nice to have 2 stages of “look”, one for “grading” and one for “print look”. People will use it different ways, or just not use it, but it would be very useful to emulate a good ol’ DI workflow with the film print emulation independent from the grading underneath, that you would carry independently, so for example the VFX guys would know that a part is a “show look” or “sequence look” and a part is the “shot grading”. I struggle a bit at finding good names for those, “grading look transform” and “render look transform” maybe?
1 Like

Also something I may not have articulated too well when talking about stuff happening in the display space, is the capability to do things like old film tinting or classic crunchy telecine looks. It’s a bit like in sound, you have the mixing part and the mastering part, stuff that happens on the "rendered’ part. Certainly challenging if you want to unbake the cookie, like trying to make a surround version of a stereo master without going back to the original stems, but at least the possibility should be there for those who want it. I had so much fun with my friends doing classic movie restoration at Eclair, they even had some boxes made with arc lamp to see how to match the original while split viewing with the digital projector… if Netflix that has financed the restoration of Abel Gance’s Napoleon, one of the most complex movies ever made, want to save the result in ACES, they would need all of that :slight_smile: Netflix et «Napoléon», les fiancés de l'Empire – Libération

1 Like

And a couple of good examples of stuff that we should be able to fit in https://filmcolors.org/

2 Likes