A single fixed Output Transform vs a choose-your-own approach

Splitting this topic from @ChrisBrejon original post:

What I don’t understand about this is, why do we (the ACES team) have to make this possible? Can’t people already do this? I mean, if you want to use K1S1 or RED IPP2 or TCAM, then there’s nothing prohibiting people from just using them. Why are people trying to shoehorn these looks into an ACES workflow? And if they are so hell bent on using one of the other renders, then why not just use them? Why are they fighting to use system X within ACES when that system X exists already in parallel and can just be used instead? I expect there are reasons for this, and I’d like to hear them. I think they will inform the work that this group decides to do.

For example, if the reason is anything like “well ACES doesn’t let me get to this particular color” or “I like the way system X looks better” or “ACES doesn’t let me do whatever” then I feel we should be looking at fixing those hinderances rather than just resorting to a metadata tracking system (which IMO always fail eventually). If we can make ACES easier to use (dare I even say “a pleasure” to use?), then will that instead help some of those detractors be ok with ACES? There are things that are broken in ACES - we know this. So let’s fix them such that those easy excuses and reasons are gone.

One argument for allowing “choose your own” output transform as a part of ACES could be that we already allow flexibility in working space choices, which must be tracked, so if we’re already doing it for a different system component, what’d be the harm in allowing it for the rendering transform?

I have always viewed that with ACES we strongly recommend that you should use ACEScct (I personally want ACEScc and ACESproxy deprecated - for simplicity). You transcode camera files to ACES, you work in ACEScct, and you render in ACEScg. Three colorspaces is enough. They’re all there for a reason. Having options for a working space is unnecessary complication, imo. But that’s probablyl a topic for another thread.

Remember the original name of ACES was the Image Interchange Framework - and it stemmed out of the desire to unambiguously encode, exchange, and archive creative intent. Metadata and AMF and all the other “stuff” that goes alongside ACES2065-1 are great for making it useful on production, but the idea is that even if all that other digital “stuff” is lost that a “negative” would still remain in a color encoding that is standardized and not just a “camera-space-du-jour”. We could theoretically make a new “print” of those ACES2065 file and have a pretty good idea of what the movie was “supposed to look like”.

Finally, if we build support into ACES for swapping in existing popular renders, what do we then do when other new renders come out? Where do we stop with what we do or don’t support? Do we support output referred workflows, too? Rec. 709? HLG? Pretty soon it becomes too much. I personally want to see the system simplified, not added to. (I didn’t even want to add the CSC transforms, because suddenly it becomes “why isn’t this camera or that camera included?” instead of “oh, i can just use ACEScct for every project? i don’t have to reinvent my workflow for the next show I do that uses a different camera?”)

Final point I’ll make is that it is the charter of this group to construct a new Output Transform, not to rearchitect the entire ACES framework. So let’s fix the broken stuff and see if there’s still such a need for ACES to be expanded in scope. I really think it can deliver as long as we fix the stuff that doesn’t work so well right now as we had hoped.


Who is ACES for, and what does it do for them?


let me comment on some of this:

As Troy mentioned, I think it is contra-productive to start a we vs them mentality in an open software project.
I think there is only a “we” (the media and entertainment industry) that’s it.
And we should do what is the best for us.

I think it would help the industry to have an easy interoperability system to communicate the pipeline for production. Also, some studios are demanding an ACES pipeline, even if the parties in the loop are not comfortable with it. Having a flexible output pipeline would make all parties happy.

I think you cannot find one output transforms which satisfies the needs for all productions in the presence and (more important) in the future. How could you forsee the future?
A live-action movie needs something else than an anime hand-animated movie.
This is a fundamental concept of any natural system.
Without mutation and without diversification you have no evolution, no innovation no progress. I think a joined effort like this one should empower innovation instead of discarding it.

It is not about fixing…

The same is true for a unified working space. Why should we do all operations in ACEScct. Maybe I want to do a CAT in LMS, a photoshop blend mode in another space and then a saturation operation in a CAM-ish space.
In the mid-term future, the concept of a working space will be obsolete.

I really feel sorry for what I am writing now but the ST2065-1 file format is far from unambiguous. If I give you a ST2065-1 file you don’t know if it was made with the ut33, 0.1.1, 0.2.1, 0.71, 1.03 or 1.1 version of ACES. Which all will producer quite a different version of the movie.
So the argument at the most forefront of defending the single output transform actually proves itself as unachievable and just sets the wrong incentives for the argumentation sake.
Is it a bad thing that we have an ambiguous archive file format? I don’t think so. It is good to have ST 2065 and nail down a few parameters for the encoding. It is a better DSM, that’s it and it is great.
There will be different display renderings as time passes. You need to archive those display renderings along with the image material this is the only way. Having a unified industry-agreed way of specific and archiving those would be a real winner.

What I am really afraid of for our future generation is that they will restore severely limited ST2065-1 files, because we rendered hacky LMTs into our masters which go to 709 and then back again, just because the actual system was not flexible enough. This is ethically and morally the wrong approach in my opinion. And we are steering right at it with a single output system.

It is a challenge to design a meta-framework and needs a lot of thinking. But I think it is the right task to fry our brains.

If we come to the conclusion that a flexible output transform system is the best thing to do I think it is a valid outcome for a vwg. We still need a very robust vanilla-aces-transform.

I hope some of this makes sense.


This is very true and something that AMF is meant to solve.

I originally asked because it seems to be a fundamental design question about audience. There are a bunch of things that leave me scratching my head, not the least of which are the ground truths folks in this thread have talked extensively about. @sdyer did a good job of enclosing “hue” in scare quotes in the ODT paper, for example, and for good reason.

I see a larger and perhaps more fundamental issue here with some of the concepts. Loosely, for the sake of a “high level” survey of image formation, we might be able to break down the philosophical ground truths as follows.

Creative Image Formation

  1. Absolute “Intent”. The reference golden image buffer is considered the idealized output and “fully formed”.
    Image formation design:
  • Where the medium values are larger, render precisely with respect to chromaticity at the output medium.
  • Where the medium values are smaller, clip precisely with respect to chromaticity at the output medium.
  1. Creative “Intent”. The reference golden image buffer is considered the idealized entry point, targeting an idealized output medium, fully formed relative to the medium in question.
    Image formation design:
  • Form the image in nuanced and different ways with respect to the output medium, but maintain the “intention” of the mixtures in the light encoding.
  • Where a medium is a smaller volume, try to preserve the chromaticity or “hue” and “chroma” intention / relative distances, and render accordingly. Note the determination whether to form “hue” and “chroma” are relative to either a chromaticity model or a perceptual model. Part of that setting of ground truths in a clear manner above.
  • Where a volume is larger, use less compression with respect to the creative intention ratios of the entire volume.
  1. Media-relative “intent”. The reference image buffer is considered an entry point and the final image formation of the “golden image” is negotiated in conjunction with the output medium.
    Image formation design:
  • Render differently for each output medium. The “creative intention” may shift in the negotiation of the optimal image output. For example, a highly saturated blue in the entry image buffer might be gamut compressed slightly to heavily saturated blue in terms of chroma for one HDR medium, a less saturated chroma for a smaller HDR output, and perhaps rendered completely achromatic for something such as SDR.

A case can be made for any of the three as completely valid.

Loosely, ACES seems to lean toward 3. but technical issues exacerbate some problems. For example, per-channel lookups are incapable of manifesting the “intention” of a “tone” curve when it skews the values such that the resulting luminance sums are radically different across the volume. Same happens for “hue” and “chroma”. The reason it seems like the design is close to 3. stems from the varying ranges of the shapers for the HDR image formation transform ranges.

Some folks seem aligned with 2., where the creative “intention” of the golden image reigns supreme. At risk of putting words in people’s mouths, I believe this is a similar path that @daniele and @joachim.zell seem to be advocating for. Feel free to chime in and let me know if I’ve absolutely misread your vantages here @daniele and @joachim.zell.

I asked the superficial-appearing question because I firmly believe it cuts to the philosophical ground-truth upon which all technical decisions are based, and no “solution” is likely going to satisfy any party until this core design goal is clarified.

It doesn’t seem easy to design “solutions” to hard technical problems without first tackling a clarification of the image formation design philosophy.

1 Like

Thanks for this important consideration @Troy_James_Sobotka.

In general your approach 2 is generally applicable to many use cases. And I agree that RGB tone mapping is always end medium related because the hue skew is tide to the tone mapping. So the dynamic range of the monitor dictate how yellow orange flames become , it will be different for different outputs. This was maybe ok 10 years ago where all output media had similar dynamic range. Re-evaluating it today it is a no-go in my opinion.

But I would go even one step further in saying that an ideal mapping strategy needs to maintain the story telling intend. This is a much harder target. In order to tell the same story on different media you might need all sorts of output translation.

I guess this is another argument for a flexible output pipeline.

I hope some of this makes sense.

1 Like

Scott, I’m with you that I don’t like the idea of an archival format that requires the use of a sidecar metadata file in order to decode correctly. It is hard to say with confidence that every production that uses (and archives with) ACES today is still going to have all the pieces together 20 years (50 years?) from now. I hope I’m wrong.

That being said, when I glanced at it briefly I believe part of the AMF VWG is consideration for archival purposes, so it appears that the overall architecture is moving forward with AMF being a part of the archive process if I’m understanding that correctly. Frankly, as Daniele mentioned, I’ll blame some of that on how much the output transforms have changed already, and restoring a file with a different version of the transform than it was originally mastered in could yield quite different results in some cases. This will be exasperated in ACES 2.x if the “RRT” becomes more neutral; anything encoded prior to 2.x will need to be handled differently.

As such, if we consider that having an AMF will be an integral part of the process and will include the output transform(s), I’m having a hard time convincing myself why this can’t become more of an open format, despite that I personally like the simplicity of a single set of ACES-defined transforms. What I’m hearing is that people want to use other renderers simply because they like the look of them better.

Now considering that the output transforms (and therefore the math behind them) would be openly accessible in the AMF, it is debatable whether vendors would want to make their proprietary algorithms publicly available. That being said, if someone wanted to roll their own output transform, I don’t see the harm in it.

I would say we are only responsible for the ACES-defined transforms (aka “default” transforms), and any other renderers are the responsibility of those that create them (vendor, studio, individual, etc), kind of like an IDT. I guess this indicates the ability of a user to add output transforms (can we call them ODTs please? It’s so much shorter to type, ha) to their system independently.

A topic probably suited for the AMF team, but I’ll mention it here: if multiple output transforms are included in an AMF (let’s say DCI, Rec709, and an HDR version), can we/do we indicate which version is the “golden image” for future reference?

Please please please, don’t remove ACEScc from ACES! This is the only log space, that lets do the white balance and exposure corrections (using offset) almost identical to the multiplication (gain) in linear gamma. Which is impossible for shadows with any other log including ACEScct because of its toe. And the only 2 reasons I finally switched to ACES pipeline are the ACEScc (which works without artifacts when it’s based on DCTL instead of Resolve built-in ACES) and the amazing gamut compress DCTL. So this is now the best pipeline I can imagine (And I hope it become even better soon, if Gamut Compress algorithm will be a part of IDT). Before I switched to ACES, my pipeline was based on making corrections in linear gamma to make the most physically correct adjustments, but it brings me to add too many nodes, because I can’t add saturation in linear gamma without introducing artifacts and having far from perfect luma weighting.

That’s great feedback, thanks @meleshkevich. When we released ACES 1.0 it only had ACEScc, which we thought had advantages. But it was a bit too far a leap for some to adjust to and it broke some existing tools or at least behaved different enough for people to react negatively against it - hence ACEScct. I am glad to hear that someone is still using it and finding it useful.

1 Like

Thank you for your answer! I think ACEScc is also very important because it finally lets CDL corrections to be physically correct. And AP1 for some reason (maybe accidentally) is very good for WB from my limited tests with color checker and comparing it to different LMS spaces.
Many colorists I know think that offset over any log, and especially ACEScct is identical to RAW WB or aperture on the lens. I think, if the fact, that its is true only for ACEScc, will be mentioned somewhere, probably in the ACES documentation, it make people to use ACEScc more often.
Sorry I’m not trying to tell you what to do, I’m just a random colorist, and of course you know better what to do with ACES. I hope I don’t I say something inappropriate. I’m just trying to make the whole image pipeline better and more intuitive for all by a sort of promoting ACEScc. And my far from perfect English definitely doesn’t help me with that and makes me sound strange a bit I guess :slight_smile:

1 Like