Validity of ACES criticisms?

Hello Brian and welcome,

I have been pretty vocal on this forum about ACES and some limitations, for example in this post.

Just like you, back in 2017, I had no clue about colour as you can tell in my first post. And it is partly thanks to this community that I started and still continue this amazing journey through colour. To answer your question shortly, I would say that yes, these complaints have practical validity. Let me give a few examples :

  • Image a system that clips shadow values in SDR at 0.002.
  • Imagine a full CG character such as a red plumber lit by the sun… And his hat goes “Dorito” orange.
  • Imagine a full CG light saber where the colour is an ACEScg primary and the character looks flat.
  • Imagine a blue rabbit or sphere that goes purple when lit.

I think it goes back to Vlad’ s point : there is a lack of overall color science knowledge in the CG industry. Learning about colour science has made me a better lighting artist and now that I have seen and experienced first hand the “notorious 6”, I cannot go back to ACES 1.X. I have setup some trainings at the studio, so every artist knows what the “notorious 6” are and how to identify them.

It took me 15 years of experience to learn about “display transforms” and in my quest about colour, I have talked to many Lighting/Compositing/CG/VFX Supervisors and realized that most of them do not have a good understanding of what happens when the “image data” is rendered for a display device. So I agree with Joseph when he says that “color management in motion pictures is still in its infancy”.

I would say that the hue shifts/skews themselves are “not necessarily” the issue (as Alex explained). The issue is that with ACES, you will always get the same ones. So, maybe your different projects have different artistic needs but ACES will prevent you from having different behaviours. Your orange tones will always become “rat piss yellow” with values above 1 for instance.

As Derek and Thomas have explained, part of the issue comes from per-channel. We could add that ARRI K1S1 is a per-channel display transform that has been used massively and with great success. So, in reality, the issue is not per-channel itself but how it is implemented (as explained brilliantly in this post by Jed). You can also have a look at this cool post about per-channel by Alex Tardif.

As always, devil is in the details. I don’ t know what is the scope of your projects, if you display in SDR or HDR and what are your requirements. Many artists who just want an “automatic tone mapper” use ACES because it has been implemented in most DCC softwares (ACES being like the lowest common denominator here) and they got used over the years to the “notorious 6 aesthetics”. But once you have seen differently, there is no going back.

About ACES productions, I would be very careful because it has been said by different people in several meetings that the ACES Output Transforms were inverted most/all of the time. So I will stick to what I know : ACES is mostly used for exchanging files and archival (unless proven otherwise).

Finally, about openness and transparency, I wish the TAC would have let Daniele speak a year ago to present the Meta framework idea. And I also wish the TAC had discussed in their latest meeting the loss of their main contributor. I think that is a big deal. Sure, this community has a voice here on this forum and this is great. But is this voice being heard ? I am not sure.

I hope my criticism is constructive. As Vlad explains, part of the issue is the “ACES evangelist” (I used to be one of them) and as a community, we need to dig deeper and expand the knowledge. So artists can take better decisions and not just follow a trend.

Regards,
Chris

3 Likes

On the subject of ICC, I’d to be curious to hear folks thoughts on Adobe Substance 3D Painter’s new ACE color system which is based on ICC and has ACEScg as one of its working spaces.

Maybe Adobe will save us. They know color and they know how to get stuff done, so maybe!

In the meantime, it seems like the problems that Christophe describes are long-standing and unresolved and there’s no alternative standard to switch to. I’m happy with my ACES workflow, but the specter of Something being Wrong is troubling, for sure.

I’m looking forward to the official release of ACES 2.0 (is that what it will be called?). The candidates are looking very good, especially C (hint hint :stuck_out_tongue:).

EDIT: As @ChrisBrejon points out below, these were mistakenly done with an older version of the Candidates OCIO config, so the current Candidate A does not in fact have these issues shown below.

Regarding those skews (blue to magenta for example) here’s an image from @ChrisBrejon viewed in ACES 1.2 where you can this pretty clearly:

Here’s that image with Candidate A. Not much change as far as those colors go.

Here’s Candidate B. The skews are gone.

Same with candidate C.

So of the candidates, only B & C address these color skews. If these were a “must have” (personally, I feel similar to @ChrisBrejon that they are) then that disqualifies A for me.

Just for funsies, here’s ACES 1.2 with the Gamut Compression applied from ACES 1.3. Not perfect, but way better on those blues not going so magenta.

2 Likes

Very nice! Candidate C is :heart_eyes:

1 Like

6 posts were split to a new topic: OT Candidates discussion

Until they fix shift to magenta and 8 BIT PRORES on export and incorrect colors with prores source files from alexa cameras on import, I personally doubt it.
I’m tired of instructing vfx-artists how to deal with all these bugs, so they could send me high bit depth files back without any change in colors.

It has per channel tone (highlights mostly) mapping, then primaries conversion of the tone-mapped image, then linear-to-rec709 (not gamma2.4) encoding. So, how it affects out-of-gamut colors depends on source gamut.

Joseph, respectfully I have to disagree with your analogy between ICC PCS and the ACES system.

The PCS very much is inherently lossy where ACES 2065-1 re-encodes the scene colorimetry using a standard color encoding and set of viewing conditions. It does not impose a reference medium the way ICC PCS does. This has a huge practical effect.

The ICC PCS reference medium is based on the colorimetry of a reflection print with a very low dynamic range. This is devastating to scene referred images and output to HDR devices. Again, there’s various strategies that people use to try to work around the issue, but the system is inherently limited and trying to use it with image states based workflows and HDR are a bit of square peg in a round hole. The only analogy I can think of would be if the ACES 2065-1 were based on the colorimetry of an image as displayed on a traditional broadcast display with a dynamic range of ~300:1. That would be highly problematic for modern motion picture, television and gaming workflows.

ICC Max is a different beast that I can’t speak to.

Below is a quote from the ICC with camera images white paper.

ICC color management workflows generally assume that the colorimetry
expressed in the PCS is of a [color-rendered] picture, and not of a scene. There
is currently no mechanism to indicate that the colorimetry represented in the PCS
by a camera profile is relative scene colorimetry. Even if there were, use of the
PCS to contain relative scene colorimetry is not fully compatible with current ICC
workflows, which assume color rendering has been performed. This distinction is
especially important with respect to highlight reproduction. Many scenes contain
highlights that are brighter than the tone in the scene that is reproduced as white
in a picture. An important part of the color rendering process is selection of the
tone in the scene that is considered “edge of white”, and graceful compression of
brighter tones to fit on the reproduction medium (between the “edge of white”
tone and the medium white).
An ICC working group has been formed to attempt to address issues with the use
of ICC profiles in digital photography applications, but at present progress is
difficult. Even if improved characterization targets (such as narrow-band emissive
targets) and profiling tools are introduced, colorimetric intents will still be
illumination specific, and perceptual intents will optimally be scene-specific.
Some argue that scene-to-picture color rendering should be restricted to in-
camera processing and camera raw processing applications, and correction of
color rendering deficiencies limited to image editing applications.

[1] https://www.color.org/ICC_white_paper_17_ICC_profiles_with_camera_images.pdf

I guess we are just disagreeing on terminology. Every aspect of a PCS is implemented and defined in ACES and is used in the display pipeline. OCES has a preferential color manipulation, reference viewing condition, whitepoint, defined dynamic range and gamut. The specifications defining AMPAS are different then ones made for ICC but functionally it serves the same purpose in color management.

I’m point this out more to show that ACES follows established and proven aspects of successful color management systems.

As you noted other parts of ACES, such as IDT’s are different.

I’ve only kept aware of IccMax and haven’t seen how it works in practice.

The parts that interested me are that ICC max extends out the ability of ICC to use spectral data as well as material properties such as florescence. The ability to handle scene referred data is added by defining the conditions of the environment. So for an image with a 10,000 nit d65 white point, the white value is still 1,1,1, but that is defined in reference to the 10,000 nit d65. If not defined it defaults back to the historic specifications. There is a lot more in there as well.

I’m not stating the new capacity is the same as ACES, but it’s not an apples to oranges comparison.

https://www.color.org/specification/ICC.2-2019.pdf

OCES is a deprecated term. It’s not used in the HDR output transforms or in the candidate ACES 2.0 transforms. If you consider OCES an equivalent of ICC PCS, in the ACES system there’s still a way into it for scene referred data. That’s not the case with ICC PCS. ICC PCS reference medium is a least common denominator. OCES is the complete opposite and is intended to encapsulate all current and future displays.

In ACES the flow is always more information down to less information. With ICC PCS you’re sometimes forced to go from more information, to less information, to a location where you wish you had access to the original but PCS tossed it away.

I’m not trying to diminish ICC, but it’s a hammer when the motion picture industry needs a Torx screwdriver. You might be able to hit that Torx screw hard enough with the hammer to kinda get it to stick, but it’s not the right tool for the job.

1 Like

All that has happened, again, because the actual mechanic is per channel, is punt the medium down the road.

The image formed is medium to medium, and no image constancy exists.

1 Like

Sorry Troy. Not following. What’s “per channel?” This isn’t about the rendering. The reference medium in ICC is really just a definition of dynamic range limitations into which the colorimetry must fit.

Feeling like we are getting way off topic. I may split this tread into it’s how topic.

1 Like

Are you sure of your setup ? This is what I have :

Thanks !

Chris

1 Like

Of course it is about the rendering.

I think it is worth drawing attention to the idea that ICC protocols are essentially about replicating imagery. In ACES, there’s no clear acknowledgement that there is an image, despite the fact that it is already happening.

From the standpoint of the working space, there are fit colourimetry values. Call those W. They have singular definitions.

Then, the rendering comes along and throws all of that out, and there ends up being an implicit medium; ACES primaries, with colourimetry that is completely distorted from the origin colourimetry. That’s the per channel, and that is the picture. Were we in an ICC chain, that’s the thing we’d be replicating.

This isn’t a bad thing, it just is drawing attention to the fact that there is always a medium. From the standpoint of what people see. In this case, it just happens that there’s accidental distortions between every medium, hence ends up being arbitrary. This is why the HDR and SDR looks different, and SDR to SDR looks different.

Going off the deep end about colourimetry in the working space is nonsense when none of it matters due to the arbitrary distortions happening on the output. Literally none of it, except achromatic values, makes it out as the colourimetry stands in the origin.

2 Likes

Good catch Chris! Indeed I was mistakenly using an older OCIO (rev007) so Candidate A was… something else. I think it was at the time just ACES 1.x with the MM tone scale and removed sweeteners. I guess that’s what you get for doing blind testing! :crazy_face:

Everyone: Please disregard all my comments about Candidate A in this thread.

1 Like

Ah cool, that lines up pretty much exactly with what I observed in my limited tests. A is neutral and B and C are more saturated to varying degrees. All look to be an improvement over ACES 1.2, which is exciting!

Troy … lots to unpack here.

I agree … the primary workflow in ICC starts with an output referred image. The goal is to reproduce that image, as closely as possible, on a different display device (e.g. RGB monitor) or medium (e.g. printed piece of paper) than the original was encoded for.

I’m not sure what this means.

I’m also not sure exactly what you mean here, but I assume you’re taking issue with the fact the Camera RGB values are generally transformed into estimates of scene colorimetry even if that colorimetry is expressed as RGB vs. XYZ or LMS. I’m not sure if you have an issue with that, or if it’s just a statement.

Rendering, by definition, is the process of deviating from the scene’s colorimetry. The transform converts scene referred images to output referred images. All output referred images have an associated display device or reproduction medium … that’s what makes them output referred.

From ISO 22028 Section 4.4

A colour-rendering transform is used to transform a scene-referred image to an output-referred image. Colour-rendering transforms embody the tone and colour reproduction aims of an imaging system, relating the corresponding scene colourimetry to the desired picture colourimetry.
It should be noted that colour-rendering transforms are usually substantially different from identity transforms and accomplish several important purposes including compensating for differences in the input and output viewing conditions, applying tone and gamut mapping to account for the dynamic range and colour gamut of the output medium, and applying colour adjustments to account for preferences of human observers.
NOTE Colour-rendering transforms are typically proprietary and irreversible.

1D lookup tables are often one step in rendering transforms. It is well known they produce hue and saturation deviations from the scene colorimetry on the output device. 1D lookup tables have been used for decades as an intentional, albeit blunt, mechanism to account for the “several important purposes” outlined in the ISO definition of rendering above. It’s not a perfect mechanism, isn’t capable of the fine tuning one might want, but the deviations from scene colorimetry aren’t unintentional.

Ok, so you want to define the output colorimetry of a given rendering transform, on a specific device, to be hero colorimetry?

Yes, ICC starts with an output referred image, intended to be displayed on a particular device or reproduced on a particular medium. That output referred image is transformed to a new output referred image intended to be displayed on a different device or reproduced on a different medium. The ICC “rendering intents” will dictate how gamut mapping between the devices is achieved.

ICC doesn’t do conversion of scene referred images to output referred images.

A major issue for some types of images is that in ICC there’s a “reference medium” and “reference gamut” built into the ICC conversion from the first output referred image to the second. That reference medium as a dynamic range of 288:1 and a black level of 0.3%. So, if you have an output referred image intended to be displayed on a device with a dynamic range of say 100,000:1 and want to use and ICC profile to convert it to an output referred image for another HDR device (maybe one capable of a very low black level), you can’t use ICC without playing tricks. The result will be lifted blacks and a compressed dynamic range.

Output referred = medium.
The issue with the PCS reference medium is that there’s a source medium associated with the first output referred image, a destination medium associated with the output referred image you’re trying to create, and in the middle there’s a medium associated with the PCS that’s often smaller than the ones on the ends.

One has a choice when converting between output referred image encodings. Say you have an SDR image you want to reproduce on a HDR device.

You have the choice to either :

A. make the output colorimetry match and look exactly the same on both devices
B. stretch the SDR output referred colorimetry across the HDR output device’s dynamic range
C. use the scene referred image, if available, to re-render a new output referred image taking advantage of the HDR device’s specific capabilities.

No choice is right or wrong, but it’s a choice that needs to be made. Only A. will “match”, but you’re also not taking any advantage of the HDR device’s unique capabilities. Your audience literally won’t even know the displays are different if shown side-by-side.

Assuming the first SDR devices gamut volume encloses the second, the same choices available here … just less extreme.

If it doesn’t you need to gamut map between the 2 if you want them to “match”. I put match in quotes because the devices are different, so they will never strictly match.

I think you’re saying your preference is to reproduce scene colorimetry on an output device. This is a choice one can make.

1 Like

Wow. Quite a reply. Let’s see…

Yes. Just generalized, open domain tristimulus, of varying quality. Scaled to be uniform with luminance.

I get that. It’s really just going from open domain, to a closed domain. But frankly, that’s where all of the magic is. The open domain tristimulus is still the baseline however. Addressed in greater detail at the end.

EG: A blue car passing under an achromatic cloud should maintain the chromaticity linear attenuation in the gloss of the paint in the working space. If it deviates from that, we end up with a double up in the creative flourishes, where our car paint now represents a sweep of chromaticity angles, as opposed to a single angle, which would compound any corrective or aesthetic flourishes.

That is literally what is seen. There’s no way around this? The critical chromatic attenuation on skin tones, to subtle distortions that may be composed of technical or creative goals. Abney effect versus MacAdam 1958’s chromaticity angle adjustments.

This is where it seems the ICC protocol has some useful facets that might be worth returning to. Specifically, an image author might want their hero colourimetry to expand along appearance precepts to augment chroma or brightness. Or perhaps they may not want that. The notion of a series of authorial “Creative Intentions” could be useful here.

It’s not really a “conversion”, but rather the actual stage that forms the picture. This is why it isn’t included in ICC as the assumption is a fully formed image entry point, that is managed across various mediums.

Agree. I think this could be done better, but the general idea stands.

After all, at the point where a set of per channel curves form the image, we can clearly see that, like it or not, we’ve formed the image and there is an implicit medium. The curves cover some closed range, and we have a unique set of colourimetry for the primaries, yielding precise “new” colourimetry. Technical flourishes for appearance constancy aside (such as veiling glare adjustments or “contrast” for surround), the medium is set at this stage, and the image is clearly formed into it.

It might be worth focusing on this as I’m not sure that covers all of the options? Are there others?

While I don’t think a majority of image authors would want a newly formulated image across different mediums, it is a legitimate creative take, that an author should be permitted to choose.

Loosely, that yields:

  • Reformulate per Medium
  • Replicate from Reference Colourimetry per Medium

In the category of replication, for example, we could enumerate some useful, and optional, facets such as:

  • Appearance of Chroma Expansion. EG: the lightsaber feels more chroma laden in the glow of the EDR version.
  • Appearance of Brightness Expansion. The lightsaber feels brighter, but the core remains white.
  • Other qualia?

I can’t see for example, Deakins wanting to have a default “always expand brightness” intention, and most likely not “Reformulate my image per medium”. Providing some granularity to the image author could as part of the image replication chain might be worth analyzing.

Completely agree. Although the medium should be along lines of appearance percept domains; blind number mashing would yield unwanted appearance results.

Indeed. And these things should be in author control.

Gosh I loathe that boat image every time it pops up.

I would consider several facets here being important:

  • Aesthetic flourishes; skin chromaticity angle rotation, etc.
  • Technical flourishes; Abney / BB, flare, surround, flare, etc.

The baseline image formation likely would need those facets controlled individually?

Examples:

  • Stagecraft walls would likely want the attenuation of the formed image for sun or skies, without the perceptual flourishes to avoid a double up when photographed.
  • A video game may want to have a chromaticity linear formed image for an in-game display, to avoid the double up, also without flare and contrast correction on the in-game formed image.

Keeping these things in a category of creative and technical intent flags allows for an echo of what the ICC protocol considered, but somewhat dropped the ball on. If the most basic formation remains chromaticity tristimulus linear, and the various intents are optional layers, it would seem like the only tenable method to offer the granular control required?

3 Likes

Hello,

A few questions to help people not familiar with bespoke terminology.

What do “generalized” means in this context? “varying quality” with respect to the scene?

Can you explain the reason for your open domain and closed domain terminology?

Why closed domain instead of closed range for example when if there is a transformation implied by the from … to in your sentence (and that you also use closed range)?

By technical and creative flourishes, you mean embellishments / adornments?

Choice C without the specific taking advantage of the HDR device’s specific capabilities should cover all your use cases.

What is a facet? A perceptual effect of the display rendering transform?

What does this means? There are like ~150 results on Google Search and almost all yours.

Which definition of qualia are you using, is it equivalent to facets?

What are appearance percept domains? Zero search hits.

You are listing flourishes here but above you listed qualia, are they different?

intents ~= facets ~= flourishes ~= qualia?

With that said I generally agree with (the translation I make of) what you are saying, I honestly think that everybody would: More control of the rendering (what you call formation), as required, is always better than not enough.

Cheers,

Thomas

3 Likes

Because 3x3 fits are generally pretty poor for more chromaticity pure values. I consider the general 3x3 fits a continuum:

  • Reasonably acceptable - at the achromatic axis origin
  • Moderately acceptable - as values move away from the origin, but close
  • Poor - as we move out toward the edges of the locus
  • Nonsense - values fit to beyond the locus

This is also part of the domain confusion here, as a three primary based fit reprojects the locus for the camera “observer”, as below.

And this list is merely the isoluminant chromaticity aspect. We could probably argue that in terms of flicker photometric luminance the values could be considered along a similar scale.

Yes. Because a domain is different from a range. At risk of using a goofy analogy… a domain is a banana, while a range is a count of bananas. We can discuss slicing up the domain of a banana, but speaking of slices outside the banana is nonsense.

Subtle language nuance can lead to incorrect inference, so I try to use language that insulates against these inferences. A closed domain is a bounded limit, with nothing beyond. For example, if we forget that a 3x3 CAT is domain bound, we might consider plotting the results in accordance with the Standard Observer Illuminant E model. If we realize that the Standard Observer is domain bound, we’d also realize that in order to properly project it into a normalized CIE xy projection, we’d have to reproject the locus itself, to account for the entire reprojection of the domain.

In the end, tristimulus is tristimulus in terms of a numerical metric. The domains, on the other hand, matter significantly.

You are quite correct that using “from” and “to” is probably weak, given that I place value on the formed image, and in that light, I do not believe the image exists in the open domain tristimulus, but rather is formed via mechanics.

As for “range”, see above. I believe it is an inappropriate use that can lead to incorrect inferences.

Yes, more or less. A “creative flourish” is something that could be considered strictly “creative”, hence, according to Merriam-Webster “a decorative or finishing detail”. I’d be less inclined to place technical facets under the umbrella term, though.

Is this not covered by the two booleans of “Expand Chroma” and “Expand Brightness”? Are there other perceptual qualia here being overlooked?

Merriam-Webster:

any of the definable aspects that make up a subject (as of contemplation) or an object (as of consideration)

I’m not fond of “Display Rendering Transform”, hence I don’t use it, as it comes with some directionality, and lessens the influence of the image formation chain. Dye density layer crosstalk in subtractive mediums, or even per channel crosstalk in additive systems, is a critical and fundamental component in terms of the formed image / picture. That mechanic, as well as the result, does not rest in the open domain tristimulus, or in the case of actual subtractive creative film, in the spectral energy outside of the camera.

Saying “rendering” therefore, is both too general, and lessens the mechanics of the actual image formation in terms of mechanics.

I use the term “chroma” in the CIE sense, as I believe that’s the appropriate term for distance to an achromatic adapted axis.

colourfulness of an area judged as a proportion of the brightness of a similarly illuminated area that appears grey, white or highly transmitting.

Many folks erroneously use the term “saturation” here.

Laden is simply “heavy”.

Qualia is a term used in quite a few colour science papers, and has origins that track back a long ways to possibly 1675.

a property (such as redness) considered apart from things having the property
a property as it is experienced as distinct from any source it might have in a physical object.

Generally speaking, it’s a phenomenological term.

Likely redundant with the use of “appearance” and “percept”, but to clearly distinguish as the phenomenological aspect of human sense, not the erroneous idea of an attribute of an object.

percept: an impression of an object obtained by use of the senses

As above.

If the takeaway was simply “more control of the rendering”, then you gravely misinterpreted what was said.

I spoke of breaking down the image formation chain such that components can be isolated, as well as drawing a connection to the ICC protocol, where some of these facets are enumerated.

“Intent” from the ICC idea of “rendering intent”.
“Facets” being components.
“Flourishes” being largely creative embellishments.
“Qualia” being things that are specifically related to sensation in the phenomenological sense.

Hope this helps.

3 Likes