Help me understand Photography Raw

Hey everyone, I am trying to understand the difference between the scene-linear luminance data from “cinema” RAW files, Arriraw or R3D or some LOG, to what I get from Photography stills, I think ive now went through all the programs (rawtoaces, resolve, affinity, libraw, dcraw e.t.c) with testfiles from different cameras and manufacturers and I dont really understand how the latitude gets distributed, its like I get a normalized “linear” ?

So basically what I am testing is the highest values in my colorchannels , lets take arriraw debayered to ACES from the Alexa65 with clipped sun in frame, and the highest it goes is around 65, to me that makes sense at we would have middelgray at around .18 so from that I can derive that 65 would be approx 8.5 Stops above middle grey. this makes sense to me. Similar behaviour can be seen across the board with RED , Blackmagic e.t.c

(maths)
65=0.18*2^X
X= 8.4963
(/maths)

In case of Photo raw there is a big difference to how the RAW was developed, using the purpose build tool rawtoaces I was able to debayer a sony A7R2 and get values up to approximately 6 . But most other tools clip at just above 1. there doesnt seem to be less visible dynamic range it just seems to be Normalized, I guess to fit in Integer based formats like TIFF , which for example CR2 is based on IIRC.
Rawtoaces has native support for the A7r2.

when I have to patch something in comp with a photograph from a stills camera (like a cleanplate) the values from stills and cinema camera would be in totally different ranges, obviously one would then adjust the photograph to match the plate in comp, but is that really the point?

I really hope someone can enlighten me why this is and if it might be on purpose (are raw-stills ment do be used as diffuse-textures rather than scene-linear ?) and even better how can I get it to behave like cinema footage.

Please do let me know if I have this backwards!!

1 Like

You’ll find DSLRs are camera referred raws, which in terms of integer encodings, will always be normalized from 0-100% of the code value range^1. This is typically a device linear range.

The confusing part is that the relative middle grey code value in device linear varies camera to camera, and is a byproduct of the engineering. This means that for any given DSLR encoding, the middle grey value in that range will differ from another camera.

TL;DR: If I’ve understood your question correctly, DSLR raws need to be uniformly scaled via multiply to properly align the middle grey value.

1 - Photosite saturation level depending.

Thanks for the reply , Troy! Might have cut off some of your post?

Would love to get more info of this as I cant find anything on “camera referred Raws”. So basically the only thing I would have to do in practice is to shoot a middle grey chart and adjust it, sadly the probability of getting such a thing from set is near zero, so it will basically be the manual adjust the ingested raw to fit type of deal.

This would also explain the “1.4 exposure” setting that DJI wants for ingesting their Zenmuse Drone stuff… hmm

And I was probably falsely expecting rawtoaces to be able to scale the values to scene referred by using the metadata and spectral data of their supported cameras, somehow this can be worked out from bracketed shots, when merging HDRis as well…

argh, why do they have to make our lives so hard, just make a stills camera Arri/Red/Blackmagic, cant be that hard to put one of your sensors in a small case with a fast mechanical shutter… Imagine a mirrorless camera sized camera with a fast shutter , autofocus that shoots BRAW or something, why not.

How?

To put this more clearly, imagine two cameras such as a Hasselblad and an entry level Canon. The Hasselblad might encode 14 EV worth of dynamic range, while the Canon 10 EV. Both encode to integer camera / device referred encodings in the 0-XXX range, which in practical terms after photosite saturation and such is accounted for, amounts to identical ranges of 0-100% of the code value range.

Where’s a middle grey? Answer… who knows!

In terms of the device referred encoding, the values vary and they can end up anywhere. If you place it at 18% of the code value range, you hard code the maximum over 18% at around 2.4 EV at 100%! 0.18 + 0.18 = 0.36 + 0.36 = 0.72 + 0.72 = 1.44, which of course maxes out at 100% code value normalized.

Hence for wider dynamic range cameras, the value is arbitrary.

Typically, digital sensors are absolutely woeful, and you’ll find that the middle grey will only require a 1 to 2 EV adjustment. They really are awful.

Linear encodings are pretty unfortunate. If you think about it, a DSLR at 16 bit linear can encode 16 EV really poorly. At the lowest end, you are at the first EV, with exactly one code value of data, then the next EV you have two, then the next EV four, then eight, and so on. We wouldn’t get into “meaty” denser data with sufficient code values until far higher up, where we care about them less and less.

The device transfer function a vendor chooses is there for good reason; it’s a valuable compression function. The time of linear raw encodings is likely soon to be relegated to the discard bin of history for this very reason; they are horrifically inefficient, but because digital sensors are so f#%king horrible, we haven’t quite noticed.

At risk of sounding like a broken record, there are way worse offenses we take for standard practice. Gamut mapping is vastly more important than all of this, yet 99% of people turn a blind eye.

2 Likes

I thought rawtoaces knows by knowing the device specific spectral response? thus can rebuild the correct scene linear data, but I probably miss understood this, totally agree on the rest you said. 1 -
But Id still see a case for a scene-linear photography camera :smiley:

So basically my hunt for getting proper scene linear data from photography raws is a dead end then, and the best I can do to import them into a ACES pipeline in practice is to :

-make sure I am not clipping latitude by checking visually against something like ACR
-map the gamut correctly, best to check with a colorchart (nuke MMcolortraget for example)

  • adjust so diffuse white is .18
  • if its an element for compositing or mattepainting , adjust the luma range to fit a reference or made up values , like have a highlight match the main plate if its comparable or just multiply to to give me close numbers to what I expect .

Is that really all I can do when mixing stills and cinema footage?

Remember that having the spectral compositions of the CFA filters doesn’t mean much; the result is just a dumb math fit, which amounts to a “this is kinda sorta like the values in the reference swatches for these less saturated mixtures, and is totally bollocks error the further out we go”.

The sensor capture is dumber than that; it’s just photosite counts of current.

Newsflash: that’s what you have. It’s just not a super great one, and it sure doesn’t do a terrific job of capturing spectra reliably. It’s just a photon-to-current sensor, and a not great one at that.

Hi @Finn_Jager,

Knowledge of the spectral sensitivities and scene linear response is not a related problem. To get a scene linear response you need to know what is the camera response and how the raw data is stored in the file container and this is vendor-specific, Canon, for example, tends to store it linearly while you will find that Nikon does not.

The camera sensors tend to behave linearly except when reaching their saturation level, so in post-production, we do the following:

  • Texture Authoring Imagery: It should never reach sensor saturation, so there is no issue here, if it does, you complain to the photographer that the data is unusable :slight_smile:
  • HDR Imagery for IBL: The parts of the imagery reaching sensor saturation but also the shadow portion are discarded during the merge, so there is no issue here again.

If the DSLR imagery is adjusted to produce similar exposure values than a Motion-Picture camera and the DSLR camera + lens characterisation is correct, the imagery should be really close, within reason.

Cheers,

Thomas

1 Like

Cheers , this does clear up a lot of stuff to me!

hmm I dont quiet follow, if I take a single frame with a Alexa I get scene-linear values up until clipping and it all makes sense, but I do not with a photography camera as you explained if I could have a stills-camera RED or whatever to take references with on set wouldnt this get me way closer?

Feel like i am missing something :smiley:

The only part you are missing is that ARRI has carefully measured the currents off the sensor and factored in other values that result in known intensity alignments when they hand the data off to you. They’ve done the work via the metadata etc.

For a majority of DSLR vendors, you get what amounts to a highly massaged set of mosaicked current counts from under each filter, with little more information than that. It’s linear in terms of current / radiometric accumulation more or less, but there is no further means to glean the “magic values” as to where they peg a middle grey. Some vendors vary quite wildly on this as well, for example, Fuji.

To ground this, crack open a DSLR at home and pop a grey card if you have one, and expose for it. Then ingest the raw and check the actual code values in your preferred software. You should be able to get a handle on roughly where the vendor slots the value.