The point of scene-referred compositing is that the different elements are not yet colour rendered, and integration and physically plausible edits can be accomplished.
If you have Video footage to start with, which is colour graded and colour rendered to a display referred state, the inverse RRT will not help you to restore scene-referred state, in the true sense.
Also inverse transforms have extremely steep gradients. The resulting pseudo scene-referred data might be very fragile (meaning that if you start to modifing it, it will break apart).
The best place to spend your energy is educating the people around you and try to get proper scene-referred data.
I hope this makes sense.