I think what you are describing is an audio book with video illustrations. This wouldn't need to be variable bitrate, just the length of the audio scene and pics that match the changing intensity of the audio would be enough.. (IE: audio change to a disapproving voice.. video change to a frown..or whatever is appropriate). If this is a correct way to describe what you want, then what you need is something that, in effect, sets an I frame & holds it for the duration of the particular dialogue, changing to a new frame when the intensity of the audio changes. Hmmm. You would think there would be a way to do this using Avisynth. Instead of using scene changes to set an I frame, using audio que's instead...