Without too much acoustical theory....
The sound is always there. Sympathetic frequencies are always octaves of dominant frequencies, either higher or lower. It could possibly be the natural frequency of the engine, it could the natural frequency of the entire car.
Since the sound discussed is lower, not higher, I would say that because lower frequencies tend to be longer wave forms, and most mics , especially in-car, aren't designed to pick them up, you wont hear them in-car. Also, since sound from the engine is omni-directional (sympathetic movement of adjacent air molecules from the top and sides of the engine as it moves thru the air), minus the expulsion of the gasses from the exhaust, you'll only get what the engine throws forward, and the vibration of the mic in its mounting. Most the time sound phase cancellation will take care of some frequencies and what you have left are dominants, in this case, the blender sound of the revving. We can see for the most part, the car isn't shaking the camera to pieces, thus, the natural frequency of the camera + mounting + air scoop combination is out of range of the natural frequency of the engine + monocoque connection. The higher the rev, the higher the frequency, as always, and vice versa.
Since the car is moving forward and the sound of the exhaust is moving backward, and probably cancelled, you'll not hear the exhaust note on an in car mic, unless they put a mic behind the exhaust, or are micing an object that resonates at the same frequency (suspension mounting behind the exhaust, the driver's helmet? maybe...)
In the section of the video mentioned, you're in the middle of a road track, with barriers and natural items, (trees, grass, air), as well as surrounded by a huge ova, with the source camera not on the car, and at some distance, as its coming round a corner, lower rev situation. The television camera + mic setup can pick up the long wave forms, as they've done a bit of travel, as well as the bounces of the frequencies off the track, off the grass/trees, etc., to make a "whole sound", one with phase modulation due to the same frequency arriving at the hearer, in this case the mic, slightly faster, straight thru the air to the mic, or slower, bounced off an object and toward the hearer. You'll also hear the sympathetics, that are lower, due to their ability to travel, in conjunction with the original sound of the engine.
Hope that helps.
cheers,
James