30 Minute Film School V: The Good, the Bad, and the Audio
So the pun in the title made you groan, huh? Well, consider that groan to be inuring you to what is about the most cringe-worthy technical aspect of filmmaking. Audio.
Okay. This episode of 30MFS has been a long time in the making because it is just as hard for me to write about audio as it is to get good audio. First and foremost on my mind is the simple fact that there is no such thing as “good audio”. There is “bad audio”, and “audio you can work with.” A major part of me holding off despite knowledge that I would have to tackle the subject was ensuring I got enough hands-on experience making and manipulating workable audio that I would not be sending people into the field with the full knowledge that audio sucks a big one (there, now you know what everyone knows), but how to make it suck a little less (slightly more privileged information).
The thing you need to understand, is that audiences are much more unforgiving of imperfect audio than they are of imperfect imagery. Just look at the recent release of Meek’s Cutoff, a movie that has very grainy, lo-fi imagery, but sounds fine. The fact is, most aspiring/upcoming filmmakers do not know how badly audio screws you because most aspiring/upcoming filmmakers never see productions with bad audio. But take one day watching some student shorts shot on even some of the most amazing prosumer or even professional cameras, and it becomes very clear very quickly why even the most forgiving of niche audiences just don’t sit still when all the dialog is clipping.
Theorists and historians bemoan the end of the silent era—which was never really silent anyway—for the freedom of expression that film had, not being tied down to audio. This is a point where I think the entire film world can be peaceful and in agreement—technicians hate audio too, especially audio technicians. If you are ever driving by a location shoot and want to figure out who the sound mixer is, look for the guy in the corner with his hands over his earphones smoking three cigarettes at once and barely able to contain his simmering, red-faced rage. Yeah, that’s the key of the sound department, and he’s hearing everything that’s going wrong with the production in real time, as the production moves on in complete ignorance of it.
Because of the demands of audio, the sound department and the camera department have historically been at odds on set. Without the camera, the film does not exist; but without the sound, it sucks. Rewatch Singin’ in the Rain for some comical references to the issues of early sound synchronization, but realize that even today camera and sound are in a constant shuffle for space on set, with both departments getting in the way of each other. Luckily, cameras, lights, and sound equipment alike have all gotten smaller, and with smaller crews needed to work them, meaning things aren’t as heated anymore. However, when you see that sound mixer guy reaching for a fourth cigarette without having finished the other three still glowing in his mouth, you’ll know some ancient rivalries never die.
Where to Start
Here’s an easy Step 1: Do not use sound from your camera’s internal mic.
That is bad sound.
I’d like to say period, but there is an exception to this rule that I will get to AFTER I’m done explaining why.
First off, the camera itself makes noise, and the mic on the camera cannot but pick up that noise. As a result, most cameras are designed to be really quiet and the mics are designed to cancel out that noise—but doing so, they cancel ALL noise on that wavelength, meaning you are not getting your full range of noise. The effect is called squelching, and is the same thing that happens in your noise-cancellation headphones you use—an amount of white noise is ADDED to the input so that you may not hear the noisy part of the output, and everything in its range too.
Secondly, many of the mics on cameras are cartoid microphones, which are not the most ideal microphones for film and video production. Which is the perfect place to start discussing microphones in general.
The Three Types of Microphones
There are types of microphones, but only three types of microphones you really need to know about at the beginning, and they are differentiated by the area in which they pick up sound.
OMNIDIRECTIONAL microphones are microphones that pick up sound from, as the word implies, any direction. They are what the FBI is using to spy on you wanking off in the bathroom because everyone assumes you are a terrorist. Omnidirectional microphones are meant to capture everything in a small space, and sounds are harder to hear the further away from the microphone they are emitted.
CARTOID microphones get their name from the shape of the area they cover, which is like a heart. They are the most familiar type of microphone; you see them when Obama is giving a speech or your favorite band is performing a show. The cartoid shape gives clearer, more dynamic sound, especially vocals, at a very short range.
DIRECTIONAL microphones, also known as shotgun microphones, cover an area that is cone-shaped—you point them directly at the source of the sound, in order to pick up mostly that sound and no other.
There are graduations between Omni, Cartoid, and Directional mics, usually things like “Supercartoid” and “hypercartoid”. The only weird sounding one is Bi-directional, or Figure of 8 mics, which basically are Cartoids where the heart-shape is constricted in the middle. Honestly, none of that information matters. Directional mics are what you use when you make movies, unless you literally cannot get your hands on one, at which point you have to work with what you have (never fear, there is still things you can do).
Movie crews use directional microphones, but cartoid microphones are typically what is found on prosumer cameras. Why is that? The reason why is because the camera uses the heart-shape sound area to create a false stereo mix on the tape: the “heart” can be divided into left and right. In this way internal microphones on prosumer cameras can be useful as a back-up and as a synchronization device. You can create a relatively dynamic “mix” by turning one channel of audio input to the internal mic, and then attaching an external boom mic (typically through an XLR connection) to the camera and turning the second channel to external. The boom mic gets the detail, and the internal mic gets the ambience. This is basically what most documentary crews do.
“Can my camera do that?”
Most cameras that can create a stereo mix in-camera have XLR connections to retrieve two different inputs of sound. Those cameras that just have a simple audio jack in, the type that look just like headphone jacks, only record one input of sound and you must switch it from either internal mic or external mic through the digital menu.
NOTE: the currently very popular Canon 5D and 7D cameras have a major drawback in this way. In addition to not having an XLR input, adapting an XLR to the typical mic input jack does not change the fact that the 5 and 7Ds record audio through a compression that, easiest way of saying it, corrupts the audio. Thus, those working with a 5D or 7D would do well to have a separate field mixer for audio, which causes two problems: syncing, and syncing.
Yes, I said that twice.
Syncing separately recorded audio
FIRST, recording audio and image separately means they must later be synced together. This means you need a slate.*
SECOND, if you are recording video, which you probably are, and that video is NTSC, which it probably is, then it records at a different rate than audio does. That means the sync will slowly lag during takes. You must drop frames from the video to catch it up.
This I am not going to be able to explain clearly. NTSC video is 29.97… not 30… frames a second. That .03 difference is where audio and video disagree on “real time”. Thus, audio goes by faster than 30 frames a second and so every so often, a few frames—say every 30th—has to be dropped from video for sound to catch up. It’s a goddamned hassle, which is another reason why HD video is so popular—not as many syncing issues. In fact, I’ve never really figured out the whole NTSC audio syncing conundrum thing. There are plenty of tutorials about it elsewhere on the Internet. Eat your heart out, I don’t even want to think about it.
However, this is worth repeating:
*THIS MEANS YOU NEED A SLATE.
Or, in case you missed that,
Essential Rule #1: GET A GODDAMNED SLATE!
Slates, like storyboards, are misunderstood by beginning filmmakers. They say to themselves, “Why expend the few bucks and the energy to have a slate and slate all my takes? I will beat the Hollywood system and make a more efficient movie and it has nothing to do with feeling silly slating takes or feeling I have it all down in my head at all…” And then they spend literally days worth of work making up for their “efficiency” in post-production trying to sync video and audio off of facial movements, and go nearly insane doing so.
I will actually talk about editing workflow in the next 30MFS. For sound, however, either get a slate and a person to slate (that is to say, someone to actually stand in front of the camera before each and every and all and every single take while yelling out the name of the take), even if it really is just some dude (pro’ly you) in a white teeshirt yelling the name of the take and clapping his hands together once instead of one of those crazy striped clapper slates. That clap is essential—IT SYNCS THE MOTHERFUCKIN’ SOUND. If you don’t clap, you’re syncing to lip movement or other arbitrary events.
This is another one of those basic things that somehow people adamantly refuse to follow up on when they are making their earlier attempts at movie production (at any level), and keep justifying over and over and over—but if you aren’t slating your takes, yer doin it wrng. You are literally hopeless and your editor will want to kill you (even if you’re the editor. You will consider suicide. I promise).
Slate your takes, especially with separately recorded audio and video.
Back to XLR
Alright, so, back to where we were before the Canon 5/7D tangent put me on the slate tangent that put me here. “What the hell is XLR?” I don’t know what those letters stand for but I decided purposefully not to look it up because nobody calls it anything but XLR, ever, and everybody calls it XLR, because that’s what it is and XLR it should be, so you should know it as XLR. XLR is an audio cable connector with (typically) 3 pins. It looks like this:
…connected to a cable, usually of differing colors. The more professional gear you work with, the more prevalent this stuff is… everywhere. On the floor, in corners, spooled, unspooled, tangled (bad set organization alert), everywhere. This is because it has to go everywhere, it connects sound to the angry smoking guy in the corner.
Okay, so, this whole thing is probably not going to be 30 minutes. Because we are not remotely finished yet.
All I’ve described is what kind of mics and how they are commonly used, and immediately got sidetracked on a plethora of issues and problems to look forward to with audio. That’s because that’s what audio technical information IS: a few recommendations with an almost unending list of problems that could occur.
Right then, so let’s go back to basics for a bit. You know not to use the in-camera mic by itself, and that you typically want to use a shotgun mic for movie production. Knowing you, you probably have no money and the only mic you have available is a ten dollar cartoid you borrowed from your garage band friends. Nevertheless, you can work with it, and the following will cover practices you need to understand no matter what mic you have, no matter what situation you have, always and forever.
Essential Rule #2: RECORD A GODDAMNED SOUND PRINT.
Here is how it works in the professional movie making world. The first thing recorded is not the wide shot, it’s not the master shot, it’s not any pickups, it’s the sound print. Ideally, the sound department and his crew are on set before anyone else is anywhere around so that they can capture “room tone” before the production starts. This is the sound that the room makes by existing. Your silent and heavenly bedroom where you escape to be alone and not let the world in has heating ducts, computers humming, clocks ticking… space itself has sound, which is one of the reasons why sound mixing is so hard—spatial sounds must be maintained between unlike locations, and every location sounds different.
Again, we are in no budget filmmaking territory. We are aware you are shooting in your apartment. So the thing is, you need to unplug the refrigerator. You need to turn off all climate control devices. You need to unplug the television, even if turned off, because some older televisions buzz. You need to carpet your walls. You need to put a heavy blanket over the window. You need to literally be in that location for a week in advance cutting as much noise as possible from the area.
And we’ll still know you recorded this in your apartment.
Basements echo. Attics creak. Cars three blocks down make noise. Airplanes miles in the air make noise. You breathing makes noise. You have to do everything you can to stop that noise, and sometimes you do, in fact, have to wait on a take just to see if that neighbor’s dog will stop barking.
On location instead of on set, wind makes noise. Well, even a breeze makes noise. Hell, AIR makes noise. An ambulance you cannot even hear gets picked up by the microphone. Leaves on trees make noise. You curse the modern industrial world for getting rid of birdsong in urban areas until you realize how much a single bird makes noise.
Wherever you are, your crew makes noise. Their clothes make noise. Their breath makes noise. Their sips of coffee make noise. Their shuffling of feet make noise. The scripts in their hands make noise. YOU make noise.*
*Fun story. I once edited this short video that featured a scene in a bathroom with running water. Of course, the audio was terrible. What you do is record the water running, and then you record the scene again without it running. What made me laugh, however, is that the takes of the scene without the running water had the director himself going “pshew pshew pshew pshew” under his breath, meaning I had to mix THAT in with running water to get the rest of the sounds of the scene. We might as well have just recorded the scene with running water and threw the rest of the sound print away.
It gets worse. Cellphones make noise—“No Shit, Sherlock!” No, I mean silent cellphones make noise.—“Duh, I know, that vibration setting is loud.” NO. I mean, nonringing nonreceiving otherwise inactive but turned on cellphones MAKE NOISE.* The wavelengths they send out get picked up by microphones and audio cable. Concordantly, cellphone towers make noise. Phonelines overhead make noise. The very lights you are using on set make noise. The electrical wiring in your walls make noise. That’s not paranoia. Microphones operate off of electromagnetism, which means electromagnetic devices affect microphones. Remember that electric currents create magnetic forces, and magnetic forces create electric currents.
*In other words, whether director or crew, do not just turn your cellphone off when you go to work—leave it in your car. Seriously.
Do your microphones pick up all of the above? In a way, yes. Even worse, microphones pick up the sound of THEMSELVES. Boom operators have to wrap audio cable loosely around the boom pole because if it bounces while hanging or drags across the ground, the resulting vibrations create noise. It is ironic, really, because the only thing microphones do not seem to pick up to full clarity is your actor’s voice.
However, having that sound print is important for two reasons. The first is because room tone affects spacial perception in the audience, and must be consistent throughout a scene, so you need the room tone to cover any absences of sound from editing or removing isolated mistakes such as that bench your PA accidentally knocked over in the middle of a take. The second reason is because many audio editing softwares feature the ability to take a recording of that noise and cancel it out of the actual sound track. Again, the more noise there is, the more has to be taken out, taking away from the dynamic range of your actor’s voice—just like squelching. Thus, you want to record the minimum level of ambient noise so that less will have to be taken out. I will explain how to use the sound print in editing in a moment.
Thus, all sound mixers record room tone, a noise print, or a sound print (all interchangeable terms) at the beginning and end of a shooting day. All professional sound mixers record a sound print before and after every scene, and when entering a new space. All actually good professional sound mixers worth the money they are paid record a sound print after every single take.
Essential Rule #3: SEPARATION IS THE GODDAMNED KEY.
You know those silly poofy things they put over boom mics for news crews? Those are called fluffies, and they cut down on the wind entering the microphone. They help increase separation.
You know that pyramidic foam padding your garage band friends stapled all over their walls after they could afford it, replacing all the egg-cartons they had there instead? That foam cuts vibrations entering the room, thus increasing separation.
You know that scene in the Godfather Part 2 where he wraps the gun in a towel as a DIY silencer? That towel increases separation by muffling the noise from its source.
The point is, you need to block off sound, vibrations, and electronics as much as possible. Luckily, film equipment is often working for you, because often the same stuff required to cut out unnecessary light (doors, 4×4 floppies, gorilla tape) also reduces noise somewhat. If you are going to a location shoot, bring a heavy blanket (or if you have professional gear, some extra “furnie pads”)—not to sit on, but for someone to hold off screen of the microphone to block wind.
Separation also has its proximity factor. Let’s put it this way: if workable audio could be captured by simply recording the sound on set and people being very very quiet during takes, it would be done that way to save cost, time, energy, hassle, and stress. Instead boom operators have to be close and tight to the actor’s chest to capture the audio correctly. When a boom is in operation, it is quite literally just out of frame—within millimeters. The boom operator and the camera operator thus have to work together before the take, the boomer lowering the boom until the CO sees it and says, “In frame” and the boomer lifts it up, like, an inch from where it was when the announcement was made. There the boom stays, still and steady (as per the boom operator’s skill level), until the end of the take.
There are omnidirectional microphones known as lavaliers that are used for wide and long shots where booming is impractical due to distance between crew and performers. These microphones have an extremely short range for the purpose of cutting ambient noise, so they are often clipped literally just under a person’s chin. The sound crew and the costuming crew often have to work together to pull this off without the costuming itself causing noise problems, or sometimes the hairdresser has to hide the microphone. I have not seen this for myself but I believe there are classes for the sole purpose of understanding how to style hair around a lavalier microphone.
Separation isn’t solely an issue of ambient noise. Dialog must be recorded separately from background (extras, traffic, etc.). Music must be recorded separately from dialog (even diagetic music such as a the band playing in Scott Pilgrim vs. the World while Scott and Ramona talk to each other off stage). Sound effects must be recorded separately from all of that. Foley is its own art, the art of recording sound effects separately and remixing them into sound design as if they happened on stage. Rarely are horses ever heard when filmed—rather, foley artists are what are being heard. That’s the cliché. But take a look at Master and Commander: that movie is infamous for having original sound recorded instead of canned or foley sound. Hell, you all know what that one stormtrooper falling off the edge of the Star Destroyer in A New Hope sounds like because you keep hearing that same scream again and again and again and again and again and (and I hate it because it pulls me out of the movie every single time dear God WHY do they keep using that scream when it’s so conspicuous?). Or the sound of a crowd gasping, how it always sounds the same?
So, if you only have a ten dollar cartoid microphone, you can still work to keep separation of sound high and keep the microphone close to its intended sound source. Ambient sounds and effects can be pulled off of free online archives, easily searchable. Record the sound of your actor walking separate from the sound of him talking. That is separation.
It’s hard to “see” the same way visual editing is “seen”. Sound is not edited together the same way, at all. Luckily, these days with color correction and post-production visual effects and popular programs like After Effects, the idea of editing “layers” is becoming more and more familiar. That is how sound is edited: in layers. Separation means your sound is workable. If you have one character coughing at the same time another is talking, and record it on only one sound track, if one sound source is recorded too high or too low, it affects the other. If you can lower the sound of the cough while increasing the sound of dialog, you can “work with it.” Thus the difference from bad audio.
Essential Rule #4: SET YOUR GODDAMNED LEVELS.
On your audio mixer, or in your camera, or wherever you are recording your audio, is the audio level for the track you are recording, measured in decibels (dB). What dB levels are and how they are measured is another full discussion, but the essential thing to know is that levels that go beyond 10 dB clip.
Clipping is bad. It is the wavelength dropping out due to going beyond the range that can be recorded. It is, quite literally, distortion, and you cannot replace it unless you record the track again.
However, lowering the decibel level does not only lower the volume of the sound, but the range of the sound being recorded. You can increase the decibel level in post, but all of the other sound, including ambience, will rise with it, creating no dynamic range and making it all sound like a Merzbow tape. Thus, having your sound track peak at 10 decibels while ambience plays at 2 to 3 decibels, and dialog keeping between 7 and 9, is ideal.
Usually the levels on a typical video monitor or mixer is not numbered, it is graphical. There is a thick line at the top denoting “10”, and anything that goes above it lights up in red. If you are recording audio into the red, you are clipping. You want to lower the audio level until it comes just short of 10 during the high points of the audio about to be recorded.
This is where it gets hard. You tell your actor to do a few test lines for sound, you start recording, and then her performance really kicks in and the levels go off into the red. It happens not only a bazillion times, but pretty much every time. Which is why the audio mixer (could be you) turns the level up and down during the first take to compensate to the right level, and you do a second take. Separate takes are not just for performance or visual coverage, you know—more takes are done for sound, really, than for everything else (see once again animosity between camera and sound department).
(Actually, light works the same way on video: light literally clips at a certain intensity, which is where zebra lines come in. To those video artists that have worked with a waveform monitor—sound waves work essentially the same way. You can record lower (video: exposure; audio: level) and increase the gain, but the result is a condensed wavelength with murky/grainy/noisy results, which means it is better if you record just under clipping. Unfortunately for sound artists, there are no zebra lines on sound, just the wonderful crackling sound of distortion when things just aren’t working and your doing it wrong.)
If you are doing the documentary trick I mentioned above, where you are recording two tracks in-camera via an external and internal microphone, it is better to set the shotgun mic’s levels to just below ten, then reduce the in-camera microphone to a couple-few decibels below that. The reason should be obvious: ambience should be recorded lower than essential dialog et al.
The Horrible Part: in the Editing Suite
Sorry guys, the above? That’s… that’s all I got for the field. Other people, especially professional sound people, can give more advice. I have to move on to the part where you unload all of your sound into your computer and discover that it sounds AWFUL, but have to work with it anyway to make it sound pleasant.
Back to the sound print. The sound print is for you to cover up mistakes that you will have made in sound. Think about it like this: when you and your friends are sitting in a room talking, the room is still making the same sounds it does when nobody is in the room at all, in addition to the sounds you all are making. Space has sound, as mentioned before. So, let’s say your actor places a cup on the table, which causes a loud clap that clips on your soundtrack, so you need to cut it and replace it with the sound of a cup being placed on a table that does not clip. You find a good sound of a cup being placed on a wooden table online, or you recorded it separately. You cut out the sound of the cup in the soundtrack and replace it with the new sound. But it doesn’t “sound right.” In addition to the room tone just disappearing randomly with a click on and off (that click, by the way, is the waveform of the sound starting anywhere but zero, and can be removed by fading in and out—which requires space on the soundtrack less those fades affect other essential sounds on the track. See? SEEE? Separation.), the new sound, though it does sound like glass-on-wood, just doesn’t seem like THE glass on THE wood. The audio n00b then goes in search of something that sounds like THE glass on THE wood online, or goes back and records it, but it still just doesn’t sound right! That sound does not sound like it happened in that space.
So you mix in the room tone you recorded before the take… you know, that room tone you were so obstinate about recording? Yeah that’s the one. You layer it over the sound of glass on wood and suddenly that glass on wood sounds like THE glass on THE wood in the scene that you shot, even though it was probably done by some foley dude 50 years ago using equipment 55 years older than yours.
The character, after all, has just finished drinking…. Erm… vodka. (Why’d you record a movie about a character drinking water, that’s boring…). Because the character has been drinking vodka, he now has to pee. You record the scene where he goes to the bathroom (because the girl is going to leave while he’s in there, otherwise why would you record a character going to the bathroom, that’s boring…), and upon listening to it, you shudder to find that—the bathroom scene sounds horrible. The sound in there is, to put it simply, different. It’s all echoey and the goddamned fan was on and… ugh. It sounds like a completely different planet even though it’s in the same apartment.
SO, you reduce the gain on most of the sound (luckily, there is absolutely no dialog in this scene), you take the sound print you were so obstinate about recording before the bathroom scene and you cancel it out of the sound print of the action, THEN you layer the room tone from the living room over the track in the bathroom. It still doesn’t sound great, but at least the bathroom sounds like it’s in the same apartment, just with an irritating cheap fan on.
How do you use a sound print to cancel noise? It depends, sadly, on the program you have.
The first step in any program is to highlight a section of sound from the print that has no or as little other noise than the room tone—i.e., if you have a sound print where someone yawns in the middle of it, choose a section of the sound print without the yawn. After that is highlighted,
-In Soundtrack Pro, go to “Process – Noise Reduction – Set Noise Print.” Then highlight the area of the audio you want the noise removed from, and go to “Process – Noise Reduction – Reduce Noise…”. A HUB will appear with the ability to preview the clip as you have several options to adjust the amount that the noise is reduced. I have found typically that the standard options are usually the best—the more noise you reduce, the wetter the sound of the dialog, and the less you reduce, the more noise is retained.
-In Adobe Soundbooth, go to “Processes – Capture Noise Print”. Then highlight the area of the audio you want the noise removed from, and go to “Processes – Reduce Noise…”.
I know it is possible to do the same thing in ProTools, possibly through a separate plug-in, but I have not found it yet. Because I work mostly with Final Cut Pro and Adobe CS5 programs, I do not have a whole lot of need to use ProTools, which strikes me as much better for studio recording, i.e. music, than for filmmaking. That said, ProTools has a lot of nifty ways to manipulate audio that is not available in other programs, and its interface is great for rubber banding—that is, slowing down or speeding up sound instantly with the mouse instead of with selections.
“Can my (insert ‘freeware/off-market program downloaded off the Internet because your friends said it was just as good as anything else’ here) do that?” Beats me, buddy. Look it up on that product’s forums. The thing is, probably. Programs like Audacity, since they are freeware, typically have a devoted base of plug-in programmers who figure out ways of doing those things and also releasing them for free. But you have to put the time into finding the answer on a forum, or the solution in another plug-in.
Final Cut Pro
“Can I do that within Final Cut Pro?” Well, actually, I am not fully sure. If you have the full version of Final Cut Pro, you have Soundtrack Pro, so it’s not worth the time. However, I have the understanding that some Soundtrack Pro tools are available within Final Cut Pro as a plug-in, and I do not know if they are available in, for instance, Final Cut Express. Researching what programs you DO have will help you discover how to use them most effectively for audio editing anyway, so it’s an important thing to do.
There are also a few things you can do within Final Cut Pro to make audio a little more workable and a little less bad. When you double-click on an audio track, the waveforms for that audio track appear on the Viewer Window. In that window you can set levels and pan the audio to the left or right (remember that when changing levels on audio that has already been recorded, it affects everything in the recording, meaning dialog increases along with other noise). The important thing to note here is that little diamond shape to the right of both the level and the pan bar. That sets keyframes so that the audio can be manipulated within a single track—that is to say, the entire track will not be leveled up if you keyframe only a section of it to.
So ultimately the easiest DIY sound editing you can do within Final Cut Pro is to fade out the sound where there is no dialog, and fade it back up near where the dialog begins and ends. It still sounds choppy doing that, and is a lot of work and a hassle, but it can be done—I did it for a short film once and ultimately it didn’t sound too too bad. It helped that I had a LOT of audio to work with, so could always pick up the best tracks that I could work the most with. Again, record as much audio as you can.
Another important point comes up with these keyframes, though. You also have to mix in a score, and when you simply lay the score over dialog, the score ends up completely covering the dialog and the movie sounds like shit (again: I have been to way too many student shorts festivals. All you need is to go to a single one to know exactly the phenomenon I speak of). Here is another excellent example of perception and the way that your ears lie to you.
When you see a Martin Scorsese movie and he’s running all these wonderful pop songs from the 50s or whatever through his soundtrack while dialog is going on, action is taking place, Joe Pesci is kicking someone on the face, the music sounds as if it is playing at the same volume throughout. It is not. The sound fades down slowly before a really important piece of audio comes in, and then fades back up immediately afterwards—the score, just like the noise print, starts to feel like it belongs in the space the action or dialog is taking place in. Rewatch A New Hope’s cantina scene—Dee-dee dee-dee dee-deedee, deedededillideedeee… That song plays through the entire friggin’ scene, but it’s constantly fading up and down depending on how much attention the viewer is supposed to give it.
Thus, scores and diagetic songs are usually loud during establishing shots, and then slowly get quieter throughout the scene as dialog and interaction between characters takes place. Then, high points of drama sometimes are marked by drop-aways of sound, especially ambient sound and scores.
Actually, there is an intensely informative scene breakdown of sound and how it works on the second disc of The Social Network SE DVD. It details this effect much more in depth than I do. Just as an aside, I have found personally that whatever the quality of a David Fincher movie (from Alien 3 to Se7en, in my own humble spectrum of bad to good), his movies typically include very valuable technical information on their DVD extras. But again, these effects are not that difficult to observe provided you are aware of them and how they work. Now every time music is playing in a movie, diagetic or nondiagetic, you will be acutely aware of its level and how it changes throughout the scene—the same is true of background crowds, waterfall scenes, forests… Lower sound is heightened drama, higher sound is establishing spatial presence.
And honestly, that’s the real fuck of audio. The audience may or may not notice the clipboard the AD left in frame during one scene, or that the actor’s hair changed between takes, depending on how involved or not they are with the story. However, if the audio levels keep changing throughout a scene, no audience can get immersed. Sound has, in fact, severely limited cinema in the way it forces spatial perception in the audience, causing them to lose focus on the story with the simplist pop or click on the soundtrack.
However, when you capture audio you can work with, by definition you work it in such a way that an audience is immersed in whatever movie is taking place. Hell, the same is true of experimental films: there’s a reason why the whole grinding, subbass thing is a cliché, it creates what most audiences call “atmosphere”. Again, even the silent era had a lot of sound, it just wasn’t synchronized recorded sound. Because of such, even the choice NOT to have sound is a major audio editing choice, such as Stan Brakhage’s films.
So, because my next 30MFS is going to be about editing workflow, here’s the last thing I can help you with on how to work with audio:
When movies are edited together, typically the first thing the editor does after the raw footage (dailies) is synced to the raw audio, is pick the master shots he wants and lays them unedited end to end in what is called a work print. This work print allows for the sound editor to come in and clean up the audio by fading between sound prints, cleaning out clicks and pops, adjusting levels throughout, and generally mixing various layers of sound together until it all flows together. To get multiple scenes (different spaces) to sound good, you cross fade between them, sometimes over long periods of time. To get the sound dynamic, you constantly raise and lower levels to highlight the most important audio information. To clean out mistakes, you delete sections or sometimes entire clips of audio and resync to a different take—sometimes this is done between words in a single line of dialog. Actually, rewatch Clerks. sometime, and realize that few of those long belabored rants on Star Wars were done in a single take—the entire speech is edited together from multiple takes of audio, some of which the recording started mid-speech on a different date from the beginning.
After the first soundtrack edit is completed, THEN that edit gets trimmed and the editor starts putting in the reverse shots, cutaways, pickups, close-ups, matches-on-action, etc. and so on. The reason why is because SOUND dictates continuity more than IMAGE does. The eye, for whatever reason, can handle perspective changes and shortened actions much easier than the ear can handle intense changes in sound or a sound cut short.
Think of a bell. Its ring fades from hearing—but if you cut the clip short, suddenly you recognize its absence even though you couldn’t hear it anymore. You have to fade the bell out, too, even though it seems to have faded out naturally. Also, as alluded to earlier, audio clips that are cut short ALWAYS pop, and the reason is because the waveform is cut in some non-zero amplitude, so that pop is the sound reader shooting suddenly from some amplitude to zero. Whatever the audio clips LOOK like, they ARE NOT LIKE video clips (or film), there are no hard cuts in audio—you must ALWAYS fade in and fade out audio. The only hard cuts are between audio clips where the waveforms are already faded and the clips are complete.
As the movie is edited, it is usually composited with graphics and other visual effects while foley is being recorded to match the audio that the production missed, a score is being recorded, etc. and so on (a lot of work goes into every part of a movie, no matter what). So now the sound editor has to add, say, spaceship ambience to the spaceship that was never recorded because it was a greenscreen, in a scene where the sounds were recorded on a sound stage so all don’t SOUND like they happened in a spaceship. A foley artist has also made more dramatic spaceship noises, say, of an engine in the interior falling apart, and that has to be edited in and sound like it’s happening in another part of the spaceship that never existed and was also not shot in. And so on and so forth.
A good thing to do would be to check out Video Co-Pilot (www.videocopilot.net) and pay attention to some of the tutorials on sound and sound design. Hell, just watch a few of the guy’s own advertisements for his own website and pay attention to how sound matches with graphics to create a more engaging/immersive effect.
Audio sucks, everybody sucks at it, everybody hates it, and it sucks. BUT, it is workable provided you work with it.
Commonly it is stated that if a production splurges in any way, it should splurge on sound, and that is true to a point. When choosing between $200 for a better field recorder or a better camera, $200 for the field recorder will go further. A $200 more expensive microphone will help you provided you have a camera or a field recorder that takes it (i.e., having XLR ins on the camera), but if not then it’s useless anyway. At some point, a $2000 Sennheiser microphone is not going to do a lot more than a $800 Sennheiser microphone if you’re operating a $400 miniDV camcorder. However, an $800 Sennheiser microphone is indescribably better than a $15 cartoid microphone, only that increase in quality is literally useless and completely negligible if you are trying to record a scene in a windtunnel while a rock band is playing—during an avalanche. And it’s all useless if you do not record sound prints and spend a lot of time in preproduction working out how you are going to deal with sound design and the issues you will confront: multiple takes? Foley? Documentary style?
So while “splurging on sound” is typically directed to the finance end of the production spectrum, it is also how you should consider budgeting your TIME. Movies constantly get fucked in postproduction, one of the most costly parts of the production and most time consuming, and audio is, frankly, where 90% of the not-quite-consensual sex (i.e., fucking over) takes place. That thirty seconds it takes to tell your crew to stop their yapping and be quiet for a little room tone will save you literally days of raging migr—ahem, audio editing. And when planning for editing, remember that you are probably going to be spending AT LEAST twice as much time manipulating audio as you are doing everything else in your edit. Be prepared.
That’s all I have for you, for now. It’s not enough, because… well, you’ll see. Or hear, really. Yeah. You’ll…. Well, yeah.
It good to see you come back to posting in high style Polaris.
Re: Audio – I had a friend who recently finished a feature film, and this was his biggest headache. His audio guy sort of jacked everything up. My friend had to let him go, hire someone else to fix it, and the film’s audio is still noticeably bad in spots, especially where he had to raise music levels to cover other sins.
I can tell your teeth were clenched as you wrote this. :) But thank you for doing it, very helpful and worth it to those of us who are newbies to this!
Thanks Polaris! I’ll comment further once I actually read all of this!
Wow, I have newfound respect for audio people.
Ahh, the basic philosophy here – isolate sounds, clean them up, and then recombine them with ambient noise to smooth out the whole thing – is something I’ve only gradually discovered on my own during my most recent post-production work, and even then, I couldn’t really articulate it. As long as this was, it was a very precise summary of my own messy experiences so far. Thanks much for giving me the perspective on it.
Really enjoyed this. It seems as the budgets get bigger no one really still cares for the headaches of the audio guys either :D
Great thread, Polaris.
You’re very right about emphasizing the importance of isolating the signal. Two indie films I mixed the sound for had some real problems with wind noise outdoors (not to mention, a couple scenes involving a subway) and it was really difficult to make it listenable. The films were actually pretty good so it was worth the trouble, but man, if I had any control over the actual recording of the sound it would’ve been done a lot differently. They ended up sounding much better than the condition I got them in so I’m rather proud of the result. You really learn a lot when you have to (somehow) resurrect a commercial sound from a wrecked recording.
Plus, there’s that classic technique of masking inequalities in the background noise by having a backup track of just “room noise” (or ‘room tone’ as Polaris mentions, which is probably the actual term now that I think about it, ha). For instance, one particular scene was shot on different days and edited together … the frequencies were all jacked up because of the lighting or furnace or something and I ended up having to loop “silent” moments between the dialogue and layer underneath (as well as separate nearly every line of dialogue to adjust their frequencies individually so that it blended in) so that it didn’t jump around too noticeably. If there would’ve been a “room noise” track, that would’ve been a lot easier. Thankfully, it was only like a 2 minute scene with a fair amount of uncomfortable pauses, so the “room noise,” once layered properly, really made the scene work.
The most important thing, as Polaris mentions, is that the initial recording is clean. Few people are going to spend weeks (even months) on cleaning up audio after-the-fact unless it’s a labour of love and it’ll always be frustrating (no matter how much you love doing it), so being familiar with the mic(s) you’re using, lighting arrangements (if any), the heating/cooling situation, outdoor ‘ambience’, wind, thin walls, etc. is essential and always run tests to hear how something will sound beforehand instead of just going guerrilla (unless you’re doing a documentary or something and it’s unavoidable).
Personally, I’m a huge fan of expert overdubbing, but this is more advanced than just recording it on the spot. However, if you’re filming a scene on a train or bus, noisy car, something similar, overdubbing is the ideal approach.
There will never be a “perfect recording” and this is what attracts me to the craft; working with the flaws which will always be present. One often has to sacrifice ‘presence’ for ‘listenability’ (or, more commonly known, as the war between ‘signal’ and ‘noise’) and unless one’s working with expensive equipment (though, get it if you can get it, heh) the proverbial lamb will have to be slaughtered. That’s essentially what I believe sound mixing is: the art of slaughtering lambs. Always messy.
God, and then I think about how such spaces as Disney Hall were designed so that the acoustics are good. It’s truly a science.
Hi all, a friend emailed me about this post, and wanted to add my 2 cents…
1. Honestly, there’s not just 3 kinds of mics.
Omni, cardioid, hyper-cardioid, super-cardioid, line + gradient (also known as a shotgun mic, or a mic which uses an interference tube, Google MKH416 to see a pic).
They are all directional, except the omni, and they are definitely all used in film.
Omni is used for wires (wireless lav mics). (Why wireless mic’s are called wires, I have no idea. Also of note though, if you can ever run a physical wire to a lav instead of using wireless, that’s ALWAYS better)
Cardioid, hypers, and supers are the primary mic for interior dialog by far (Schoeps CMC6 + MK41 hyper cap is the standard on most sets). Why isn’t a shotgun used indoors? If you look at one, you’ll see lots of holes along the tube. These help cancel noise coming from the sides of a mic, and do it very well. They don’t do it evenly at all frequencies though, so when you have a voice picked up by the front of the mic in a reverberant room, the echo of the voice is picked up unevenly and sounds odd. Once a voice is distorted, you can’t fix it. Hyper is the standard, if you have a 2-shot I’d use a cardioid so I didn’t have to move the mic as much to hear both people, for someone very rigid I might use a tigher, super-cardioid.
Shotgun’s are great for outdoors though, assuming you have proper wind protection. “Proper” means a full blimp. Those little dead cats you slide over the end of the mic? Yeah if it’s windy enough that you ever notice it, your audio is as good as toast honestly. A great shotgun for outdoors the rejects absurd amounts of noise for noisy locations is the Sanken CS-3e. Cheaper end and more bullet-proof is Rode NTG-3 (cheaper and newer than, but modeled after, the famed Sennheiser MKH-416, Hollywood’s golden standard for many years, and still used by most film schools due to being nearly indestructible and low handling noise for novice boom ops). Even outdoors though, if it’s quiet and you don’t need a shotgun’s noise rejection, then hypers are still great even outdoors (obviously with more wind protection than outdoors). Jeff Wexler was one of the first to really use the Schoeps MK41 outdoors almost exclusively. Obviously he’s the exception, but just as an example.
2. “Room Tone” (never heard sound print in my life) isn’t actually done before people get on set. It’s done last, because you need the sound of everything to match exactly what sounds were on during the shot. Didn’t turn the AC off during the shoot? Sound editor will hate you, it’ll sound bad, but at the very least make sure it’s STILL on when you record room tone. People in the room, equipment, ect… all change how a room sounds, and you want the “room tone” to match exactly how it was while you’re shooting. So it’s always done at the end, and a helpful tip is getting the AD to like you early, they’re the best at putting the fear of God into everyone else to stay quiet while you’re recording room tone. Consistency in the audio trumps everything. If the AC has to turn on and off (big building, sometimes physically impossible to control) you’ll need to record the AC so all shots have it. Ideally, obviously, you want every sound source off. And yes, even the paranoid seeming ones as Polaris DiB correctly mentioned. Mic’s hear everything. If you want your sound editor to love you, record 30 seconds of room tone for each setup. Why? The mic is in a different position in every setup obviously, and how the mic picks up the room tone changes slightly depending where the mic is, and where it’s pointed. Rarely is there time for this honestly, but that should give you a better idea of WHY post-production guys want any room tone.
3. The scream everyone knows is the Wilhelm scream. Created as an “in-joke” for sound editors by Ben Burtt, Randy Thom, a few others, now everyone notices and even those original guys don’t like how it’s overused anymore. Basically, studio execs finally caught onto the joke, and now they’re like the unpopular kids that finally think they have a funny line except it’s 20 years old….
Separation stuff all definitely true. Actor walking and talking in the scene and you don’t see his feet? Get carpet and/or he walks barefoot. People sitting in a running car? Keep the car off, add the car noise later. Ect…
4. Setting gain correctly very important. In digital audio though, 0 is clipping, -0.1 is safe, and everything else is just negative numbers. Cameras may label it differently, but in any pro camera there should be lines to tell you were -12dB and (hopefully) -20dB. I haven’t ran sound directly to a camera in years without a mixer as that makes everything much nicer. Sound Devices 302 and MixPre are good ones. They send a 1kHz tone to the camera, you set the gain on the camera so that tone matches the cameras -20dB mark, and then you tape it closed and never let the camera op change it. Mixer then has bright display to let you know when you’re clipping, easily adjustable knobs, and high-end limiters so that even if an actor starts screaming, you’ll hit a limiter (lose some dynamic range in the voice) but it isn’t clipped (which is un-fixable).
Please, if you do noise-reduction while you’re sound editing, without first getting the original as as well fit as possible, using crossfades, room tones, ect… copying that to a safe track, and THEN trying noise reduction, on the OTHER track (meaning you still have an un-touched, fully edited clip available) then your sound mixer will shoot you. Noise reduction choices, honestly, cannot be made through cheap speakers, in a loud room, or with headphones of any kind on. You need a sound treated studio with high quality monitors so you can actually hear what noise reduction does to the characters dialog. Many times, it’ll sound worse. And if you’re you’re “free” noise reduction plug-ins, even more so honestly. Directors are often surprised when they “cleaned” the dialog themselves, and then it gets played at a festival on bigger speakers and the clips where he/she applied noise reduction sound absurdly awful.
If there’s 1 book I suggest someone read on location sound, it’s “Dialogue Editing for Motion Pictures: A Guide to the Invisible Art”. It tells you EXACTLY how dialog editing takes place, some tricks to save yourself, and what is un-fixable. It focuses on Pro Tools (which is what any sound professional in film uses in the US honestly) but the ideas and workflow can apply to any program. If you know how production tracks are used in post, it helps SO much understand the purpose and pitfalls of production audio.
On the set, sound is problem solving. People who choose to work in location sound generally like solving problems and creating off the wall ideas for getting the cleanest dialog. Whether that’s hiding the trunk of a car, hanging a mic from fishing line, whatever, there’s always a solution. If you want good audio, find one of these people, and not just a PA who “wasn’t doing anything”. If the audio guy doesn’t care about the tracks, no one will. Even when I direct stuff, I still have to be reminded and yelled at by the sound guy what he needs. It’s just too hard for a director to manage sound alone, when actors and story are (and should be) the directors focus.
And after you’ve got the production sound to not-suck, then you get to my favorite area where the “art” really comes into play. Sound design. What’s the door slam sound like, what does it say about the moment. Add a squeak to that characters step? What’s the background like? Maybe it’s a graphic murder and you hear the sound of children laughing in a nearby park to contrast. Cheap apartment with a baby crying in the next room? I think Murch said this originally, visuals are the front door to the audiences mind, and sound is the back door. You can add so much to a story with sound and the audience will never (consciously) even realize it. They just enjoy the movie more (or get more terrified, thrilled, laugh harder, ect.. whatever the case may be).
My usual hangout is the audio boards on a different site which has a ton of stickies that cover WAY more than I have time to get into. I didn’t see a rule against linking to other sites, but I’m posting this link separate just in case a mod needs to remove it. Sorry if I missed it in the rules, just thought it would be helpful for anyone looking for more in-depth advice.
A million thanks for your clarifications and even direct disagreement. This 30FMS has been a very long time in writing for me precisely because it’s the one I felt the least experienced to write, but the one that was being requested of me the most. There is, in fact, a confusing lot of audio technical information in the world, and it’s really hard to know what you’re talking about without a lot of experience. I did not choose to follow through on writing this until I had successfully done enough with audio that turned out with acceptable results, so this is sort of my own graduation from basic audio knowledge as well.
As for the sound print thing, the first time I heard it, a coworker in my production office was going on a huge rant about it and I listened for quite a while not being able to follow along until I clicked and realized he meant “room tone”. I did not adopt that term until I had heard it twice more, and then noticed that “sound print” is typically what audio programs seem to call it. At that point I adopted it. I am also from the US, so…. meh. I’ll go back to room tone, I like that term better.
Also, again I have heard many rants from many audio engineers and especially sound editors about capturing room tone both before and after a production takes place. Hell, the first shoot I was on captured it in the very middle of the scene, which was weird and I could not tell you why (or if it worked any better). Whatever the timing, the point is that it must be captured at some point, preferably near to when the cameras actually role. I like your suggestion for each separate set-up as opposed to every single shot (which to me is impractical, but apparently sometimes happens).
Odilonvert: “God, and then I think about how such spaces as Disney Hall were designed so that the acoustics are good. It’s truly a science.”
This is why I love my job ;)
Polaris: Thanks again for writing this up. After reading it I passed it along to my friend from school, Gohanto, since I figured he could add some further depth to this thread. Seems like that worked out!
Audio is definitely something that can not be learned in “30 minutes”. However, I think someone trying to make their own film, that doesn’t know anyone in the audio field, could learn enough by reading through this thread that they would at least not suck terribly at capturing audio. At the very least they would avoid the most notorious pitfalls.
The reason these posts are (initially) forum-based (as opposed to Garage Production Notes, like 1 and 2 turned into at Garage’s behest) is to attract conversation, questions, and alternative solutions. I have to admit that I do not do anything so laborious without the conviction that it will help me in some way, and you could call these 30MFS’s as me reworking over all of the things I’ve learned to see if I’ve kept track of them—and hoping that somebody else could offer more information.
They’ve been pretty successful. At the very least I get a few thanks, and that does make it worth it. As I said, next one is going to deal with digital editing workflow, and this is something that has no standard approach, but is something I do every day, so it’ll be interesting to learn what I’m doing wrong from others.
Cool! I love the pragmatic info, PolarisDiB!
“30-Minute Film School” is one of the things that I love about this site. MUBI should publicize them more!
@Anonymouse: working on getting the rest of them up, our schedule for production notes has increased but we will make sure to continue to post them.
@Polarisdib: Thanks for all you do for the community, you deserve allot of the credit for keeping it going and more!
NOTE: the currently very popular Canon 5D and 7D cameras have a major drawback in this way. In addition to not having an XLR input, adapting an XLR to the typical mic input jack does not change the fact that the 5 and 7Ds record audio through a compression that, easiest way of saying it, corrupts the audio. Thus, those working with a 5D or 7D would do well to have a separate field mixer for audio, which causes two problems: syncing, and syncing.
Love this ^
We work with the Canon t2i which I figure must be the same thing and we’re picking up audio equipment soon. Very good to know.
I have a Canon Vixia HFS200 and a shotgun mic by Rode for video (VideoMic). I figure image equipment for image, and sound equipment for sound. Not that I’m going to become an expert in sound anytime soon, just planning to learn more than I do now for the future…
Thanks polaris. I had horrible audio on my last film. From now on, I realize the importance of workable audio.
This is a pretty good summary of the Sound 1 class I took last semester.
A really nice primer on audio. Well done.
Great introduction to sound.
I would like to add that sound power is measured in decibels which works on a logarithmic scale. What this means in mixing is that you don’t double the amount of decibels to double your loudness. Just turn up 6dB.
Speaking of 6dB and logarithms, the inverse square law tells you that every time you double your distance from your sound source you loose half your volume (or 6dB). One foot to two feet away loses half of your volume So… If you are a boom operator pay attention to the movement of the microphone.
Cardiod, omni and hyper refer to the polar pattern of the microphone capsule.
In film sound the most frequent types of microphones ( in terms of how they internally work) is dynamic and condenser mics. Dynamic do not require power. Condensers do require power. This power goes by 48v, 48 vdc (voltage of direct current) or phantom power.
Many microphones have a built in filter. It is not a wind cut filter. It is a high pass filter. High pass filters cut low frequencies. It is called high pass because it cuts the lows and let’s the highs pass through.
Polaris, a few questions and clarification requests:
1. When you talk about bringing a blanket for location shooting, can I assume you mean an outdoor location?
2. When you say “it is better to set the shotgun mic’s levels to just below ten, then reduce the in-camera microphone to a couple-few decibels below that.” Do you mean set the SM levels at nine and the in-camera mic to 7 ( I realize this isn’t an exact science and that adjustments will need to be made based on a variety of factors.
3. When you say “you take the sound print you were so obstinate about recording before the bathroom scene and you cancel it out of the sound print of the action, THEN you layer the room tone from the living room over the track in the bathroom. It still doesn’t sound great, but at least the bathroom sounds like it’s in the same apartment, just with an irritating cheap fan on.” how do you still get the sound of the fan to distinguish the bathroom sound from the living room sound?
4. How long should a typical sound print be? Obviously you are going to repeat it as necessary but I’m juts wondering how the pros do it. (Do you agree with Gohanto’s 30 second rule?)
5. Is the under 200 dollar boom even worth it or is it not that different from the cartoid mic?
1) No. You might need to industrial staple that blanket to the wall to cut down on ambient noise. Doing so will cost you lots of money in repair costs to whomever’s wall you just industrially stapled a blanket to, but it’s worth it.
2) Yes, that’s what I meant. Read further responses on this thread for more expert opinions on decibel levels, since they have informed me that I have set my levels too high.
3) The fan will create a horrendous ambient noise if it is left on in a bathroom while you are recording in there. If you do the sound reduction I mentioned, you will still have a horrible fan noise that is even more horribly warped… but is at least squelched. Keep in mind that if you recorded dialog during that scene, it’s gone. It’s trash, you either need to rerecord that scene or dub over it. But if you are still wanting “naturalistic” sound and don’t want to be mixing canned sound for ages, then you can play with the sound print and some equalizers for ages instead, resulting in a messy and bad fan noise that you could at least overlay some other ambient white noise over to lessen the effect.
Really, the point of all that is that your audio is trashed and unworkable. But what the hey, there are still things to be done with it anyway.
4) 30 seconds is pretty standard, yeah. A minute is nice. Two minutes isn’t going to give you much more, honestly, and on set time is money, so you do have to keep moving. What you’re really hoping for from that 30 second print is at least five or six seconds of absolutely nobody moving, shuffling, clicking, coughing, humming, wheezing, or seemingly existing. The more you take, the more likely you’ll get that silence, but the law of diminishing returns sez after a certain point people start getting antsy and moving around.
5) Whatever you have to work with, the next step up in audio is probably worth it—even the ten dollar microphone strapped to a broomstick will provide better audio than the internal microphone of the camera by nature of the fact that you can get that broomstick in close to the source and thus isolate it from everything else. A boom mic is typically used instead of a cartoid because it isolates dialog, so it should be better. My recommendation ends there because you’d have to be comparing different varieties and brands of mics, most of which I know nothing about and more than anything but a technical manual could sum up entirely anyway.
Thanks! These threads are excellent.
very useful info written in a funny and simple style! I love it.