# Audio recording in theory

Recording sound in theory

## General properties of equipment components

• Dynamic range: ratio between maximal and minimal signals (given noise)
• Frequency response: ratio of output power to input power across range of frequencies, with constant input power
• Noise
• Weight and size
• Durability
• Power requirements
• Cost

## Cables

• Electrical specifications
• Unbalanced line: noisy
• Two-wire cable: hot, and ground (shield)
• hot - ground = signal + noise
• Advantage: cheap, simple, fine for short distances
• Disadvantage: noise may be introduced via stray electromagnetic fields on long cables
• Balanced line: no noise!
• Three-wire cable: hot, cold, and ground (shield)
• Signal is encoded on both hot and cold
• hot - ground = signal/2 + noise
• cold - ground = - signal/2 + noise
• (hot - ground) - (cold - ground) = hot - cold = signal
• Take care: the same pin must be hot or cold, else signals may cancel!
• Understand the issue with a number game:
• A wants to tell C a number, X by whispering to B who tells C. But B will always add a fixed number N (noise), unknown to C. C hears X+N and therefore can't know the original number!
• Trick: B must add the same number N to all whispered numbers.
• So A tells B the number X/2 as well as its negative -X/2. C gets X/2 + N and -X/2 + N, and subtracts them. Voila: X appears!
• Connectors and connections
• Balanced lines are carried by XLR connectors (3 pin/wire)
• Unbalanced lines are carried by quarter inch, miniplug, or RCA connectors (2 contacts)
• Stereo versions add an extra signal, thus 5-6 wire (balanced) or 3-4 wire (unbalanced), depending on whether ground is repeated.
• Thus quarter inch: tip, ring, sleeve (ground)
• Be sure ground is connected to ground

## Microphones

The microphone is a kind of transducer, converting pressure waves into electrical signals by means of a diaphragm that vibrates with incident sound waves, generating an electrical current. The output is usually a low voltage (thousandths of a volt), which must be boosted to line level (in the range of about 0.5 volts to 2 volts).

• Induction technology
• Condenser (capacitor) microphone (capacitance change)
• Advantage: stronger signal, better frequency response
• Disadvantage: requires power (battery or phantom), more delicate, poor choice for loud sound
• Dynamic microphone (electromagnetic induction, like a reversed speaker).
• Advantage: Simpler, sturdier, fewer moving parts, no power required, good for loud sounds
• Disadvantage: Tend not to have flat frequency response, weak signals
• Piezoelectric microphone (piezoelectric generation)
• Directional pattern
• Omnidirectional
• Cardiod (front and sides)
• Bidirectional
• Hypercardiod
• Shotgun (too conspicuous!)
• Variable (some microphones provide a switch, e.g. to match video zoom)
• Special types
• Lavelier mic
• Wireless mic
• Contact mic
• Stereo mic
• Digital mic: converts straight to digital, and plugs into your computer often using a USB slot. (Convenient, but not the best A/D conversion.)
• Connectors
• XLR (professional): larger, but shielded, balanced, better connection
• TRS (tip-ring-sleeve, both quarter-inch and miniature): not shielded, unbalanced
• RCA: not shielded, unbalanced
• USB
• Impedance
• Low: under 600 ohms (most condenser mics are under 200 ohms)
• High: over 10k ohms
• Low impedance is preferred. High impedance will not perform well with longer cables.
• Connect mic to system of equal or higher impedance
• Frequency response
• Plot of dB vs Hertz (input signal), where dB measures the ratio of output to input
• Flat is generally best, but at least match to expected frequency range
• Condenser responses are generally flatter than dynamic
• Dynamic range
• essentially ratio between loudest sound to noise floor
• Accessories
• Windscreen
• Mic mount (reduces vibration)
• Stand

## Speakers

• The speaker is the microphone's opposite in the record/playback chain.
• Applications:
• Feedback to informants
• Transcription
• Editing

## Digital recording

• Analog recordings can be subsequently digitized.
• To make a real-time digital recording, an analog mic signal passes through an A/D unit (sometimes this is contained inside the mic itself), to be stored in digital media.

### Digital audio data rates

The net bit rate for audio is a function of

• bit depth
• sampling rate
• CODEC compression(if any)

Very roughly speaking, remember that an audio CD (uncompressed audio) is around 650 MB; a high quality MP3 version is about a tenth that size, while the highest quality audio is 2x-3x bigger.

Remember to differentiate units (bit rate vs. byte rate).

For some sample and illustrative calculations see: File size calculations.

### Recorder type

• Computer to HD
• Dedicated hard drive recorder
• Flash memory (SD cards, etc.) recorder
• Minidisc, Hi Minidisc

### Digitization

• AD converter: quality important (use outboard converter)
• Sampling rate
• Sampling rate should be greater than twice the highest signal frequency
• aliasing: high frequencies appear as low frequencies, because they're undersampled.
• Nyquist frequency (= half the sample rate) is the highest frequency that can be sampled without aliasing problems.
• Therefore A/D converters must filter before sampling.
• Human hearing is limited to about 20kHz, which means the signal can be filtered to 20 kHz max without much perceptual loss.
• In this case sampling rate must exceed 40 kHz. Typical rates:
• 44.1 KHz (used for CDs)
• 48 KHz
• 96 KHz
• 192 KHz
• Bit depth
• More bits increases dynamic range, decreases quantization noise, but increases file size
• 8 bits (49.93 dB SQNR)
• 16 bits (98.09 dB) (used for CDs)
• 20 bits (122.17 dB)

### Setting levels

• Automatic level control
• Limiter
• Manual level control
• Danger: clipping!

### Uncompressed audio

Use Pulse Code Modulation (PCM) to encode the analog signal

• wav (pc)
• aiff (mac)

• flac

• mp3
• ogg
• mp4
• wma

## Considerations when designing a recording system

• Cost
• Cost of equipment
• Cost of repairs
• Cost of supplies, per minute of recording
• Cost of power, per minute (batteries)
• Quality (Signal/noise, distortion)
• S/N ratio
• Dynamic range
• Frequency response
• Other sources of distortion
• Durability
• Durability of equipment (given environmental problems: heat, dust, moisture, light, shock, etc.)
• Stability of the medium, durability of the recording itself
• Feasibility (practicality)
• Portability of recorder (weight & size)
• Compatibility of equipment types (e.g. computer and recorder)
• Conspicuousness (small is less, but may make less good recordings), e.g. shotgun mic may be technically useful, socially useless
• Practicality of supplies (availability)
• Practicality of power source (availability, voltage, etc.)
• Ease of working with the medium (copying, transferring, editing, etc.)
• Amount of storage (tape, HD, etc.) required (tape length, total volume/weight of tape required)
• Duplication ease and generational loss
• Able to handle range of situations (interview, music, inside, outside, etc.)
• Ability to combine video recording a plus!