Concepts

Read through the following and get a sense of the terms and concepts, or you may prefer to start with the demos & simulations below.... I suggest you download the software listed at the end of this article and try the various examples.

NB: Java simulations will not run in Chrome!

Signals

A function maps every value in a domain to a single value in a range.
A signal is a function whose domain and range can be interpreted in terms of physical units, usually mapping time and/or space to some physical quantity. Signals thus include that which we perceive in time and space. For instance, a song could be represented by a signal that maps each moment over a period of 3 minutes to a single value interpreted as a voltage, or pressure. A scene viewed out of a window could be represented by a signal that maps each point on the window to a single value representing the color and intensity of light at that point.
Periodic vs non-periodic signals
- Periodic signal
  - Time domain representation
  - Frequency domain representation: Fourier series demo 1 demo 2
- Non-periodic signal
  - Time domain representation
  - Frequency domain representation: Fourier transform producing a spectrogram. Try examples using Audacity, Sonic Visualiser
Analog vs. digital signals
- Continuous or Analog signal
  - An analog signal is a continuous function of time and/or space into a continuous range.
  - Analog signals may be represented in the physical world, for instance as time varying pressure (in air) or voltage (on a wire)
  - Analog filtering and synthesis
- Discrete signals and Digital signals
  - A discrete signal is a function of discrete time points (e.g. integers); values may be continuous.
  - A digital signal is a discrete signal where values must also be drawn from a discrete set: A digital signal is a discrete function of time and/or space into a discrete range (e.g. integers)
  - Digital signals tend not to exist in the physical world, at least not at the level of human perception, but result from the need for computer representations of analog signals
  - Digital filtering and synthesis
Mono vs. stereo signals
Matlab programming example (click on "Try this example" to run your own code"). The following code sets k to a sequence of odd values from 3 to 101, then adds corresponding sine waves at those harmonics, divided by the harmonic number.

for k=3:2:101
 y = y + sin(k*t)/k;
end
plot(t,y);

Waves

A wave[1] is an oscillation in some physical value (e.g. pressure, voltage) carrying energy through space/time, often carried by a medium and constrained by physics (including physics of the medium)
- The wave's value at a particular point in space (over time) or in time (over space) defines a signal, as illustrated in this demo
- Mechanical waves are carried by an elastic transmission medium, analogous to weights connected by springs
- Electromagnetic waves (e.g. light waves) travel without any medium
- Waves may be transverse or longitudinal demo
- Dimensionality:
  - Of wave: (1D, 2D, 3D).
  - Of wavefront: (0D, 1D, 2D)
Wave amplitude (A) and Power:
- the peak-to-peak amplitude is the difference between the wave's maximum and minimum values.
- the RMS (root mean square) amplitude is the square root of the average of the wave's squared value. Its units are the same as those of the wave itself, e.g. pressure or volts. This is typically what is measured.
- Wave power (P = A^2): waves carry energy; the power (energy per unit time) carried by a wave is proportional to the square of its RMS amplitude. Here is an explanation.
- We measure the "difference" in power between two waves using a logarithmic scale, dB, applied to the power ratio between the two waves, which is the square of the amplitude ratio, A/B
- dB = 10*Log(power ratio) = 10*log(A^2/B^2) = 10 * 2 * log (A/B) = 20 log (A/B)
- [recall from high school math that log(A squared) = 2 log (A) ]
Inverse square law: in the absence of boundaries, waves spread. In 3D, the spread is proportional to the square of the distance from the source. If the power (energy per unit time) is constant, then intensity (power per unit area) is inversely proportional to the square of the distance. What this means: a flashlight's intensity at 8 feet is only a quarter its intensity at 4 feet.
Wave properties
- Wave superposition principle: in a linear medium, waves combine simply by adding their values. Traveling waves will appear to pass through each other unscathed. It's also possible to cancel a wave by adding its inversion (as in noise canceling headphones). See this simple 1D string animation, or play with this ripple tank.
- Reflection: a wave will reflect off a boundary. See animation
- Diffraction: a wave can travel around an obstacle, or spread after passing through a hole. Diffraction effects are greatest when the wavelength is roughly equal to the size of the diffracting object. See animation. If bigger, the wave will pass around an object nearly unaffected; if smaller, the wave will be blocked. The diffractive property of sound (stemming from this principle) is crucial to its social efficacy; for instance A=440 Hz has a wavelength of about 2.5'. By contrast the wavelength of visible light varies from 4*10^-7 to 7*10^-7 because visible light's frequency is very high (more than 430 Terahertz), and thus will not visibly diffract except in relation to very narrow objects (or slits).
- Refraction: a wave changes direction when the medium changes. See animation
Sound waves
- Sound waves are longitudinal traveling pressure waves through air (the medium) in 3D, with a 2D usually roughly spherical wavefront. A violin string carries a transverse displacement standing wave in 1D. Either kind may or may not be periodic...
- We hear sound roughly from 20 Hz to 20 KHz
Light waves
- Light waves are transverse electromagnetic waves traveling without a medium in 3D, with a 2D wavefront that tends towards spherical (but can also be directed as a coherent beam, e.g. lasers)
- The visible spectrum [2] ranges from about 400 nm to 700 nm, corresponding to 770-430 THz (speed of light is about 300,000,000 meters/second)

Periodic waves

Periodic waves as periodic signals

Periodic waves repeat their values (whether pressure, for sound, or displacement, for a string) over both space (at a fixed moment in time) and in time (at a fixed point in space). The period over which a wave repeats is called wavelength (in space) and period (in time). The frequency is the number of periods per second, i.e. is the number of times the period divides into one second. Thus:
Period (T) can be measured in seconds; frequency (f = 1/T) is measured in Hertz (Hz), or cycles per second.
Wavelength (L)
Phase. At a fixed point in space, the waves will go through its full range of values once every period (T). So it's like a cycle, and we can identify points on the cycle as angles: for instance, we can call the beginning of the cycle (which is arbitrary) "zero" and the halfway point "180" and the quarter point "90". This angle is called phase.
Open Audacity and play with periodic waves (sine, or drawn)...

Wave speed

What is the speed of a periodic wave? Assume the wave is passing a point in space. The entire length of the wave (L) passes the point in T seconds. So the speed of the wave must be L/T, which is the same as L*f.
Wave speed = L/T = L*f
In words: wave speed = wavelength divided by wave period = wavelength times wave frequency.
- Mechanical waves travel at a speed that depends on the medium (e.g. air pressure, temperature, humidity). You can imagine these factors related to the spacing of weights and the stiffness of the springs connecting them. In room temperature dry air, sound waves travel a little faster than 1000 ft/sec; 1115 ft/sec at 20 degrees, to be precise)
- Electromagnetic waves travel at a constant speed: 186,000 miles/second
demo

Standing waves

Traveling waves move continuously through space until stopped by the medium's firm boundary (note that if the boundary isn't firm the wave may propagate beyond it, though transformed e.g. a sound wave in air can pass beyond the air boundary to enter into water or wall)
Boundary conditions: the medium may be infinite. Or it may (more typically) be bounded: the air in a room meets a wall, a string on a stringed instrument meets a bridge.
Wave reflection: at an obstacle, a boundary a wave will reflect and return. If the reflected wave matches the incident wave, you'll observe the phenomenon of resonance, and energy will be retained in the vibrating medium, until dissipated. The frequencies at which this happen are called resonant frequencies of the system (medium plus boundaries). For instance if you tie a jump rope to a doorknob and shake the rope at just the right rate, the whole rope will go into smooth up and down motion, perhaps with nodes (stationary points) in the middle. Or if you sing in a resonant room at just the right frequency, you'll find your voice amplified and echoing even after you stop singing. This is the phenomenon of Standing waves
Check out this demo of standing waves on a string (1D)
Here's another very nice demo, in which the string is modeled as a series of weights and springs - you can play with various string parameters.
Here are standing waves on a circular membrane (like a drum head, 2D)

Resonant systems

A physical system comprising a medium with boundary (e.g. a string tied at its endpoints, an air-filled tube closed at one end, a guitar body, a room filled with air, a pendulum) can be characterized by its resonant frequencies, natural modes of vibration, at which standing waves form.
When waves are introduced at a resonant frequency of the system, they reflect off the boundary in such a way as to reinforce incident waves.
In an ideal system, a fixed amount of energy carried by a wave oscillating at the resonant frequency will continue to oscillate, without loss, as a standing wave.
In practice, the resonant frequency is a "sweet spot" at which energy will rapidly build a high amplitude standing wave.
Consider, for instance: a swing (pumping your legs at the right rate makes the swing go higher and higher), a shower stall (singing at the right frequency creates a loud sound).
Such systems naturally filter out non-resonant-frequency energy - thus plucking a guitar string causes it to vibrate at its resonant frequencies (though the manner of plucking may vary the timbre)
As resonance entails energy buildup and increasing amplitude, disaster may ensue (e.g. when bath water spills, or a bridge breaks)
Standing waves are characterized by nodes of near zero displacement over time.
See if you can put this string demo into standing wave resonance.
Some resonance videos featuring cymatics (Chladni plates): [3][4][5][6]
Here's a beautiful example of resonance among metronomes

Frequency analysis

Fourier series: discrete Fourier analysis of periodic waves (represented by signals)
- In some sense the simplest waves are sine waves, because these result from simple harmonic motion (back and forth movement of a particle whose position is constrained by a simple spring, as in air or an idealized string); they even sound simpler than other waves (due to the physics of our auditory system).
- These simple waves are like an alphabet out of which other periodic waves can be constructed, through a mixing process, according to Fourier analysis:
- Fourier analysis: a periodic wave of frequency f can be represented as a sum of an infinite set of simple periodic waves (sine waves), at frequencies f, 2f, 3f, etc. (called harmonics) each at different phases and amplitudes. The first frequency "f" is called the fundamental, or first harmonic. Subsequent frequencies are integer multiples of f; thus 2f is the second harmonic, 3f is the third harmonic, and so forth. The series is called a Fourier series.
- Thus every periodic wave of frequency f can be analyzed into its component frequencies, each at a particular power level, as given by a series of amplitude coefficients: A1 (amplitude of the first harmonic=f), A2 (amplitude of the second harmonic=2f), etc.
Fourier transform: continuous Fourier analysis of arbitrary waves (represented by signals)
- The same analysis can be performed on an arbitrary (possibly non-periodic) wave, generating what is called a Fourier transform (a density function).
- The fourier transform tells us how much energy is in each frequency band
- A spectrogram is an example (refer to my Cantometrics presentation - in class only)
- In practice, one uses a sliding window assumed to represent a temporary period to generate a Fourier series. Obviously in this case the amplitude coefficients are varying over time.
- We can represent the Fourier analysis as a three dimensional plot of frequency, time, and amplitude. Sometimes the plot is reduced to two dimensions, when the amplitude is plotted as intensity or color; this is called a spectrogram.
- The length of the window -- call it W -- defines a frequency f = 1/W. This frequency should be low compared to the expected frequencies in the signal. All the frequencies that emerge out of the Fourier analysis must be multiples of this f = 1/W. Obviously if W is bigger we'll get finer grained resolution - the so-called "narrow band" spectrogram.
- Use Audacity or Sonic Visualiser to record and display waves, and compute spectrograms.
Relation of Fourier analysis to standing waves in resonant systems, and to sound waves
- A standing wave in a resonant system is periodic, hence can be described by a Fourier series: a fundamental frequency and harmonics.
- Generally a standing wave with particular harmonics will generate an equivalent traveling sound wave with the same harmonics
- On a string or air column (1 dimension) the fundamental frequency is the lowest possible standing wave frequency, and resonant frequencies are all harmonics of that fundamental.
- See the diagram of harmonics here
- Thus a bowed string vibrating at 440 Hz generates a sound wave with pitch A and timbre corresponding to the string's harmonics.
- Other systems may exhibit more complex series of resonant frequencies. In general the resonances are called partials; each partial above the lowest is called an "overtone". But they will typically not be multiples of a single fundamental frequency when the resonating system is 2 or 3 dimensional (e.g drums and gongs).
- Here is an overview and simulation of partials, overtones, and harmonics. Listen to the overtone series as sine waves.
- Here is a recipe for a bell sound using a set of partials (demo using Audacity; also show Fourier transform using 4096 point window. Bigger window=lower frequency, provides more frequency resolution.)

Synthesizing and Filtering

Synthesizing by adding sine waves: fourier analysis/synthesis demo
Filtering
- Filtering demo
- vocal chord vibrations filtered by the vocal tract

Aperiodic waves, noise, and the (very important!) Signal/Noise (S/N) ratio

Even musical signals are only periodic with the following caveats:
- Periodicity is approximate (there are always slight variations due to the addition of noise, and changes in amplitude)
- When monophonic (polyphony introduces aperiodicities)
- Over short periods of time (since pitch is typically changing)
The extreme of aperiodicity is randomness, or noise
- Using the Fourier transform we can analyze the frequencies in noise (even though we may not hear them!)
- White noise: flat frequency distribution: all frequencies are equal.
- Pink noise 1/f distribution: higher frequencies are less present
- Demo: Audacity (generate noise, analyze, and filter), or Filtering demo
Signal/noise ratio is measured in dB: 20 log (S/N) where S and N are SPL measurements, i.e. RMS amplitude (see below for this computation)

Musical psychoacoustics: pitch, loudness, timbre, duration, envelope

Note that our perception tends to be logarithmic in relation to physical quantities. For instance doubling a physical quantity produces the sensation of equal perceptual distances: Logarithmic perception (pitch, loudness): equal perceptual distances correspond to equal ratios of physical quantities

An example of such doubling is the octave: four frequencies in the relationship X, 2X, 4X, 8X (e.g. 100, 200, 400, 800) will sound equally spaced.

Basic perceptual attributes of musical sound (and their physical correlates)

(experiment using Audacity)

Pitch (frequency)
Noise (unwanted signal, typically random and aperiodic): S/N ratio
Loudness (amplitude)
Timbre (tone color) (wave shape)
Duration and envelope (RMS amplitude)
Phase: Note that phase is not audible in itself, but is crucial in Fourier series for determining wave shape (and in particular a 180 degree phase shift can produce complete signal cancellation!)
Spatialization: binaural hearing implies two signals, enabling sound spatialization

Pitch

Audible frequency range for good hearing: from 20 Hz to 20 KHz. Demo: try this online tone generator (but you'll need very good speakers to hear the low end)
A4 = 440 Hz (octaves are numbered from C: Middle C = C4)
An interval is defined by two frequencies, f and g
Perceptually equal intervals have equal ratios. Thus the interval between f1 and g1 is heard to be the same as the interval between f2 and g2 when f1/g1 = f2/g2 (and not, as you might expect, when (f1-g1) = (f2-g2))
Octave ratio = 2
Semitone interval frequency ratio (12 semitones to the octave) = twelfth root of 2
Cent interval frequency ratio (100 cents to the semitone) = 1200th root of 2
In music we measure pitch intervals in log units, so that we can compute differences rather than ratios.
Pitch perception is logarithmically related to frequency
Musical pitch interval = log (frequency ratio) (the logarithmic base provides the interval unit; thus log base 2 gives the number of octaves, and log base twelfth root of 2 gives the number of semitones)
To clarify, let's pose and solve these problems:
- What's the interval between two waves of frequency 660 Hz and 440 Hz? We compute the ratio 660/440 = 1.5, then take the log base 2 of this value to = approx. 0.585, to express this ratio in octaves. In tempered semitones, compute 12*0.585 = 7 semitones, or about a perfect tempered 5th (1.5 = 3/2 is exactly a perfect fifth in Pythagorean tuning).
- Conversely, what's the frequency ratio corresponding to a tempered major 3rd above A4? This interval contains 4 semitones, or a third of an octave. The ratio is then 2^(1/3) = approx. 1.26. Since A4 is 440 Hz, a major third above is (1.26)* 440 = 554.4 Hz.

Loudness

Loudness is to power as pitch is to frequency, sort of. The physical quantities (power and frequency) aren't exactly the same, and their relationship is a bit complex - but they're relatable. Multiplying power by 10 corresponds to approximately double the loudness, and multiplying frequency by two corresponds to rising an octave.
RMS amplitude gives the average sound pressure (or average voltage, when converted to an electrical signal)
Power is the square of this RMS amplitude
decibels are a measure of loudness - ten times the number of "Bels". Some fun videos: [7][8][9] and Khan Academy
The number of dB in a ratio = 10 * log (ratio) [base 10]
Here the ratio is taken between the power of one wave, and the power of another wave
Suppose two sound waves have RMS amplitudes A and B, i.e. power A^2 and B^2. Then the number of dB between them is = 10 * log (A^2 /B^2) = 20 * log (A/B)
In other words, if RMS amplitudes are A and B, and A/B = 10, then the dB = 10*log(100) = 20.
And if dB = 10, then that means 10*log(A^2/B^2) = 10, so that A^2/B^2 = 10, i.e. the ratio A/B = square root of 10, approximately 3.2.
Threshold of hearing at 1 KHz: 2×10^-5 = 0.00002 Pa (RMS) (Pa = Pascale, unit of pressure = 1 newton per sq. meter). This value is used to compute sound pressure level (SPL): The SPL of a wave is the number of dB above this threshold. Thus the threshold itself is 20*log (0.000002/0.000002) = 0 dB
A loud sound such as a jackhammer is about 2 Pa, 100,000 times bigger than the threshold. In SPL dB we compute 20*log(100,000) = 20 * 5 = 100 dB.
In other words: whenever the RMS sound pressure goes up by a factor of 10, the SPL increases by 20 dB. Figure out the number of factors of 10, multiply by 20, and you've got your SPL dB.
Raising the volume by 20 dB corresponds to an increase of 100x in energy, but only 10x in SPL. Raising the volume by 10 dB means SPL goes up by a factor of about 3.16.
Loudness in dB SPL

Duration

The length of the periodic wave (so long as it remains within audible range of pitch and loudness). True periodic waves are infinite, but in practice the sound has a beginning and an end, or at least its amplitude fades beyond audibility outside a certain time interval (perhaps due to movement of the auditor).
You can think of the duration determined by a perfectly periodic wave, plus an on/off switch: you turn the switch on, and hear the wave, then turn it off, and hear nothing.
This idea can be generalized: imagine a perfectly periodic wave, plus a volume control: you bring the volume up from zero, modify it, then fade it out.
If we assume simple linear changes at particular points in time, we get the concept of "sound envelope".
The sound envelope is crucial to identification of sound source. On synthesizers, the envelope is often defined by times for ADSR (attack, decay, sustain, release). Here is an explanatory video.

Timbre

Timbre is a kind of "remainder" category, defined as that which differentiates two sounds of identical pitch, volume, and duration.
Timbre is closely related to the waveshape and envelope of a periodic wave.
Try drawing different waveforms in Audacity and hear what happens.
Or modify harmonics in a Fourier series, and see what happens to the waveshape.

Digitization and sampling

Signals can be analog (continuous in time and value) or discrete (countable in time) or digital (countable in time and value)
Converting the physical analog world to the digital computer world, by
- Sampling a signal at discrete moments in time, usually at a fixed rate, called the sampling rate. (CD: 44.1 kHz)
- quantization: Converting each sample to one of a number of discrete values, determined by the bit depth: the number of bits used. The number of values is 2^(number of bits). So with 8 bits we get 256 values, but with 8 more bits (16 bits) we get 65,536 values; add another 8 bits (24 bits) we get 16,777,216 values. Most audio equipment generates 16 or 24 bit samples.
- Codecs perform analog to digital (A/D) and digital to analog (D/A) conversion.
Aliasing problem:
- It is necessary to sample at least twice as fast as the highest frequency in the signal being sampled. This rate is called the Nyquist rate. It should be considered your minimal sampling frequency.
- If you don't sample fast enough you'll get "aliasing" - spurious low frequencies will appear due to undersampling.
- See the diagram at the top of this page for an example.
- The Nyquist frequency is half the sampling rate, and is the maximum frequency that can be represented.
- The Nyquist frequency should be higher than the highest frequency in your signal, i.e. higher than half the Nyquist rate.
- For example: you have a signal whose highest frequency component is 18 kHz. Then the Nyquist rate is 36 kHz. If you sample at 44 kHz your Nyquist frequency is 22 kHz and you're ok. If you sample at only 32 kHz you'll get aliasing.
- For a typical CD, at 44.1 kHz sampling rate, the Nyquist frequency is 22.05 kHz. (This sampling rate is no accident: 22.05 kHz is the upper range of our hearing.)
- Helpful video on sampling and aliasing
- Usually a digital recorder applies a low-pass filter prior to sampling, in order to avoid aliasing problems.
- Human hearing only goes up to 20 kHz implying a 40 kHz Nyquist rate, but in fact we can distinguish sounds whose only difference is higher frequencies (we just can't hear those frequencies in isolation).
Noise and aliasing problems result from undersampling:
- aliasing and quantization noise - demonstration video
- more on aliasing
Play with Audacity: examine samples

Homework

In the following recall that:

to solve for X in X^Y = Z take the Yth root of Z (e.g. X^2 = Z implies X = square root of Z). [here ^ indicates a power]
to solve for Y in X^Y = Z take the log of Z base X. To do that, take the log of Z base 10 and divide by the log of X base 10 (or any other base; 10 is not special!)

Recall that western pitches can be designated by an octave number followed by a letter, such that C4 is middle C (octaves run from C to the next higher B). This is called scientific pitch notation. (Here's an easy mnemonic: A440 is A4.) Further divisions can be introduced using cents: a semitone is divided into 100 perceptually equal parts. Thus an octave comprises 12 semitones and 1200 cents.

What frequency ratio corresponds to an octave? a tempered 5th? a semitone? (hint: it's a number X such that X times itself 12 times = X^12 = 2)
What frequency ratio corresponds to a cent? (hint: it's a number Y such that Y times itself 100 times = Y ^ 100 = a semitone frequency ratio)
How many cents are in a whole tone? (remember: cents are hundredths of a semitone)
How many cents are there from 440 Hz to 523 Hz? (hint: first compute the interval in semitones)
Compute A5 in Hz. Create such a pitch in Audacity.
Compute C5 in Hz. Create such a pitch in Audacity.
What pitch is 600 Hz? (hint: compare by taking the ratio to a pitch you know, such as A440)
Derive the frequency, period, and wavelength of A4
(assume dry air at 20 degrees celsius, in which case the speed of sound is 1115 ft/sec). (Remember that wave frequency times wavelength = wave speed.)
What is the frequency (in Hz) and pitch (as note name) of a periodic sound wave with wavelength 6" (six inches)? (again assume the speed of sound = 1115 ft/sec or 340 m/sec)
Compute the SPL in dB of a rock concert if the RMS pressure amplitude is 10 Pa = Pascales [remember the formula for dB difference when you know RMS pressure for two sounds to be A and B: 20 log (A/B), and recall that the threshold of hearing is 2×10^-5 = 0.00002 Pa (RMS)]. Could this SPL be dangerous to your hearing? Consider this table.

Lab

Audacity:
- examine samples
- draw a periodic wave, hear it (100 samples at 44.1kHz is approximately one period of a 440 Hz = 0.44 KHz wave) (use Loop Play = shift-space)
- record, observe clipping
- edit, analyze, normalize
Sonic Visualiser
Praat: pitch track, etc.