Skip to content

SASS theoretical principles

DD4WH edited this page Nov 30, 2020 · 12 revisions

Most of the info and text below is taken from Billingsley (1986), Bartlett & Billingsley (1990), and Billingsley & Bartlett (1990).

Human spatial hearing

The recognition of spatial hearing in humans is based on the fact, that we have two ears (with a distance of about 16-18cm) that can perceive and analyse sound simultaneously. Thus, the human brain can use these two different cues to distinguish the direction and distance of the incoming sound. This is achieved in our brain by:

  • comparing the phase (position of a sound wave in time) of the sound arriving at the ears: the phase at one ear differs from that of the other ear because of earlier arrival of the sound wave (speed of sound is 340m/sec) and from diffractions caused by the head and outer ear tissues. Theoretically, phase differences can only be used by the human brain to localize sound cues up to about 950Hz [f = c / lambda, c = 340m/sec; lambda = 36cm --> double ear distance], because phase differences become ambitious, if half the wavelength of the perceived sound frequency approaches the distance between the ears (18cm). However, time differences in arrival of the sound at each ear play a large role even in higher frequencies, but only for transients, but not for continuous sounds. Sounds below about 200Hz cannot be localized, because the phase differences are too small

  • comparing the sound pressure levels arriving in each ear: The head, nose and associated tissues of our face shadow and dampen the sound depending on frequency: thus the brain can judge the direction of arrival of sound by comparing sound pressure levels in each ear. For low frequencies (<1500Hz) the sound is not dampened enough by the head in order to make the amplitude differences large enough, so amplitude differences only contribute to spatial hearing above about 1500Hz.

Which of these principles is used by the human brain to determine spatial perception is highly dependent on the frequency of the incoming sound: low frequencies (200 - 700Hz) are judged by phase differences, whereas amplitude differences are used above 1500Hz. In the transition zone (700-1500Hz), both principles are used, below 200Hz spatial recognition is not possible.

Construction principle of the SASS

A spatial recording system has -in general- to be as similar as possible to the human head in order to account for the phase, time and amplitude differences in the above-mentioned frequency bands. Michael Billingsley and Bruce Bartlett have designed such a system which enables to record the spatial sound perception of the human ears which is headphone AND stereo speaker compatible (Billingsley 1986, Bartlett & Billingsley 1990, Billingsley & Bartlett 1990).

The SASS utilizes two omnidirectional microphones capsules (in my SASS, I use two parallel-wired EM272 for each side) coupled with two wooden boundary plates and an acoustic baffle, which together mimic some of the defractive and absorptive qualities of the human head. The microphone capsules are spaced at a distance of 18cm to provide both time-of-arrival and head-shadow cues essential for accurate localization. The wooden boundaries are arranged at an angle of 110 degrees in order to provide a rearward sound acceptance angle for each side of the array of 125 degrees, which corresponds to the human ear characteristics. The area between the microphones is filled with a baffle constructed from acoustic foam (Basotect), which attenuates frequencies above 1000Hz at a rate of at least 3dB per inch. By using a baffle to attenuate frequencies above 1000Hz, we achieve that phase differences can be used unambitously at these frequencies to detect spatial arrangement of sound by using amplitude differences rather than confusing phase errors (which can only be detected correctly up to 900Hz by phase differences).

Stereo localization characteristics

1.) at low frequencies (< 500Hz), the SASS produces equal sound amplitude at both mics, with direction-dependent delay between channels.

2.) at frequencies between 500 and 1500Hz, the stereo localization fo the SASS is mainly due to time or phase diffences between channels

3.) at midfrequencies (1.5 to 3kHz), the localization is due ti a combination off time and intensity differences

4.) at high frequencies above 4kHz, localization is mainly due to intensity differences.

In other words, the localization mechanism of the SASS crosses over from arrival-time differences to intensity differences in the vicinity of 2000Hz (Bartlett & billingsley 1990).

Advantages over other mic arrays for stereo recording

  • Dummy Heads can provide very good near-to-perfect stereo respresentation of sound. However, if played by speaker systems, the sound is poorly focused spatialy. Also, the stereo representation is dependent to a large extent on the indiviual shape of the ears pinnae, which differ largely between different humans. The SASS recordings can be played over headphones and speakers and provides a good spatial representation without spectral distortion on both systems.

  • Spaced array uses two or three mics aiming at the desired sound and is spaced several centimetres or metres apart. Time differences between channels create the stereo images. The spaced array has poorly focused imaging, is not mono compatible and has extreme phase differences between channels.

  • Coincident pair (Blumlein, X-Y, Soundfield, M-S) uses non-spaced, angled directional mics. Amplitude differences between channels create the stereo images. The mics have off-axis coloration and reduced low end response.

  • Near conincident array (ORTF, NOS) uses a pair of directional mics angled and spaced a few centimetres apart horizontally. Time and amplitude differences between channels create the stereo images. Phase cancellation can occur when combined to mono channels and many mics used with his technique have weak low end or off-axis coloration.