Published 6 March 2026 - Last update 6 March 2026 - Author: Franck Martin

Sound, Space, Time

Spatial Audio or Immersive Audio is gainning popularity, thanks to the processing capabilities of computers, phones, game consoles. Virtual Reality depends on sounds too. For the universe to be completely immersive you need to bring all the clues to our brain. Even in a normal game, it could be an advantage to hear what is happening behind the player.

In the analog world processing sounds to make them immersive was complicated, but in today's digital world, this is just programming. The technology, to convert object based sounds into a binaural rendition that makes you believe the sound is really coming from slight right and above from you, is becoming more and more available on consumer grade equipments.

Spatial Audio is just an evolution of past methods to bring clarity and richness to a sound environment.

When you play 2 sounds at the same time, they may overlap each others, cancelling each others. There are several methods to separate them, during playing, or during mixing. Let's review some of these methods.

Separating Sounds by Time

For percusive sounds, a method is simply to play one sound before the other. This is the concept of swing, where the drumer will play in advance or late compare to say the guitar or the rest of the band. The transient of a drum is happening just before other sounds. this is just a few milliseconds but enough to separate the drums form the rest.

Separating Sounds by Volume

This method is more commonly called side chaining. A brief sound triggers compression of another sound, making it quieter for the time of the transient of the first sound.

Separating Sounds by Frequencies

There are common chart out there, explaining the frequencies range of instruments. A composer for orchestra needs to know the frequency of each group of instruments and may ensure thhat the ones with overlapping frequencies are not playing the same notes.

Frequency chart

It is also common in a mix to remove frequencies on some instruments as to carve some space for the voice, to make it more intelligible.

Separating Sounds in Space

Stereo

Stereo is not the reality. It is a representation of the reality. Band members are not placed left and right, they are over the stage. Granted, nowadays they are electrically amplified and the sound you get at a concert comes mainly from 2 sound sources.

In the beginning of stereo, the mixing consoles had only 3 settings: left, center and right. If you listen to early Beattles album, the instrument are clearly placed on the left or right speaker.

Quadraphonic or source based

Quadraphonic is a system that never got formalized as a standard. The closest standard that you find is 5.1 or 7.1 sounds system that you find in home theater systems. You can fit a quadraphonic sound, 4 channels, into a 5.1 system, however the left and right speakers for front and back, in a 5.1 systems often are not placed in a perfect square. The back speakers are called surround speakers, because they are just slightly at the back of the listener. They are also for added ambiance in movies, they are not necessarily a full sound source. Still Stereo, 5.1 and 7.1 are source based system. You render the sound for the location of your speakers.

What this format allows you is to have more locations to place the instruments, therefore separating them physically so you are not obliged to use the separation methods described above.

Object based

This is a more recent format, possible because of the capabilities of consumer products to process sound digitally. Instead of placing sound sources on speakers, you place them in space on virtual objects, that you can animate. The player, using psychoacoustic digital effects is capable to convert thisw virtul space into a physical space, which could be binaural (headsets) or 5.1 or 7.1.4 (home theater). Dolby Atmos is the most recent standard in this field.

This system offers more capabilities to separate instruments because for instance with Dolby Atmos you can have up to 128 objects in your virtual space. That being said for consumer products, due to bandwith limitations the objects are converted into 16 clusters of objects.

128 objects at 48kHz 24 bits need 128*48000*24 = 147,456,000 bits/s or 147.456 Mb/s. You would need a serious Internet connection to your home. For the phone you would need a minimum of a 5G connection.

FAQ

Frequently asked questions about sound, space, and time—spatial audio and separating sounds.

What is spatial audio or immersive audio?
Spatial or immersive audio uses digital processing to place sounds in a virtual space so they appear to come from specific directions. It is an evolution of past methods to bring clarity and richness to a sound environment, and is increasingly available on consumer devices, including for virtual reality and games.
How can you separate overlapping sounds?
You can separate sounds by time (playing one before the other, e.g. swing), by volume (side chaining—briefly reducing one sound when another hits), by frequency (carving frequency ranges so instruments don't clash), or in space (stereo, multichannel, or object-based placement).
What is side chaining?
Side chaining is separating sounds by volume: a brief sound (e.g. a kick) triggers compression on another sound, making it quieter during that transient so the first sound stands out.
What is object-based audio?
Object-based audio places sounds as virtual objects in 3D space that can be animated. The player uses psychoacoustic processing to render this to binaural (headphones), 5.1, or 7.1.4. Dolby Atmos is a leading standard and can use many objects (e.g. up to 128), though consumer delivery often clusters them into fewer channels due to bandwidth limits.
What is quadraphonic or source-based sound?
Quadraphonic uses four channels; it was never fully standardized. The closest common standard is 5.1 or 7.1 home theater. In these source-based systems you render for fixed speaker positions (left, right, center, surround, etc.), giving more physical separation than stereo.