Ginger Audio Logo
Continue with Google
or

Getting Started with Immersive Audio: What Do You Need?

Looking to take the leap into immersive audio? We break down everything you need to start working with 3D sound

Audio immersion has become a vital part of listeners’ audio experiences in recent years. Whether you’re a music producer, composer for film and TV, or a sound engineer, you’re probably thinking about stepping into the world of immersive audio if you haven’t already.

If you want to keep up with modern audio standards and future-proof your skill set, it’s vital to equip yourself with the tools and knowledge needed to create and mix immersive sound. In this article, we’ll explain what you need to get started working with immersive audio.

While there are several immersive audio formats, including Auro-3D, Sony 360 Reality Audio and THX Spatial Audio, Dolby Atmos is widely considered to be the industry standard. For that reason, this article will focus on getting started on mixing immersive audio with Dolby Atmos.

What is Immersive Audio?

Immersive audio refers to sound that can accurately place a listener within a three-dimensional space. Unlike the two-dimensional surround sound format, immersive audio allows users to position sound along the vertical axis, as well as the horizontal axis.

The advantages of immersive audio don’t stop there. Previous audio formats such as stereo and surround sound are channel-based, meaning each channel has its own dedicated position within a mix. Immersive audio instead uses sound objects, granting far greater levels of control and the ability to move a sound’s position dynamically.

The 3D audio format also uses sound beds, which operate more similarly to traditional audio channels as they can be panned and played out of a pre-determined speaker. Sound beds are placed on bed channels, while sound objects are placed on object channels.

While traditional stereo audio consists of two channels, left and right, immersive audio can consist of up to 128 channels. This demonstrates the increased audio immersion that can be achieved with 3D sound.

Why Does Immersive Audio Matter?

Up until relatively recently, immersive audio was reserved for those with high-end home cinema or studio setups. Thanks to Dolby Atmos and Apple Spatial Audio, immersive audio is more accessible to more consumers. Popular streaming platforms including Apple Music, Apple TV+, Netflix and Disney+ now offer greater levels of audio immersion.

Additionally, music and audio in general can help to elicit an emotion in the listener. The spatiality and fidelity of immersive sound can heighten the level of emotion that can be conveyed, and may therefore improve the overall audio experience. For these reasons, an increasing number of music, film, TV and game audio projects are being created and mixed in immersive audio.

What Do You Need to Get Started with Immersive Audio?

Now we’ve defined immersive audio and why it’s important, let’s explore the software and hardware you’ll need to start mixing more immersive sound with Dolby Atmos.

A Suitable Digital Audio Workstation

For most modern music production and mixing tasks, you’ll need a suitable digital audio workstation or DAW. The same goes for mixing immersive audio. While you can use most DAWs to mix immersive sound with Dolby Atmos, some DAWs are better suited to the task than others.

That said, if you already have a go-to DAW, you may not want to learn a new DAW for the sake of a slightly more straightforward immersive audio experience. If your chosen DAW doesn’t natively support Dolby Atmos rendering, you can opt to use the external Dolby Atmos Renderer.

A 3D Audio-Capable Speaker System

With DAWs covered, let’s turn our attention to monitor speakers. As we already mentioned, the stereo audio format relies on only two channels, and therefore only requires two speakers to be accurately reproduced. When mixing Dolby Atmos, the recommended speaker configuration is a minimum of 12 speakers in a 7.1.4 configuration.

This consists of seven ear-height speakers (left, center, right, left-surround, right-surround, left-rear, right-rear), four overhead speakers (left-top-front, right-top-front, left-top-rear, right-top-rear), and one subwoofer (LFE). Of course, you can opt for more extensive 3D audio speaker systems for higher levels of control and resolution.

Within the context of immersive audio systems, the model of monitor speakers you opt for doesn’t matter too much. The priority is having the correct number of speakers in the correct configuration. In fact, you don’t even need to have a set of matching monitors to mix 3D sound, but more on that shortly.

If you do want to go for matching monitors, some manufacturers such as IK Multimedia, Neumann and Genelec offer immersive audio monitor packages. These packages contain full 7.1.4 speaker configurations and offer better value than separately purchasing each component.

An Audio Interface with 12+ Output Channels

You might be wondering how you can output audio to the 12 speakers required to create a multichannel mix. One option is to use a suitable audio interface with at least 12 outputs. Depending on your budget and willingness to switch interfaces, this might not be a practical solution.

Additionally, there are some disadvantages to using an audio interface as the heart of your immersive audio mixing setup. Many audio interfaces don’t allow you to adjust the level and timing of each speaker in any given setup. This is an essential part of mixing multichannel audio, as it’s vital to get your speaker balance and timing right in order to mix with accuracy and confidence.

So what’s the alternative to this solution?

Immersive Audio-Capable Monitor Controller Software

While there are physical monitor controllers on the market that promise to streamline the 3D audio workflow, a virtual monitor controller is better suited to those starting their journey into immersive audio. In this section, we’ll highlight some of the advantages of using a virtual monitor controller, such as Ginger Audio Sphere, when getting started with immersive audio.

Increased Audio Interface Flexibility

Instead of using a single audio interface, you can use a virtual monitor controller such as Sphere to connect up to eight audio interfaces simultaneously. This means you can continue to run your primary audio interface as usual, adding a secondary interface to increase your number of outputs when needed. This gives you ongoing flexibility with your immersive audio speaker setup, allowing you to increase the number of speakers in your configuration, as and when you need to.

It’s also worth noting that Sphere is DAW and audio interface-agnostic, and is therefore compatible with software and hardware from all major manufacturers. Sphere is also compatible with the external Dolby Atmos renderer, so no matter which DAW and interface you already use, Sphere will seamlessly integrate with your existing setup.

Speaker Calibration and Configuration Functionality

Using a virtual monitor controller such as Sphere offers quick speaker configuration and calibration, something that isn’t always possible with an audio interface on its own. This is an essential part of mixing multichannel audio, as it ensures that each speaker is correctly balanced so that you can mix in complete confidence.

Sphere lets you adjust the level, timing and phase of each speaker independently. You can even apply room correction on a per-speaker basis with your favorite room correction plugin, allowing you to fine-tune the performance of your immersive audio speaker system.

Increased Monitor Speaker Flexibility

Sphere’s refined per-speaker calibration functionality offers another benefit when getting started with immersive audio. Being able to independently tune monitors and apply per-speaker corrective EQ means that you can mix and match monitor speakers. As with the ability to combine multiple audio interfaces, this flexibility enables you to add further monitors to your immersive audio speaker setup at any point, without the need for matching your existing monitors.

Integrated Apple Audio Spatial Audio Rendering

As discussed earlier on, Apple’s Spatial Audio can be credited with much of the recent increase in the prominence of binaural sound. Their AirPods range has put Dolby Atmos and binaural audio within reach of more consumers, allowing them to enjoy their favourite music, TV series or film with a greater sense of audio immersion.

With more and more people consuming audio in the Apple Spatial Audio format, it’s important to be able to render and monitor your binaural audio mixes, exactly as your audience will hear them. This is made particularly easy with Sphere, which boasts integrated Apple Spatial Audio rendering. You can even simulate Apple Audio’s head-tracking functionality with Sphere’s inbuilt head-tracking and HRTF support. For more information on Sphere’s Apple Spatial Audio renderer, and how to use it to create Dolby Atmos mixes, check out our in-depth tutorial.

Download the free 14-day trial of Ginger Audio Sphere now to see how it can improve your multichannel mixing workflow.

FAQs

What makes sound immersive?

Immersive sound and immersive audio refer to audio that can be delivered within a three-dimensional sound environment. Unlike surround sound, immersive sound contains audio that is positioned along the vertical axis.

What do you need for spatial audio?

To listen to music in spatial audio, you’ll need a spatial audio-capable playback device such as compatible headphones, a TV, smart speaker or a soundbar.

Does Spotify support immersive audio?

Currently, Spotify does not support immersive sound. If you’re looking for a more immersive audio experience, both Apple Music and Tidal offer spatial audio thanks to Dolby Atmos compatibility.

About the author
Jake Gill is a journalist, content writer and music producer based in Bristol, UK. Having studied marketing as well as music production, he's gone on to write for some of the industry's leading software developers, instrument manufacturers and publications.