- RX 10 Advanced: $799 USD (regularly $1,199 USD)
- RX 10 Standard: $299 USD (regularly $399 USD)
- RX 10 Elements: $99 USD (regularly $129 USD)
iZotope’s RX application is the gold standard for repairing and enhancing audio. The new version is now Apple silicon native and has a number of features that will make dialog and monolog editing a lot more efficient.
I started my review with a new feature of RX 10 Advanced that, in my opinion, will be a boon for anyone working in the broadcasting and film industries: the ability to turn on Text Navigation in dialog editing. Text Navigation transcribes spoken text of up to eight speakers, making them individually discernible using color coded “lanes” for each recognized speaker right above the spectrogram and in a sidebar, which includes a search field.
It allows you to find words so that, for example, you can quickly navigate to that part of a dialog where de-essing is urgent, while leaving the other voice entries alone. It is the most spectacular new functionality of RX 10 Advanced.
The current state of the feature, however, is not flawless, nor does it need to be, for that matter. The recognized text only serves to chop up large files using sound that closely resembles words. For example, RX misunderstood about 10 percent of iZotope’s own test file. I dictated part from Wikipedia’s entry about Merlin, the wizard of the Arthurian legend, in my best Belgian English. The transcription engine misunderstood about 12 percent.
Even with these inaccuracies, though, the essence of dividing up the file in chunks of textual content is kept intact. The search field works unexpectedly well with its fuzzy search engine. You enter a couple of characters and the field starts to fill up with possible words. In the example of “Merlin,” all the variations RX had listed turned up after the first three characters entered.
Speaker identification isn’t 100 percent accurate yet either. While I was alone in the room, RX insisted there were two of us, one identified as myself, with a blue dot next to the ID, and the other with a yellow dot next to the ID, but with no lanes for the second one. And yet, that doesn’t make this any less practical as it would when RX would miss out on a speaker, which it doesn’t.
So, with one click on a speaker’s color dot, you’re selecting everything they said and you can apply whatever improvement or fix on all of it at once.
Sexier Repair Assistant, Dynamic Adaptive De-Hum to fix mobile phone noises and more
The transcription capabilities are spectacular, but speed bumps are always welcome. RX 10 doesn’t disappoint. It lets you fix anything in a fraction of the time it took you with RX 7, RX 8 and even RX 9. To that effect, the all-new Repair Assistant plug-in has been rebuilt from the ground up.
As with previous versions, the Repair Assistant uses machine learning to automatically recognize specific problems and intelligently propose fixes that you can modify to taste (included with RX Elements, Standard and Advanced), but, it does it using more modules at once, and with better results than RX 9’s even.
In my tests of the new Assistant, voice recordings with a serious de-essing problem were fixed and also improved in terms of dynamics, noise and reverb using one mouse click. In contrast to pure AI-driven repair and enhancement solutions on the market, however, you can change the decision of any of the six modules afterwards, right from within the Repair Assistant’s interface, or by doing your own thing in the associated modules.
Then there’s the new Dynamic Adaptive Mode in De-Hum. I always wished De-Hum would let me remove constant sounds that interfere with the desired signal, regardless of changes in frequency — the new Adaptive mode does exactly that. It automatically eliminates complex noise that changes pitch, (e.g. electromagnetic interference), without sacrificing quality (Standard and Advanced).
Perhaps least interesting from a film audio perspective, an upgraded Spectral Recovery improves upon the quality of the re-synthesized upper frequencies, and can now add missing lower frequencies, too.
This is especially useful for recordings made on mobile phones or non-studio-grade recording equipment. Still, for broadcast news this can make the difference between an unintelligible recording and one that is actually good enough to understand what interviewees are saying (Advanced only).