You've finished recording your audiobook. The narration is solid, the edits are clean, and now you need to get it through ACX quality control. This guide walks through each mastering step required to turn a raw recording into a file ACX will accept.
Before You Start
Work from the highest quality source you have. If you recorded in WAV or FLAC, use those original files. Don't start from an MP3 because you've already lost information in the first encoding pass, and encoding again will compound the quality loss.
Process every chapter through the same mastering chain. Consistency across chapters matters as much as hitting individual spec targets. Listeners notice when chapter 5 sounds different from chapter 6, and ACX reviewers flag it.
Step 1: Format Conversion
Start by getting your files into the right format. ACX requires 44.1 kHz sample rate and mono audio. If your DAW recorded at 48 kHz or 96 kHz, resample to 44.1 kHz using a high-quality algorithm. If you recorded in stereo, downmix to mono.
Do format conversion first because every subsequent processing step depends on sample rate, and processing in stereo when you only need mono wastes time and can introduce complications.
Step 2: High-Pass Filter
Apply a high-pass filter around 80 Hz to remove low-frequency rumble. This catches things like building vibration, HVAC rumble, and plosive energy that sits below the useful range of speech. A gentle slope (12-24 dB/octave) works well for spoken word without thinning the voice.
This step is technically optional for ACX compliance, but it makes the subsequent noise floor treatment more effective and generally improves clarity.
Step 3: Noise Reduction
ACX requires a noise floor below -60 dBFS. If your recording environment is quiet, you might already be there. If not, you'll need some combination of noise gating and noise reduction.
A noise gate silences the audio during pauses when the level drops below a threshold. This is effective for cleaning up room tone between sentences. More aggressive noise reduction (spectral subtraction or machine learning based) can reduce continuous background noise, but overuse introduces artifacts that sound hollow or watery.
Measure your noise floor after treatment. Solo a section of silence (where you're not speaking) and check its RMS level. It should read below -60 dBFS.
Step 4: Loudness Adjustment
This is the most critical step for ACX compliance. Your chapter's RMS level needs to land between -23 and -18 dBFS. Most raw recordings from home studios come in around -30 to -40 dBFS, so you'll need to bring levels up significantly.
Start with normalization to bring the average level close to your target. If the dynamic range is too wide (quiet whispers followed by loud exclamations), use gentle compression with a low ratio (2:1 or 3:1) to even things out. Compression before normalization is usually the right order.
Be careful not to over-compress. Audiobook narration should retain natural dynamics. The goal is consistency, not loudness. Aim for an RMS around -20 dBFS, which gives you comfortable margin within the -23 to -18 range.
Step 5: Peak Limiting
After loudness adjustment, your peaks will be higher. ACX requires true peaks below -3 dBFS. A brickwall limiter with a -3 dBFS ceiling handles this. Set the ceiling and the limiter catches any peak that exceeds it.
True peak limiting is important here—not just sample peak limiting. Inter-sample peaks can exceed the -3 dBFS threshold even when every individual sample is below it. A true-peak-aware limiter accounts for the actual analog waveform between samples.
If the limiter is working hard (reducing more than 2-3 dB frequently), your loudness adjustment in the previous step was too aggressive. Back off the gain and let the limiter do less work.
Step 6: Silence Padding
Each chapter needs 0.5 to 1 second of silence at the beginning and 1 to 5 seconds at the end. This silence should be room tone level, not digital zero. Absolute silence followed by speech sounds unnatural and can trigger noise floor measurement issues.
The simplest approach is to record a few seconds of room tone at the start of each session, then use that as your silence padding. Alternatively, you can use very low-level shaped noise that matches your room tone characteristics.
Crossfade between the silence and the speech content to avoid clicks or abrupt transitions.
Step 7: MP3 Export
Export as MP3 at 192 kbps with constant bit rate (CBR). This is non-negotiable—VBR at 192 kbps average will be rejected. Use a reputable encoder. LAME is the standard and produces high-quality results at this bitrate.
After encoding, verify the output. Decode the MP3 and measure its levels again. MP3 encoding can shift peak levels slightly, and a file that was at -3.0 dBFS true peak before encoding might be at -2.8 dBFS after. If this happens, your pre-encoding ceiling needs to be slightly lower.
Step 8: Final Verification
Before submitting, verify every checkpoint against the actual MP3 file (not the source WAV). Check:
- RMS between -23 and -18 dBFS
- True peak below -3 dBFS
- Noise floor below -60 dBFS
- Sample rate is 44,100 Hz
- Mono audio (1 channel)
- 192 kbps CBR MP3
- Head silence 0.5--1 second
- Tail silence 1--5 seconds
Do this for every chapter. It's tedious but it prevents rejection, which costs more time than verification.
The Easier Way
Each of these steps requires specific tools, careful parameter choices, and verification. For a 20-chapter audiobook, that's a significant time investment per submission.
ACX Pass automates this entire chain. It runs a 9-step processing pipeline that handles format conversion, filtering, noise treatment, loudness adjustment, peak limiting, silence padding, and MP3 encoding. Every output file is verified against all 8 ACX checkpoints before export, and files that don't pass aren't exported—you get a clear report of what failed and why.
Skip the manual mastering. Upload your raw chapters and download ACX-compliant MP3s in minutes.