Interfacing an Audio Codec with ESP32 – Part 2

The previous article touched on some basics of interfacing an I2S audio codec with the ESP32. This article will further outline the pinout and requirements of a typical audio application built around the ESP32, to be specific.


Types of audio codecs

In terms of interfacing, there are primarily 2 types of audio codecs:

  • Codec that needs I2S master clock
  • Codec that does not need I2S master clock

This difference is significant. Not all microcontrollers can generate the master clock (usually called MCLK). The master clock frequency must be an integral multiple of the I2S bit clock frequency. Also, the jitter must be extremely low for audio to be of high quality.

An example of an I2S audio device not requiring MCLK is MAX98357A, while an example of a codec requiring MCLK is SGTL5000.

In general, an application that uses no MCLK produces lower analog audio quality than a system that uses an MCLK. However, this may not be perceivable when you hear the audio.

Connecting an audio codec with ESP32

The ESP32 supports pin function multiplexing and flexible routing of "slow" digital peripheral signals through the GPIO matrix. This simply means that the ESP32 can output the I2S signals on any pin where digital IO function is supported.
However, this comes with an exception. The MCLK (I2S master clock) output can be put out through a CLK_OUT pin only. If you are designing a PCB for ESP32 audio application, this is something you may want to take care of.

Here is an example of how you may connect the AudioBit to ESP32. The MCLK must be connected to GPIO0, but the other pins may be connected elsewhere as suited for your application.

audiobit sgtl5000 esp32 audio codec connection Connecting AudioBit, an audio codec, with ESP32 (for audio playback)

The Nano32 is a great board to start working with, because of its simplicity.

As you can see in the AudioBit pinout, a typical codec will have an I2S bus for audio data and I2C bus for control inputs such as setting I2S data format or configuring the mixing of sound tracks, audio volume... and a LOT of things. The SGTL5000 on board the AudioBit can do a whole lot in terms of audio processing!

Talking to the codec

Control over I2C

The I2C SDA and SCL lines are used for configuration of audio codec. The I2C interface is mandatory and you MUST configure the codec on power-up. Otherwise it will fail to work, because all of the I2S functions are disabled by default on power-up.

Note that setting the initial state of the codec can be frustrating and difficult. We have set up the SGTL5000 for basic audio playback using ESP32.

Audio data over I2S

The I2S interface uses

  • MCLK - for master clock. This clock makes the audio codec core (digital parts) work.
  • LRCLK - Also called the WS (or word select) defines which channel the audio data must be put out on.
  • BCLK - Also called CLK, SCK or BCK. It is the bit clock.
  • DOUT, DIN - The data output and input pins respectively.

The subtle design mistakes

What could go wrong when interacing an audio codec with the ESP32? A lot of things! Here are some common mistakes to watch out for:

  • For some codecs, MCLK and all I2S signals must have the same phase (all I2S bus signal must rise and fall with the rise and fall of the MCLK).
    If you are using the ESP32, you can cause delays in signals by routing them throught the GPIO matrix... which changes the phase with respect to MCLK. This will result in some terrible audio. And sometimes you may get lucky.
  • The range of MCLK is limited. You cannot just send 40MHz down that pin and expect the codec to work!
  • If you are not playing or recording audio, mute/power down the codec. The BCK line can pick up noise and clock in some garbage which comes straight out the headphones. Ugly! Route I2S lines away from MCLK.
  • Initialize MCLK first, I2S second and I2C last. This is because many codecs will be non-responsive to I2C commands without MCLK.

Still stuck? We can always provide consulting help with audio systems design based on the ESP32 or ESP8266.

An interesting fact

The ESP8266 from Espressif Systems also integrates a very similar I2S peripheral that ESP32 has. The ESP8266 I2S is backed by powerful linked list DMA engine and can play or record 96 kHz audio with no issues at all. We have tried playing WAV files off a FAT32 formatted SD card (connected to HSPI) and it has performed quite well!

In addition, the ESP8266 contains a separate clock source just for I2S and can output a reliable clock stream from its CLK_OUT pin to support codecs like this.

A $10 web radio really is possible!

1 thought on “Interfacing an Audio Codec with ESP32 – Part 2”

  1. Pingback: Interfacing an audio codec with ESP32

Leave a Reply