Research / Design / Consulting

In this article, I would like to talk about capturing audio from a microphone. Tt happens that some applications or games that we create use the microphone eg for chat in a multiplayer game or voice control. I will also show how to playback the captured audio and the necessary steps to check if there is a microphone present and its recording capabilities. The seemingly complex function is very simple to implement. So let's get to work.


Let's start


To capture the audio input from a microphone in Unity, you can call static Start() method from the Microphone class. This method return AudioClip that can be played back using an AudioSource in Scene. However, we have to handle exceptions from being thrown, there is a simple verification to detect if there’s a microphone present prior to calling this method and also, the microphone audio capture capabilities are checked.


So, here’s our script:

using UnityEngine;  
using System.Collections;  
  
[RequireComponent (typeof (AudioSource))]  
public class MicrophoneCapture : MonoBehaviour   
{  
    // Boolean flags shows if the microphone is connected 
    private bool micConnected = false;  
  
    //The maximum and minimum available recording frequencies  
    private int minFreq;  
    private int maxFreq;  
  
    //A handle to the attached AudioSource  
    private AudioSource goAudioSource;  
   
    void Start()   
    {  
        //Check if there is at least one microphone connected  
        if(Microphone.devices.Length <= 0)  
        {  
            //Throw a warning message at the console if there isn't  
            Debug.LogWarning("Microphone not connected!");  
        }  
        else //At least one microphone is present  
        {  
            //Set our flag 'micConnected' to true  
            micConnected = true;  
  
            //Get the default microphone recording capabilities  
            Microphone.GetDeviceCaps(null, out minFreq, out maxFreq);  
  
            //According to the documentation, if minFreq and maxFreq are zero, the microphone supports any frequency...  
            if(minFreq == 0 && maxFreq == 0)  
            {  
                //...meaning 44100 Hz can be used as the recording sampling rate  
                maxFreq = 44100;  
            }  
  
            //Get the attached AudioSource component  
            goAudioSource = this.GetComponent<AudioSource>();  
        }  
    }  
  
    void OnGUI()   
    {  
        //If there is a microphone  
        if(micConnected)  
        {  
            //If the audio from any microphone isn't being captured  
            if(!Microphone.IsRecording(null))  
            {  
                //Case the 'Record' button gets pressed  
                if(GUI.Button(new Rect(Screen.width/2-100, Screen.height/2-25, 200, 50), "Record"))  
                {  
                    //Start recording and store the audio captured from the microphone at the AudioClip in the AudioSource  
                    goAudioSource.clip = Microphone.Start(null, true, 20, maxFreq);  
                }  
            }  
            else //Recording is in progress  
            {  
                //Case the 'Stop and Play' button gets pressed  
                if(GUI.Button(new Rect(Screen.width/2-100, Screen.height/2-25, 200, 50), "Stop and Play!"))  
                {  
                    Microphone.End(null); //Stop the audio recording  
                    goAudioSource.Play(); //Playback the recorded audio  
                }  
  
                GUI.Label(new Rect(Screen.width/2-100, Screen.height/2+25, 200, 50), "Recording in progress...");  
            }  
        }  
        else // No microphone  
        {  
            //Print a red "Microphone not connected!" message at the center of the screen  
            GUI.contentColor = Color.red;  
            GUI.Label(new Rect(Screen.width/2-100, Screen.height/2-25, 200, 50), "Microphone not connected!");  
        }  
  
    }  
}  

Script explanation

To make our script work, we had to initiate a series of necessary variables to store data, so we declared: a boolean, a pair of integers and an AudioSource. The boolean flag tell us whether there’s a microphone attached or not. The two integers (minFreq, maxFreq) store the maximum and minimum available sampling frequencies for audio recording with the default microphone. Lastly, the AudioSource object is going to be the one responsible for playing back the recorded audio.


The Start() method is basically composed of a single if statement, that tests whether the string static array named devices at the Microphone class has a length greater than zero. A length equal to zero means that there is no connected microphone. In this case, a warning message is printed at the console. Otherwise, if the length of the devices array is greater than one, the audio recording and playback variables are initialized.


Then calling the static method GetDeviceCaps() from the Microphone class. This method gets the audio recording capabilities of the device’s microphone. It receives three parameters: a string with the microphone name and two integers for the maximum and minimum frequencies. In the above script, null has been passed as the first parameter and according to the documentation; “[…] Passing null or an empty string will pick the default device“. If there’s just a single microphone attached to the computer, this won’t be a problem. Nevertheless, to target a specific microphone, you just need to pass its name to this first parameter.


Still on the GetDeviceCaps() method, the second and third parameters are respectively, the maximum and minimum recording frequencies supported by the device. This method take these two parameters with an out modifier, meaning that any value alteration made to the minFreq and maxFreq inside the method will be maintained after the method has finished its execution. Again, quoting the documentation, if the maximum and minimum frequencies values are zero, the microphone supports any recording sampling frequency.


As previously mentioned, the Microphone.Start() method will be the one responsible for capturing the audio from the microphone. The first parameter takes string with the name of the microphone we wish to record from. Again, passing null will make Unity pick up the default microphone. The second parameter takes a boolean that indicates whether the recording should continue if the length defined on the third parameter is reached. Passing true as in the above script makes the audio recording wrap around the length of the audio clip and record from the beginning. And the third parameter, as explained, is the length of the AudioClip this method returns. To put in a simplified manner, in this script, if an audio recorded from a microphone has 23 seconds, the final 3 seconds will be recorded over the initial 3 seconds. So, the maximum recording time is 20 seconds in this script. The AudioClip returned by this method is assigned as the AudioClip of the goAudioSource.


And, if there’s at least one microphone and the audio from one of them is being recorded, the ‘Stop and Play‘ button is rendered. Pressing it will stop the audio recording and play it back. Just like the other static methods from the microphone class, the End() method takes a string that tells Unity which microphone the recording should be stopped. It also accepts null as a parameter to select the default microphone.

The above solution works without any problem on iOS and Android mobile devices and Windows / Mac OS standalone application.