Clap switches are an old automation gimmick from the 1980s. Basically the circuit hears a noise above a given volume (amplitude), and activates.
This project is an attempt to refine that idea, to create a switch that does the same, but responds only to a given sound (frequency).
This was inspired by watching my partner fail miserably playing South Park - The Stick of Truth.
In the game during battles, there is a timed button press during an attack that increases damage. Timing these button presses was not going well...
That noise triggers at the time the button press is required, so I started thinking about how that noise could be listened for, and the button press triggered.
Band-pass filters
This system relies on the use of a band-pass filter. There are plenty of explanations around about these and how they work, so I won't reinvent the wheel here.
For the purposes of this project, the key point is if the input has a frequency between the low and high thresholds, it is allowed through. Frequencies outside of those thresholds (bands) are rejected.
These can be built in hardware as circuits, but also in software on regular computers.
Fast-fourier Transform (FFT)
This is a well-known algorithm that, in simplistic terms, takes input over time (such as audio), and breaks it down into the frequencies it's composed of.
I'll admit, my understanding of FFTs is similar to the relationship most people have with their household appliances - know how to use it, but can't really explain how it works under the hood. There's plenty of detailed explanation for the more mathematically inclined.
So the basic concept is this
- Pipe the audio input into the FFT
- The FFT converts it into the frequency domain
- Zero out all the values for the frequencies that fall outside of the range we're 'listening' for.
- Do an inverse FFT transform, which turns the frequency domain data back into time-domain (i.e. back to real audio). This gives us sound where everything except for the range we're listening for is muted.
- This can then be passed into a regular 'clap-switch', where we trigger if the volume of the sound is above a given level.
Finding the target frequency
This part of the process can be trickier than it initially seems. Sounds are composed of many frequencies, so it is necessary to select a frequency range that is unique to the target part of the audio.
To start with, I extracted the audio from the above video clip using FFMPEG, and opened it up in Audacity. This initially shows the audio waveform.
Select the area containing the sound, and select Tools, Plot Spectrum.
This will show the frequencies that exist within the selection. However, this doesn't give us all the answers. Save the plot (I just took the below screen-shot and used that). Then select another parts of the audio and repeat the exercise. Then basically it's a case of spot-the-difference, looking for a frequency spike that appears in our target audio but not the other samples.
|
|
Hardware
To run this switch I'm going to use the Next Thing Co CHIP. This is the same system I used for the TV desk project years ago. Unfortunately these are now discontinued, but there are still many ARM-based SoCs running Linux out there.
Potentially this could be distilled down further on to a smaller microcontroller, although I'd have reservations about how far you could reduce the resources until the processing time introduces enough lag to make it too slow to use.
As well as effectively being a 'proper' computer, the CHIP has general purpose IO pins, like most microcontrollers. This can provide the interface for the output of the 'clap' switch. In this case, for the sake of example I'm just hooking up a simple LED that will blink on detection of the given sound.
Although the CHIP does have microphone pins and the ability to switch it's video pin from the jack to be an audio in, for the sake of prototyping, I found it much easer to just use a cheap USB adapter which has microphone and headphone sockets.
Power comes from a standard USB phone charger.
Control is done via a serial connection to my PC, using Minicom.
Software
As the CHIP is a full-blown Linux distribution, there's lots more flexibility in the software that we use. I ended up using Java and the Apache Commons Math library.
The basic OS was pre-installed, Java 8 JDK was installed from here.
The Java code listens to the microphone input, and allows the user to load a JSON file containing details of the filter to apply - this made testing and refining the filter easier. The values are the start and end of the frequency range to listen for, and the threshold of amplitude to trigger the output (This figure can be a bit of trial and error based on how loud the input is, and can vary if the input volume varies.)
It can also be controlled via command line to either pass-through the audio to the headphones as-is, or post-filter - i.e. you hear what the 'clap' circuit would hear.
Code is on GitHub here.
Configuration
Some configuration was required to enable the GPIO pins to be activated on boot. To do this, the necessary commands (below) are wrapped in the bash script
/etc/init.d/preparegpio.sh:
echo 1023 > /sys/class/gpio/export
echo out > /sys/class/gpio/gpio1023/direction
echo 0 > /sys/class/gpio/gpio1023/value
chown chip:chip /sys/class/gpio/gpio1023/value
(Refer to the CHIP docs for what exactly these mean)
This is set to run on boot by adding the below line to /etc/rc.local
sh /etc/init.d/preparegpio.sh
Finally, for the Java code to trigger blinking the LED, it triggers another SH script. I went this route as I intend to develop the Java code into a more general purpose audio tool, so didn't want to tie the code too closely to the hardware I'm using for this project.
It also has the benefit that the audio processing doesn't wait for the GPIO operation to complete, thus reducing the lag.
~/triggergpio.sh
echo 1 > /sys/class/gpio/gpio1023/value
sleep .5
echo 0 > /sys/class/gpio/gpio1023/value
Testing
While it certainly does respond to the input audio as expected, there is definitely some processing lag, as can be seen as the video progresses.
This isn't entirely unexpected, and could be overcome by throwing more computing power at the processing (The CHIP is a 1GHz processor), or possibly further optimisation of the code (or porting it to C or similar).
That might be the subject of a follow up project at a later date, but for now, this demonstrates the idea.