IoT Project: OSD335x Powered Alexa Smart Speaker

 

The application engineering team at Octavo Systems is always looking for cool ways to showcase the distinct and creative ways our devices can be used. Recently, we built a tweeting Rubik’s cube solver to showcase some of the IoT and industrial control capabilities of the OSD335x Family of System-in-Package(SiP) devices. It was amazing how easy it was to give the robot a voice and allow it to talk to the world via social media. So, we decided to explore more IoT capabilities the OSD335x Family of devices can implement without a lot of development overhead. This time, we wanted to incorporate human interaction into the device. What better opportunity than to explore America’s favorite IoT phenomenon, Alexa. So, we whipped out a BeagleBoard.org® PocketBeagle® and got to work. Though our vision for this IoT project is bigger than just implementing Amazon’s Alexa voice service on the PocketBeagle, this is a good place to take a break and showcase a simple step by step way to get your PocketBeagle to be a “man’s best friend” personal assistant. Check out the Hackster.io page for the steps.

The Hardware

PocketBeagle is a $25 tiny Linux PC from our friends at Beagleboard.org. It’s a great board with a lot of flexibility based on the Octavo Systems OSD3358-SM system-in-package device.  PocketBeagle easily runs Linux and has 72 exposed pins that provide an extensive list of peripheral interfaces. It is a perfect foundation to build on and avoid unnecessary clutter. The plan was to start with the PocketBeagle and add plug and play, off-the-shelf hardware components for the initial “Alexa” prototype.

The Software

The software decisions on a project often drive the hardware choices. The Linux environment makes it easier to integrate and use the complex software tools needed to implement Alexa. Also, the open-source community can serve as a resource to jump start the project. For this project, reusing code from Franklin Cooper’s BeagleAlexa hackster project allowed us to get to the finish line in less than a week!.

Linux Sound Architecture

ALSA Software Framework for Audio

Within Linux, ALSA (Advanced Linux Sound Architecture) is a software framework that provides an application interface for audio drivers. It abstracts physical sound cards to the kernel, with each component of the sound card presented as a device or a sub-device. ALSA provides an operational interface via the /dev/ folder and configuration options in the /proc/asound/ folder. Recording or playback devices show up as PCM devices. The amixer command can be used to interface with the available controls for each PCM device, such as volume and gain control. If there are multiple audio devices connected to the processor, the default PCM card can be selected in the configuration file /usr/share/alsa/alsa.conf.

PulseAudio for Bluetooth

The intention for this project was to use a generic Bluetooth speaker as the sound interface. In order to send and receive audio via Bluetooth, a sound server interface between the hardware and the application called PulseAudio is used. Once PulseAudio is installed in the system, the package is responsible for directing audio through channels configured using the pactl command. Another important command to use is pacmd which provides options that can be used to set different modes for the Bluetooth interface. Furthermore, application packages like Pyaudio for Python can be used to interact with ALSA to make recording and playback easy. On a side note, due to software dependencies, multiple software packages might need to be installed before an application like PulseAudio is installed for it to function as intended. The package manager, Apt, manages installation of software packages automatically and will install all of the required packages before it installs the desired appliation. However, with multiple ways to install things like Python libraries, there might be differences in software packages that can cause issues on a particular Linux installation.

Bluetooth audio devices can be configured to use a couple of different Bluetooth profiles. The A2DP profile allows for one way high quality sound transfer from the processor to the speaker device while the Hands Free Profile (HFP) and Head Set Profile (HSP) are used for two way audio transfer so that audio can be played and recorded from the device. Unfortunately, the drivers that support HSP and HFP in PulseAudio still have some issues which can cause the PulseAudio server to be broken or unreliable. Therefore, even though many Bluetooth speakers also contain microphones, we will use a separate microphone to ensure reliable audio recording.

Bringing it Together

For fast prototyping, we chose a quality off the shelf USB microphone and although it was a little bigger than we would have liked, it worked really well and audio recordings were clear and legible even from far away. To make sure that the microphone and Bluetooth+WiFi dongle, which was used to connect to the speaker and the internet, had enough power, we added a powered USB hub. As a side benefit, the hub could also be used to supply power to PocketBeagle which meant we needed only one power plug.

PocketBeagle Alexa built with Plug-and-Play peripherals

 

As part of the smart speaker functionality, the project implements a keyword detection mechanism. This opens up a lot of fun human interaction and allows the smart speaker to extract and decipher voice commands and take actions accordingly. We had a lot of fun interacting with our “Dog” (aptly named by Franklin) using some of the 30,000 skills from the Alexa collection. Our next steps on the project will be to focus on adding physical components to implement more smart device functionality. PocketBeagle has many standard Input/Output interfaces that can be used to interact with sensors, actuators and the cloud. For now, we have taken the first step towards a robot butler that we desperately want to do our chores!

We would love to get your feedback on cool IoT projects you’d like to see built with a OSD335x based board.