In collaboration with Dr. Scott Hawley, Physics Professor at Belmont University, software company Art+Logic has unveiled Vibrary, the first project to come out of its incubator lab. Vibrary uses machine learning to analyse short samples and loops. Its design makes it easy for producers, composers, and musicians to train their own models and classify sounds by sound, genre, feel or other characteristics, defined by users’ needs and preferences.
The open-source AI tool features a helpful interface to make training algorithms accessible to anyone with a computer, internet connection, and a massive sound library.
“Much of the technology involved is straightforward, but what is unique about us is that we built a user-friendly utility that lets people train their own AI,” explains Hawley.
Before Art+Logic embraced the project for its incubator initiative, Hawley had been playing around with algorithms and audio files for years, a hobby, of sorts, of the busy physics professor and musician. He dug into the work of top researchers using spectrograms, the visual representations of sound, as objects for algorithmic classification. It enabled him to find and tag sounds in a massive, confusingly-labelled sample and patch library, something producers and composers often struggle with.
He dubbed his experiments Panotti, named for the big-eared people of medieval legend, and eventually he took his prototypes to Nashville’s ASPIRE Research Co-op, a gathering dedicated to audio innovation. He and his fellow researchers worked to improve Panotti.
There was a problem, however. Hawley’s model gave good results, but it was a pain to set up and train. Hawley imagined a better way, one accessible to audio pros, and Art+Logic helped him find it, creating a simple, attractive interface. This interface ensures Vibrary leaps past a major sticking point for many specialised machine learning projects: Domain experts aren’t data scientists, and data scientists may have no clue how domain experts use or perceive the data. Vibrary empowers audio pros to build their own AI without a data-science background.
“Scott’s interface was built in Python, but it wasn’t something the overwhelming majority of users would have been able to configure,” says Jason Bagley, senior software developer at Art+Logic, himself an electronic musician. “We wanted something someone could download and start running immediately. We simplified things, automating a lot of processes. I came up with a user flowchart for training and categorisation. My colleague Daisey Traynham turned the flowchart into an interface that is simple to use, hiding as much of the complexity as possible.”
Data can present another hurdle in training an effective machine learning model as a non-expert. “Data scientists have to be careful in creating their data set. If you’re a producer, you don’t think like a data scientist. You may not get why your data are causing model prediction errors,” reflects Hawley. “We had to guide end users toward data and approaches that are helpful to them.” For example, to adequately train a model, users need to tag a large number of files with a shared characteristic to ensure accuracy. “We did a good job of making this difficult step as simple as it could be,” says Bagley.
After some soul searching, the Art+Logic team decided that the best way to get Vibrary out to users was to make it open source, rather than trying to launch a full-blown commercial product. Bagley and Hawley imagine many use cases for Vibrary far from the studio, including counting bird species by calls in a forest and diagnosing motors or devices by sound. “We knew this could be very useful to people, and we didn’t want to limit the features or define a single direction for the software,” Bagley notes. “As open-source software, we can get it into people’s hands and they can get creative.”