Internet

Fact-checked

What is Audio Mining?

H. Bliss

Last Modified Date: February 16, 2024

Audio mining is usually used in speech recognition software and music analysis. This technology gives the user the ability to search through speech or music audio that has been analyzed for specific characteristics. When used in speech recognition technology, audio mining identifies spoken words in the audio and puts them in a searchable file. This feature can be useful for students or those in the business world who attend many meetings because it allows the user to more easily browse topical information from speech presentations. This type of analysis can also be used in music to determine characteristics like beats per minute (BPM), musical key, and musical structure, information that is employed to classify music.

In speech recognition, where the technology is most often used, audio mining is employed to create an acoustic model. An acoustic model programs speech recognition software to recognize speech patterns as words. This technology is developed by audio mining a recording of a spoken phrase, which is compared to text matching the spoken phrase. The computer uses the information to recognize words when the user makes similar sounds to those in the acoustic model. An acoustic model is used in combination with a file that tells the speech recognition program what language to interpret and what patterns of words are likely to be spoken in certain sentences and situations.

Musicians and music listeners can both benefit from audio mining in music. Sometimes, music software that categorizes music by genre uses audio mining to organize the music. The process identifies and groups music files with sound similarities that frequently occur in musical genres. Though this technology can make organizing music and finding new music easier, it can make mistakes classifying songs that have similar measured characteristics but different overall sound. Audio analyzing software can be useful to musicians, especially composers, because it allows the composer to jump to specific parts of the song structure, including musical key changes and words within the lyrics.

The speech recognition software manufacturer Dragon® sells a program called AudioMining® that transcribes audio files and marks the files so they can be searched for text. Dragon is a manufacturer of computer linguistics programs, the technical term for the field of software designed to interpret speech. Audio mining, when used as two words, is a general term that refers to analyzing a sound file for a determined set of audio characteristics. Other manufacturers of audio mining software include Nuance® and Nexidia®.

AS FEATURED ON:

Discussion Comments

miriam98

August 30, 2011

@MrMoody - I think the programs like Dragon’s Audio Mining might be similar to what you’re looking for. It might be able to capture the text portions of the audio at least.

These could serve as the cue points which you’re looking for. The program is geared for text recordings so I don’t know how well it would work for music, but it’s worth a shot.

What I’m really interested in is using audio analysis to create a searchable archive of audio recordings on the Internet. Right now we mainly search for text and some video.

What if all audio could be indexed? I mean not only indexing the title of the audio recording, but also all of the recording itself – as if it were a transcript. That would open up a whole new world of information that could be searched over the Internet.

MrMoody

August 29, 2011

@SkyWhisperer - That’s great. I’ve love to see audio analysis used in video editing software. Sometimes I do event recordings, and have to record live music.

I want to be able to sync the music with other parts of the video as well, as if I had multiple cameras shooting the same event from different angles, but all with the same audio track. I’d like the video program to do an audio analysis of the whole file and create cue points for me where I could cut to different scenes and make it sound as if I was still using the same soundtrack.

Right now I can mimic that effect by pulling out the sound as a separate file and laying it down on the audio track, then pasting video clips on top and trying to sync them with the audio. It would be cool if the video program could do that for me though.

SkyWhisperer

August 28, 2011

I tried out speech recognition software a few months ago. I had a simple need. I needed to write a lot and wanted to reduce the amount of typing I had to do by using dictation instead.

I began looking around for various speech recognition programs. I had heard of Dragon Naturally speaking but to my surprise I discovered that Microsoft had bundled its own speech recognition software in the newest release of Windows.

I wondered how good it would work, since it was free. To my surprise, it worked very well. To begin with there was a training session, where the program would put phrases on the screen which I recited into the microphone.

This was the audio mining technology in action; the computer matched my recorded responses to the text, and created a profile for me. Thereafter it was able to recognize my voice when I dictated, with an accuracy of roughly 95%.