Before Maryland was quarantined due to the coronavirus, I was one week away from departing on a plane to California to compete in the 2020 IBJJF Pan Championships. I was also teeing up my next blog to predict the black belt outcomes using our Elo for BJJ model. Both of those plans were cancelled, but my search for more, quality data on our sport has not ceased. Data collection can be time consuming to get accurate, and I hope to have more for you all soon.
In the meantime, I created a short highlight video of one of my favorite matches from the 2019 IBJJF World Championships using data science. This is how I spend my free time while under lockdown, learning new skills.
In the meantime, I created a short highlight video of one of my favorite matches from the 2019 IBJJF World Championships using data science. This is how I spend my free time while under lockdown, learning new skills.
Process
I turned to publicly available matches on the IBJJF's YouTube channel for show stoppers from last season. I've studied, and watched for entertainment, many of these videos with the extra time at home. Ultimately, I choose to analyze the finals match of the Heavy weight division between 2019 number one ranked fighter, and multi-time world champion, Leandro Lo of NS Brotherhood against first year black belt Kaynan Duarte of Atos Jiu Jitsu.Audio Analysis Using Computer Vision
I could have used a movie editor software to cut and splice the highlights of the video. However, I wanted to learn new techniques and I read a cool article that taught me how to use computer vision to analyze a video. The technique I used is called Simple Speech Analysis. I analyzed the energy, or loudness of a sound in the audio signal, to detect key points in the match between Lo and Duarte. For instance, when the crowd cheered or the announcer's voice rose an octave, the audio's energy wave fluctuated.Step 1: I extracted an audio file in a wav format from the mp4 video file. I segmented the audio into 5 second intervals for analysis. I took one of the 5 second clips and plotted the audio's energy waves in graph. Here is what that looks like.
Step 2: I computed the short time energy for each 5 section interval. I put them all together and plotted the energy spikes in a histogram to understand what the energy pattern looked like. Here is that visualization.
From the histogram, we can see that the majority of the audio's energy is below 1000. There are two major spikes in energy around the 8500 mark and another around the 1800. Theses are likely two key scoring moments when the crowd went wild.Step 3: I used a threshold of 450 to find other spikes in the audio file. Altogether, there were 9 short, 5 second clips where major action occurred in the match.
From the table above, we now have the energy level associated with time (in seconds) in the match. The
table was created with the peak energy marked as the end time, and
the start time calculated as 5 seconds before to capture the sequence of
events leading up to a roaring crowd.