IBM iX and Corporate team project members include: Brian Adams, Aaron Baughman, Karen Eickemeyer, Monica Ellingson, Stephen Hammer, Eythan Holladay, John Kent, William Padget, David Provan, Karl Schaffer, Andy Wismar


+ This content is part 2 of the 4 part series that describes AI Highlights at the Masters Golf Tournament.

Shot Detection

The Masters golf tournament produces several continuous streams of video feeds that broadcast the golf action.  For the Masters this year, we are ingesting 4 different programs of live video content that include Amen Corner, featured groups, holes 15 & 16 and simulcast coverage.  The video data is analyzed on the move and at rest to find sequences of frames that could be a golf highlight.  Several markers are used to accumulate evidence for golf clip detection.  Optical Character Recognition (OCR) on the Masters logo and the detection of video fades, dissolves and transitions establish clip boundary detection.  The start time and end time is used to segment the stream into a collection of golf shots.  These clips can then be used to measure and rank highlights.

OCR Scene Detection

The media director’s job is to blend the content from all 18 holes into one produced feed or highlights. The content could be live or it may be delayed by a few seconds.  In addition, golf dependent clues are very important for determining clips.  This means that we need a golf specific set of business rules, models and discovery to clip golf content.  Therefore, we need to approach this differently from a sport like tennis.

A process pulls sections of stream files from adaptative m3u8 files.  When the appropriate bandwidth channel is selected, transport files are downloaded and split into graphical frames and wav files.  Each modality is used to determine a golf shot highlight.  The following code shows an abbreviated method of analyzing a stream.

system("curl --retry-max-time 0 -o $path/stream/$cache{$segment} \"$segment\"");
# check size of downloaded segment
my $size = -s "$path/stream/$cache{$segment}";
if($size < 100000)    {
     print "FOUND EMPTY SEGMENT!!!\n";
     last LOOP0;

}

 

A good indicator of the start of a golf scene is the presence of the television graphic. Most golf scenes begin with this graphic to identify the player and the situation. The presence of this graphic and the metadata contained within such as the player name and the hole number tell us it is the beginning of the scene and who is in that scene.

 

Simple color matching is done to determine if the graphic is present.  We scan the video at a rate of 1 frame per second to find the presence of the bright yellow color in the Masters logo. More complex methods could be used to template match the TV Graphic, but detecting the color is very fast and reliable. If the graphic is within an image, the region is cut from the video for further analysis. Optical Character Recognition (OCR) is performed on the graphic to recognize the player name, hole and score. The timestamp in the video is also marked as a potential starting time for a golf shot. To reduce the number of false positives, we only mark a candidate start time if a player’s name can be recognized from the graphic.  For example, we would not mark a potential start time if a patron in the crowd is wearing a bright yellow shirt without the presence of a player name.

The TV Graphic detection system was written in perl using the GD module. The following code snippet shows an example of finding and segmenting the graphic.

use GD;
my $in = 'tiger.png';
my $tolerance = 30;
my $image = new GD::Image("frames/$in");
$index = $image->getPixel(807,51);
($r,$g,$b) = $image->rgb($index);
$rc = abs($r - 246); # 246,255,11 is the RGB color we're looking for.
$gc = abs($g - 255);
$bc = abs($b - 11);
if (($rc < $tolerance) && ($gc < $tolerance) && ($bc < $tolerance)) {
# perform logic
}

 

For the OCR, we used tesseract. This was again done for speed. We installed the HEAD version which turned out to be 4.0.0-beta.1. Tesseract requires files as it’s input so small files were created by breaking up the TV Graphic. This was again done using GD.

my $nameImage = new GD::Image(278,29);
$nameImage->copy($image,0,0,830,48,278,29);

open(FILE,">tmp/name.png");
print FILE $nameImage->png;
close(FILE);
undef $nameImage;

 

A Tesseract command was run on the system command line to recognize the player name.  The following parameters were fine tuned for golf.

tesseract tmp/name.png --psm 7 --oem 2 stdout

Camera Cut, Fade or Dissolve Detection

A scene can consist of multiple shots that require further segmentation.  Typically, if there are multiple shots within a scene, the director changes camera angles multiple times. After a single golf shot is complete, the shot can dissolve into another or a cross cut can link multiple candidate highlights together.  As a result, we used opencv to analyze sequences of images to detect scene transitions to the mark the end of a clip.

Since many of the cuts or video transitions happen after a shot, we look for skyward looking angles followed by a quick transition to the golfer.  We needed to be able to compare the primary the color of sky shots that combine blue and white with camera transition color.  If the primary color followed our pattern, we determined a video cut was present.  Within python, we converted the Red, Green, Blue (RGB) color space to the Hue, Saturation, Value (HSV) color space.  The HSV space summarizes the degree to which a color is present within the Hue.

The following code snippet depicts how we determined the primary color if an image.

target_image ='frame001.jpg'
img = cv2.imread(target_image)
hsv_image = cv2.cvtColor(img, cv2.COLOR_RGB2HSV)
hue, sat, val = hsv_image[:,:,0], hsv_image[:,:,1], hsv_image[:,:,2]
hist = cv2.calcHist([hue], [0], None, [180], [0, 180,])
indices = list(range(0, 180))
s = [(x,y) for y,x in sorted(zip(hist,indices), reverse=True)]
hist[s[0][0]]

 

In addition to primary color, we used brightness to determine if multiple shots within one cut.  Each frame is converted to gray scale to measure the brightness of image.  If the brightness of an image is above a predetermined threshold, we know it was a sky shot.  We also compared the brightness of series of images to determine if a cut, fade or dissolve occurred in the video.  The following code shows an example of our implementation:

frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
brightness.append(cv2.mean(frame)[0])
brightness = brightness[-1] # level 0 to 255

 

The ending scene boundary is determined using a combination of methods including scene change detection, crossfades, and the presence of TV Graphics in subsequent shots.  The candidate clip is sent to the AI Highlights system to further cut into clips based on business rules around clip length, shot quantity, gesture recognition and crowd noise. For example, an end highlight might have loud cheering and golf player fist pumps.  Another consideration is that multiple putts could be in the same highlight.  One business rule would have four hosts within a putting sequence.

The start and stop times are used to clip the scene.  The scene is stored as an mp4 file within Object Storage for playback. A JSON payload is then built with the metadata from the TV Graphic and pushed to a Cloudant job queue for excitement ranking.  The AI Highlight system will pull the JSON job file for instructions on how to retrieve the clipped mp4 file.   Below is an example JSON job file.

{
"_id": "0089ad52af1743bf0a82acdbee117e6d",
"_rev": "3-4bd9a39d0bb5806206292f12ff933d85",
"name": "simulation.mp4",
"source-assets": {
"storage-url": "/simulation.mp4"
},
 "title": "r4_26596_2_1.mp4",
 "platform": "standalone",
 "duration": 34.034,
 "uid": "1522612416058",
 "thumbnails": {
 "image": [
 {
 "width": "1920",
 "value": "/thumb.jpg",
 "height": "1080"
}
]
}
}

 

For example, when Sergio Garcia made his last putt to win the Masters 2017, we will use all of the available information to create a clip.  In the beginning frames, Sergio is putting.  Next, the video cuts to a new angle where the brightness and primary color changes.  However, we do not view another television graphic and the crowd cheering is continuing.  The entire shot along with a prolonged celebration is included within a single clip.


AI Listens, Watches and Reads The Masters Golf Tournament Action

+ This content is part 2 of the 4-part series that describes AI Highlights at The Masters Golf Tournament.

 

Join The Discussion

Your email address will not be published. Required fields are marked *