Behind the code: Billions of precise fantasy football predictions

Reasoning. Emotion. Ambivalence. Information overload. Many difficult decisions bring about discomfort and discord #FantasyFootballFace. However, ESPN and IBM have teamed up to bring a new level of insight to fantasy football team owners: Watson AI. Watson is analyzing millions of documents, videos, and podcasts from thousands of football sources to help you calculate risk and reward, offset personal bias, and incorporate more evidence into your decision making.

The ESPN Fantasy Football with Watson system has been a significant undertaking with many components. This article is the first in an eight-part series that takes you behind each component to show you how we used Watson to build a world-class AI solution.

How we created billions of precise fantasy football predictions with Watson

When it comes to scale and precision, artificial intelligence (AI) with ESPN Fantasy Football provides accurate predictions to millions of users around the world. The raw volume of traffic that is many millions of users per day and hundreds of millions of resource retrievals through the IBM Cloud requires a web acceleration tier within the overall system architecture. The web acceleration tier protects the machine learning pipeline within Watson from the deluge of traffic to avoid resource contention and nondeterministic outcomes. The origin machines that run the AI pipeline are behind a series of caches through a content delivery network (CDN). A middle tier updates the origin content with precise boom, bust, player buzz, evidence, and projection trends. Now you can access the enterprise-grade AI with ESPN Fantasy Football throughout the entire football season!

Each day, millions of users access our predictions through a player card. Many different player views are available on the player card such as a compare players feature. Here, multiple players can be compared so that score distributions, boom, and bust insights are contrastable. The rich user experience is data intensive on the IBM Cloud.


The volume of resource demand on the IBM Cloud is staggering. Through 3 weeks, the IBM CDN has had 9,284,216,526 hits, an 83.06 percent hit ratio (served from the CDN), and consumed 830.72 TB of bandwidth. The IBM Cloud CDN sustains the demand from the millions of ESPN Fantasy Football users. For this system to work, a series of components works in parallel.

When a player is loaded, dynamic Java Simple Object Notation (JSON) data is retrieved through the IBM Cloud CDN with an Object Storage Origin. In total, 13 different JSON files that range from video evidence to score distributions are pulled from the edge servers as long as the time-to-live (TTL) value for the content has not expired. If the content has expired, traffic for the pieces of data is routed through the midgress servers to Cloud Object Storage. The content is then sent back through the CDN and cached for other requests.

In addition, the player card skeleton is cached within the CDN. Artifacts such as Hypertext Markup Language (HTML), images, Cascading Stylesheets (CSS), and JavaScript libraries are all retrieved through the CDN. On the browser side, a no cache header is specified so that the latest content is retrieved from the CDN.


Several Cloud Foundry applications written in Python connect to several Watson services, such as Watson Discovery, and run the AI pipeline. Deep learning, regression, decision trees, and document 2 vector models work together to produce fantasy football insights and evidence. Over 3,000 news sources provide millions of articles for Watson to read and comprehend. Not only that, but Watson watches football videos and listens to fantasy football podcasts to provide more predictors for the AI pipeline.

The AI pipeline writes to over 20 Db2 tables. The data includes statistical insights and evidence that will be available on the player card. When a player has been analyzed and is complete, the AI pipeline sends a RESTful GET request to a Node.js application. The Node.js application gets the player and event information to pull insights from Db2 on the IBM Cloud. Through a series of asynchronous calls, the Node.js application generates JSON files and posts them to Cloud Object Store (COS). In the header, the access permissions, x-amz-acl, is set to public-read so that any player card can read the JSON files. The latest insights are available after the TTL for each piece of content expires or an administrator completes a manual purge.


Other player information such as injuries, byes, suspensions, trades, and injured reserve status are included within the CDN content.

Watson reads, listens to, and watches millions of artifacts to distill into evidence and insight. At the same time, Watson responds to millions of users that do not sleep! As a result, Watson never sleeps. Maintaining a continuously available enterprise AI system is a difficult yet required proposition. #WinWithWatson

Check back next time as I discuss continuously available football analysis. To find out more, follow Aaron Baughman on Twitter: @BaughmanAaron.

The ESPN Fantasy Football logo is a trademark of ESPN, Inc. Used with permission of ESPN, Inc.