IBM Developer Blog

Follow the latest happenings with IBM Developer and stay in the know.

Join data scientists to develop models focused on forecasting wildfires in Australia for the upcoming wildfire season.


Nearly 3 billion animals were affected by Australia’s worst wildfire season that burned from July 2019 through March 2020 estimates Chris Dickman, a professor of ecology at the University of Sydney. We’re asking you to join data scientists to develop models focused on forecasting wildfires in Australia for the upcoming wildfire season, and enter the chance to win $5K USD. To get you started we’re releasing historical data sets extracted from the Weather Operations Center Geospatial Analytics component (PAIRS Geoscope)

Overview of challenge

Wildfires are one of the most common forms of natural disasters in some regions, including Siberia, Australia, and parts of the United States such as California. It is important to improve forecasting for wildfires for a number of reasons:

  1. To prepare and respond
  2. To understand the root causes
  3. To help to mitigate wildfires in the future

At the Digital Developer Conference Data & AI, you’ll hear about the Call for Code Spot Challenge on Wildfires. The objective of this challenge is to forecast wildfires in Australia during the month of February 2021 to better understand the application of machine learning techniques in this domain. We are excited to share an extract from the Weather Operations Center Geospatial Analytics component (PAIRS Geoscope), with some of the data going back to 2005. We also have sessions to help you get started. Do join us.

Sessions at the conference in track 4 that will help you get started with the contest:

The Conference Slack workspace is here http://digitaldevcon.slack.com. If you need an invitation to the Slack channel, go to http://ibm.biz/devcon-ai-slack. There is a lab channel #ddc-ai-competitions and a help channel #ddc-ai-help in the workspace

Summary of track 4 – Community: Contests and open source

Talk title Speakers
Introducing the Call for Code Spot Challenge for Wildfires data set and contest Hendrik Hamann, Chief Scientist for Geoinformatics and PAIRS Geoscope; Omid Meh, Developer Advocate, IBM; Sundar Saranathan, Software Architect, IBM
Getting started with the wildfires data set Margriet Groenendijk, Developer Advocate, IBM
Getting started with AutoAI and the wildfires data set Gregory Bramble, Research Software Engineer, IBM
Building and using stacked machine learning models: A proven path to more accurate models David Carew, Developer, IBM
Creating inclusive IT Language: A Fireside Chat Priyanka Sharma, GM CNCF; Dale Davis Jones, Vice President & Distinguished Engineer, IBM
Ways you can get involved in Women in Data Science (WiDS) Karen Matthys, Executive Director, ICME, Stanford University
What’s next in open source data science & AI Todd Moore, Vice President, IBM; Ibrahim Haddad, Executive Director, LF AI & Data Foundation; Lisa Seacat, Distinguished Engineer, IBM

The data sets

Predict the size of the fire area in km squared by region in Australia for each day in February 2021.

The regions are:

  • NSW=New South Wales
  • NT=Northern Territory
  • QL=Queensland
  • SA=Australia
  • TA=Tasmania
  • VI=Victoria
  • WA=Western Australia

To forecast the wildfires, you will be given 5 data sets, extracted from the Weather Operations Center Geospatial Analytics component (PAIRS Geoscope), which you can augment with other open data sets. You will also be given opportunities to try out your predictions before February 2021 in earlier stages of the contest.

Note that there is no hidden data in this contest. You will be predicting wildfires for February 2021 during January 2021. The leaderboard will check how closely your prediction matches with reality.

The data sets and accompanying readme and slides are available via GitHub at https://github.com/Call-for-Code/Spot-Challenge-Wildfires, together with a starter notebook.

  • Land classes Australia by region (static throughout the contest)
  • Normalized vegetation index Australia by region
  • Weather
  • Weather forecasts
  • Wildfires

The data sets will be refreshed throughout the contest at specific times (see the timeline table below). Contestants can incorporate other open data sets into their model preparation.

The prize

One winner at the top of the final leaderboard on March 1, 2021 (or when IBM declares the contest closed), gets $5K USD.

Registering for the contest

  1. Register for the Call for Code Spot Challenge for Wildfires at https://developer.ibm.com/dwwi/jsp/register.jsp?eventid=cfc-2020-SP-wildfire. If you are working as a team on the Spot Challenge, please make sure each team member is individually registered.

  2. You will then register yourself for the contest leaderboard. If you are working as a team on the Spot Challenge, please make sure each team member is individually registered. You can then form your team on the leaderboard.

You can:

  • See the data sets and sample notebooks at https://github.com/Call-for-Code/Spot-Challenge-Wildfires.
  • Begin making your submissions to the leaderboard. In the initial “Development: Try the platform” stage, you will forecast wildfires for February 2020 to get familiar with the platform.

How to win

Predict the size of the fire area in km square by region in Australia for each day in February 2021.

Submit your prediction in January 2021 to the public leaderboard during the final stage “Predict Feb 2021 (Feb 1-28).” The date range when you can submit your final predictions is 2020-01-23 to 2021-01-31. If you examine the timeline closely, you will see that there is a plan to refresh the data sets on 2020-01-29, which enables you to review your model again during the last days of January.

You will be predicting wildfires for February 2021 during January 2021. In February 2021, the leaderboard will check weekly to see how closely your prediction matches with reality, the actual fires. Please note that the actual fire information will become available in February 2021 by NASA. The raw data (the actual fires) provided by NASA will be processed in the same way as the training data as detailed in readme docs. The overall scoring is based on two metrics, the mean absolute error (MAE) and the root mean square error (RMSE) between the forecasted and the actual estimated fire area. The total score will be weighted 80% towards MAE and 20% towards RMSE

Contest stages and submission time line

There will be four main contest stages. The first three stages are for practice. The final stage is what the contestants will be measured on.

  • Development: Try the platform – Predict Feb 2020
  • Predict Jan 2021 week 3 (Jan 16-22)
  • Predict Jan 2021 week 4 (Jan 23-29)
  • Predict Feb 2021 (Feb 1-28)

The following table describes the four stages including:

  • When the data is refreshed
  • When you can make submissions
  • The maximum number of submissions you can make
Data Refresh Submissions take place Contest Stage Max Allowed number of submissions
Available on 2020-11-10 Base Data – starts between 2005 – 2015 until 2020-10-31 2020-11-10 – 2021-01-09 Development: Try the platform – Predict Feb 2020 Daily 10, weekly 50, total 100
Available on 2021-01-09 Refresh data to include up until 2021-01-08 2021-01-10 2021-01-15 Predict Jan 2021 week 3 (Jan 16-22) Daily 5, weekly 35, total 35
Available on 2021-01-15 Refresh data to include up until 2021-01-14 2021-01-16 2021-01-22 Predict Jan 2021 week 4 (Jan 23-29) Daily 5, weekly 35, total 35
Available on 2021-01-22 Refresh data to include up until 2021-01-21 2021-01-23 2021-01-31 Predict Feb 2021 (Feb 1-28) Daily 3, weekly 3, total 3
Available on 2021-01-29 Refresh data to include up until 2021-01-28 As above As above As above

Terms and conditions

Please make sure you have agreed to the Participation Agreement for the Call for Code Spot Challenge for Wildfires before you start submitting to this leaderboard. (See registering for the Contest Section)

  • No IBM or Red Hat employees can participate.
  • A contestant can have one account only on the leaderboard and can be in one team only after the first Develop stage.
  • The maximum team size is 5.
  • Teams must be registered on this leaderboard by January 8, 2021.
  • No team mergers are allowed.
  • IBM can restrict the number of teams competing.
  • No sharing of notebooks and models privately between teams unless you make the content available to all.
  • The leaderboard determines the winner on March 1, 2021, or when IBM declares the contest closed.
  • At some point during the contest, an IBM tool such as Watson Studio or AutoAI should be used during the model development, training, and so on.
  • The top 5 contestants on the final leaderboard will be asked to share their notebook on Watson Studio and provide information on the tools they used as well as any other open data sets they incorporated.

We hope that you will join us.