Digital Developer Conference: Hybrid Cloud 2021. On Sep 21, gain free hybrid cloud skills from experts and partners. Register now

S&P 500 and Bitcoin, community-driven insights with and IBM SPSS Statistics

Cryptocurrencies have gained immense popularity in recent years with Bitcoin being the first and most popular peer-to-peer cryptocurrency. Investors see Bitcoin as the next “gold” (a safe asset for the future) to invest in as there are only a limited amount of Bitcoins. However, the past 12 months have seen both all-time highs and rapid fluctuations for the different indices (DJIA, S&P 500, etc.), gold and Bitcoin. Using one of the latest IBM SPSS Statistics integration, we thought it would be interesting to do some basic analysis on factors that affect the price of Bitcoins.

We considered three different stock market indices including Dow Jones Industrial Average (DJIA), Standard & Poors 500 Index (S&P 500), and Nasdaq Composite Index, etc. For this how-to, the S&P 500 index is used to determine if Bitcoin prices are affected by the performance of the index over a period of time. While DJIA is one of the oldest, better known and frequently used indices it represents only about a quarter of the value of the entire US stock market. The S&P 500 Index is a larger and more diverse index, made up of 500 of the most widely traded stocks in the US. Therefore, the S&P 500 is a good indicator of the US marketplace sentiments.

To help multiple analysts collaborate by contributing data, analysis, observations, and suggestions for this how-to, and IBM SPSS Statistics were used. is an open, secure and social data collaboration platform and is used in conjunction with IBM SPSS Statistics, a leading statistical platform to quickly and easily find insights in data. We leveraged a combination of stock/quote data with Bitcoin data hosted in a project (sourced from and uploaded as a dataset), then collaborated with the team to identify areas of interest and deliver insights on how the performance of S&P 500 (sourced from affects the Bitcoin price.

Learning objectives

Learn how to use SPSS to do the following:

  • Collaborate with
  • Upload data as a dataset
  • Import data to IBM SPSS Statistics
  • Analyze the data
  • Build a regression model
  • Make a prediction


  • SPSS Statistics

Estimated time

  • About an hour


Collaborate with the team using

It appears that my colleague Deepak wants to do some research on how the performance of S&P 500 affects the Bitcoin price.

Data profiling with

Automatically Profile Data

Upon uploading the data to, automatic data profiling was done and a cursory view of the data quality and outliers in the dataset resulted. This data profiling is shown below.

Data profiling with

Get the extension

The IBM SPSS Statistics extension for is used to import data into IBM SPSS Statistics and conduct the analysis. You can get the extensions from the IBM SPSS Predictive Analytics hub as shown below.

IBM SPSS Statistics Predictive Analytics extensions

Work with the data

To access datasets from within IBM SPSS Statistics, click File > Import Data > Import from Data.World.

IBM SPSS Statistics extension for

Importing data from requires the following information on the dialog box:

  • authentication token (you can get this token from the settings page).
  • dataset URL. (Example: amodu/bitcoin-price-analysis)
  • SQL SELECT statement to access the data. (Example: SELECT b.close, s.price from bitcoin_data b inner join standard_and_poor500 s on cast(b.timestamp as DATE) =

An example of the ‘Import Configuration’ dialog box is shown below. data import configuration

Analyze the data

Now that the required data available for analysis in SPSS Statistics is available, you can run a quick Pearson correlation analysis (Analyze > Correlate > Bivariate). The results are:

IBM SPSS Statistics Pearson Correlation

Notice in the figure above that the value for Pearson Correlation between the Bitcoin Closing Price and the S&P Price is 0.883. This value is very close to 1, so it appears that there is a strong correlation between the S&P500 index and the Bitcoin prices.

Data Science is a team sport and the team members need to share data, findings and visualizations securely. The figure below shows how we used the collaboration feature to collaborate with the team and determine the next best action. team collaboration

Build a regression model

To further quantify our observations, we thought it would be a good idea to build a simple regression model in SPSS (Analyze > Regression > Linear) that can help predict Bitcoin prices going forward.

The R^2^ value for the Linear Regression was 0.779. This implies that more than 77 percent of the variation in the Bitcoin data can be explained by the variation in the S&P index value. The results of the regression are shown below.

IBM SPSS Statistics Linear Regression Summary

The figure below shows the Scatter Plot (Graphs > Legacy Dialogs > Scatter/Dot > Simple Scatter) we created to graphically present the relationship between Bitcoin close price and the S&P index.

IBM SPSS Statistics Scatter Plot

Using the coefficients of the regression model, we came up with an equation for predicting the Bitcoin Price using the S&P index value.


Now, we can leverage the coefficients from the IBM SPSS Statistics Linear Regression model to predict the Bitcoin price based on the different values of S&P500 as shown in the table below.

Date S&P500 Actual Bitcoin price Predicted Bitcoin price
02-Jan-2018 2,695 14,678 10,151
08-Feb-2018 2,581 8,259 7,195
15-Mar-2018 2,747 8,265 11,500
03-Apr-2018 2,609 7,464 7,921

After the analyses completed, we uploaded our insights and remarks to the Project, as shown below.

Team collaboration with


What is presented in this tutorial is just the beginning. There are plenty of open datasets on to explore and many algorithms and methods as part of SPSS Statistics to derive information from the datasets. Together with and SPSS Statistics, you can manage data, perform analyses in a collaborative fashion, and share results to improve the accuracy and quality as part of the decision-making process.