Cryptocurrencies have gained immense popularity in recent years with Bitcoin being the first and most popular peer-to-peer cryptocurrency. Investors see Bitcoin as the next “gold” (a safe asset for the future) to invest in as there are only a limited amount of Bitcoins. However, the past 12 months have seen both all-time highs and rapid fluctuations for the different indices (DJIA, S&P 500, etc.), gold and Bitcoin. Using one of the latest IBM SPSS Statistics integration, we thought it would be interesting to do some basic analysis on factors that affect the price of Bitcoins.

We considered three different stock market indices including Dow Jones Industrial Average (DJIA), Standard & Poors 500 Index (S&P 500), and Nasdaq Composite Index, etc. For this how-to, the S&P 500 index is used to determine if Bitcoin prices are affected by the performance of the index over a period of time. While DJIA is one of the oldest, better known and frequently used indices it represents only about a quarter of the value of the entire US stock market. The S&P 500 Index is a larger and more diverse index, made up of 500 of the most widely traded stocks in the US. Therefore, the S&P 500 is a good indicator of the US marketplace sentiments.

To help multiple analysts collaborate by contributing data, analysis, observations, and suggestions for this how-to, data.world and IBM SPSS Statistics were used. data.world is an open, secure and social data collaboration platform and is used in conjunction with IBM SPSS Statistics, a leading statistical platform to quickly and easily find insights in data. We leveraged a combination of stock/quote data with Bitcoin data hosted in a data.world project (sourced from bitcoincharts.com and uploaded as a data.world dataset), then collaborated with the team to identify areas of interest and deliver insights on how the performance of S&P 500 (sourced from Investing.com) affects the Bitcoin price.

Learning objectives

Learn how to use SPSS to do the following:

  • Collaborate with data.world
  • Upload data as a data.world dataset
  • Import data to IBM SPSS Statistics
  • Analyze the data
  • Build a regression model
  • Make a prediction

Prerequisites

  • SPSS Statistics
  • data.world

Estimated time

  • About an hour

Steps

Collaborate with the team using data.world

It appears that my colleague Deepak wants to do some research on how the performance of S&P 500 affects the Bitcoin price.

Data profiling with data.world

Automatically Profile Data

Upon uploading the data to data.world, automatic data profiling was done and a cursory view of the data quality and outliers in the dataset resulted. This data profiling is shown below.

Data profiling with data.world

Get the data.world extension

The IBM SPSS Statistics extension for data.world is used to import data into IBM SPSS Statistics and conduct the analysis. You can get the data.world extensions from the IBM SPSS Predictive Analytics hub as shown below.

IBM SPSS Statistics Predictive Analytics extensions

Work with the data

To access data.world datasets from within IBM SPSS Statistics, click File > Import Data > Import from Data.World.

IBM SPSS Statistics extension for data.world

Importing data from data.world requires the following information on the dialog box:

  • data.world authentication token (you can get this token from the data.world settings page).
  • data.world dataset URL. (Example: amodu/bitcoin-price-analysis)
  • data.world SQL SELECT statement to access the data. (Example: SELECT b.close, s.price from bitcoin_data b inner join standard_and_poor500 s on cast(b.timestamp as DATE) = s.date)

An example of the ‘Import Configuration’ dialog box is shown below.

data.world data import configuration

Analyze the data

Now that the required data available for analysis in SPSS Statistics is available, you can run a quick Pearson correlation analysis (Analyze > Correlate > Bivariate). The results are:

IBM SPSS Statistics Pearson Correlation

Notice in the figure above that the value for Pearson Correlation between the Bitcoin Closing Price and the S&P Price is 0.883. This value is very close to 1, so it appears that there is a strong correlation between the S&P500 index and the Bitcoin prices.

Data Science is a team sport and the team members need to share data, findings and visualizations securely. The figure below shows how we used the data.world collaboration feature to collaborate with the team and determine the next best action.

data.world team collaboration

Build a regression model

To further quantify our observations, we thought it would be a good idea to build a simple regression model in SPSS (Analyze > Regression > Linear) that can help predict Bitcoin prices going forward.

The R^2^ value for the Linear Regression was 0.779. This implies that more than 77 percent of the variation in the Bitcoin data can be explained by the variation in the S&P index value. The results of the regression are shown below.

IBM SPSS Statistics Linear Regression Summary

The figure below shows the Scatter Plot (Graphs > Legacy Dialogs > Scatter/Dot > Simple Scatter) we created to graphically present the relationship between Bitcoin close price and the S&P index.

IBM SPSS Statistics Scatter Plot

Using the coefficients of the regression model, we came up with an equation for predicting the Bitcoin Price using the S&P index value.

Bitcoin_Predicted_Price=-5.97E4+25.93*S&P500IndexValue

Now, we can leverage the coefficients from the IBM SPSS Statistics Linear Regression model to predict the Bitcoin price based on the different values of S&P500 as shown in the table below.

Date S&P500 Actual Bitcoin price Predicted Bitcoin price
02-Jan-2018 2,695 14,678 10,151
08-Feb-2018 2,581 8,259 7,195
15-Mar-2018 2,747 8,265 11,500
03-Apr-2018 2,609 7,464 7,921

After the analyses completed, we uploaded our insights and remarks to the data.world Project, as shown below.

Team collaboration with data.world

Summary

What is presented in this tutorial is just the beginning. There are plenty of open datasets on data.world to explore and many algorithms and methods as part of SPSS Statistics to derive information from the datasets. Together with data.world and SPSS Statistics, you can manage data, perform analyses in a collaborative fashion, and share results to improve the accuracy and quality as part of the decision-making process.