Tutorial
By Deepak Rangarao, Amod Upadhye | Published May 7, 2018
AnalyticsData Science
Cryptocurrencies have gained immense popularity in recent years with Bitcoin being the first and most popular peer-to-peer cryptocurrency. Investors see Bitcoin as the next “gold” (a safe asset for the future) to invest in as there are only a limited amount of Bitcoins. However, the past 12 months have seen both all-time highs and rapid fluctuations for the different indices (DJIA, S&P 500, etc.), gold and Bitcoin. Using one of the latest IBM SPSS Statistics integration, we thought it would be interesting to do some basic analysis on factors that affect the price of Bitcoins.
We considered three different stock market indices including Dow Jones Industrial Average (DJIA), Standard & Poors 500 Index (S&P 500), and Nasdaq Composite Index, etc. For this how-to, the S&P 500 index is used to determine if Bitcoin prices are affected by the performance of the index over a period of time. While DJIA is one of the oldest, better known and frequently used indices it represents only about a quarter of the value of the entire US stock market. The S&P 500 Index is a larger and more diverse index, made up of 500 of the most widely traded stocks in the US. Therefore, the S&P 500 is a good indicator of the US marketplace sentiments.
To help multiple analysts collaborate by contributing data, analysis, observations, and suggestions for this how-to, data.world and IBM SPSS Statistics were used. data.world is an open, secure and social data collaboration platform and is used in conjunction with IBM SPSS Statistics, a leading statistical platform to quickly and easily find insights in data. We leveraged a combination of stock/quote data with Bitcoin data hosted in a data.world project (sourced from bitcoincharts.com and uploaded as a data.world dataset), then collaborated with the team to identify areas of interest and deliver insights on how the performance of S&P 500 (sourced from Investing.com) affects the Bitcoin price.
Learn how to use SPSS to do the following:
It appears that my colleague Deepak wants to do some research on how the performance of S&P 500 affects the Bitcoin price.
Upon uploading the data to data.world, automatic data profiling was done and a cursory view of the data quality and outliers in the dataset resulted. This data profiling is shown below.
The IBM SPSS Statistics extension for data.world is used to import data into IBM SPSS Statistics and conduct the analysis. You can get the data.world extensions from the IBM SPSS Predictive Analytics hub as shown below.
To access data.world datasets from within IBM SPSS Statistics, click File > Import Data > Import from Data.World.
Importing data from data.world requires the following information on the dialog box:
An example of the ‘Import Configuration’ dialog box is shown below.
Now that the required data available for analysis in SPSS Statistics is available, you can run a quick Pearson correlation analysis (Analyze > Correlate > Bivariate). The results are:
Notice in the figure above that the value for Pearson Correlation between the Bitcoin Closing Price and the S&P Price is 0.883. This value is very close to 1, so it appears that there is a strong correlation between the S&P500 index and the Bitcoin prices.
Data Science is a team sport and the team members need to share data, findings and visualizations securely. The figure below shows how we used the data.world collaboration feature to collaborate with the team and determine the next best action.
To further quantify our observations, we thought it would be a good idea to build a simple regression model in SPSS (Analyze > Regression > Linear) that can help predict Bitcoin prices going forward.
The R^2^ value for the Linear Regression was 0.779. This implies that more than 77 percent of the variation in the Bitcoin data can be explained by the variation in the S&P index value. The results of the regression are shown below.
The figure below shows the Scatter Plot (Graphs > Legacy Dialogs > Scatter/Dot > Simple Scatter) we created to graphically present the relationship between Bitcoin close price and the S&P index.
Using the coefficients of the regression model, we came up with an equation for predicting the Bitcoin Price using the S&P index value.
Bitcoin_Predicted_Price=-5.97E4+25.93*S&P500IndexValue
Now, we can leverage the coefficients from the IBM SPSS Statistics Linear Regression model to predict the Bitcoin price based on the different values of S&P500 as shown in the table below.
After the analyses completed, we uploaded our insights and remarks to the data.world Project, as shown below.
What is presented in this tutorial is just the beginning. There are plenty of open datasets on data.world to explore and many algorithms and methods as part of SPSS Statistics to derive information from the datasets. Together with data.world and SPSS Statistics, you can manage data, perform analyses in a collaborative fashion, and share results to improve the accuracy and quality as part of the decision-making process.
Step-by-step instructions to perform data analysis and generate a prediction model in SPSS.
AnalyticsCloud+
Back to top