The Guide to machine learning using cancer data notebook explains how to use machine learning for a classification problem on tumor data. You’ll see how to develop a solution in three parts, starting with an intuitive introduction to supervised learning concepts, followed by a basic example of a machine learning model. The final section is a deep dive into model stacking and parameter tuning, both of which are used in practice to significantly improve predictive accuracy.

If you are interested in running this notebook in your own Watson Studio environment, you can get the notebook from GitHub.

5 comments on"Introduction to Machine Learning: Predict Cancer Diagnosis"

  1. Einar Karlsen August 07, 2018

    Hello,
    the notebook uses to a data file called ‘cancer_data.csv’ and provides a link to a webpage where data can be downloaded. However there is no such file name in the data set to be found. What kind of data should be used and what are the prober instructions for accessing the data inside the notebook.
    Thanks

    • You can find a link to the data in the “1.2 Defining the task” section of the notebook, and the notebook can be accessed using either of the links above.

  2. Einar Karlsen October 07, 2018

    The link takes me to the UCI “Breast Cancer Wisconsin (Diagnostic) Data Set”. The page contains several files, none of them named ‘cancer_data.csv’. I have used “wdbc.data”, renamed it and uploaded it to COS. However, this file does not contain a header line whereas ‘cancer_data.csv’ supposedly does insofar that the notebook prints this out. I can of course add this, but it would be a bit easier to get started if the file appeared on GitHub ‘cancer_data.csv’. together with the notebook.

  3. Einar Karlsen October 07, 2018

    FYI: The notebook may need the following install command for pydotplus in section 2.1:

    !pip install pydotplus

  4. Einar Karlsen October 07, 2018

    The notebook seems to be written for Python 2 where print statements could be provided without parenthesis. To get it to work with Python 3 (use by Watson Studio) simply add the parenthesis whenever a syntax error is reported.

Join The Discussion

Your email address will not be published. Required fields are marked *