I’m a big fan of CARTO, an online service for cartographic visualization and basic spatial analysis. At its core, CARTO is a hosted, managed PostgreSQL platform that relies heavily on that database’s PostGIS extension. On top of the database, CARTO has built a slick mapping platform, including a map editor, a Python module, and a front-end Javascript mapping library. Today, I’ll show you how to use CARTO to map the outputs of your Jupyter notebook-based Python analytics work. Quickly show spatial patterns with a visual presentation quality that will impress your boss and customers.

CARTO

First, go to CARTO.com and sign up for a new account (if you don’t already have one). After setup, you should find yourself at an empty maps screen that looks something like this:

blank map in CARTO

If you already have an account, this page is your maps dashboard at https://{yourusername}.CARTO.com/dashboard

Import zip code data

We’re going to map some zip code data, so let’s add US zip codes to our account. I’ve created a zip code data file that you can bring into your own workspace.

  1. Click this link: https://ibmanalytics.CARTO.com/u/ibm/tables/ibm.zipcodes/public
  2. On the lower right of the screen click the CREATE MAP button.

    IBM zip code map

    The zip code database will load in your CARTO account, and generate a blank zip code map.

  3. Name the map.
    Go to the top left of the screen and click Edit metadata and name it Zips Map.

Create an on/off column in your zip code table

Here we’ll add a new column that we’ll use to either show or hide the zip code on our map.

  1. On the upper left of the screen, click the left arrow next to the map name.
    arrow

  2. Switch to the datasets view.

    At the top of the screen, click the small arrow beside Maps and choose Your datasets.
    datasets menu

    You are now on a URL that looks like this:

    https://{yourusername}.CARTO.com/dashboard/datasets

  3. Click the data set called carto_query.

  4. On the top left, click the Edit metadata link. Name it zips and click save.

  5. On the bottom right of the screen, click the Add column icon.
    add column icon

  6. Name the new column showme, and set its data type to boolean (by pressing on the word ‘string’ below the column’s name).

  7. Set the display of all zip codes to false.

    On the right side of the screen, click on the SQL icon. Enter this SQL and click Apply query.

    UPDATE zips SET showme = FALSE

Tweak the map cartography

The last setup step before we can try some analysis, is to tweak the map’s cartographic look to make the styling better suited to our use case. As is, if we run an analysis that returns only a few zip codes, you won’t be able to see them when looking at a map of the entire United States.

We’ll tweak CARTO’s CSS-based map styling language to say: when looking at a large land area, show selected zip codes (where showme=true) as large dots. But when zoomed-in close on a city or region, shade in the actually boundaries of the zip code and don’t show dots.

To make this change:

  1. Return to your map by visiting:

    https://{yourusername}.CARTO.com/dashboard/maps

  2. Click on your Zips Map to open it.

  3. On the right side of the screen, click the CSS icon.

  4. Replace any text you see there with the following JSON

    /** simple visualization */
    Map{
      buffer-size:256;
    }
    
    #zips{
      marker-fill: #FF6600;
      marker-width: 24.0;
      marker-line-color: #FFFFFF;
      marker-line-width: 2.0;
      marker-line-opacity: 1;
    
      [zoom>=9] {
        polygon-fill: #FF6600;
        marker-width: 0;
        polygon-opacity: 0.7;
        line-color: #FFF;
        line-width: 0.5;
        line-opacity: 1;
      }
    }
    
  5. Press Apply style.

You now see a blank map, which makes sense, because the current value in showme is false for all rows in the zips table. We’ll change that by doing some analysis with Python’s Pandas module in a Jupyter notebook.

Copy the map URL

  1. In CARTO, on the upper left of your screen, click the left arrow to return to your dashboard.

    back to dashboard

  2. Switch to the maps view.

    At the top of the screen, click the small arrow beside Datasets and choose Your maps.

  3. Click Zips Map to open it.

  4. On the upper right of the screen, click the Publish button.

  5. Copy the CartoDB.js URL. You’ll need it in a minute.

    CARTOjs

Copy CARTO account credentials

The notebook you’re about to create also needs access to your CARTO account. Grab your credentials now and copy those too:

  1. On the upper right of the CARTO screen, click the icon (this is a randomly-generated icon).
  2. From the menu that appears, select Your API Keys.
  3. Copy the API key that appears.
  4. Copy your username, which you’ll find in the URL of your CARTO account: https://{yourusername}.CARTO.com/

Set up Analytics Notebook

IBM’s Data Science Experience (DSx) includes Data Sets, a selection of open data sets that you can download and use any way you want. It’s easy to get an account and grab some data. You can also create a notebook to run some analysis online.

Get population data set

We’ll map highly populous zip codes–those that have over 100,000 residents. To do so, we’ll load an open data set containing population info.

  1. Sign in or create a trial account on DSX.
  2. At the top of the screen, click the Data Sets tab.
  3. At the top of the screen, in the Search box, type Demographic. Click the United States Demographic Measures: Population and Age data set.
  4. On the top right of the screen, click the Link button to get a new access key.
  5. Copy the URL that appears. You’ll use it in a minute to load data into your notebook.

Create Python notebook

Tip: If you don’t want to create a notebook and run the commands yourself, you can also just open the notebook in your browser and follow along: https://github.com/ibm-cds-labs/open-data/blob/master/samples/cartodb-notebook.ipynb

Create a new python notebook in one of 2 ways:

  • online, hosted by IBM Data Science Experience
    1. Create a new project (or select an existing project).
      1. On the upper left of the DSX screen, click the hamburger menu and choose My Projects.
      2. On the upper right of the project list, click + create project.
      3. Complete fields and click Create.
    2. Add a new notebook (From URL) within the project.
      1. Click + add notebooks.
      2. Click From URL.
      3. Enter notebook name.
      4. Enter notebook URL: https://github.com/ibm-cds-labs/open-data/raw/master/samples/cartodb-notebook.ipynb
      5. Select your Spark Service.
      6. Click Create Notebook.
    3. If prompted, select a kernel for the notebook. The notebook should successfully import.

      When you use a notebook in DSX, you can run a cell only by selecting it, then on the Run Cell (▸ icon) button. If you don’t see the Run Cell button and Jupyter toolbar, go to the toolbar and click pencil icon Edit.

  • or locally using Python and Jupyter
    1. Download this notebook from GitHub: https://github.com/ibm-cds-labs/open-data/raw/master/samples/cartodb-notebook.ipynb. (Copy the text and save as your own .ipynb file.)
    2. Install Python from Anaconda (a free distribution that includes the most common packages).

    3. Launch Jupyter.

      In Terminal, cd to the directory where you downloaded the notebook and type jupyter notebook. Jupyter launches in your browser.

This notebook uses Pandas to read a CSV file for US population by zip code, and selects zip codes where the population is greater than 100,000 persons.

To map the results, we use CARTO’s Python module to execute SQL statements that will change showme to true for those 8 zip codes. That automatically updates the map we created earlier in CARTO.

Add credentials

Your notebook needs access to your data, so edit the following entries in your new notebook.

  1. In cell 2, replace os.environ['AE_KEY_AGE'] with the URL of your Population and Age database. Then that cell reads something like:
    pop_df = pd.read_csv("https://console.ng.bluemix.net/data/exchange-api/v1/entries/beb8c30a3f559e58716d983671b65c10/data?accessKey=2dbb61f9aed0ecb65316b1ecadfb6ebb", usecols=['GEOID','B01001e1'], dtype={"GEOID": np.str} )
    pop_df.columns = ['GEOID','POPULATION']
    pop_df = pop_df.set_index('GEOID')
    pop_df.sample(10)
    
  2. In cell 5, replace placeholder CARTO credentials with your CARTO API key and username.

    carto creds

  3. In the last cell, replace the map URL with the CARTO.js URL you copied a few minutes ago.

    sample URL:

    https://{yourusername}.CARTO.com/api/v2/viz/{yourmapid}/viz.json

Run the notebook

Run all cells/commands in the notebook, in order. Comments in the notebook and its code explain what each cell does.

Tip: If you plan to play with some data and run through this notebook again, comment out the pip install line in cell 5 on subsequent runs. To do so, insert a # character at the start of that one line. You only need to install on your first run through the notebook.

The map

All the cells build to generate the map in the final cell. That last cell uses JavaScript “magic” to embed the map in the notebook, which draws from the map URL you just entered.

Can’t see the map? At the time of writing, CARTO is using a certificate authority that many browsers don’t recognize. If you can’t see the map, the browser is blocking access to CARTO web resources. To work around this issue, manually set https://libs.cartocdn.com/ as a trusted site in your browser. In Google Chrome, do so by visiting https://libs.cartocdn.com/CARTO.js/v3/3.15/themes/css/CARTO.css. When Chrome tells you that isn’t safe, click Advanced, trust the URL, and load it.

You can see my map here:

https://rajrsingh.carto.com/viz/3e4b46a4-3ed3-11e6-bbbe-0e3a376473ab/public_map

Notice that when you look at the whole country, it seems like there are only 4 locations that meet the criteria (population > 100,000), but this is because we’re using large dots to mark locations at this scale. Zoom in on New York City or Los Angeles and you see multiple zip code boundaries highlighted (NYC has 2, LA has 4, Chicago has 1, and El Paso has 1).

map zoomed to NY

The takeaway

This was an extremely simple analysis, but it illustrates a powerful concept. As long as you can get the data into a Python object, you can map it with CARTO. Our data came from a CSV file, but it could have come from a relational or a NoSQL database. And we didn’t even have to use Pandas–this could have been a Spark-based analysis. The only real limitation is that you don’t want to send huge SQL statements over the Internet. But huge is a relative term. Since the SQL update commands need to happen only once (the map persists until you want to update it), it’s probably acceptable to run multi-megabyte SQL statements that take many minutes to execute. So this is an incredibly flexible way to mix and match best-of-breed cloud services for your work.

Join The Discussion

Your email address will not be published. Required fields are marked *