Tutorial

Reading RSS feeds and XML with GraphQL

Convert any RSS feed to GraphQL

By

Roy Derks

RSS feeds have been around since the 1990s as a popular way to get updates from a website. Many websites widely use it, and today it's becoming the primary way podcasts distribute content around different streaming platforms. Unlike modern APIs, RSS feeds return their data not in JSON but in XML. In this tutorial, we'll be looking at how to read RSS feeds (or, XML) with GraphQL.

You can find the complete code for this tutorial in our Github examples repository. And, you can watch the video walkthrough on our Youtube channel.


Video will open in new tab or window

Notice: StepZen was acquired by IBM in 2023 and renamed to API Connect Essentials. Some artifacts still use the StepZen name.

Exploring an RSS feed

What is an RSS feed exactly? RSS stands for "Really Simple Syndication" and is a format used to deliver frequently updated content, usually news headlines. It's an XML-based format that allows you to subscribe to a website's feed and receive updates whenever there are new posts or podcasts. The RSS feed is a standard way for a website to keep its content in sync with readers' devices.

Lots of news outlets and online publishing platforms have an RSS feed, and most of them are public. So is the case for Medium, one of the biggest online publishing platforms on the internet. Medium has a REST API, but it requires you to request an API Key; therefore, the RSS feed is the best way to get updates from Medium. It's simple and fast, and it updates you whenever a new post is published on your favorite Medium publication. Every Medium user has their RSS feed, which looks like this:

https://medium.com/feed/@username

It contains information about the owner of the feed, the author in this case, and all the posts this author has posted to Medium. The documentation for Medium RSS can be found here, which also includes information about getting RSS feeds for Medium publications or tags.

If you don't have a Medium account, you can open the following RSS feed in this browser: https://medium.com/feed/@roy-derks. This feed contains all the blog posts I've posted to Medium and looks like this when you open it in your browser:

Medium RSS feed displayed in Chrome

The response is in XML, and the posts can be found in the item elements between the channel opening and closing tags. In this element, you can find the title, author, category, URL, and post content in HTML. This feed can be read through the browser, RSS feed readers, or an HTTP request. But in this tutorial, we'll be using GraphQL instead!

Connecting to an RSS feed

We'll be using IBM API Connect Essentials (formerly StepZen) to make it possible to connect to the Medium RSS feed using GraphQL. With API Connect Essentials, you can create a GraphQL API for all your existing data sources, including REST or SOAP APIs. An RSS feed is very similar to a SOAP API, as both return a response in XML.

To create a new GraphQL API with API Connect Essentials, you can use the CLI to import a data source or use GraphQL SDL to configure a GraphQL schema. We'll be doing the second, meaning you need to create a new directory on your machine and place two files in there: index.graphql and rss.graphql. The first is a configuration file that links to the rss.graphql, which contains the connection to the RSS feed.

  • index.graphql

      schema @sdl(files: ["rss.graphql"]) {
        query: Query
      }
    
  • rss.graphql

      type Query {
        getPosts(username: String!): JSON
          @rest(
            endpoint: "https://medium.com/feed/$username"
            headers: [
              { name: "Content-Type", value: "text/xml" }
            ]
            transforms: [{ pathpattern: "[]", editor: "xml2json"     }]
          )
      }
    

To create a GraphQL API based on this schema, you must have the API Connect Essentials CLI installed. After installing the CLI and creating an account, you run this command from your terminal or command line:

stepzen deploy

The CLI asks you what you want to call the endpoint of the GraphQL API (in our example, it is api/with-rss). Then, it deploys the GraphQL API to a private endpoint protected with your API Connect Essentials API key.

In the next section, we'll use the GraphQL API to read from the Medium RSS feed.

Querying an RSS feed with GraphQL

We've already created the GraphQL schema and deployed it in the previous section. After running stepzen deploy, the GraphQL API is available at the endpoint that is displayed in your terminal or command line. It looks something like this:

https://YOUR_USERNAME.stepzen.net/api/with-rss/graphql

We'll be using the HTTPS endpoint to explore the GraphQL API using the API Connect Essentials dashboard. In the Explorer, you can paste the following query to get the information from the Medium RSS feed:

query {
  getPosts(username: "@roy-derks")
}

You can change the username value and add your own Medium username.

From the Explorer, you can see a similar result like this:

Medium RSS feed displayed in GraphiQ

The response is in JSON instead of the XML response the RSS feed returned earlier. In the GraphQL schema, we've configured the @rest custom directive to transform XML into JSON. This configuration is done within transforms: [{ pathpattern: "[]", editor: "xml2json" }], where xml2json is used for the transformation.

To dynamically select the fields the GraphQL API returns, we need to set a custom response type for the getPosts query. In the last section, we'll use a tool called JSON to GraphQL to generate the response type.

Creating a custom response type

The query getPosts is now returning the response of the RSS feed with the response type JSON. We want to create a custom response type, so we can dynamically select the returned fields. First, we'll limit the fields that are returned as we only need the author information and posts located in the channel element of the RSS feed.

In the @rest configuration, you can add a resultroot field to just return the channel information:

type Query {
  getPosts(username: String!): JSON
    @rest(
      endpoint: "https://medium.com/feed/$username"
      headers: [
        { name: "Content-Type", value: "text/xml" }
      ]
      transforms: [{ pathpattern: "[]", editor: "xml2json" }]
      resultroot: "rss.channel"
    )
}

When you run the query from the previous section again, you can see that the GraphQL API returns only the channel information from the Medium RSS feed. We can use the response of the GraphQL API to create the custom response type for the getPosts query.

To convert the JSON to GraphQL you can use this tool where you can copy-paste the complete JSON response on the left side of the page. On the right side, the GraphQL types for the response will be generated:

Convert JSON types to GraphQL SDL

You can copy the generated GraphQL types into the top of your rss.graphql file.

type Image {
  link: String
  title: String
  url: String
}

type OneItem {
  category: [String]
  creator: String
  encoded: String
  guid: JSON
  link: String
  pubDate: String
  title: String
  updated: DateTime
}

type GetPosts {
  description: String
  generator: String
  image: Image
  item: [OneItem]
  lastBuildDate: String
  link: JSON
  title: String
  webMaster: String
}

And link the type GetPosts to the query to retrieve the posts as its response type.

type Query {
  getPosts(username: String!): GetPosts
}

This tells the GraphQL engine what fields are available to query from the RSS feed. Instead of getting all the data from the Medium RSS feed at once, you can now dynamically select which fields to return. For example, if you only want to get the title and category for the posts, the following query does just that:

query {
  getPosts(username: "@roy-derks") {
    item {
      title
      creator
    }
  }
}

From the GraphQL schema or the GraphiQL IDE, you can see what other fields can be returned. With these steps, you've created a GraphQL API that connects to a Medium RSS feed.

Conclusion

This tutorial explained how to convert an RSS feed (or XML API) to a GraphQL API. We used the Medium RSS feed in our example, but you can use any other API that returns XML, even SOAP APIs.

We would love to hear what RSS Feed (or SOAP API) you tried converting. Join our Discord to stay updated with our community.