Developing a gen AI application using IBM Granite Code

Code assistants are all the rage, for good reason. These tools boost productivity by automating repetitive tasks, offering helpful nudges, and saving developers from having to constantly copy-paste code from Stack Overflow (because let’s be honest, we’ve all been there). One of the most exciting developments in this space is the rise of IDE plugins that run AI models locally on your machine. Beyond skipping paid API calls, local models enhance data privacy for companies or individuals that don’t want their code shared with third-party services.

In Gabe Goodhart’s tutorial, Build a local AI co-pilot using IBM Granite Code, Ollama, and Continue, we showed you how to quickly get up and running with an AI-powered development environment using open source building blocks. This tutorial builds upon this setup to show how developers can level-up their day-to-day work with open source AI.

Prerequisites

Following the steps in Gabe’s earlier article, I configured my environment using the granite3.1-dense:2b model for auto-completion in Continue (an open source AI code assistant IDE plugin), and granite3.1-dense:8b for everything else.

Sample Project: Meeting Summarizer

This tutorial walks you through some practical examples of using Granite Code while building a simple application.

I built a Proof of Concept (PoC) meeting summarizer to tackle the challenge of catching up on lengthy meeting replays. The concept was straightforward: upload meeting transcripts, generate a summary using an AI model, and store it for quick access to key points. For this project, I used Python and Flask to build the API, with Redis as the data store, and relied on the IBM Granite Code model as my code assistant throughout the process.

Steps

Step 1: Designing and building the API

Before starting, I thought about the API I wanted to create. I started up a session with Open WebUI (an open source browser-based AI model chat interface) to iterate on a RESTful API design for my application. I then asked the Granite 3.1 8b model to critique my proposal.

Here was my prompt:

You are a software engineer who is going to critique the design of a REST API I am proposing. Please succinctly critique the structure of the API, its inputs and output, and it's compliance with RESTful best practices. The purpose of this REST API is for managing meeting transcripts. With this API, a user uploads a meeting transcript. Subsequently, a summary of the transcript is created and stored along with the original transcript and some metatdata. Here is the API design that I propose:

Endpoint: /transcripts/
Methods:
  POST:
Purpose: Create a new transcript - Which saves the transcript, its summary and its metadata
Request content: A text file attachment
Response content: An ID of the new transcript record
  GET: 
Purpose: List all available transcripts
Request content: None
Response content: A Json object with metadata about all transcripts

Endpoint: /transcripts/:transcript_id 
Methods: 
  GET: 
Purpose: Get information about a meeting transcript
Request content: None
Response content: A JSON object with metadata about the transcript
  DELETE: 
Purpose: Delete a meeting transcript
Request content: None
Response content: Just an affirmative response code
  PUT: 
Purpose: Provide an updated transcript in order to produce a new summary
Request content: A text file attachment
Response content: Just an affirmative response code

Granite had some critical feedback for me mixed with a dash of encouragement.

alt

I accepted its feedback, and asked it to generate some scaffolding implementation of the APIs using Python/Flask. My query to generate code, and the resultant code

I took the generated code and copied it over to a new project in my IDE to begin development.

Step 2: Fill in some functionality

Now that I've got a basic scaffolding, it's time to start filling out the meat of the functionality. The first obvious task is filling in the piece that actually makes the call out to an inference endpoint where I can run the summarization task. Because I am already running ollama on my laptop (from completing the prerequisite tutorial), I will hit the same ollama endpoint in order to generate the summary.

In my IDE, I'll use the Continue plugin so that everything I'm doing is in one place. I asked my local granite3.1-dense:8b model to generate a method that would summarize the contents of a file using the Ollama API and save the summary in a new file. Here's what the interaction looked like:

The Continue plugin in my IDE where I chat with the Granite model to generate code

As shown, I instructed it to create a summarize() method that takes in a file path and returns a summary of the contents of the file after making a call out to an Ollama API. Since the Ollama API is fairly simple, I pasted in an example request and response so it knew how to structure the API call.

Step 3: Generate test cases

Next, I wanted to make sure the summarization worked before getting too far. So, I turned to my code assistant to generate a pytest case for uploading and summarizing a new transcript. The AI model assisted me in creating a pytest case to test the create_transcript() function. Here's how the pytest function was generated:

alt

The initial test case it developed was fine enough, but it chose to use a Fast API client to talk to my Flask server. I asked it for an explanation of why it chose to do that, but I ultimately asked it to revise the test case to use a Flask client instead, and it happily obliged.

Step 4: Iterating and refining my code

With the basics in place, I continued building out the app by blending my own hand-written code with AI-generated snippets. The code assistant was particularly useful for handling boilerplate tasks like setting up data persistence in Redis and generating a simple HTML/JavaScript UI to interact with the APIs. Other areas where the code assistant was super helpful was in code completion and troubleshooting.

Code completion

One of the more surpisingly delightful features of an IDE-embedded code assistant is its ability to do auto-complete on both code and documentation. All throughout the process, as I am typing up code or documentation, my code assistant was able to suggest completions that left me wondering whether it had access to my brain (which, thankfully, it does not). Case in point, as I am writing the content of this tutorial in Markdown in my IDE:

alt

Stop messing with me, Granite! But in all seriousness, from a code perspecitve, it was quite helpful. For example, when I was trying to add logging to this python function, it would auto-suggest appropriate messages:

alt

Troubleshooting

My local code assistant was there to help me with debugging errors. When my test cases innevitably failed, I was able to ask it questions about why there were failures and how to fix them. Here's an example of where I fed it an error stack trace and it identified I was inserting wrongly formatted data into Redis and modified the code:

alt

The culprit code:

alt

The amended code, generated by Granite:

alt

Step 5: Documenting my code

Lastly, I asked the Granite model to generate a README for the application, and it gave me a solid draft to build on. After some minor tweaking, I had a functional and (gasp!) well-documented project. The screenshot shows how Granite generated the README based upon references to the code.

alt

Below you can also see that the Granite model had indeed auto formatted the content into Markdown as part of the process.

alt

Key takeaways

Here are the key takeaways from my experience using Granite Code as a local AI assistant:

Granite Code can be run anywhere, but we’re keeping it local: This setup maximizes data privacy, eliminates dependency on third-party APIs, and allows me to work offline.
Fully open source, inside and out: One of the standout features of Granite Code is that it’s fully open source, not just the weights but also the data it's trained on. For developers who value openness and wish to avoid proprietary black-box models, Granite is a rock solid choice. (See what I did there?)
Productivity boost: The productivity gains were real. Whether it was generating boilerplate code, offering intelligent feedback during development, or handling more tedious tasks like debugging and documentation, Granite Code was like having an extra set of (really smart) hands. It auto-completed code, suggested logging messages, generated tests, and even drafted documentation—freeing up mindshare to focus on the bigger picture. Best of all, it cost me zero dollars to use.

So, if you’re a developer curious about local AI assistants, grab your IDE, follow along, and see how much time you can save with a little AI-powered magic on your side. Just be prepared for your assistant to occasionally suggest something eerily on-point leaving you wondering if it's reading your mind. But don’t worry, it’s just really good at code.

Next steps

Check out more articles and tutorials on Granite models, including Granite Code, on IBM Developer.

View the completed sample Meeting Summarizer project on Github.

Try watsonx for free

The Granite models are all available in watsonx.ai.

Build an AI strategy for your business on one collaborative AI and data platform called IBM watsonx, which brings together new generative AI capabilities, powered by foundation models, and traditional machine learning into a powerful platform spanning the AI lifecycle. With watsonx.ai, you can train, validate, tune and deploy models with ease and build AI applications in a fraction of the time with a fraction of the data. These models are accessible to all as many no-code and low-code options are available for beginners.

Try watsonx.ai, the next-generation studio for AI builders.