For college students, landing a summer internship or a first full-time position is a big deal. But it’s not just about getting any job, it’s about getting the job. With countless qualified students vying for limited openings, how does one stand out? How are candidates getting those highly coveted positions?

Inspired by these questions, Tyler Spagnolo, Shane Hepner, and I created Aspire. Aspire is a tool to help college students obtain their desired internships and entry-level jobs. As Penn State students in search of our own internships, we realized the value in discovering the pathways that led successful students to the jobs we sought, and we wanted to use these insights as a way to define pathways for others.

We entered Aspire into the Nittany AI Challenge, sponsored by the Penn State EdTech Network. The competition challenges teams to develop artificial intelligence-based solutions that improve the student experience at Penn State, solve real-world problems that the university is facing, or generate innovative start-up ideas. In March 2018, we presented a prototype of Aspire for the second round of judging and received funding for the creation of a minimum viable product (MVP), which we are currently working on and will present in September 2018.

Here’s a more in-depth look at Aspire and how we used IBM Watson to develop it.

On the front end, Aspire lets a user search for a specific company and position. It returns the pathways that others took to get that job as well as summary statistics, such as top majors and skills. Conversely, students can explore the database of pathways with filters, such as year or major, to view possible career paths that might be similar to theirs.


To develop Aspire, we needed information on the experiences that helped other Penn State students obtain the jobs we were after, including their involvement in clubs, previous internships, and applicable skills they possessed. We realized that most of this information is stored on someone’s résumé. However, in PDF format the data is unstructured.

This is where Watson comes in. To extract this valuable data from a person’s résumé, we created a machine learning model with Watson Knowledge Studio.

First, we built a Type System with the entity types Education, Organization, Position, DateRange, and Skills. We also included the relations Held_At, which relates a Position to its corresponding Organization, and Held_For, which relates a Position to the appropriate DateRange.

After running a Python script to convert PDF résumés to text files, our team annotated a sample corpus of text résumés, marking instances of the entity types and relations between them. For our initial prototype, we used a small set of 25 résumés to train the machine learning model. Despite using very few training documents, we achieved F1 scores of 0.75 and .58 for entities and relations, respectively.

Next, we deployed our machine learning model in Watson Natural Language Understanding, which lets us extract valuable data from a large number of résumés as we develop our prototype into a minimum viable product. Users in our application can then query the structured data produced by Watson.

We recognize that Aspire does not necessarily provide a holistic approach to getting a job – there are a lot of other factors involved, such as networking and soft skills – but we hope to create a tool that assists students in making data-informed decisions about their college career. Thousands of students go through Penn State every year, and our goal is to harness the data in the pathways they’ve taken to help other students achieve their goals.

Watson Knowledge Studio and Natural Language Understanding have been instrumental in the development of Aspire. Our team, consisting of first- and second-year undergraduate students, had little experience with artificial intelligence and machine learning prior to the Nittany AI Challenge, but Watson Knowledge Studio made the process simple. The ease of annotating documents, training the model, and deploying it to Watson Natural Language Understanding has allowed us to bring our idea to life.

Join The Discussion

Your email address will not be published. Required fields are marked *