Learn more >
Artificial intelligenceData scienceMachine learning
by David Nugent Updated August 1, 2019 - Published August 2, 2019
The global market for artificial intelligence products is supposed to grow roughly 10 times by 2025 to almost $120 billion, according to market research firm Tractica. Many companies are attempting to capture that market, including IBM with its Watson™ suite of developer tools. I spoke to my colleague Upkar Lidder about how to adapt developer-relations strategy to current and future generations of developer-facing AI products.
Upkar Lidder is a full-stack developer and data wrangler with a decade of development experience in a variety of roles. He speaks at various conferences and participates in local tech groups and meetups. Upkar went to graduate school in Canada and currently resides in the United States.
There’s a lot of learning, trial, and experimentation with AI and machine learning. The goals for AI projects may be vague: “reduce the number of customer complaints,” for example.
By comparison, classical software development user requirements may look like “give me a dialog box with a button on it” — specific and well-defined. Of course, there is a lot of user research and design that goes into the software spec to get to that point, and as a developer, you work to that spec. On the contrary, as a data scientist, you may only be pointed to an unstructured data set, then the real fun starts: You start exploring it! I love the data-wrangling aspect of AI development. You can get into a Jupyter Notebook and start exploring specific outliers, shapes of data, types of data, and see how the data looks through different visual representations.
Then you make decisions. What do I do with the missing values? How is that going to affect my projected outcome? Even in these first two stages, there are a lot of unknowns. In software development many programmers walk a well-worn path which their colleagues and predecessors have paved since decades. In data science, you have an exploratory period where you try to find a path to take. Once you’re done cleaning and transforming, you choose an appropriate modeling technique and proceed with your analysis. A lot of that exploration is brute force. XKCD has my favorite cartoon on data science.
Like I said, some of data science is just brute force. Even with helper libraries and frameworks, you have to sketch out an educated starting point yourself and let the library do much of the rest on its own. Afterward, you’ll analyze how the results compare with your other benchmark algorithms and repeat the procedure.
It’s a great question: how well do you want to be able to explain your thought process and decisions to business users? Some models like decision trees are easy to explain, whereas something built with neural networks or ensemble models, your models can get more complicated and harder to explain. Compare this to traditional software development: except for some tricky bugs, problems of explanation like that just don’t happen.
Now with the more advanced systems like AutoAI, you give the data to the system, and it will take care of more of the heavy lifting on your behalf. For example, I’m working with some data scientists on a project analyzing NPS scores for some internal departments. We’re building a system where, as a support call is going on, the system can identify red flags in the call that show it “going downhill” and alert a manager while the call is still in process. We have access to data points such as call length, customer tier, and sentiment analysis, so we can use this data to automatically flag issues before they explode. Interestingly, we tried running AutoAI on the data — the data scientists didn’t like it! The main issue is that it can be a bit of a “black box,” and the scientists wanted to be able to explain how they reached their conclusions.
In the annual data science survey, one of the biggest gaps in data science is skillsets. So, on the one hand, we need black box systems like this where you don’t have to have a Ph.D. in math to understand why the system works; it will do feature engineering, Hyperparameter optimization — at the same time, the data scientists are not fully trusting it.
I joined through the support group at IBM, so I’d get calls from clients around the world with issues and try to help them out. I was Level 2-3, so the problems would be escalated to me. So the customers were already angry by the time they talked to me! In a lot of ways, I feel that the beginning role was similar to what I do now. I talk with developers and try to figure out how to help them, even though I approach that from an education perspective more than support. Then I was a Java developer, building products with Eclipse. From there I went to a client-facing technical role working on client projects, so very different from product development. From there I became a functional lead, which is essentially a project management role. I had a team of developers that I’d work with to scope solutions and ensure they were delivered on time. After two years of that, I moved into DevRel.
Before working in developer relations, I would enjoy mentoring coding school and bootcamp students on the side; so when this developer-relations job came up I thought, “Wow, it would be great to do that as a job and get paid for it!”
With AI/ML, you have to do — less talking, more doing. For other software development topics like serverless, you can have a longer lecture and then get into a demo. With AI/ML, there’s an emphasis on experimentation. You have to get your hands dirty or it won’t work. I love Jupyter Notebook because you can do something, see the causation, see the result, and only then think about why.
I feel like there’s more abstract theory, math, and intuition behind data science. You can always memorize a formula, but to be able to get an intuition about something, that is ideal. And that comes from experimentation. Through visualization and plotting, you can understand the math behind the different data science concepts. Contrast that with something more DevOps-oriented — it’s a different approach. So in data science and AI developer relations, you have to make sure the attendees are doing something and engaged. Otherwise you lose them very fast — because there’s math involved!
One of the things that’s worked for me is to put a lot of time into my workshops, explaining every step in great detail. In my slides, I’ll use arrows, annotated rectangles, and the like to ensure that the students are able to follow along easily and naturally. When I teach Jupyter Notebooks, I craft half-baked solutions, where I build out a solution that works to a certain point and then the next two cells would be questions: find the frequency of the data we just queried. You can do a demo, where you do and they watch, then you can do a follow-along, where you both do at the same time, and finally, you walk through an exercise method, where they do the work first. The last two are most useful for data science concepts.
The top five things that work for me in workshops:
Fortunately, the DevRel world is filled with people I look up to! Some of the names that come to mind are:
September 10, 2019
Artificial intelligenceData science+
Evaluate a model's performance
Back to top