This article is a blog version of my talk “An AI, a software engineer, and a security researcher walk into a bar…” presented publicly at the 2024 Pride Summit.

An AI, a software engineer, and a security researcher walk into a bar… It starts as a joke, and ends with concrete recommendations to build effective teams that can shape the future of AI products.

Collaboration is the tool

You can hardly have missed the on-going industrial revolution caused by GenerativeAI (genAI) these past 18 months. GenAI is now everywhere, even on this blog’s banner! I have written about what is AI and what it can and cannot do. In my view, AI is a large tool box of algorithms. These algorithms can perform increasingly long lists of tasks, more and more efficiently. It’s all about having the right tool for the right task.

As an AI researcher, I have worked on a wide range of projects: satellite coordination, sport nutrition simulation for high school students, bird song recognition using Bayesian Network, data mining for student success. I could go on… In each of those cases, collaboration with the subject matter expert (SME) was the cornerstone of the project’s success.

With the advent of genAI, AI practitioners and SMEs have to come together.

Diverging views

However, it is often that AI experts and SMEs have diverging views. In my current career as an AI researcher applied to cybersecurity, I have observed that those views diverge the most between AI experts and security researchers. Let’s look at a few of them:

Expertise - Both AI and cybersecurity require highly specialized expertise, but they are also very siloed from each other. There are very few overlapping concepts, methods, or tools between both worlds.

Data - Data scientists often view data as inert and innocuous, whereas security folks see it as potentially dangerous, requiring special infrastructure to isolate and manipulate safely.

Software Engineering Practices - The ubiquitousness of Python and notebooks has encouraged weak software engineering practices among AI practitioners. This is in opposition to security engineers, who tend to be paranoid to a fault about their code and practices.

Risk Acceptance - Machine Learning algorithms are conceived to be Probably Approximately Correct, that is, they likely (probably) produce results close to (approximately) the correct answer. This is at odds with the cybersecurity culture, which considers uncertainty as highly undesirable.

Infrastructure Maturity - The infrastructure of AI development, and the MLOps tools for maintenance and observability are barely emerging and there is still a lot of ground to cover. On the other hand, security has a mature tooling suite.

Governance Maturity - The governance of AI systems is in its infancy. Meanwhile, cybersecurity has a decades-long culture of information sharing, reporting, auditing, and frameworks such as NIST or CVE.

With those diverging views and objectives, can security and AI folks ever see eye to eye? In my experience, they can, but it takes work, so let’s dive in!

Step 1: Build familiarity, not fluency

A series of robust studies by IBM research has shown that one of the main reason AI projects fail is the gap in mental model between data scientists and their stakeholders. A mental model of a project is everything you believe and know about a project. So how do we align mental models, especially between experts with such diverging views as AI and security experts?

Become familiar

Becoming familiar with the concepts, methods, and tools in both directions, is the crucial step to build trust across disciplines. It reduces the knowledge gap, and helps build this common mental model about the project, and thus reduces the risk of bad surprises.

This can be done by creating or providing introductory materials. My Introduction to AI blog is actually an adaptation of a short training I made for my colleagues.
Reading groups are also useful, and give an opportunity for people to share papers, tutorials, or videos about AI concepts/tools that they want to learn more about.
Hold regular demo sessions where each expert can dive in a current project in an educational way.
Direct mentoring is also very effective, even among senior roles. I have mentored multiple engineers to use simple classifiers to automate some tedious tasks, and they took this knowledge with them into their future roles.

Don’t become fluent

However, it is not about fluency and becoming a 2-hat expert. It is a cliché to say that the technological landscape is becoming more complex every day, but it is nonetheless true. It is an unrealistic expectation, and quite frankly, bad leadership, to try to hire a unicorn or expect people to become one. AI and cybersecurity are very large complex fields, let people be experts in the field they have already invested their time and effort, let them collaborate, and the sum of the whole will be greater than its parts.

Map your world

Familiarity is also built by mapping and aligning terminology. On a recent project on SQL injection detection using ML, I irked my security colleagues because I kept saying that this was a “threat detection model”. It turns out that a Threat Model is a very specific thing in Threat Intelligence. Name collision IRL! “Performance” is also a word that ignited much debate… There needs to be a conscious effort to define and align the business vernacular.

Explain the “why”

Finally, educate everyone on the “why” of the AI development lifecycle. The AI development lifecycle is different than the traditional software lifecycle. First, the feasibility of a project is often unknown, and the feasibility assessment may require complex discovery work. Second, the overall result and success of the project is also unknown. Actually, there is no guarantee of success, even partially.

It doesn’t work but it’s not broken - A data scientist

Third, the benchmarking of the model will take a significant amount of time and resources, especially if it requires labeling. And finally, the lack of maturity of the AI infrastructure means that the coupling of data and software is still a non-standard problem. Engineering teams will spend a significant amount of time investigating new solutions.

All of those (and some more) need to be explained and agreed upon with stakeholders, SMEs, management and leadership alike.

Step 2: Pool People Together

The familiarity must not only be technical, it must be personal as well. When people are in different teams, or worse, in different organizations altogether, their priorities and roadmaps are going to be different or out-of-sync. Maybe even competitive with each other.

One thing that I have learned while working at Auth0 by Okta, is that you need to pool people together literally and figuratively. The company invests immensely in creating strong connected teams by sending their teams to a large off-site every year, in a nice place with a pool. In the talk version of this blog, this joke makes more sense 😅. These off-sites are devoted to connecting people and giving them the space and time to plan, strategize, network, find solutions, imagine the next thing. In my experience, these have been incredibly valuable, even to resolve interpersonal communication issues. Many projects have been started, fixed, or improved in that pool. 🏖️

Step 3: Build Together

Of course, if your budget does not include sending people to a resort for a week, there are cheaper options to build an esprit-de-corps. The traditional team building activities - in-person or online - abound: cooking class, painting class, food tours, escape rooms, wine tasting, solving an ancient egyptian murder mystery… Your budget is the limit. 😄

Companies should organize hackathon very regularly, and encourage their experts to pool together, build together, to further this familiarity and common mental model. In my experience, these hackathons often led to new products or improvement on the product, as well as patents at the frontier of AI and cybersecurity. But even simpler than that, within a pool of expertise, do mini-hackathons. Over a couple of days, encourage people to work together on a discovery work on a topic of their choice. Then, present this discovery work and discussed it within the group. Going back to the AI lifecycle, there are many areas which can use a pre-assessment for feasibility, cost, architecture etc. In my experience, these mini-hackathon generate value, innovation, and further foster this shared mental model within a cross-disciplinary group.

The AI development cycle with SME-in-the-loop

So far, we have seen how and why to pool people together, and align their priorities and objectives. This will build trust and respect. But then what? How do you actually build something together? In my experience, you improve the chances of your project to succeed by implementing an AI development cycle that includes an SME in the loop.

The AI development cycle with SME-in-the-loop

Let’s imagine that we want to build an e-mail spam filter. Our AI expert makes a model based on common knowledge about spam (i.e., presence of typos, unknown e-mails, etc.). The output of the algorithm is then showed to the SME. In this case, I am not talking about performance metrics and accuracy. I’m talking about actual output, like an e-mail and its label. The algorithm should provide as much context as possible about the decision to label the e-mail as spam/not spam. The SME (now familiar with AI concepts and methods) can start building a mental model of what the algorithm is doing. In turn, this should lead to more trust in the AI model. The SME needs to provide detailed feedback to the AI expert. What was correct/incorrect, and why. Then, the AI expert goes back to the drawing board and improves the model.

Model refinement is definitely something that most data scientists should be familiar and comfortable with, so instead, I will dig into the harder stuff: The context of the decision and how to collect feedback from SMEs.

Context of the decision

One weak software engineering practice I find in data scientists, is the tendency to abstract the AI’s algorithm away into a black box. You throw data into this magical black box, give it a good shake, and get a magical answer. It is simply not how it works. An AI is an algorithm, which given an input, follows a mathematical or statistical process, and outputs something desired. Data scientists must be trained to use eXplainable AI (XAI) methods. XAI methods are mathematical, statistical, and qualitative methods to help the algorithm formulate an explanation as to why the input led to this output. Not every model is explainable (cough LLMs cough) at the moment, so the explainability of a model should absolutely be part of the tripartite decision between AI experts, software engineers, and security researchers (or whomever your SMEs might be) when creating an AI product.

XAI is a maturing field of research, and there is no need to reinvent the wheel. Leverage the extensive Human-Computer Interaction (HCI) literature that exists on the topic.

Here is a few of my favorite papers on the topic:

Collect Feedback

In a recent project on SQL injection detection, I applied this AI development lifecycle, and had my security researchers inspect URL requests predicted to be SQL injection. This was a tedious labor of love. At each iteration of the loop, they inspected nearly 500 URL requests. How did they do it without murdering me? 🔪 I used the HCI best practices regarding collecting feedback from SMEs. Once again, there is no need to reinvent the wheel, when extensive research has been done on how to best build feedback systems. In the case of our SQL injection detection system, the recommendation was not to use a spreadsheet, but rather to build a small app, which allowed the use of shortcuts to label and annotate, as well as being able to investigate a suspicious string’s greater context directly in the app, rather than having to query it separately.

Once again, here is a few of my favorite papers on the topic:

Step 4: Plan Together

To build together, you need to plan together. This is my final point, but not the least. Roadmaps, architecture diagrams, requests for comments, no matter what software development documentation framework you use, you should systematically have each area of expertise represented and resourced. How many times have I seen an ML project fail to go to production because the POC worked in a sandbox, but couldn’t work in the existing production software. To avoid wasting time, resources, money, and talent, plan together. Going back to my previous point about aligning priorities, this is infinitely easier to do when all of your experts are already pooled together in the same structure.

The Great AI Industrial Revolution holds many promises, but it will not happened without the Great Collaboration, so let’s talk effectively with (not just ’to’) each other!