Your New Junior Colleague

26 Mar, 2023

The list of services that now include some kind of AI is mind-boggling. In March, the same month GPT-4 was announced, I've also heard about:

I mean, I've heard of dozens more; the above list is just a sampling of productivity services that have integrated AI. While I don't hold the position that AI is going to take everyone's jobs, it's hard to avoid the pesky reality that work is about to change — a lot.

Adobe, Microsoft, and Salesforce are feeling out how their products fit into a world where AI is just part of everyday life. As I watch this play out, I've found myself obsessed with just how it should look like to have AI in our daily work.

Don't Give This Assistant Your Credit Card

Right now when we talk about GPT-4 or some other AI tech, we use the metaphor of the "assistant" - someone who performs tasks on your behalf. We treat AI like a secretary when we use an AI transcription service to take meeting notes or dictate memos. We treat it like an unpaid intern when we ask ChatGPT to read a long text and summarize the key points. This is a natural extrapolation of how we think of tech and automation in general — as machines that do stuff for us.

I think this metaphor sucks. First, it implicitly assumes that we transfer some amount of agency to AI when we use it; that we're putting AI "in control", even momentarily. With its new plugin feature I mentioned above, you can ask ChatGPT to do something, and it will call the appropriate services to actually do it. This has led some software engineers to wonder if OpenAI just wrote the last application. Don't give this assistant your credit card¹.

Your New Junior Colleague

The second reason I dislike thinking of AI as an assistant is that it overlooks a core difference between AI automation and plain-old software automation: humans can't build confident mental models for AI.

For most of the systems we interact with, we have some concept of what it's doing. That concept may be generalized or outright wrong - it doesn't matter - you use the mental model like a toy version of it in your head, making a plan as you use the real tool. For instance, when you turn the steering wheel in a car, you expect the wheels turn. This mental model is consistent with the operation of the car, even though the power steering does most of the actual turning of the wheels. To keep the mental model consistent and give the driver a sense of the road conditions, most car manufacturers add artificial feedback of forces acting on the wheels.

As we go forward with AI, it's important that we deeply understand which parts of our tools need to be concrete and which can be fuzzy; which require a clear mental model and which do not. While the latter can be provided by AI, the former should never be. Note this is more about perception than actual reliability — I'm squeamish about self-driving cars, even though I know that it's likely that self-driving system will likely cause fewer accidents than humans. Users expect certain parts of their systems to be grounded and rules-based, and will lose trust fast when those rules are broken, even slightly.

As such, I'm a bigger fan of metaphors that treat AI like something that works inconsistently. At first I thought of droids from Star Wars - they're fickle, require maintenance, and in the case of C3-PO, often uselessly verbose. But I landed on the idea of the "colleague" — someone who works with you, not for you. This colleague will make mistakes, just like a human, so you'll want to check out their credentials before hiring them. And you'll need to check their work. And you'll think twice before increasing their responsibilities.

Leave the Basics to the Robots

As a toolmaker, a big part of this that interests me is how it can guide decisions on where to integrate AI. Specifically, I think it gives a good answer to the question "should I have AI help the user with this task?" When the mental model of a tool is core to carrying out a specific task, AI is inconsistent and indirect and will gum up the works. When a problem is ambiguous, AI might be able to help.