State of Sustainability - Centaurs & Cyborgs: separating AI potential from AI fantasy

What you'll learn

Shadow AI is already here — 90% of companies have employees using personal AI accounts at work, pilots or not.

Most people use AI wrong — Treating it like Google leaves the majority of the value untapped.

Builders beat buyers — Companies that built their own AI tools consistently outperform those that bought off-the-shelf.

Sustainability's natural fit — Unstructured, hard-to-collect data is exactly where Gen AI delivers the most value.

Trust requires testing — Build evaluations (question → expected answer sets) and run them every time your AI system changes.

Links & resources

Listen to this article

Guest: David Sykes, Former Head of Data Science, Octopus Energy

‍

Transcript:

SAIF: On this episode, I speak with David Sykes. David grew and led the data science team at Octopus Energy over an eight-year period. This spanned their days as a small Soho-based startup with 100 people to their current position as the UK's largest supplier of household energy.

Many of you need a stronger perspective on AI and its impact on sustainability teams. David and I talk about AI tools in general and which business functions are most likely to benefit from them. We discuss specific areas where sustainability teams will see rapid value, and some levers that might be harder to pull. We also throw in actionable advice for sustainability professionals.

I started the conversation by asking David for an overview of where AI brought value to Octopus Energy.

DAVID: When I think about AI, I think about two flavours. It's become a bit murky since ChatGPT arrived on the scene and generative AI went mainstream.

You've got traditional AI and then generative AI. Gen AI is what 99% of people think about — typing something into ChatGPT and getting an answer, creating an image on Gemini, generating a video, text-to-speech, that kind of thing. Traditional AI is where you typically have one very specific use case that you've trained an algorithm to solve.

Most of my career at Octopus — most of those eight years — was in the traditional AI paradigm. It was pre-LLMs being accessible to the wider market. We were using traditional machine learning and neural net algorithms to solve specific business problems: determining whether someone was likely to pay, classifying customer service messages for better routing, and a lot of time-series forecasting. An energy supplier's main job is to balance supply and demand and not go bust, so there was a huge amount of work there.

Towards the back end of my time, ChatGPT dropped. We were one of the very early companies in the UK to throw it straight into customer service. Greg was very clear that he thought this was the future and was willing to try things — a lot of other companies were terrified of what might happen if you put a robot in the mix.

So we started expanding our AI work into customer service. It was never answering a customer directly while I was there, but the question was: how do you augment a customer service agent so they can answer far more questions, far more accurately, while maintaining all that beautiful human touch that Octopus was famous for?

SAIF: Thanks, David. Super insightful, and a really nice way to ground us. Moving to a big-picture question — orienting this towards sustainability functions — when it comes to AI use cases across business functions, what are you seeing? You touched on a few in the Octopus context: pricing, cost management, customer service. Are there certain functions within a traditional business — think of a consumer packaged goods company doing procurement, sales, IT, and so on — that you think have the most potential for change?

DAVID: I think what's very interesting right now is that Gen AI is absolutely everywhere. This week Sam Altman announced that OpenAI is at 800 million weekly active users — a terrifyingly large number. WhatsApp now has Meta AI built in with three billion users, including in many of the lowest-income regions in the world. Anecdotally, I know it's everywhere because my parents are using it, and when your civil servant friends start talking to you about a new technology, you know it's permeated every corner of every dusty institution.

In business, usage is broadly twofold. There's what I'd call sanctioned use — where a forward-thinking CEO or CTO has bought a tool, maybe an enterprise OpenAI licence or a specific SaaS product, and given everyone access. And then there's the unsanctioned underground economy of Gen AI — basically everybody using their personal accounts to help them do their jobs without their company knowing.

Right now, formal sanctioned use is around 40% according to an MIT study, while unsanctioned use is up around 90% — meaning 90% of companies have employees using AI in some form. Think: HMRC staff uploading R&D claims into ChatGPT.

SAIF: Exactly.

DAVID: Someone who's got so frustrated with the Copilot tool they've been given that they just switch to ChatGPT because they use it personally. That's really interesting, because much of the reporting on success focuses on these sanctioned pilots and programmes — and a lot of it is quite dispiriting. One study suggested 95% of pilots went nowhere and aren't creating value, whereas it's quite clear this stuff is creating value. It's creating value the same way introducing Excel to your business creates value: you can't easily quantify it, but you know it's there because everyone is using it.

The third trend is that usage patterns are still extremely basic. Most people ask ChatGPT the way they'd ask Google — they just type something in as a search query. Whereas if you look at sectors that have gone deeper, coding is right at the front. The tooling is so much more sophisticated: multiple models running in the background, it can edit your files, it has all your work context, it can push to GitHub. So in terms of business impact, most functions are on a curve from "everyone's using it as a search interface" to "we have genuinely useful vertical-specific tools like Cursor or Windsurf."

In terms of which departments benefit most, there are three dimensions. First: how regulated or controlled is your function? If you're in legal working on major contracts, you're probably not letting AI do the whole job. Second: how digital or physical is your work? If your job produces code — a digital artefact — AI is incredibly useful. If your job involves visiting coffee farmers in Africa to source sustainable supply, AI can't help you much with that. Third: how relationship-based is your role? If your job is about talking to people and building long-term relationships, AI is not that helpful right now. I speak to a lot of diplomats, and I tell them their jobs will be among the last to go — it's pure human-to-human relationship work.

SAIF: I love that framework. The risk appetite filter, the digital vs. physical filter, the relationship filter. At the same time, I wonder whether there's also a workflow dimension.

The study I found really compelling was the Harvard and BCG study from a couple of years ago on the "cyborg vs. centaur" approaches. To recap for listeners: the study looked at consulting teams enabled with Gen AI versus a control group and found that, in general, teams using Gen AI performed better. But the AI-enabled teams used it in different ways.

The cyborg approach is where the human and AI go back and forth — you riff, you make adjustments, you iterate. In our business of carbon accounting, a good example is data ingestion: the AI flags anomalies and asks you to confirm or reject each one, and you go back and forth together. The centaur approach is where you split the work — this piece goes to AI, this piece stays with the human. Maybe AI matches an emissions factor to business data to run a calculation, while the human decides what new initiative to build to actually reduce emissions.

Does that split make sense to you? And would you say, looking at your three filters, that in roles like farm visits there's simply less room to apply either approach — not that one approach is better than the other?

DAVID: Yeah, 100%. I haven't heard the centaur vs. cyborg framing before, but I love it. It is absolutely a workflow thing. One study found that AI not fitting your workflow was one of the main blockers to adoption — if you're constantly copying and pasting between tools or context-switching all the time, that's no good.

There's an excellent YouTube video by Andrej Karpathy — former head of AI at Tesla — where he talks about "Software 3.0" and the need to build tools that augment humans. He talks about a generation-verification loop: AI generates something, the human verifies it, repeat.

The best current example of the cyborg approach in practice is the modern coding IDE. These tools provide the traditional interface where you can look at and edit your code directly, but they also package all the context for the model automatically — you don't have to copy-paste anything, it knows what you've highlighted and can read your entire codebase. And they have what Karpathy calls an autonomy slider. You can say, "Go refactor this entire codebase in a different language," and it'll do it — but you'd have no idea what came out the other end, and you'd never professionally ship that. Or you can say, "Let's take this step by step — you do this part, I'll do the next." That's the best live example of the cyborg approach today.

And really, all we're waiting for in sustainability, and in other verticals, is someone to build that equivalent tool. It will just take a bit of time. But you're doing it in the sustainability space, and lots of others are doing it in other verticals. A specific, vertically designed tool that enables that generation-verification feedback loop — that's what's required.

SAIF: Bringing it back to business functions — there are certain functions where all three of your filters check out in the right direction. Take procurement in a large food or apparel business: there's flexibility given the volume and variety of suppliers, it's not binary win-or-lose, it's increasingly digital, and while it's relationship-based, a lot of commodity purchasing ultimately comes down to cost, cash, quality, and timing. So procurement seems like a great candidate for huge AI value.

On the other end of the spectrum, certain parts of HR — not HR in general, but the parts focused on human empathy, peer-to-peer learning, workplace connection — are probably at the far end of the spectrum. And yet, what you and I are both saying is that probably every team could be using AI better than most people are today.

DAVID: Exactly. There are basics in any role in any organisation: information management, access and retrieval, writing things down, documenting processes. Those professional skills are present in every function, and people should be trying to use AI as much as possible to improve them. If you're not, you'll fall behind very quickly.

Obviously, if you're in the maintenance department and your job is making sure the air conditioning works, AI isn't going to help you change the physical outcome. But regardless of that, there's always going to be 10–20% of your role that's about managing the people around you, managing your relationships, processing the information flowing to you — and AI can do a lot there.

I also listened to one of your other episodes where you talked about the concept of a "change department." We're seeing the same dynamic with AI right now. Companies have an AI department whose job is to deliver AI to the whole business — and that feels like it's going to go the same way as the change department: ending up with a huge team that everyone else tells to leave them alone. The key is for each department to think carefully about how they use AI and own it themselves, rather than having it forced on them.

SAIF: Just for listeners who haven't heard that episode — this was a client of mine, around seven to eight thousand people, also in the utility space. The IT team was about 1,200 people and the change team was about a thousand. What we learned was that the change team largely existed to translate IT initiatives into the rest of the business, and vice versa. You definitely want to avoid that dynamic.

I actually think we have a real opportunity to avoid it with Gen AI, because consumers are often further along at a personal level than their organisations. They're using these tools — maybe just to find washing machines, but also to write emails, manage their lives — in ways that don't require supervision or support. There's a learning curve they're already on.

I want to move to sustainability teams specifically. When you and I last spoke, we talked about what sustainability functions are actually responsible for. What do you think are the big use cases for sustainability teams?

DAVID: One of the very hard things about sustainability — and I'm not a sustainability expert by any means, I've worked adjacent to it — is that you're trying to account for something that's very hard to see and very hard to count. It's often right at the edge of your product's usage. End-of-life waste accounting is super hard because the product has gone out into the world and you can't put nice controlled counting in place. That has led to sustainability data being spread across unstructured places and being very difficult to collect.

The brilliance of AI is that it makes that life so much easier. Traditionally, you'd have to go and read reports to figure out what companies were doing on sustainability, because it was buried in non-standard graphs in PDFs — very hard to automate. A lot of that can be automated now.

Visual data collection is going to be really powerful. With multimodal models, you no longer need to train a specific model to identify something in your factory. If you're trying to identify a specific type of waste on a production line, you can now do that with a generic AI model. What Gen AI has done is take capabilities that AI could technically perform — data extraction, processing, search, insight generation — and make them accessible without custom model training.

Gen AI is also great at writing code to do analysis for you. And finally — and I think this is really interesting — it changes how you expose data. Traditionally, sustainability teams have exposed findings through reports and spreadsheets. That's been the interface, both internally and externally. Now you can build web interfaces, chat interfaces, live agents that anyone can ask questions of and that have access to all the underlying data. It creates a much more interactive and usable way of surfacing sustainability insights — not "here's my spreadsheet, go look at it."

SAIF: My next question has an element of bias, but I want to build on that direction of travel. You've highlighted all the things Gen AI could do for sustainability teams. There's also the question of how sustainability teams access Gen AI.

As I understand it, there are roughly three hypotheses. Hypothesis one: direct access through ChatGPT, Claude, or Gemini — you're essentially building more sophisticated use within the LLM provider's own platform, the way I might have a "home improvement" project in Claude and keep iterating on my washing machine shortlist there.

Hypothesis two: point solutions that do specific things well and integrate AI into how they do them better. That's what we do at AltruistIQ — leveraging Gen AI to expose data in more interactive formats, enabling much more iteration, moving from static Excel exports to live AI-enabled tables. AI capability delivered through a purpose-built software platform.

Hypothesis three: every large enterprise eventually just builds its own single solution — everything feeds into a massive centralised data lake with AI-enabled analytics on top, and there are no third-party point solutions left.

The reason I'm raising this is: what should sustainability teams actually be doing right now? Wait for the third horizon? Run ahead with the first? Go looking for something in the second?

[Mid-roll: AltruistIQ]

DAVID: I like that framework. If you knew the answer to this with certainty, you'd be extremely well-placed to make billions in the VC market right now — everyone is asking exactly this question.

You've got the LLM itself, then the interface the LLM provider wraps around it, then LLM wrappers built by third parties, and then big custom in-house tools. There's going to be enormous competition for enterprise dollars between LLM providers and SaaS companies.

What's very telling is what OpenAI has been doing lately. They launched GPT-5, but then — interestingly — their next releases weren't really model improvements. They launched apps, so you can now address Figma, Spotify, or Booking.com directly within ChatGPT. And they launched a Workflows SDK, which lets anyone point-and-click their way to an automated workflow with ChatGPT inside it. What you can see is that LLMs are actively reaching for that next layer and trying to displace the wrapper products. It's a pitched battle.

What I'd say is this: several studies have found that the teams and enterprises that have succeeded most with LLMs are the ones who have built software themselves. Not built a model — but gone and built a web tool, a framework, or their own chat app, given it all the data it needs, given it context, and customised it for their teams. The people who have simply bought have been less successful.

So I would say to sustainability teams: start building. The barrier to building is so low now. Even if the LLM providers end up absorbing that space down the track, that's fine — it's happened before in tech. But you'll get ahead in the process, you'll learn a lot, and you'll build the muscle. Don't wait, because if you wait, the whole sector will pass you by.

SAIF: Fantastic. David, we've touched a lot on use cases that are relatively conventional — cleaning data, running better analysis, creating more interactivity in how you query findings. These are real, but they're broadly true for finance, procurement, sales, and other functions too.

If I look at what Gen AI might unlock specifically for sustainability teams outside their core workflows — thinking about downstream impacts, end-of-life, upstream agricultural sourcing — looking at this through a double-materiality lens: how do I control the impact my business has on the world, and how do I control for the impact the world has on my business as a result of climate change? Do you see use cases for Gen AI there?

DAVID: Absolutely. But they look very different, and they're probably not as easily accessible as the ones we've been discussing — because they interact with the physical world. Any project that interacts with the physical world is harder to build and harder to pilot.

The key enabler here is the multimodality of the new models. The models can now see and hear, and you can prompt them in natural language to determine what to look for. Traditionally, if you wanted to identify a Coca-Cola bottle in a pile of rubbish, you'd have to train a specific classifier. Now you can just ask Gemini, ChatGPT, or Claude, "How many Coke bottles are in this image?" and it'll probably give you the right answer. That generic ability to sense and understand the real world should allow us to deploy a lot more sensing-type products at scale.

The second dimension is physically changing the world — changing outcomes. That's more frontier territory right now, but there are teams working on using Gen AI to explore proteins and material structures, and to generate new proteins and new materials that could completely revolutionise things. For me, that's probably where I'm most bullish about AI's impact on society. It'll be one of those things where AI hasn't decided on its own to invent a protein to munch through plastic — someone has directed it to do that. But the enabling role it plays will be more revolutionary, I think, than becoming 50% more efficient in our digital workplaces.

SAIF: The societal impact angle is honestly enough content for an entirely separate episode, which we should do. We're coming to the end, but I have two rapid-fire questions from our audience.

First: how do you upskill quickly as a sustainability professional looking to get familiar with AI?

DAVID: Simple answer: go beyond the chat window. Stop using AI only through the chat interface and start figuring out what else it can do. A few things to look at quickly: workflows, tool use — AIs can actually use external tools, but you have to go through the API to enable that — and MCP, the protocol for connecting AI systems to other tools, which is how you'd connect to something like Spotify. And then try a vibe-coding platform, something like Lovable — build something and see how AI works by actually building with it. As with any learning, push yourself out of your comfort zone.

SAIF: And the final one: how do you pressure-test AI to ensure you can trust it?

DAVID: This is actually a solved problem in software. When you write software, you write unit tests — tests that determine whether your code produces the right output. If someone else changes your code and a test breaks, you know something is wrong.

You just have to do the same thing with AI. Build what are called evaluations — essentially: if I put this question into my AI system, does it give me the right answer? If you have a dataset of 100, a thousand, or ten thousand such question-answer pairs, and you run them every time you change your AI system, you'll have strong confidence that it's still doing what you expect. You can't just keep tweaking your prompt and swapping out underlying models and assume it's going to be fine. You have to build evaluation into the process.

I expect evaluations will become one of the most important things in the workplace over the next decade — even though almost no one sees that coming yet.

SAIF: When I was at McKinsey, my early managers called this triangulation — I'd build a model, they'd ask two or three tricky questions to expose everything I'd done wrong, and I'd go back and fix it.

David, this has been genuinely insightful. I felt like we could have run for twice the time, which is all the more reason to bring you back for another episode. On behalf of the show and all our listeners — thank you so much for joining us.

DAVID: Thank you so much for having me.

Thanks for listening to this edition of the State of Sustainability Podcast. Hit follow or subscribe to get notified as soon as our next episode drops.

‍

Centaurs & Cyborgs: separating AI potential from AI fantasy

What you'll learn

Links & resources

Guest: David Sykes, Former Head of Data Science, Octopus Energy

Subscribe for updates

Questions, feedback or content suggestions?