Posted April 28, 2025

In this episode of AI + a16z, a16z Infra partners Guido Appenzeller, Matt Bornstein, and Yoko Li discuss and debate one of the tech industry’s buzziest words right now: AI agents. The trio digs into the topic from a number of angles, including:

  • Whether a uniform definition of agent actually exists
  • How to distinguish between agents, LLMs, and functions
  • How to think about pricing agents
  • Whether agents can actually replace humans, and
  • The effects of data siloes on agents that can access the web.

They don’t claim to have all the answers, but they raise many questions and insights that should interest anybody building, buying, and even marketing AI agents.

Transcript

Guido Appenzeller: So, I think there’s some things which are probably kind of easy to say, which is (A), there’s a good amount of disagreement [about] what is an agent. We’ve heard a lot of different definitions of it both on the technical side, as well, I’d say, on the marketing and sales side in some cases, because there’s some sales models associated with it. So, let’s start with the technical side. I think there’s sort of a continuum here, you know. The simplest thing that I’ve heard being called an agent is basically just a clever prompt on top of some kind of knowledge base, or some kind of context, that has this sort of a chat type interface. So, from a user’s perspective, this looks like [what] a human agent would look like, right? So for example, [if] I asked, “Hey, I have a technical problem with my product X, Y, Z,” it looks at the knowledge base and comes back with a canned response.

Yoko Li: But there doesn’t have to be a knowledge base, right?

Guido: It doesn’t even have to be a knowledge base. I see, got it. OK. So, maybe it’s just a trained model. It’s all the model weights, the knowledge, so it’s even simpler. So, an agent could just be an LLM with a chat interface or something like that by some definition, right? I think on the other end of the spectrum, there are some people who basically say for something to be a real agent, it has to be something fairly close to AGI, right? It needs to persist over long periods of time. It needs to be able to learn and it needs to have a knowledge base. It needs to work independently on problems. If you take, then, the most extensive definition, is it fair to say that doesn’t work yet?

Yoko: I think so. It doesn’t work yet.

Guido: Will it ever work?

Yoko: That’s a philosophical question.

Guido: Alright. Fair. Very fair, very fair. So, if we take that continuum in between, is there at least a way to chop that up into a couple of categories of something, maybe degrees of agentic behavior?

Yoko: And different types of agent. There’s some artsy agents that help artists to come up with new Bezier curves. There’s coding agents, which we like to talk about as the agent of the day.

Guido: Which we use. Yeah.

Yoko: Yeah. Which we use. There’s [an] agent that’s just a wrapper on top of LLMs.

Guido: That’s right. Yeah.

Matt Bornstein: I may be the contrarian in this group.

Guido: Alright.

Matt: Look, I kind of think agent is just a word for AI applications, right? Anything that uses AI kind of can be an agent now. Before we started this talk, I actually went online just to refresh myself about some of the more interesting AI agent perspectives out there. I found a really cool talk from Karpathy that he gave a couple of years ago about agents, which I can describe a little bit. But the really funny part was on the YouTube recommended videos to watch next, it’s like, “’AI agents are going to revolutionize your lifestyle’ and ‘the rise of super intelligent AI.’” You know, it’s just kind of like, marketing. And so, I actually do think that’s what’s going on in a lot of ways.

The cleanest definition I’ve seen of an agent is just something that does complex planning and something that interacts with outside systems. The problem with that definition is all LLMs now do both of those things, right? They have built in planning in many cases and they at least consume information, you know — at least from the internet, maybe from some servers that expose information through MCP or some other protocol. So, the line really is very blurry. And, you know, what was so interesting about the Karpathy talk is he basically, he related it to autonomous vehicles and said AI agents are a real problem, but it’s like a 10-year problem, it’s like a decade problem that we need to work on. And I think most of what we’re seeing in the market now is not the decade version of this problem. It’s like the weekend demo version of this problem. And this is why we sort of generate so much confusion: You have this kind of poorly defined nebulous thing that LLMs are kind of consuming themselves over time.

And so, I don’t think anything we have are actually agents, and agent itself may be a poorly defined and kind of overloaded term. But if someone’s willing to do the hard work and define exactly what it’s like to kind of be a human, but in digital form, and spend 10 years to make it actually work, you know, that’s what I’m excited to see.

OK, so, defining agents is a difficult job. Maybe it’s easier to talk about how people use the tools they call agents and what are the differing degrees of agentic behavior.

Yoko: I wonder if part of the conversation is redefining agent, because we all know that agent as a term is just not a great term. It means so many things to so many people. If it’s interesting to dissect, like, what do we mean? What do different people mean when they say agents? What are different ways we could utilize this process we call agents?

Guido: So, it seems to me if we’re trying to define agents, or maybe even degrees of agentic behavior, which might be a little easier, there’s something like a user interface aspect to it, right? Where something that’s a pure copilot, but basically a user goes back and forth with an LLM to work on a particular task, that’s often not called an agent. Is that fair? There’s a little bit [of] the copilots versus agents UI models.

Yoko: Yeah. I guess, like, what are the elements, we would think, that go into agentic behavior? Like, Matt mentioned, planning could be one. There could be decisions made by the agent. There has to be an LLM somewhere. But curious about your take.

Guido: So, I think another definition we heard from Anthropic recently was this idea that an agent is an LLM running in a loop with tool use, right? Which, there’s two important parts of that. One is this notion that it’s not just a single prompt and not even just a single static sequence of prompts, right, but something where the LLM takes the output of a prompt, feeds it back into itself, and based on that makes decisions on what the next prompt [is]. And likely also when to abort, like, you when to complete a task. I think that for the real agents or the more agentic behaviors, I think that’s a reasonably good definition. I think the other thing…

Matt: But just by that definition, isn’t every chatbot effectively an agent then in this world, right? Like, if I go just to chatgpt.com and use their latest reasoning model with web search, right, isn’t it using tools and feeding its outputs into a new prompt in order to do kind of chain of thought?

Guido: Chain of thought is a little bit in between. If it’s just a single prompt that comes back with a result, then it wouldn’t have this notion of planning and doing a more long-term concept, and deciding itself when it is complete, right? If you have chain of thought reasoning where I’m giving a more complex task, that’s starting to look agentic. I agree.

Matt: I just think it’s really tough to define a system based on what someone says to it, right? Because these are by design unstructured inputs. These systems will accept literally anything. And so, sure, if you tell it, you know, “What’s today’s weather?” I would agree that’s not agentic, right? That’s just fetching from an API. If you ask it, “Define a new philosophy of weather,” right, it’ll happily go do it. So, it’s like, an agent if you ask it one thing, but not an agent if you ask it another thing. I think that’s a lot of the confusion in the market around this. And if we spoke in the terms that you’re talking about, Guido, of like, “Hey, this is an LLM in a loop with a tool,” like, that’s actually a much more productive way to talk about it, I think.

Guido: That said, it seems like we’re seeing some degree of specialization of user interfaces in sort of two directions, right? There’s, let’s say, a Cursor or something like that, which really emphasizes the tight loop between the user … the tight feedback loop between the user and the LLM and the thing I’m working on, right? So, I want immediate gratification when I do something, you know, and response time matters. Then there’s sort of more the backend, you know, source code management system type plugins. where it’s more about throwing something over the wall by maybe answering a couple of questions. And then you try to maximize the amount of time the agent can work independently.

So, it seems like, I think you’re right that there’s no clean system definition split between the two, but there seems to be a little bit of a user interface specialization. Is that a fair statement?

Yoko: I almost feel like for all the use cases we’ve described, there’s one element that all agents have, which is reasoning and decision. Would you call just a call to [an] LLM to say, “Translate this text to JSON,” that’s probably not an agent. But then if you ask LLM to, say, “Hey, decide where, you know, this response goes and route it for me.” It feels more like an agent than before. So, it almost felt like planning. I’m actually not sure, does the agent need to plan or does it need to decide? Maybe both. I actually feel like it’s a multi-step LLM chain with a decision tree.

Guido: A dynamic decision tree.

Yoko: A dynamic decision tree. Yeah.

Guido: Yeah. I think that’s fair.

Matt: I think we’ve all just been nerd-sniped. I just think it’s, like, humanities people love classifying and they draw kind of like fine distinctions between different types of things, entities, whatever. We’re computer scientists, like, for, you know, not that there’s anything wrong with humanities, but we’re just not that. So, I think we’re not well equipped when it’s … a bit, isn’t just zero or one, it’s maybe something in between. And we just talk about it a lot. We, like, try to like coerce it to one value or the other.

Of course, agents are more than pure technology. They’re also becoming products, which means they need to be marketed. And how someone positions their product has a major effect on how they price it. What’s more, the ultimate value of any given agent, which is still to be determined for the vast majority of them, is to what degree they can actually replace or simply augment human workers.

Guido: There is an interesting point, which is I think there is a marketing angle to agents, right? I’ve heard this narrative from a couple of startups that they’re basically saying, like, “Hey, you know, we can price the software that we’re building much, much higher because this is an agent.” So, we can go to a company and say, “You’re replacing a human worker with this agent. The human worker makes, I don’t know, $50,000 a year, and therefore this agent you can get for only $30,000 a year.” This sounds really compelling from a first glance. And actually, I mean, there’s some value to it in the very early days because it essentially … it’s very easy to understand comparative pricing for somebody who has to make a buying decision, right?

Now, on the flip side, we all know that the cost of a product over time converges towards the marginal cost of production, right? And so today, if I used to use a translator, maybe to translate a page of text, today I use ChatGPT. I do not pay ChatGPT like I paid my translator. I paid a tiny fraction of a cent, right, which is via the API, which is the actual cost. So, I sort of wonder how much of the agent debate is driven by marketing and pricing.

Matt: I just actually think this is a really interesting topic. What fields can you think of that are actually suffering complete replacement from AI or AI agents? And this is a setup, I’ll warn you. I have another extreme point of view that I’ll say afterward, but can you think of fields where this is actually happening?

Yoko: Not completely, but definitely partially. Because there’s a lot of, for example, voice agents that replace receptionists, people who would, you know, get back to customers. So, there’s definitely a lot of workloads that have been offloaded from the folks who traditionally did the job. But I don’t think they’re, you know, 100% replaced. They can do something else. But we are seeing headcount growth in some areas that are slowing. So, it’s not that existing jobs are being replaced. It’s more like, they’re hiring net-new humans slower.

Guido: I think it’s exactly right. I mean, I think in few cases humans will get replaced by AI. In most cases, you know, two humans will get replaced by one human that’s more productive with AI.

Matt: Yeah. Or maybe they keep the two employees and, you know.

Guido: Maybe go to three employees because now they’re more productive.

Matt: Right. It’s just a really interesting question. And the reason I think it’s really relevant to agents is I think part of the ethos and part of the confusion around agents is this idea that we actually will develop human replacements and that this thing we called an agent, which by the way is a name for a person — before we had AI, we had people called agents, and we still have all kinds of people called agents — and it just doesn’t seem like that’s happening, right? Not in the replacement sense, you mentioned, Yoko, with agents. We’ve already always had, you know, customer support automation. We’ve had 1-800 numbers where you, like, press one for sales, press, you know, that’s existed for a long time. This is a much better form of that, obviously.

Translation is a great example too, Guido. These systems can perform translation extremely well, but you’re probably not going to just stick something to ChatGPT and then publish it on your website. There is actually work that needs to take place. And I think the reason for this is there’s just fundamental creative work in most things that humans do. I think from our kind of perch in Silicon Valley, we can forget that sometimes — that people all over the country, and doing all sorts of jobs, actually have hard jobs. And not just hard in the sense of “someone’s got to do it” jobs. But hard in the sense of it does take thinking and human decision making, which … I just don’t know that AI kind of has what we would think of as decision making or intent. It’s a system that, still, somebody has to push the button, right? It may be running somewhere, it may do a great job of whatever. But someone still has to give it a prompt and hit go. And to me that’s a lot of the confusion around agents.

We’re all thinking at some point a human person with intent and creativity and thinking is going to be replaced. I’m just not sure that even is theoretically possible. It’s almost just like a catch-22 to say an AI system is thinking for itself because somebody has to have sort of created it. You know, this is old sci-fi philosophy I’m getting into now, but, like, I actually do think it’s a big reason for the confusion that, you know, we sort of experience now.

Yoko: It’s interesting because there’s two types of agent we’re already talking about. There’s one type where the agent is replacing humans, working with humans, doing things humans can do. There’s the other type of agent that’s more low-level system processes. They work with each other, they hand off tasks to each other. To some extent, agents are like technical details in the system in that way. But we mean both when we talk about agents.

Guido: In that case, is there actually a difference between an agent and a function?

Yoko: I think so. I think the agent will be multiple functions with LLMs in the middle.

Guido: If I have a low-level agent and I’m giving this low-level agent a task, and I get back a task result, it looks a little bit like a classic API call.

Yoko: But with the LLM in the middle to make decisions on what to do for that API call.

Guido: Understood. But that’s sort of how this function works internally to some degree, right?

Yoko: Yes, yes.

Guido: So from the outside, would I care?

Yoko: You wouldn’t care. It’s like, most of the time when we see AI SDRs, when we talk about AI SDR agents, what we mean by that is when the agent can go to the CRM, pull something out and then filter the list, draft an email, and send the email. So that feels very process level instead of human level.

Guido: Yeah, totally.

Yoko: Yeah. So, that’s what I meant.

Guido: If you don’t know how this thing works internally, the classic function and agent become indistinguishable.

Yoko: Totally, I absolutely agree. But as a programmer, when you write a function, you would define an agent that that’s this interface.

Guido: I see.

We’ll get back to pricing shortly. But first, let’s dive a little deeper into this discussion of how interacting with an agent is different than, or similar to, traditional software-based functions.

Matt: So, here’s one interesting thing to think about on that topic. I totally agree with you, Guido, and I think you sort of agree, too. It’s really a function if you kind of just look at it that way. Shareable, reproducible functions have never really been a thing. This has been one of these long-time goals that people in the market have tried to say, “Oh, I can just write a function and then anybody on earth can use it.” Right? Like, you know, we have packages that you can download a whole package with various functionality, but literally just one function you can share. If you kind of squint a little bit, that kind of exists now with AI because you have these models; that’s trained by somebody, somebody else may download it, fine tune it, train a LoRA, package it up into some new and interesting way. And then it’s actually immediately available for someone else to use on hosting services or HuggingFace or something like that.

So, while it does seem to be just an implementation detail whether you’re using an LLM or not, there is this interesting thing where the model itself takes up so much of that functionality in the function, and it’s just a different kind of animal compared to normal code. It’s actually more, it’s kind of shared by default in a way because nobody’s going in and training their own model every time they’re writing code. You know, it’s obviously heavy, it’s harder to move around. There are all these different characteristics from normal functions that some of which are actually very desirable, some are kind of bad characteristics you don’t want, but many of them are kind of interesting. And I think we’ll actually see new infrastructure, new dev tools kind of built around this in the long run.

Guido: I think it would make sense. I mean, if we go back in time, the last time we sort of invented a major new component for building systems, it was probably networking, right? How we thought about calling a function before networking afterwards changed a lot, right?

Matt: Totally, totally.

Guido: It’s like, the complexities of APIs and the infrastructure around that is completely different today.

Yoko: This is such a good point because now that I think about it, I feel like humans are just functions too. Like, if you have a thought experiment and then replace LLMs in the program, to a human, like, the kind of answers it will give to the program is not that different from what the LLM will give to the program.

Matt: So, if we actually all get hooked up to servers one day and can be called as a function from Lambda, then I will agree that agents have been created. That’s what an agent is.

Guido: Isn’t Mechanical Turk exactly that, or maybe even your email inbox?

Matt: Yeah. Sounds like an agent to me.

Yoko: There’s an Amazon Go supermarket a while back in Soma. I think they were advertising that it’s computer vision models behind the scenes identifying what you took from the supermarket. But then people found that they hired a lot of people behind the scenes to actually label the data in real time. So, the humans in that case are the functions that today may be…

Guido: Secret agents.

Yoko: Right. Replaced by LLMs with…

Matt: Well, but this was exactly my point though, right? There actually is important creative work. Even in a grocery store checkout clerk, you could naively think, “Oh, this is an easy job.” Actually, it’s not an easy job at all, right? And so, you can take this work and kind of shift it, and you can squeeze it down with automation and stuff, but it never really goes away.

Yoko: Oh, yeah, absolutely.

Alright. So, given all of this, how should companies think about pricing their agents? Per seat, per token, per task? Hint: it might be too early to truly tell.

Guido: Usually, if you introduce a brand new product category, you often initially put a pricing that prices against the status quo, right? Whatever you replace or augment, in some cases. But let’s assume we have a direct replacement, right? So, that’s I think where this idea from, “Oh, this replaces a human” — which it doesn’t — but if it would, then you could charge, you know, X amount for it, right? And usually over time competition kicks in and you’re effectively priced by how much your competitors are charging. You start sort of an erosion. Then it depends on many things. Like how much of a moat do you have? Do you have customer lock-in, right? And so on. Long-term, converge against the marginal cost of production, which, I mean, look, if I look at most agents today, it’s probably very low, right? Any agent you can purely model in software with a couple of LLMs calls, you can run at a very low cost — and the cost is decreasing over time.

Matt: And I would sort of argue that kind of already what’s happening — that, in practice, most AI applications, and, in particular, if we want to call them AI agent applications, you know, they have their sales pitch around, “You should pay us X because we’re saving …” You know, it’s like a classic ROI calculation.

Guido: Establish value. Yeah.

Matt: Yeah, exactly. Value-based pricing. But in practice, I think most buyers are actually pretty sophisticated about what’s going on under the hood. And to your point, they know it’s pretty simple stuff happening. And so, it’s like, “Hey, what does it cost you to run all these GPUs and we’ll pay you some premium over that.” And I think that’s how a lot of vendors are pricing in practice these days.

Guido: I mean, long term you’d expect pretty healthy margins, just like in SaaS, right? Which so far traditionally has very good margins.

Yoko: It’s so funny because we always advise companies to not price based on the margin, but price based on the value you add, whatever that could be. It could be compared to other vendors in the market, it could be compared to just, you know, what it is building in-house. And traditionally for infra, a rule of thumb not always the case, is that if the service is used by a human, it’s per-seat pricing. And if it’s a service that’s used by other machines, it’s usage-based pricing. And I actually don’t know where to put agents here.

Guido: Well, it could be used by either, right?

Yoko: It could be used by either.

Guido: An agent could be using it or a human could be using it.

Matt: Look, I think your analysis is exactly right. And the reality is most AI companies don’t know what value they’re generating yet. This is so new and so nascent that it’s like, “Hey, we’re just going to charge something that we’re not going to lose money on.” And you know, in the case of OpenAI, they have now many millions of users? They probably don’t have a very strong sense of what they’re all using it for. And once they do — and you see this more, they’re trying to verticalize a bit more and have kind of specific products for specific use cases, code obviously being the big one — then you’ll be able to see the pricing kind of catch up … is kind of my hypothesis.

Yoko: This reminds me, the OpenAI point you brought up. I was thinking about AI companions because that’s the closest to per-seat human pricing. Like, you can’t charge someone for every sentence they talk to their companion. Although some of the foundational models…

Matt: There are services that will charge you per response. I haven’t used them, but they do exist.

Yoko: I see. Wow. OK. So, usually it’s kind of weird to charge someone, like, by tokens of how much you talk to the companion other than, like, a flat monthly fee.

Guido: It doesn’t feel like a true friend if you get charged by it, right?

Yoko: Right. Exactly. It’s very transactional.

Matt: Look, this is all theory, right? People love sitting around and talking, “Oh, we’re going to charge per person, per task, per, you know, world economy that we rescue.” It’s like, it’s all made up, right? I think Guido’s thing was exactly right. Let’s look at the actual technology underlying what we’re calling agents right now, where are they being deployed and why? And honestly, the pricing, the marketing, the sales tactics, all of this kind of follows from what they’re actually selling.

If I’m selling something that looks like an agent, but I haven’t truly figured out the value I’m providing to my users, how do I justify the jump to a higher price point when I do figure out that value?

Matt: You just need to be selling a solution rather than a product. This is really well worn expertise in enterprise go-to-market. Code, you can somewhat see the decoupling of price from the underlying technology now, because it really works. There’s very clear ROI to people who use it. And so, as a VP of engineering or a CTO, you can look at this and say, “OK. I’m actually saving a lot of money and my guys are getting a lot more productive. I can value … I can do a normal ROI.”

Guido: And they’re happier.

Matt: Yeah. So you’re kind of buying a solution, right? You’re buying from a vendor something that solves a problem for you, which again, Microsoft, Oracle, Salesforce people have been doing forever. Once we start to see more of that, it’s going to be these things that become real products and kind of decouple pricing, and look kind of like real businesses, I think.

Yoko: I think it’s dictated by the high-level applications. So, I’ll give you an example. So, I’m a Pokemon Go Player. So, for those who have played Pokemon Go, once you collect enough Pokemons, but you are out of storage in your pocket, you need to pay extra to buy a new bag, virtual bag, that you can put more Pokemon in. And as an infrastructure investor, I invest in storage businesses, and then when I look at how much I need to pay for, like, 30 extra Pokemon, it was thousands of times more expensive than what storage is. So, it actually reminded me…

Guido: I’m surprised it’s only thousands.

Yoko: It’s only thousands.

Guido: I would have guessed 10 to the 15th or so.

Yoko: There’s a whole price curve on Pokemon storage, it turns out, but…

Matt: Because this is one JSON blob basically in your Pokemon.

Yoko: Right. It’s one JSON blob. I know.

Matt: And they charge you like, five dollars.

Yoko: Yeah. And then the normal Pokemon players, they wouldn’t think about this. Like, how much does storage cost, right? Like, a normal Pokemon player will be like, “Oh, this capability, I would be happily paying thousands more than if I were to have a S3 bucket somewhere.” So, one of it is monopoly. So, it’s an application layer monopoly that you wouldn’t have been able to store the Pokemon anywhere else. And two, it’s a use case. It’s for a different audience that wouldn’t be asking these questions. They would be thinking about what is the net new value? What’s the net new cost I will be willing to foot the bill for if I were to get this value? “Is it a fun game? It’s a fun game. Take a hundred more dollars.”

Matt: Yeah, I think that’s exactly right. And implicit is what you’re saying is this idea that the product or the solution has to actually work for them, right? For a less technical person who’s, you know, the person who’s not going to try to provision their own storage bucket to self-hosted Pokemon.

Guido: [crosstalk] for Pokemon. Yeah.

Yoko: And it’s quite defensibly differentiated too because, you know, Pokemon Go is not open source. There’s no other replacement of Pokemon Go. There’s only one Pokemon Go. So, there’s only one place where you would be willing to pay so much money for Pokemon storage.

Guido: Plus very strong brand, plus you have a little bit of network effect because you can play together.

Yoko: Yeah. And then we’ll see the AI agent version of this. I can’t wait to see the AI companion version of this … paying storage for an AI companion’s wardrobe.

As the AI market continues to shake out and evolve, where will agent capabilities ultimately live? For example, can they live inside LLMs or must they call external tools? And who’s ultimately in the best position to influence this?

Guido: Super interesting question, right? What’s the system’s perspective of how an agent is built. And I personally think that, architecturally, there really is no difference between your typical SaaS software today and agent in terms of how you build it, and let me explain why, right? So, in an agent we said you have sort of an overall loop with an LLM and prompts that feeds into itself plus external tool use. The LLM itself, you probably want to run a separate infrastructure just because it’s highly specialized. You need these vast GPU farms — you can’t easily run today’s LLMs in a single GPU — so that’s a very specialized infrastructure. That’s externally. So, the LLM call is external.

The state management… Well, today in SaaS applications, we do all the state management externally in databases or something like that. So, you probably also want to externalize that, right? And then what remains is fairly lightweight logic where basically I’m taking context that I retrieve somehow from databases, I assemble that into a prompt, I run the prompt, and then I occasionally invoke tools. Maybe I do that with MCP or something like that with an external server. But the core loop is actually pretty lightweight and I can run a gazillion agents on a single server. Not a gazillion, but many agents on a single server. I don’t need a lot of compute performance for that. Does that sound about right?

Matt: Yeah, yeah. I totally agree. The interesting architectural question for me has always been how do you handle the kind of non-determinism that may come. Many of these successful AI applications that we all use and love, really just spit model outputs back out to the user, right? Like, a chatbot or image generator. It’s like, “Hey, I called the LLM. Here’s what I got.” You know, “Good luck.” When you try to actually incorporate the output from an LLM into the control flow of your program, that is actually a very hard, very unsolved problem that, you know, to your point, there are relatively minor architectural differences today. But this may actually drive more significant changes in the future.

Yoko: I actually think the winners will be the specialists, not the foundational models. It’s the people who will build on top of the foundational models or fine tune the foundational models. So, like, a very artistic example of this is that I’ve been spending the last two weeks just prompting GPT-4o, their image model. It’s very good at cartooning, so it’s very good at manga. It can spell, so it has a storyline. But then I realized that there’s only top two or three styles it’s good at. So it’s good at Ghibli, it’s good at manga. And then there’s variations of a style in that realm. So now, where art comes in, is that the market likes out of distribution art. Everyone doesn’t want to see the same things over and over again because that’s how they value art, something that’s different.

Matt: Ideally, maybe.

Yoko: Ideally, maybe.

Guido: Did somebody recently define art as out-of-distribution samples?

Matt: Yeah.

Yoko: Art can be in distribution as pop art, right?

Guido: Maybe.

Yoko: It could also be out of distribution. That’s, like, when impressionism came up many years ago, everyone was drawing impressionism. And at the time, the painters before, they were like, “What’s wrong with your eyes? Why are you drawing blurry images?” So, styles come and go, but because of that, I think it’s a pushing distribution question. The foundational model will never cover 100% of everything. So, it’s really up to the humans and specialists of the next wave to come up with the new data, new workflows, new aesthetics to push that distribution.

Of course, at the end of the day, agents are only as useful as the tools and data to which they have access. So what happens if major web platforms decide they want to keep agents from accessing their data?

Guido: It seems like one of the hardest things about agents today are data moats. In some cases, just because they’re technically difficult — I’m trying to access data, an agent is trying to access data, and it’s just very hard to integrate with that system. In some cases, it’s very deliberate, right? My iPhone, the photos are not accessible via any API because it’s a walled garden.

Matt: So, sort of data silos, you’re talking about.

Guido: Data silos, right? So, is that something that’s holding back agents, or is making them more difficult? Or to make it even stronger: consumer companies traditionally often were opposed to offering automated access to their services because they want their user engagement, they want the time to advertise to the user. Will that limit how much we can deploy agents?

Yoko: And would that be changed once we have the browser-native agents that can browse the web and browse a [crosstalk 00:30:23.020].

Guido: Great question. Yes. Yeah, yeah.

Matt: I think Yoko is totally right. You know, it’s like, there’s strong incentives for people who own data about, you know, physical entities, people, businesses, etc., to keep it to themselves, right? Especially because they may be scared what AI is going to do to them, by the way, so they’re kind of clinging tight to what they have. And these problems are rarely solved by defining a new protocol and just saying, “Hey, if we make it easy for people to give away their core assets, they’ll just do it.” Obviously, that’s very unlikely to work. But someone eventually will solve this by saying, “Hey, if your data is publicly visible, we’re going to get it,” you know. It’s like, “By the way, it’s not actually your data, it’s data about me, so why should you be holding onto it?”

Yoko: Yeah. Actually, I feel like the new advancement in models may just change the data moat kind of to the point of today, web browsing using an agent doesn’t work super well. It’s very slow, it’s very clunky. You have to try it multiple times for it to do any task. But imagine if we have a foundational model capability of giving an agent the ability to go to any website, login as a human … we’ll table that one; I don’t know how agent identity works yet … or go SSH into a server, like, execute certain commands, or, like, spin up a virtual machine for a mobile, or access a device farm to play Pokemon Go. Like, maybe those are like … the data traditionally only available to humans under that account now may be available to agents.

Guido: There’s also the opposite that could happen, right? That basically all the consumer sites are starting with more and more complex anti-agent CAPTCHAs, trying to keep out the agents because they only want the humans that have attention to come to those sites. I mean, I recently did use one of these deep research tools, one of the major LLMs, and one of the steps — if you look through it, all the steps it went through — was like, you know, trying to see how I can get around a CAPTCHA mechanism for a site. There was an actual reasoning step, right? Where basically it knew what information it wanted and it was blocked from accessing it. You know, how dystopian is the future going to be?

Matt: Did it solve it?

Guido: It solved it, actually.

Matt: I mean, it’s so interesting. So here’s a really early machine learning example of this. I don’t know if you guys remember when Gmail first implemented ads. It was a big controversy because they basically said, “OK. We are not going to read your emails, but our algorithms are going to read your emails. And we’re going to suggest ads that you should watch, you know, or click on based on that.” We all sort of, I think, just forgot and got used to it. I still think we don’t love the idea, but we kind of lived with it. But some of the data providers reacted by removing data from email. So, Amazon famously now, when you order something, they send you a confirmation email that says, “Hey, you just ordered something. Click here to find out what you ordered, when it’s going to arrive, or any information you might want to know.” And so, that actually did happen in practice in that example, that the major data holders kind of found ways to withhold it. It’ll be interesting to see whether that’s possible now or not.

Yoko: But that same data is scraped on the client side from the ad networks I install.

Matt: Oh, for sure. Yeah, yeah, yeah. There’s always some other way.

Yoko: Yeah, some.

Matt: It’s not maybe exactly the same, but a pretty good proxy. Yeah, yeah.

Guido: It may be that it’s much harder to tell the difference between an LLM and a human than a classic, you know, sort of API call mechanism and a human. That may change the dynamics.

Finally, Guido, Matt, and Yoko answer an obvious question on the longest timeline into which we might have clear visibility: What needs to happen to make agents a truly game-changing innovation within the next, say, two years?

Guido: I think the positive vision is that in two years we figured out how an agent working on my behalf can use most of the tools that I have access to. I think it’s also clear what are all the pieces that are missing for that, right? We have not figured out security, authentication, access control for agents working on my behalf yet. We have not figured out how data retention works. We have not figured out the relationship with consumer websites that potentially want to block that agent. But if you had that, it could make many tasks much, much easier, right? Today if I have data sitting, say my Google Drive or so, right, how easy I can reason about that data versus other data that’s in more fragmented sources, it makes an incredible difference. So, I think that that’s the bull case, right? Where you have agents that can take all the data that you can access, they can access it on your behalf, and perform tasks on your behalf, and save you a ton of time, right? It could make you, depending what you do, like, you know, multiple times as productive as you are today.

Yoko: My answer to that is actually different modalities on the foundational models. Today it’s still very much text-based, and that worked really well for coding and text-based tasks. But then for more visual-first tasks, there’s just no one-to-one mapping. Even for web browsing, it’s like a very clunky experience of “take a screenshot every couple of seconds and send it back to the foundational model.” So, I will actually bet on multimodality when it comes to … If we train the model with different traces of clicking on buttons on the website, navigating the web, using different devices, drawing, producing vector art, I think there will be net new things that the model could unlock on the agent level.

Matt: You can probably guess my answer. If we don’t use the word agent two years from now or five years from now, I think that’s a huge win. There’s actually a fun paper put out by some folks at Columbia, I think, called “AI as Normal Technology,” and they sort of make the argument that there’s a false dichotomy out there. It’s like, AI is either going to bring about utopia or dystopia, meaning everything’s going to be amazing because we have AI or everything’s going to be terrible. This is kind of the national discourse. But if you just think of it as normal, like water or electricity or the internet, or things like that, I think that’s the world we’re kind of headed towards. An agent is this kind of way to help us get there. And so, that’s my goal. I mean, this stuff is just incredibly powerful. We understand how to use it, we understand the use cases, and we’re kind of, you know, we’re kind of putting it to use for us.

More About This Podcast

Artificial intelligence is changing everything from art to enterprise IT, and a16z is watching all of it with a close eye. This podcast features discussions with leading AI engineers, founders, and experts, as well as our general partners, about where the technology and industry are heading.

Learn More
OSZAR »