Back to Nicholas
Source

AI Agents Are Here: What They Can Already Do—and What’s Next (Stanislas Polu & Harrison Chase)

Nicholas
@nicholas

What’s next for AI agents, and how will they change the way we work? In this conversation, Stanislas Polu (CEO of Dust, formerly research at OpenAI) and Harrison Chase (CEO of LangChain, one of the most influential open-source AI frameworks) unpack the current state and future of AI agents.

Appears in

Uploaded
Uploaded May 27, 2026
File type
POD
Queried
1

Full transcript

Showing the full transcript for this episode.

Speaker A: The big question we see in the market today, which is interesting, is agents versus AI workflows. We do foresee a world where those agents will be actual coworkers, and I don't think you can really encode a coworker with a workflow. We're really bullish on trying to help people create agents, not workflows. Speaker B: What is the right way to think about what an AI agent truly is and what it isn't? Speaker C: You can often do the same things with workflows and agents. It's just the ease of how you describe it.

In an agent, it would all be in natural language, right? Like you could have a recipe that you just put in natural language, say, hey, do A, then do B, then do C. And it's not as deterministic. Optimistic, so it's not as safe, but it's way easier. Speaker B: I'd love to zoom out for a moment and talk also just about what it means and what it's like to be building in AI at the moment and some of the specific dynamics that founders have to face. Speaker A: The fog of AI is the fact that the foundations are moving very quickly.

And so you have to have a vision of where you're going, but you cannot paint it because if you paint beyond 6 months, whatever you were painting will probably be not true. Speaker D: Hey, I'm Mario, and this is The Generalist Podcast. As the saying goes, the future is already here, it's just not evenly distributed. Each week, we sit down with the founders, investors, and visionaries living in these pockets of the future to help you see what's coming next. Today, I'm speaking with Stan Pulu and Harrison Chase about the future of AI agents.

Stan is one of the founders of Dust, a Sequoia-backed platform that makes it easy for enterprises to build and deploy custom agents across their workforce. Before that, Stan was at OpenAI, researching the mathematical reasoning capabilities of large language models. It was during his time at OpenAI that he first met Harrison Chase, who went on to found LangChain, a popular developer framework that has become essential infrastructure for building agentic applications. LangChain has also become one of Silicon Valley's hottest companies, raising from Benchmark, Sequoia, and IVP. Both Stan and Harrison spend their days thinking deeply about AI in general and agents in particular.

In our conversation, we explore what comes after ChatGPT, why there probably won't be one agent to rule them all, and the unique challenges of building a business in the fog of AI. I walked away from our conversation with practical insights about where agents are headed and how they might change the way we work, think, and build over the coming decades. This is a new podcast, so if you enjoyed today's episode, I hope you'll consider subscribing and joining us for the incredible conversations we have coming up. Now, here's my conversation with Stan and Harrison.

Speaker B: Thank you both so much for, for being with me today. I'm excited to, to chat about AI in general and, and sort of agents in particular with two people who are really spending their lives dedicated to this, you know, field and, and, and this part of the world. Uh, maybe to start, we could just begin with a little bit about both of your, your companies and, and what they do. So Stan, Why don't you tell us about Dust? Speaker A: Yeah, well, Dust is the place where you build agents for, for Workair.

So it's a product that lets you create, manage, and operate agents with access to your company context and tools. Speaker B: Amazing. And we'll get into the details of what that means and why it's become so powerful more, I'm sure, over the course of this next hour or so. Harrison, yeah, I'd love to have folks know about LangChain. Speaker C: Absolutely. At LangChain, we build developer tools to try to make it as easy as possible to build agentic applications. That's the one-liner, but I'm sure we'll get deeper into it over the course of the conversation.

Speaker B: Amazing. Well, I think something that you both have in common is that you've been in AI, at least by the industry standards, for a long time, you know, predating sort of the ChatGPT moment in November of 2022. I started in VC in 2016. That was my first venture job, and I remember there was a big sort sort of excitement around those first phase of chatbots, but, you know, honestly, lots of the excitement sort of withered for the years that followed. And I think there was a question mark about when exactly AI in the way that we see it today was really gonna come to the fore.

And so I'm curious for you both, you know, what was it that you were seeing in the industry before, you know, it became so obvious to everyone else when ChatGPT came out that excited you to sort of like build your career around this? And maybe let's start with you, Stan, 'cause I know you spent some time at OpenAI, so you probably had, you know, a front row seat to a lot of these questions. Speaker A: So actually, the fun fact is that we chatted together with Harrison. I think it was maybe September 2022 or maybe October 2022.

It was funny because it was pre-ChatGPT and we naturally met together and spent some time chatting together because there were so few people like interesting themselves in the use of LLMs for development or use of LLMs for doing anything else in product. And so that's for the fun fact to give everybody the kind of a context of how small the community was during those few months that were the end of December [redacted address] to ChatGPT release, which was in October or yeah, I guess October. So it's a fun fact. As far as I'm concerned, I had the chance to work at OpenAI.

So I had been working on studying the mathematical reasoning capabilities of LLMs, but obviously was also exposed with GPT-4, which had ended training early summer 2022. I mean, it was pretty obvious to me that there was a massive disconnect between the capability of the technology and the actual impact it was having on the world. The revenue of OpenAI, probably not public, and I don't even remember the right numbers, but was like a drop compared to what it is today. It was just a few tens of millions of dollars. And so compared to the power of GPT-4, there had been the luck of playing with internally.

I think it was obvious to me that there would be, there was something missing at the product layer to really unlock the use of LLMs everywhere. And so that's why it kind of motivated me to start building in the product space rather than doing research. Speaker B: And just to go back to that moment in the summer of 2022, Harrison, do you remember like what you guys were sort of talking about at that point? Like what was the tenor of those conversations or sort of the contours of what you were thinking might be possible or might be interesting?

Speaker C: So I think, and Stan, you should correct me on this, but I'm pretty sure I just cold emailed you or cold DM'd you on Twitter or something. So I can share a bit of my background as well, because I think it leads up to this. But my background, so I studied stats and computer science, then worked at a fintech company doing more like time series stuff, but a little bit with some of the early BERT models on entity linking. And then I went to an MLOps startup where I was doing kind of like tooling for ML.

and, and then I remember in, in like August, September, I was going to a bunch of meetups in SF. Um, a lot of them were on generative UI, but they were more on like Stable Diffusion and image things like that. But there were a few people doing cool things with language models, and I just remember being like, holy crap, these language models are fantastic. They're just so different than like the traditional kind of like ML models that I'd, that I'd worked with before. And so then I think I was just paying attention online, and Stan I think was tweeting a bunch about stuff that he was working on around early versions of Dust.

And I think I just like either DM'd or emailed, um, and, uh, he was very gracious to kind of like respond to that and hop on a call. And I think, uh, I think we had one or two calls and just kind of jammed on that and then kept on working. Yeah. That was the, I think that was the start. Speaker A: Yeah, exactly. Speaker A: Yeah, exactly. Speaker B: That's awesome. Um, yeah. So you talked about this sort of discrepancy between the power of the technology and the impact it was having in the world.

And it sounds like that was sort of the, maybe the spark that led you to Dust, but like more specifically, what was it that you were saying, here's the gap or here's the real thing that needs to be solved and like, here's how we go about doing it? Speaker A: To be very candid at the beginning, it was not necessarily perfectly identified. It was really, this thing is great. Nobody uses it. And you're like, what is happening? Where is the heat being dissipated? It must be dissipated somewhere. And it's instantly aware.

And I think it was at the product layer and ChatGPT kind of was a confirmation of that hypothesis. ChatGPT in a sense is mostly a product. There's a bit of a model work compared to what was available off the shelf at the time, but not that much. It's just a very nice UI and the fact that it was free. Super interesting to think back about, and I don't want to sidetrack you too much, but Character AI was incredibly well positioned at the time. Character AI had the product, had the research team, had the model.

Why is it ChatGPT that takes it all? Why character AI didn't explode at the time? That's a very interesting and fascinating question that we'll be able to study when we start doing history of that period, I guess. Speaker B: Oh, interesting. I appreciate the desire not to get too sidetracked, but I do think that's interesting. Like, what's your sort of working hypothesis of why character AI maybe wasn't the one to capitalize in that moment? Speaker B: Oh, interesting. I appreciate the desire not to get too sidetracked, but I do think that's interesting.

Like, what's your sort of working hypothesis of why character AI maybe wasn't the one to capitalize in that moment? Speaker A: So Character AI, I think, was speaking to a specific population. It was kind of weird in many ways. You would start by creating a fake character, or you would create Elon Musk clones or clones of Albert Einstein or whatnot. And I think it created a complexity level, even for the pure bit2stereotypes, that was maybe a bit too high. It had a very strong community, which is the interesting bit. It was going very well.

It's just that it didn't just catch up the same way that ChatGPT did. Did you need the OpenAI brand gravitas for ChatGPT to emerge? Maybe, because OpenAI was not completely unknown. I mean, most people wouldn't know OpenAI, but in the tech sphere and the journalistic sphere, I guess people did know it. It's only a hypothesis. I'm not sure, but Character was still a slightly kind of a bit complex, a bit gimmicky, a bit geeky products, and maybe they missed on that opportunity because of that. I know. What do you think, Harrison?

Speaker C: I think there's some level of kind of like simplicity that just ChatGPT kind of like brought. It was just one chat box. You didn't have to select who you were kind of talking to. Even now, like, I mean, this is again a bit of a segue, but I'm actually curious how you guys handle this at Dust, but in a lot of consumer products, you choose kind of like what model you want or things like that. And I think there is a simplicity in just like you know, just one chatbox type in and, you know, now OpenAI has this, but at the time when they launched, I think it was probably just one model that they gave you kind of like access to.

So you don't have to think about that. And then I had some friends at Character as well, and I remember when ChatGPT came out, they were very worried by that. Um, and, and obviously ChatGPT exploded more than Character, but it did kind of, it was a little bit of like rising tide lifts all boats. Like there was just more people interested in kind of like chatbots and seeing what was going on. So I feel like to some extent it's a consumer product. And so who can really kind of like, you know, know exactly why the cultural zeitgeist catches on to one thing.

But once it does, it's just, you know, explosion. Speaker B: And so for, for you, Harrison, what were sort of the precipitating events to going from the MLOps company you were at to really deciding, hey, there's something to be built here. And the thing that I want to work on is, is LangChain. Speaker C: Part of it was my background's just in kind of like the building developer tools. Part of it is my background's in ML, and I got really excited when I saw these models and was really like, oh, like these are, these are kind of like amazing.

So I was still at my, my previous company at the time, and I, I knew I was gonna leave. I didn't know what I was gonna do, but I basically wanted to explore this area to kind of like figure out what, what I would do next. And my plan was to basically leave, take 1 to 3 months to just like figure out what to do next and then start working on it. But I was, I was sticking around. The CEO asked me to stick around for a few months to help with the transition.

And so in that time period, was going to meetups. That's when I reached out to Stan and wanted to just build some things kind of like to get my hands kind of like dirty with the tech. And I remember chatting with Anton, who's one of the co-founders of Chroma, another kind of like developer tool in the space. He actually remembers the story better than I do, but apparently I had kind of like 4 ideas, which I was talking to him about. And one of them was LangChain. And then one of them actually would end up being LangSmith, which is our commercial product that we build.

The difference is LangChain, you can build kind of like nights and weekends. It's just an open source kind of like Python package. And so I built that. Langsmith is more of an actual product. And so, you know, that came later, but basically just wanted to build stuff to kind of like get my hands dirty, released it, tweeted about it, kept on adding to stuff. A month later, ChatGPT comes out, keep on doing the same. And by the time I end up leaving my company in early January, it's pretty clear that there's LLMs are gonna transform how applications are built, and we think there's a lot of tooling that needs to be built around them to make it easy to kind of like plug them into applications.

And so kind of like those two things, like LLMs are great, but then also we saw this with ML stuff and MLOps, like there is tooling that needs to be built, and a lot of that manifested in Langsmith and other tools that we're building. Speaker B: It sounds like you had a lot of clarity around sort of like the early vision, both with LangChain and Langsmith, Are there parts of that vision that have surprised you with how true they sort of have turned out to be, or parts that maybe you were, you know, making a bet on that you're like, oh, actually, you know, the way the industry has developed is actually quite a bit different than I initially foresaw?

Speaker C: I think like high level, like, you know, maybe we had some clarity that we were right on, like high level, like we thought LLMs would be great and transform how applications are built, and high level we thought there'd be tooling that would be needed to build around them. But like, that's very high level. I think a lot of the lower level stuff, like we were figuring out as we went along. Maybe one thing to that effect, like Langsmith, which is our tooling for kind of like debugging and testing these types of applications, we see it being used by a whole set of kind of like builders.

So not just engineers and not just AI engineers, but product folks and subject matter experts and everyone coming together. And when we were building it, especially because of my background in MLOps, where in MLOps, the people using these tools were ML engineers, a small segment of of the population, you know, it, we didn't, I don't think we fully appreciated or anticipated how many different types of folks would be involved in the creation of these types of applications and how therefore our tool would need to respond to these different folks. So I think like, you know, high level, I'd say we were, you know, I think we had some good insights, but like low level, all the details, I mean, that was like no way we could have seen all of that.

That was just kind of like, you know, peeling back layers of the onion as we went on. Speaker B: Stan, it sounded like you sort of maybe started with an acknowledgement that this was, you know, a very exciting space, but the exact sort of dimensions of Dust have been more emergent. Like, what does that journey look like for you? Like, what were the early bets you were making that, yeah, have been proven true and maybe have been a little different? Speaker A: Yeah, I think the early days of Dust is mostly me, as Arsen did, mostly me messing around with a product.

And I think we really started building what Dust is today much later, actually. It's more like early 2023 when my co-founder Gabriel joins me. And there at this moment, we decide to really focus on applying LLMs to internal productivity. And so not building— Dust was at the beginning something addressing the same kind of space as LangChain because the developers was the only persona you could talk to basically in late 2022. But as ChatGPT comes out, it kind of educates the entire market. And now it opens up the opportunity to talk about LLMs and agents, or not agents yet, but LLMs and how it can impact work.

Much more broadly. And so as Gabriel, my co-founder, joins, we really decide to focus on enterprise internal productivity and applying that technology to how people work. This is a very long journey that we are only starting. I think the great thing is that there is so much to be done and the form factor of what it's going to mean to work with agents is still pretty much unknown. And so it's been as well a very long iteration on many stuff. We started with the idea of, with two main hypotheses at that time.

So it's kind of the rebirth of Dust as we incorporated and started working together. And I think we start with two main product hypotheses. It was first, we need to have the company context because if we don't have the context, we won't be very useful for doing any actual work. There is a lot of work that can be done with that context, translating, creating ideas, doing research on the internet, et cetera. But for doing actual work at scale within the company, we want to have access to the company information. And the second thing was pretty simple is that, I mean, the company context is kind of big.

So if you give it to an agent or to an LLM, it's going to be lost in it because the retrieval is not perfect. The models are not perfect. And so the second hypothesis was really the ability to create custom agents so that you could pinpoint what context you really need for a specific task, which mechanically gives you much better results. And those two hypotheses were somewhat early in the market, meaning that when we were pitching Dust in 2023, everybody was kind of looking at us with weird eyes. And we've seen the market open up to those ideas, which has been super exciting.

I think the latest step function in there is probably the Toby Lutke memo, which really made us to some extent go from the early minority to the early majority. I don't know, I don't remember the exact terms, but we went one step in the adoption curve. So we kind of nicely positioned our place where the market is going now, but I think the one thing is that it's a constant reinvention of what we are trying to achieve because it's for sure not the end and we can dive into that. We have many, many opportunities to see that through different prisms.

Speaker B: I'm sure we will dive into that. And just for listeners, you know, I'm sure most folks saw it, but the Tobi memo was, you know, Tobi from Shopify basically saying, you know, if you're not using AI for your job at Shopify, you're in big trouble. And so you need to really be adopting it in a very aggressive way. Speaker A: Yeah, yeah. I think it was slightly more gentle. Maybe I'm wrong, but I think it was, Don't come at me for a headcount if you haven't tried using AI first.

Speaker B: Yes. When I say in big trouble, I mean less, I'm going to come get you, but more like you're going to be left behind. And anyway, it was, you know, I think it was funny. We saw Toby write that memo and then I feel like there was this mass memesis of like every CEO felt like they wanted to get their letter out into the public to say like, hey, we also really care about this and we're hardcore. It was very funny. But Toby, Toby was early, I think. And yeah, I imagine that was a very validating moment.

You know, we're sort of talking about this, the framing of LangChain and Dusk, but just to bring as many people along in this conversation as possible, like very crisply, what is the right way to think about what an AI agent like truly is and what it isn't? Just so that folks, you know, have maybe a crisp wrapper for that thinking, Harrison. Speaker C: An agent is an application where an LLM decides the control flow of the application. And so, I think that that's a bit vague. If you want to get more technical with it, I think for all intents and purposes, the way that a lot of developers think about an agent is running kind of like a while or for loop, calling an LLM to decide what to do, and then taking those actions if they're actions, and until it basically decides that it's finished.

And I would say that, you know, like there, the LLM is very clearly deciding the control flow of the application, like every single step. Loop is decided. I think you also have applications that are, you know, not as kind of like maybe agentic in some sense where the LLM decides maybe a few steps, but there's some steps that are hardcoded. Like maybe after A, you always do B, or even when you talk about multi-agent applications, maybe you run one of these loops and then after that it immediately goes to this other application, which runs a check or another agent even, and then goes back.

I feel like there's a spectrum, and Andrew Ng has a good way of talking about it that I like, which is rather than talking about whether something is an agent, let's talk about how agentic it is. And so that allows for this kind of spectrum of agenticness. And I would say, yeah, the more an LLM's deciding what to do, the more agentic it is. And then probably at some point, there's some threshold where it crosses from a non-agent to an agent. And to be honest, I don't 100% know exactly where that is.

But I like the idea of agenticness. But again, for all intents and purposes, if you're talking to a developer, I would largely say they think about it as running an LLM in a for-while loop and having it call tools until it decides that that's kind of done. Speaker C: An agent is an application where an LLM decides the control flow of the application. And so, I think that that's a bit vague. If you want to get more technical with it, I think for all intents and purposes, the way that a lot of developers think about an agent is running kind of like a while or for loop, calling an LLM to decide what to do, and then taking those actions if they're actions, and until it basically decides that it's finished.

And I would say that, you know, like there, the LLM is very clearly deciding the control flow of the application, like every single step. Loop is decided. I think you also have applications that are, you know, not as kind of like maybe agentic in some sense where the LLM decides maybe a few steps, but there's some steps that are hardcoded. Like maybe after A, you always do B, or even when you talk about multi-agent applications, maybe you run one of these loops and then after that it immediately goes to this other application, which runs a check or another agent even, and then goes back.

I feel like there's a spectrum, and Andrew Ng has a good way of talking about it that I like, which is rather than talking about whether something is an agent, let's talk about how agentic it is. And so that allows for this kind of spectrum of agenticness. And I would say, yeah, the more an LLM's deciding what to do, the more agentic it is. And then probably at some point, there's some threshold where it crosses from a non-agent to an agent. And to be honest, I don't 100% know exactly where that is.

But I like the idea of agenticness. But again, for all intents and purposes, if you're talking to a developer, I would largely say they think about it as running an LLM in a for-while loop and having it call tools until it decides that that's kind of done. Speaker B: Stan, is there anything you'd want to add to that? Speaker A: No, yeah, I think the definition is perfectly on point. I do like the agenticness framing. That makes a ton of sense. The big question we see in the market today, which is interesting, is agents versus AI workflows.

It's a different way of looking at the same question. There's a lot of traction on companies that are doing AI workflows. We all know, have heard about N8N and those kind of companies. And I think there's a lot of value in AI workflows. And I used to use a definition of agents that I think apply to AI workflows, which is any program where one conditional is driven by an agent, but to some extent, that agent-icness framing is very interesting. Workflows versus agents is a massively interesting question, which is not answering your question and that we can revisit.

But there's a lot of value in having workflows because it gives you much more control on what's happening. But to give you a sense, I think that long term, they don't really make a ton of sense at the end of the day because We do foresee a world where those agents will be actual coworkers, and I don't think you can really encode a coworker with a workflow in the same way you cannot encode Claude code with a workflow. We're really bullish on trying to help people create agents, not workflows. Very interestingly, which is the weird part, the agents is richer, but it's more risky in a sense, but it's also easier to build, which has a ton of value in the kind of a work setting.

That means that anybody can build an agent, not anybody can build a workflow. Workflow is like typical Make, N8N, Zapier and stuff. It's pretty easy. Most people will be able to interact with those products, but not everyone. And it requires a little bit of a kind of a learning and learning curve. Compared to that, building an agent is actually just describing in plain English what you want to do and giving, clicking the capabilities of the agents should have. And it's it's actually accessible by a much broader audience, which makes it also very exciting.

And so that's weird that the long-term most powerful thing is actually at the same time the easiest to build to some extent. I'm obviously exaggerating because to build a great agent, you need evals and stuff like that. But to build a proto version of an agent, it's actually extremely easy. Speaker B: Is it a fair synthesis? And you guys can push back on this, but if we're talking about sort of the workflow versus agent idea, a workflow might be more like write writing out a recipe step by step that you're sort of, you know, saying, here's what I want you to do.

And an agent is more like creating a chef and saying like, please go cook me something. Is that, you know, roughly a way that someone might think about it? Speaker A: Yeah. I mean, it's a difference between McDonald's and a Michelin star restaurant, right? You know what you're going to get at McDonald's. It's very well streamlined. And when you're going to Michelin star restaurants, you don't know what you're going to get because you'll be at the— the chef will be improvising with the situation and the the food that has been available.

It's, and it's again, a funny way to frame that. Speaker B: Harrison, I saw your eyes sort of light up with a question there. Speaker C: I mean, yeah, maybe pushing back on this a little bit. Like I feel like to your point, Stan, I feel like you can often do the same things with workflows and agents. It's just the ease of how you describe it. Like in an agent, it would all be in natural language, right? Like you could have a recipe that you just put in natural language and say, hey, do A, then do B, then do C.

And you know, it's not as deterministic. So it's not kind of like as safe, but it's way easier. And so I'm thinking out loud of this analogy for the first time. So I don't have super strong opinions on it, but I do think, I don't know if it's so much different things as like different ways of accomplishing the same thing. And so maybe it's, I don't know, the difference of a person in McDonald's who's doing it all by hand versus using some of the pre-portioned things. I don't know exactly where I'm going with this, but I do think I just, I really agree with what Stan said where like there is a beautiful kind of like simplicity in how easy it is to build agents.

It's usually natural language and then you choose some tools. And I remember one of my favorite releases we ever did in LangChain, it must have been like the 13th release or something like that, where we took this idea of like the React agent, which is this great paper by Shen Yu, and like was a little bit focused and the examples they had in the paper were very focused on some like Hotpot QA, some Wikipedia question answering, like a narrow task. But I remember we took this and we made an abstraction around it where you just did exactly what Stan said.

You gave it some tools, you gave it a system prompt, and it would just do things. And I was like, holy crap, this is amazing. And I think that is like, there is this beautiful simplicity in agents. And so, yeah. Speaker A: And very interesting data point here is that the React paper, which I invite everybody to read, and everybody We will read that paper today, we'll look at it and say, what the fuck is that? It's so completely obvious. Is it even a paper, seriously? And it's funny how at the time it was kind of a mind-opening and people, we were so early in that technology that it was kind of a really interestingly mind-opening paper despite retrospectively it feeling extremely trivial and obvious with everything that's been built since then.

And so that's really funny, very funny exercise to all listeners to open the ReACT paper and skim through it quickly. That will show you, and it was a great paper, kind of a really engaging paper in late 2022. And that's how early we were. That's how no clue we had about what we can do with LLMs. Speaker C: And maybe to that point, like when we launched it and even right now in LinkChain, I think we still call it like the ReACT agent, but we're going to stop doing that because it's just confusing to people.

They're just like, what do you like? What is ReACT? Like, this is just so obvious. It's just run. It's like, you know, just taken for granted now. Speaker B: That's amazing. You know, to maybe get a little bit more tactical with it, like what are the different use cases you guys see for agents at the moment that are sort of most productive? And, you know, Dust is obviously creating a lot of those sort of for companies, you know, deploying them. But yeah, curious, like even within that, where you're seeing the most leverage for businesses.

Speaker A: The list is long. We're taking the, we have a pretty horizontal I mean, we have a completely horizontal product because we believe that it can be applied in so many places and that there is value of having one platform for everybody to share creating and sharing agents and having agents interacting with others in a business setup. I think that I can only give you examples. It goes from extremely simple Slack thread to issue creation. And that's not completely trivial because at Dust, we have something like 3 or 4 different types of issues, which goes in different types of projects, given the different shape or type of discussion that needs to happen.

And having an agent that helps you go from a thread, a thread Slack where there's a discussion where maybe there's a bug that is emerged, there's a place obvious where to put it when that's a bug. But when it's a decision that needs to be taken, it needs to be moved into a different place and taking a few actions around that. And so streamlining that process of having something happening organically on Slack, as an example, to following a workflow that makes it represented in the canonical way that a company represents that artifact that is being discussed on Slack is a very general use case that I covered for issues on GitHub, but that it can be done for many other stuff.

Whenever there's a sales transcript, there's an army of agents that gets kicked in for providing feedback, for autofilling in Salesforce, for extracting product interest and going to create comments on Notion pages related to the product that has been discussed during that transcript. And so there is so much stuff that would require human work that is now being doable by agents. That's really interesting. The most interesting places is for the things that no human would ever do because it's just too much work. So for any sales transcript, being able to put a comment in the right product document on Notion, about the fact that it's been discussed, the link to the transcripts and the kind of a one-sentence summary of what has been said is something that never happened before because nobody was there to make it happen.

Nobody had time to review all the transcripts. And so those new use cases are almost the most exciting as well to me. Speaker A: The list is long. We're taking the, we have a pretty horizontal I mean, we have a completely horizontal product because we believe that it can be applied in so many places and that there is value of having one platform for everybody to share creating and sharing agents and having agents interacting with others in a business setup. I think that I can only give you examples. It goes from extremely simple Slack thread to issue creation.

And that's not completely trivial because at Dust, we have something like 3 or 4 different types of issues, which goes in different types of projects, given the different shape or type of discussion that needs to happen. And having an agent that helps you go from a thread, a thread Slack where there's a discussion where maybe there's a bug that is emerged, there's a place obvious where to put it when that's a bug. But when it's a decision that needs to be taken, it needs to be moved into a different place and taking a few actions around that.

And so streamlining that process of having something happening organically on Slack, as an example, to following a workflow that makes it represented in the canonical way that a company represents that artifact that is being discussed on Slack is a very general use case that I covered for issues on GitHub, but that it can be done for many other stuff. Whenever there's a sales transcript, there's an army of agents that gets kicked in for providing feedback, for autofilling in Salesforce, for extracting product interest and going to create comments on Notion pages related to the product that has been discussed during that transcript.

And so there is so much stuff that would require human work that is now being doable by agents. That's really interesting. The most interesting places is for the things that no human would ever do because it's just too much work. So for any sales transcript, being able to put a comment in the right product document on Notion, about the fact that it's been discussed, the link to the transcripts and the kind of a one-sentence summary of what has been said is something that never happened before because nobody was there to make it happen.

Nobody had time to review all the transcripts. And so those new use cases are almost the most exciting as well to me. Speaker B: Harrison, I'm curious, you know, maybe how you guys use agents at LangChain internally and maybe some of the use cases you've seen that you found particularly powerful. Speaker C: Yeah. So internally we use agents in a few ways. So we, so, and they kind of map with like the, the big use cases that we see out there as well. So like we see customer support being a big use case.

Uh, we've built slash are building an internal kind of like customer support agent to help us with a lot of those inquiries and responses. Um, coding is a massive use case. We both use and build some kind of like internal, uh, coding or coding adjacent agents for kind of like responding to issues, things like that, managing discussions. I personally use an agent to kind of like monitor my email and draft responses and flag things. And so that's probably the one that I use the most. I mean, we use off-the-shelf and internal kind of like versions of deep research agents.

Like that's been a huge kind of like style of things that we've seen pop up. Oh, we use some for marketing as well. Like marketing's a fantastic use case. And so we use some for translating some of the blogs or things we do into tweets or LinkedIn posts. And so I think, I think those are a lot of the big use cases that, that we use internally. And I think they generally match up with what we see people building in, in the industry. Speaker B: You've written a, I think it was a blog post where you've talked about, you know, how we interact with agents and how that might change from sort of typing, you know, prompting these agents with voice or text and shifting towards more of what you call like ambient agents that, you know, require less of that.

Why do you see things going in that direction? And maybe you can share a little bit more about, you know, what an ambient agent really looks like. Speaker B: You've written a, I think it was a blog post where you've talked about, you know, how we interact with agents and how that might change from sort of typing, you know, prompting these agents with voice or text and shifting towards more of what you call like ambient agents that, you know, require less of that. Why do you see things going in that direction?

And maybe you can share a little bit more about, you know, what an ambient agent really looks like. Speaker C: Yeah, absolutely. And I'm super curious to hear what Sam has to say on this as well, because he— I think you mentioned something about, you know, thinking about how we interact with agents as part of the core mission of Dusk. And that's one of the things that I'm a little bit jealous that we don't get to do a lot of at LangChain because we are more developer-facing. So we do think about it and it does absolutely inform like, you know, what tools we build, but not nearly as much as folks building products in the space do.

But I mean, so far the dominant UX for agents has kind of like been chat. And I think, and I think, and I think if you think about it from first principles, that actually does make a lot of sense. Like it puts the human in control. It's very human in the loop. You can, you can, not only does the human initiate it, but the human can see what's going on because you can stream back results. If the agent wants to do an action, you can have the human kind of like immediately approve it there.

So you can kind of like have some sort of like approval for, for dangerous actions. And, and it's relatively fast for the most part. But I think like, you know, people have been saying that, that chat won't be the only UX or, or, you know, the, the forever UX. And, and while it has actually lasted a lot longer than maybe people saying that would've initially thought, I, I do, I, I do think it's interesting to think of what, what besides chat is out there and, and also like what some of the downsides of chat are.

Um, and I think some of the downsides are also some of the things that, that that make it good in some cases, namely like you have to kick off all conversations. So if you wanna run it over kind of like 1,000 or 10,000 things, like that's, that's a little bit tedious to kind of do. And, and also because you generally expect to be in the moment, they can't really take that long. Um, otherwise you get a little bit bored and maybe switch. And I actually wanna come back to that point cuz I think with like Deep Research and some of the coding agents, we're starting to see some that like that's starting to happen, but it's still, anyways, I'll, I'll come back to that.

And then the other thing is like, like, yeah, I mean, just based on inside an enterprise or inside a company, you have all these events happening and rather than like copy paste an email and then take that email and go and, you know, put it in chat, wouldn't it be nice if that would just kick something off automatically? And so I think the email assistant that I use is actually a great example of this. It just monitors my email inbox. It just gets triggered by these events and then it goes and does something.

And if it wants to take an action that I deem kind of like dangerous enough where I want to approve it, which right now is scheduling a calendar invite or responding to the email, then it presents it to me in some way. And I think they're like, what is that UX? What does that look like? Is that just a draft in my email inbox? Is that— I, I, we have a concept of like an agent inbox, which is a dedicated view for this. Back to the original question, like ambient agents, we define as agents that, that listen to a stream of events and then act on one or multiple at the same time.

And I think crucially, these are not necessarily autonomous agents. They're still kind of like have some human in the loop at some component because I think that's still necessary for enterprise adoption. I have some other thoughts on kind of like the deep research and coding style agents, but I'd actually be curious to hear Stan's thoughts first because you work on actually delivering these to a lot of end users. Speaker A: Totally. I think basically the way we see it indeed is that the conversation interface has been the interface. We hypothesize, and maybe that's never been the case because indeed it survived for a long time, that there's going to be a fork probably in the typical UX UX/UI that means working with agents.

As deep research agents take longer and longer, as agents are being triggered with human out of the loop, you probably want something that looks more like a command center than a list of conversations. The conversation paradigm will probably make sense in the B2C setup for a much longer time because in the B2C setup, the truth is that your agent is your executive assistant. And so you have one stream of conversation with them or a couple different streams. But when you think about the enterprise, there's going to be agents that take a lot of time, agents, conversation with agents that involve multiple people.

That's something that the market hasn't even started exploring much, right? I mean, we know we don't explore it much yet because I think we consider them as toys. And, but the more powerful the use case will be, the more meaningful it will be for people to actually interact, interact, multiple people interacting with the same, with an agent or with multiple agents. Completely aligned with your vision, Harrison, of the ambient agents. I think the first step, as you described it, is really to have more of an inbox paradigm when you interact with those agents, having agents that are being triggered by external events.

And I think the crazy idea is agents that are not necessarily mechanically triggered in the sense that when this do that, that, or when these actually execute, but having agents that are just skimming through what's happening inside of the company and maybe ping you with offers to give you, to provide you some value, which is obviously what the end state should be. When you think about all multi-agent systems, I mean, today we obviously don't see that in the enterprise, but you could imagine giving a very high-level project to a set of agents and just let them walk and organize for delivering the project on their own and give you a report multiple days after.

I think all of that is completely uncharted territory. And so, but yet it's obviously the end state. And so that's why it's so important and so exciting to be working in that space. Speaker C: You mentioned a command center, Stan, like, is that in the product now? Or is that a future kind of like direction? Because I love that idea. But I would love to see like, yeah, I would love to see what that— I don't know what that looks like. I would I'd love to see what that looks like.

Speaker A: I don't know what that looks like either. I think the first step will obviously be like what you've been building internally. And I think you've shared on some of your blog posts, but then what you just mentioned, a form of inbox is already a first step in that direction, obviously. The weird thing about the agents is that they also, I mean, the APIs are so biased toward a conversation that often you're like, do something for me, and you have that agentic loop that can last for a very long time.

But at the end of the day, you still have an agent message, which is kind of weird because that means that you somewhat have a cap on the interactions through the agentic loop. You can make them very long, but you're also going to be exhausting your context at some point, but that's not necessarily completely an issue. But at the end, the whole system is kind of constrained to give you an answer. Yet, When you think about agents working, you just want to say, go do the work for a day and ask me questions if you have any, but don't give me an answer in 30 minutes.

I just want something delivered in one day and ask me questions if you have anything. But yet there's no good system for interacting with the current shape of those agents for doing that. You could think about multi-agent stuff, et cetera, but the ecosystem is not there yet, I guess. And so we still, even at the API and post-training level, a little bit bound to be staying close to the conversational interface, but I'm sure that we'll see kind of stuff emerge around that. Speaker C: When we were building at the start, like, messages weren't a thing.

It was just text in, text out. And then like OpenAI, uh, I think it was, it might've been 3.5 or maybe 4 that they released and it was only the Chat Message API. And I remember talking with someone from OpenAI and it's like, so are you gonna release like the non-chat message thing? And they're like, ah, We don't know. And they ended up not. And now everything is just like that. And that kind of happens too. What's also really annoying about this is there isn't really like an official schema for what Messages is like.

OpenAI has their kind of like input-output schema, but that's different from Anthropic's, which is different from Google's. And like, you'd think if this was like— this is like, you know, the base thing, which has how we interact with these. I, you know, I wish it was a little bit more standardized what that schema is, although it's constantly evolving as well. So tough to, always tough to do that. And then the third point is maybe like, so all the chat agents to date have been kind of just like synchronous agents and that like you just chat in the moment.

Now with deep research, you have things that start sync, you start with a chat, but then you go to this deep research and then maybe you actually come back to synchronous at the end. And I think for some of these ambient agents, you could almost view them as async running in the background, but then at some point, like you said, Sam, they ping you with something and then they become synchronous. And so So I think chat is a pretty good form of synchronous communication. And then what async means is maybe that's hidden a little bit through some context or prompt engineering, and it's just like by the time it surfaces to the user, it's just all a message because that's the dominant form for synchronous communication, at least.

Speaker D: This episode is brought to you by Brex. Fred Adler, the influential venture capitalist of the 1970s, was known known for displaying decorative pillows in his office that featured a signature business philosophy: "Corporate happiness is positive cash flow." In today's post-SERP environment, Adler's wisdom feels particularly relevant as founders need to make every dollar work harder. That's exactly what Brex delivers. Their modern finance platform was built specifically for startups like yours and designed to help extend your runway when capital efficiency efficiency matters most. With Brex, you get global corporate cards with up to 20x higher credit limits and no personal guarantee required.

Their banking solution has no minimums and no transaction fees while letting you earn high yield from day one with same-day liquidity. Best of all, Brex knows you were born to build, not juggle spreadsheets and finance tools. Their AI-powered platform brings cards, payments, banking, expense management, and travel all in one place. It's simple, scalable, and designed to get you back to what you do best: building. More than 30,000 companies, including 1 in 3 US venture-backed startups, trust Brex to help make every dollar count toward their mission. Join them at com/mario. Speaker D: This episode is brought to you by Brex.

Fred Adler, the influential venture capitalist of the 1970s, was known known for displaying decorative pillows in his office that featured a signature business philosophy: "Corporate happiness is positive cash flow." In today's post-SERP environment, Adler's wisdom feels particularly relevant as founders need to make every dollar work harder. That's exactly what Brex delivers. Their modern finance platform was built specifically for startups like yours and designed to help extend your runway when capital efficiency efficiency matters most. With Brex, you get global corporate cards with up to 20x higher credit limits and no personal guarantee required.

Their banking solution has no minimums and no transaction fees while letting you earn high yield from day one with same-day liquidity. Best of all, Brex knows you were born to build, not juggle spreadsheets and finance tools. Their AI-powered platform brings cards, payments, banking, expense management, and travel all in one place. It's simple, scalable, and designed to get you back to what you do best: building. More than 30,000 companies, including 1 in 3 US venture-backed startups, trust Brex to help make every dollar count toward their mission. Join them at com/mario. Speaker B: Also, it sounds like there's just this layer of proactivity that, you know, you're suggesting might be different, that you're sort of saying, here are the the goals that we have as a business.

And actually, as long as you sort of have the sufficient context and, you know, enough power, you can start to say, hey, by the way, you should consider doing something as little as writing this tweet to boost the numbers that you want to do, or to something as big as you should consider, you know, this new product that might, you know, be really important over the next few years. What are the sort of major limitations to a more ambient model? Speaker B: Also, it sounds like there's just this layer of proactivity that, you know, you're suggesting might be different, that you're sort of saying, here are the the goals that we have as a business.

And actually, as long as you sort of have the sufficient context and, you know, enough power, you can start to say, hey, by the way, you should consider doing something as little as writing this tweet to boost the numbers that you want to do, or to something as big as you should consider, you know, this new product that might, you know, be really important over the next few years. What are the sort of major limitations to a more ambient model? Speaker C: Today? Speaker B: Like, how far are we from a true command center world where, you know, maybe you're really ushering out a swarm of agents per person and having them sort of monitor and think for you and do this deep asynchronous work on a regular basis?

Speaker A: I think reliability is obviously a limitation. It's still mind-blowing to me how dumb those agents can be in pretty obvious situations, and yet F-star, get a Naimo gold medal at the same time. It tells a long story about the importance of data, the importance of pre-training, the importance of post-training, and how there's been focus on code, on math a lot. And yet on different places, there is some, obviously some gains as well, but so many cases where they're like, damn, You're being so silly there. You can solve very complex math problems and yet you don't understand from the context that it is two women speaking together, whatnot.

Anyway, so I think it is the main blocker. And so as an example, I wanted to share that. I think a different way to think about working with agents is that could be a transition towards the very long ambient agents is the concept that I really like, which is the concept of a walk plan. And if you ask me, I don't understand why Linear and all the kind of Azana and stuff isn't doing that aggressively. But you can imagine a work plan. You have a very high-level task and you start splitting it in smaller tasks and you start splitting the stack in smaller tasks.

And once you start doing that work, you can do it assisted by an agent. Okay, you can do it yourself. And then you can start delegating or discharging those tasks to agents. And those agents start working on the task, come back to you and like, no, not quite yet. And eventually you start clicking the tasks that being done, maybe it's you, maybe it's another human, maybe it's an agent, maybe it's a bunch of agents. And so there you kind of have a nice mesh between the conversation and the kind of ambient agents.

It's not at all ambient to begin with because you kind of develop the work plan and discharge to agents as you go. But the better the agents are, the more they'll be taking off that work plan task all the way to eventually maybe someday defining the work plan, speaking the task, discharging to other agents, et cetera. And so I think even in the current world where we have very deep limitation in the reliability of agents on some tasks, which make the presence of a human to monitor what's going on kind of very important.

I think there's many product surfaces we can imagine that will start to, you know, mesh between the sync interactions all the way to more async through the ability to probably introspect what has been happening. Speaker B: Does that track with you? What do you think, Harrison, is the major blockers? Speaker C: Yeah, I would agree with that. I mean, I think like reliability of individual agents. I don't think there's a lot of work to be done at kind of like the UX layer. And then I'd also say, like, and this kind of gets to the reliability aspect, but just like learning/memory is also interesting as well.

Like, there needs to be some, like, that's what we as humans do. And so that's maybe a little bit further out, but I do think that's a component. I also think, like, I think code often leads this space just because the models are really good at it. And so I think if you look at, like, Claude Code, like, that's a great example where the model got good enough. Okay, so reliability's a little bit better. They did a good job of writing up a CLI and giving it access to some tools.

So some great context engineering there. And then you start to see like some interesting UX things happen. So I think, I think there's an open source project called like Taskmaster or something that like, you know, keeps an eye on like 5 or 6 Claude code things that, that happen kind of like in the background. I think they released a view to kind of like see kind of like usage of Claude code and, and, and Chip Hewitt released something as a way to debug like the errors that Claude code kind of like made.

So like now that we get these like more, like I think that's the first example or that's a yeah, one of the first examples of these really like long-running kind of like more autonomous agents. And now you're starting to see a bunch of like interesting kind of like command center-y type vibe things coming out for how to interact with them. But it's still, it's still really early on. But I, I, I like to look to, uh, to code for an example of like where, where the space is generally headed, just because I think it's ahead of, of the other verticals.

Speaker B: Given how Dust operates, I imagine you have a, specific opinion on this, but when you think about how agents play out over the next few years, do you think that, like, it's unequivocal that there's going to be really many, many specialized agents, or over time, do we just sort of start to converge into, you know, a superagent that has enough context on work and life or whatever it is? Yeah. Are we, are we heading towards a true multitude or, you know, an oligarchy or one true, true ruler? Speaker A: That's the big question.

We don't have a clear answer on that, and we're trying to stay very humble with respect to that question. We started with many custom agents, and it was a clear good decision at the time, given the state of the models. As the models are getting better, there is an undeniable force towards higher-level agents. Until the agent doesn't have a really functional memory so that it can interact with humans, learn from them, being coached and understanding how the company operates, I think you're still going to have the need for custom agents because if the agent doesn't have a good episodic memory, in a sense, it's going to be very hard for them to learn that this data is routed and this data is fresh.

I mean, in every company you have data that is not up to date and data that is good and data that is bad, and you have ways of doing stuff, etc. And so today, having custom agents lets you point to the right data, explain the right process so that you don't have to do it each time. I think the state of memory of agents doesn't scale us to a point where you could have just one agent and it's going to learn it all. And also kind of feels weird that you would have to teach your agents.

And there's also weird stuff in a, when a company gets somewhat big, you even have contradictory ways of doing stuff within the company. Team A will do stuff this way and Team B will do stuff this way. And so now it begs the question, where's the memory? Because teams will be competing for the same memory slots in the sense of doing the thing the right way. So there's still many questions, even if you assume a really great, perfect agent, there's still a ton of questions. To answer your question, I don't know if it's the end state.

I think the level of abstraction of the agents in general will increase. And so the number of agents being necessary to walk and to do walk will probably decrease. There will be probably a convergence toward one, but it's very unclear when that's going to happen. And I think we're trying to keep our finger on that trend. Speaker B: So you do think eventually there'll be a convergence towards one workplace agent? Speaker A: No, I'm sorry. I'm saying that there's going to be an increase in the abstraction level of agents and so a decrease in the number.

I don't know if it's going to converge towards one. I guess we'll converging, but maybe there's going to be a ceiling. Speaker B: 10 versus 100 or whatever it might be. Speaker A: Exactly. Speaker C: Yeah. Speaker B: Um, does, do you take the same position, Harrison, or do you see things a little differently? Speaker C: No, I think I largely agree. Maybe like a few kind of like, you know, thoughts as well. Like one, like generally, like what does it even mean to have like multiple, like what, what are like, what does it mean to have multiple agents?

Like how are they different? And generally it's, it's, it's the prompts. It's maybe the model, but mostly like the prompt and the tools it kind of like has access to. And so sure, in the, in the limit, you could maybe have like, you know, one agent with every single instruction for how to do everything at the company in the system prompt and all the tools there under the sun. That's definitely not what we see right now. Maybe it will go towards that direction or towards a smaller number of agents. I think what we see more now and is maybe an alternate view is like there will be one agent that that a user at a company interacts with, but under the hood, there will be many sub-agents that it can call out to or route to or use to, and those have like the specific instruction that, you know, like when we talk about people for building agents, it's like write down a standard operating procedure and figure out what tools it needs, and then that's your agent.

So maybe there'll be like, you know, one kind of like central supervisor agent that can interact with all these other agents either by, and now we get, start to get into multi-agent stuff, and that's very, very kind of like early on, I would say. But there are some initial ideas of how to do that. I mean, even if you look at some of the coding agents, going back to kind of of like looking at code, Google Jules is kind of interesting. It has kind of like this chat-based synchronous agent that kicks off other async kind of like background agents.

And I'm assuming there's some difference in the system prompts and tools that has access to. Something like that I think is very, very reasonable. And most people even right now are kind of building towards, 'cause people don't want to have to choose like, oh, I have hundreds of agents. Like, no, they just want a chatbot. It's this, it's a simple kind of like, you know, approach to that. But under the, under the hood, at least right now, and even for the foreseeable future, I think they'll still be like relatively special.

Speaker B: It's like having, you know, your agent that is your VP of marketing who's also managing the agent for your social media marketing, your performance marketing, your brand marketing, and you don't have to worry about, hey, I'm trying to figure out, you know, how to do my performance marketing. Here's the exact one I have to go to sort of thing. Speaker C: I think that's exactly right. I think like, um, one thing that we do, and I genuinely don't know if this is good or bad, but I feel like we often like anthropomorphize how we interact with these agents.

And on one hand it might be good because like, yeah, that's how we we are used to communicating and that maps to our mental model. And that's a good, like all these like context engineering, which is like, you know, the, the topic of the month is just communication, right? But on the other hand, like these things are different than us. So like, why should the way that we communicate be the way to keep— so like, I genuinely don't know if it's good or bad, but like the analogy you just made, like I think that's what we often do and other builders often do to try to figure out what the best way to organize and communicate across these agents are, again, for better or worse.

Speaker B: Here's a question I have, you know, that maybe goes beyond the paradigm of the agent, which is how can we make sure the agents and AI in general is doing properly useful work when we see like so much of this sycophantic posture from a lot of the responses? Like, you know, something I worry about when we see, you know, a company full of agents is like, will you really have a VP marketing, VP eng, you know, whoever who's really able to think critically about this when there is just this, you know, reflexiveness that is so pleasing?

Like, do you see a solution to that question anytime soon? Speaker A: I don't have a solution to offer like that, but I feel, I mean, my, one of a small side research project I'd love to work on, I don't have the time, but if I had time, I would play with that is probably to try to have agents debating against each other towards a goal. Speaker B: Like adversarial? Speaker A: Not necessarily adversarial, maybe more like a research community. They share results and then you have something like a Hacker News system where the things that are the most cited go up and it's a clear objective of the agents to get ranked high and they try to push towards getting some answers with that by trying to follow some form of notion of truth, which is obviously a whole, I mean, you have there's no guarantee that they would do it or whatnot.

But I think there's, in the multi-agent setup, there's probably a new dimension that it creates that can probably alleviate that problem because you'll have an opportunity to prompt agents to be actually a little bit adversarial to other agents, providing a very, as you mentioned, very reflexive response that tries to please the user. And so I think there's a lot of stuff to be explored there. Obviously, we are light years away from productizing kind of stuff. We're light years away from practicing this kind of stuff because the state of the market is light years away from even those kinds of questions, which is interesting.

But I think there's a lot of stuff to explore in that direction. Speaker B: Harrison, anything that comes to mind for you there? Speaker C: Yeah, prompting these agents to have different points of view is practically speaking what I think is feasible now. And then I also imagine some of kind of like these issues will get handled through better from better models that just come out from the foundation model labs. Speaker B: Harrison, anything that comes to mind for you there? Speaker C: Yeah, prompting these agents to have different points of view is practically speaking what I think is feasible now.

And then I also imagine some of kind of like these issues will get handled through better from better models that just come out from the foundation model labs. Speaker B: Are you seeing folks do some of that prompting to do some of those sort of like prompting different views and sort of, you know, strapping that together to have like the Hacker News style ranking or whatever it might be? Speaker A: I mean, I know there's probably a ton of teams all over the world working on those kind of ideas. OpenAI obviously has a multi-agent team.

It seems like the IMO result comes from the multi-agent teams. They say they have a special model. If you ask me, I would that it's probably a multi-agent setup where they do exactly those kind of shits. And so I think many people are exploring for sure. And that makes sense, but it's still definitely in the realm of research at this stage, I would say. Speaker C: I think we see very simple and naive versions of that, or a simple version of this is just reflection or critique on an initial thing.

And so I think a pattern that we sometimes see is like, yeah, have one agent or one LLM generate something and then give some feedback on that through whatever. I mean, this kind of gets into like some of the reward systems that actually go into RL. And so like for code, it's kind of easy. You can run the code. So you actually don't even need another agent to provide this kind of like other point of view, right? You could almost view it as like, hey, this is the system's point of view.

Your code doesn't compile. That's just like a fact, right? But I think you can imagine doing stuff like this for, you know, essay writing or something like that, where you have kind of like one agent that, you know, reviews it or, or, or gives some feedback. I think right now it's a little bit more researchy unless you have kind of like these verifiable kind of like rewards almost that you can feed back into the agent as it's running, like evals in the loop is kind of like what, what, what we call them.

And so you can add these checks from, yeah, like running code's kind of like the most obvious example. We're working on an internal coding agent and as part of that, we're experimenting with having kind of like a separate agent that kind of like decides whether it's, you know, done with a loop. And that's a little bit different. It's not like as adversarial, but it's kind of just like delegations of concerns almost, or separations of concerns. Um, and so I think you could, I think you can view some of the, some of the, some of the stuff that people do in this vein, but it's very kind of like brute force in some way, or like simplistic.

Speaker B: I'd love to, you know, zoom out for a moment and talk also just about what it means and, you know, what it's like to be building in AI at the moment and some of the specific dynamics that founders have to face. And I think one of them is really just how fast the fast following is happening. You know, you see any good idea, there will be 3, 4, 5, however many folks, you know, chasing that, that very, very, very quickly and able to raise, you considerable amounts of money.

As you've gone about building your businesses, like, how do you think about protecting against that, building in defensibility where you think you have a real chance to build a moat? Yeah, Stan, how have you thought through that at Dust? Speaker A: I mean, so building in AI is a clusterfuck, that's for sure. So basically for the past many decades, the technological substrate has been extremely stable. For the SaaS, let's say for the SaaS decades that were behind us, it was just JavaScript and Postgres. I'm exaggerating a bit again, but extremely stable technological substrates.

So when you were building something, you knew that the foundations were not moving, so you could describe where you were going. You could build an image of your vision. So you had the vision and you had what you wanted to build to realize that vision. And today, one very specific thing that you're faced as a founder building in AI is that you have what I call the fog of AI. And the fog of AI is the fact that the foundations are moving very quickly. And so you have to have a vision of where you're going, but you cannot paint it because if you paint beyond 6 months, whatever you're painting will probably be shattered by the foundation shifting towards a different, I mean, the space-time of the ecosystem shaping itself in different ways.

And whatever you were painting will probably not be true. So that means that you have that fog of AI at 6 months, which is very interestingly very problematic for like building a high-efficiency organization, I find, because alignment is one of the things that makes organizations extremely efficient. And here you don't have that kind of continuity between the current products and the vision. You cannot paint that continuously because you have that fog barrier at 6 months, which means that you have to jump from from the roadmap for the next 6 months to the vision.

And that makes alignment of the team a challenge that is interesting in the way people think about where the product will be, how they prioritize stuff. You want all of that to be as autonomous as possible. And that makes it a really strong difficulty compared to what I've seen. I've been lucky to be at Stripe and at Stripe we had a very clear alignment because it was a simple developer-centric product, an API. And so you had that very strong alignment internally that allowed for the organization to grow and be efficient.

Without a lot of processes. Speaker B: This is the lesser discussed AI alignment problem. Speaker A: Yeah, I find that one of the most challenging part of building in AI. Speaker B: I bet. Yeah. And so how have you thought about building the defensibility piece for Dusk? Like where? Speaker A: Oh yeah. So sorry. And as I mean, we've managed to build a product that was slightly in advance of phase compared to the market. And now we see the market moving to that. And as the market moves to that, every big players in the market is waking up to it.

So you have Salesforce with Agentforce, you have Agentforce Space, of Google. I mean, everybody's working up to it. We've had 2 years of building a product as best as we could that is really creating us a defensibility today, meaning that our product is probably in a better state than most of those big players are shipping today. But they're also big players with many developers, so they eventually ship the thing. I mean, we can trust that. And so I think it's always a question of trying to build a few sometime in advance, which is completely contradictory with what I just said before, but that's part of the curse of building in AI is that you must be building 2 years in advance, even if you don't really see it yet.

And so that's a real challenge. Building an enterprise is more a command center. We don't know what it looks like. Building something like Workplans and stuff like that, that we discussed on the podcast, I think we don't know exactly what it looks like, but you have to be investing a lot of resources there because you have to build conviction that it's where it's going to be in the future and you want to be building it now. And as you do those phases, the bigger you get and the more gravitas you get and the more resources you get and you have maybe a chance of surviving to the OpenAI and the Microsoft and the Google of the world.

That's for us. What about you, Harrison? Speaker C: Yeah, I mean, I think like, I agree with everything you said around just it being a chaotic time to build. I think like, you know, execution is a moat and execution speed and like like that's, yeah, to the point of, you know, building fast and thinking in the future. Like, yeah, I think like we, you know, I think honestly the team that we have at LangChain is fantastic and that is a big moat we have. And I think we do execute like really fast and really efficiently.

I think like, you know, a little bit maybe more kind of like in the details on that or other than that, the fact that there's so much going on in AI can actually be a blessing in some sense as well because competitors will get distracted by other things as well. So they might see something that you do and be like, oh, that's cool. But then they see something else that someone else does, and they're like, oh, that's cool as well. And so they'll, so I think like really trying to like understand the problem and having conviction in that and like building towards that in like a, I actually think like just like understanding in general is actually very hard.

And so from like a product point of view, just like understanding what you're building towards and having consistent kind of like product strategy in that or product experience in that, and then, you know, the features that you bad, someone else may be able to copy them. But if they don't have that kind of like holistic understanding, they're not going to do as good of a job, and it's going to show up kind of like at the margins. And then the other thing that I'd say is like, I think a lot of the early things we did were around that kind of like understanding the user experience and building towards that.

And now we're starting to build— we're starting to try to figure out like, okay, what are the kind of like, um, deeper technical bets that we can make that like these other things like all kind of like boil down to in some way? And, and, and again, like, and there's like 2 or 3 in particular that, like, we're kind of like thinking about. And that's not that many, right? But we will way many more things, but they all kind of like come back to this. And so if you're looking from the outside in as a competitor, you might say like, oh my God, they do like 100 things, but there's really like 2 or 3 kind of like tech deep technical things that we're betting on.

And that's just, you know, having conviction and kind of like, I think it's tough because you do need to be moving fast, but you also need to kind of have some sort of consistent kind of like conviction or consistent kind kind of like technical bets that you're making to build up that moat that can't just be replicated in like a week. So it's a, yeah, it's a crazy time, but that's how we think about it. Speaker B: You know, at the time that we're talking, not long ago we had sort of the windsurf drama opera of, you know, OpenAI buying them, then that falling through, then, you know, management getting picked up by Google and then, you know, Cognition sort of taking the rest of the company and sort of saving the employees from, from being left without anything from what was looking like a really massive acquisition.

Do you think that's like a version of M&A that we're going to see more often? Was there enough of an organ rejection from the startup community towards that practice that like, hopefully we don't see that as much? Are you seeing you know, talent sort of respond to these kinds of behaviors? I'm, yeah, just curious for your take as founders on the ground and how, yeah, those sorts of things are changing the field perhaps. Speaker A: Weird, it's probably going to be controversial, but who cares? I think I much rather prefer the whole Windsurf setup than the Scale AI setup.

Why? The Winstaff setup is actually an acquirer and has been an acquirer forever. And acquirers have always been selective on the people they take in. The weird thing about that acquirer is that the amount just doesn't make sense. It's just way too big. And so the amount makes it, like, makes it really not great because you already paid $2.5 billion for some folks. Why don't you get all the folks, right? Generally, the acquirer is when the company is dying in a sense, and it's an event that is great because it lets you join a bigger team, but it's not with massive amount of money.

So the kind of stardom system of AI,, makes those acquirers completely weird. But I'd much rather have that than Scale AI. It feels to me slightly more weirder because it's both the CEO acquirer, but at the same time, not fully completed acquisition, just a majority stake buy, which I don't know what were the dynamics for in terms of returning to the employees and feels kind of a bit, almost even a bit more complex, and I, as a sense, a little bit more perverse. But obviously, I, I, all that being said, I do, uh, think that it was really, uh, not acceptable for the employees that were seeing a bunch of folks leave for $2.5— I don't remember numbers— but for billions of dollars and be left with something like, what, what are we doing, guys?

And having somebody tap you in the back and say, you've got a running company that's great, go get it. That made the whole setup weird. But at the same time, if you look back and forget about the amounts, it was still mostly an acqui-hire. It's just, it was weirded out by the massive amounts involved. Speaker C: This isn't the first time it's happened. I mean, like Character, Inflection, Adapt all had versions of this as well. I also think like you're seeing in the markets, like there's just this insane price for talent that's going on with, you know, Meta and, you know, the rumored salaries and stuff that they're paying people to try to get from OpenAI.

And so I think it's like a really really just like crazy time in the talent market. And I think that's manifested in a few ways, including these offers, but also including these kind of like acquihire acquisitions, whatever, or, you know, faux acquihires, whatever you want to call them. I feel like the Windsurf News is new enough where I actually haven't had that many kind of like detailed conversations with folks about it on the ground. I think it happened what, last week or something like that, or week and a half ago.

Yeah, I mean, I don't really know how it will affect things going forward. I do think that it's, uh, uh, you know, I imagine that's not what, um, the founders had in mind when they started the company or even, you know, 6 months ago or something like that. And so I don't think it's a great situation at all for kind of like the employees that were left there. I also don't, I don't know, like I, I, I, I spent a lot of time thinking about why I wanted to start a company before I started a company.

I came to the conclusion that I wanted to build something great with people I enjoy working with. And I feel like, you know, I feel like not only do we have that here at LangChain, the chance to build something great, but also like I really enjoy everyone that I work with here. And so I think like that's personally kind of like what motivates me. And so I think I have a tough time kind of like seeing kind of like myself or LangChain going that route. But, you know, I also think if you ask the Windsurf founders a year ago, they probably would've said the same thing.

So I don't wanna, I'm not, I'm not here to judge anyone. Speaker B: So yeah, I, I think your point about the, just the intensity for talent right now is like, you know, really such an important one as startups. Like how, how do you think about where your competitive advantage is in, in such a hot talent marketplace? Like where have you found you're able to, to compete most effectively? Um, and, and get the folks that are like perfect for your particular mission? Speaker A: And for us, we're building from Paris, so we have an easy, easy, easy day on this one.

I think there's still some competition, but it's nowhere close to SF. And I think we are capable of creating— I mean, we're spending a lot of time creating a brand that is really attracting in Paris and has been a really great thing for us to build the team. We're as well super excited to be working with them. I think that's been our mostly differentiated approach here is around the locality of the engineering team, for sure. Speaker C: Yeah, we're mostly based in San Francisco, so it's been a lot tougher. I'd say like, you know, like one, we hire more kind of like just software engineers as opposed to research engineers.

And so like the folks who are getting a lot of the crazy salaries, we're probably not competing for them. That being said, like, you know, OpenAI and Anthropic and everyone is also hiring a bunch of software engineers as well. I think, you know, we're, we're a lot smaller than them. We're more of a startup. A lot of people wanna work at a startup for a variety of reasons. And so that's been the main reason that someone would join us as opposed to one of the model labs. Speaker B: Amazing. Well, I wanna just ask one more AI question before we do a bit of, uh, our, our final wrap-ups, which is more of a general one, but what are your rough sort of frameworks for when we can expect sort of true AGI, if you think we haven't hit it yet already?

Already. And, you know, ASI, you know, are you on the AI 2027 timeline? Are you more bullish, less bullish, more scared, less scared? Speaker A: Yeah, I don't really have a timeline. What we've seen so far, if you look back, is that investments in that ecosystem has been a leading indicator to the progress that has been made, which makes sense and is pretty obvious. I think looking at advancement, it's nowhere close to being stopped, so we can expect more progress. The pace of it, et cetera, is very hard to anticipate in any way.

I mean, at the end of 2024, we were like, ah, things start plateauing and all of a sudden you've got a new paradigm that kind of emerged and push back again and push back the accelerator in terms of progress. So it's all very hard to anticipate. What is true is that it seems like the investment is still going crazy, which means that there is no limits to the kind of resources that will be invested in in making those models better. It is hard to believe at the same time that there is kind of a hidden limit that supposedly would be here around before we reach even better capabilities.

So I think it's mostly a question of pace. To be honest, that's not something I'm spending too much time on because I think we are, and I probably assume that it's the same for Harrison, but we are pretty agnostic to the, or pretty edged in both scenarios. I think the scenarios leading to something that is really Superman and the machine takes over all work, it will require some amount of time to transition to that. And there's going to be a lot of product to manage that transition. Eventually everything's off anyway.

So in a sense, we're pretty neutral. And if the technology kind of plateaus, I think the deployment of that technology in the society will still take years and years and years. And so I think there's still a lot of value to be constructed, to be created around building the product that helps that deployment. So I think we're excited to be building in both worlds. One of the funky, funny ways that motivated me to go back to building a product was DAT, which is the last train before AGI. So it's kind of the last opportunity to be building a company.

And so that was kind of, it's not the true motivation, but one of the one of the fun reasons why I moved back from OpenAI and started building a startup. So we'll see. I think it's, to me, I don't have a timeline and I'm just amazed by how fast it's been progressing. And so I don't expect it to stop. And so I'm really wondering where we're going to be in 2 years. I don't know if it's going to be AI 2027, but it's surely going to be crazy compared to where we are today.

Speaker B: Harrison, I saw you nodding along through most of that. Does that sort of Is that how you see things more or less? Speaker C: Yeah, I think that's pretty spot on. Like, I don't spend a ton of time thinking about that either. I think there's, I think even, you know, even if the models get really, really good in order to make them impactful, you'll still want to integrate them in some way and that has to happen somehow. And I think that's the type of work that we focus a bunch on.

I don't have any particular insights and so I don't claim to, and I don't spend a ton of time thinking about it. Speaker B: Well, let's move to two sort of wrap-up questions, if that sounds good to you. Harrison, let's stay with you. If you had unlimited resources and no operational constraints, what is an experiment you would love to run? Speaker C: I don't know the exact experiment, but maybe two areas. One, one, just one general one, and then one more kind of like focused on this. I think memory is really, really interesting.

I think memory for AI and agents is, is, and personalization and learning or whatever you want to call it. And so I don't know exactly exactly what the experiment I would run is, but that's an area that I would absolutely kind of like explore. And then one that's like not related to AI at all, but like, how do I get the best sleep? Like, I just wanna sleep really well. Like, I, I need like 8 hours or, or I'm terrible the next day. And so like, what conditions set me best up for success there?

Like, that's one that I personally would love to have an answer to. Speaker B: So, well, we're seeing some amazing things happening around like sequencing short sleepers and trying to figure out, you know, how to turn that into some kind of therapeutic where, you know, on 4 hours a night, maybe you can have the same level of productivity and energy or more. So who knows, maybe we're just a few years away. Speaker B: So, well, we're seeing some amazing things happening around like sequencing short sleepers and trying to figure out, you know, how to turn that into some kind of therapeutic where, you know, on 4 hours a night, maybe you can have the same level of productivity and energy or more.

So who knows, maybe we're just a few years away. Speaker C: That would be great. Speaker B: Stan, what about you? What would your experiment be? Speaker A: At the end of the day, a lot of, and that's what we're doing, is just we'll be about doing it faster to some extent. I think at the end of the day, those models are still pretty smart. They do a lot of really impressive stuff. It's mostly that we don't have the pipes to connect them to the to the right actions and the right data.

And so there's a lot of pipes missing. And so if you could send everybody, I mean, a very large team to do partnership with every platform up there and create all the pipes,, and see how much, and be then able to really figure out how much of work can be taken by those models. Because I think we don't see it quite clearly, not only because there's the model capabilities, there is a limitation, there's So the availability of the actions, availability of access to the data, which is never perfect. And so getting perfect there is purely an operational play.

It's about building stuff or getting partnerships and stuff. And so that would be one of those. The other thing that is kind of a sidetrack is I'm super interested in answering the question of whether there is a maximum team size that exists for a given product, especially in the age of AI. I think many people do believe that there is an optimal team size for a given product, let's say, and that often the scale that some companies go into, to thousands of employees and stuff like that, is mostly people getting busy by just organizing themselves in a sense.

I think with agents being ambient around them and being used in many ways, ways this has become more and more true. And so if I could really quickly find out, you know, scale the team with really great talents at many different levels and quickly operate it to see how it feels, that would be something that would be very interesting to figure out. So if there is such maximum of efficiency at some point. Speaker B: I love that. Yeah. Where you hit the limit. Speaker C: Okay. Speaker B: This is a question I love to end with usually.

If you had the power to assign a book to everyone on earth to read and understand, stand, what book would you like to assign? And why don't we stick with you, Stan, and we can end with Harrison? Speaker A: Yeah, one book that I loved is, and that is, I'm not sure I understood it completely. It's called, it's Greg Egan, Permutation City, because I think it just sends you into thinking about the nature of consciousness in a way that is extremely interesting. And so that's one that comes to mind. Speaker B: Is it a novel, fiction?

Speaker A: It's a novel. It's a novel. Speaker B: Okay, yeah, it sounded like it might be. Wow, that sounds really interesting. I'm going to add that to my list. And what about you, Harrison? Speaker C: One of my favorite books is Range by David Epstein. And it actually, you know, it's great for the title of the podcast, The Generalist, because it's all about how generalists succeed in kind of like a specialist world. And I thought that was really interesting. And, you know, when they succeed and when it's also good to be a specialist and, you know, just how a lot of the great things in humanity have come from just generalists with range connecting dots and putting things together.

And so I really enjoyed that. Speaker B: What a perfect coda for us here. Thank you both so much for taking the time. I learned so much and yeah, really enjoyed chatting with you both. Speaker B: What a perfect coda for us here. Thank you both so much for taking the time. I learned so much and yeah, really enjoyed chatting with you both. Speaker A: That's it. Speaker D: Thank you for listening to this episode of The Generalist Podcast. Speaker A: Podcast. Speaker D: Please subscribe on Apple Podcasts, Spotify, or your preferred podcast app.

Ratings and reviews help others discover these discussions, so if you enjoyed the conversation, I'd be grateful if you could take a moment to leave one. For all past episodes and more, visit us at com. See you next time as we continue to explore the future.

Want to learn more?

Ask about this episode