Data Engineering and Analytics with Natural Language
Space and Time recently announced the release of Houston, the OpenAI chatbot that users can interact with on the Space and Time Studio. Houston leverages GPT-4 and ChainML to allow users to generate SQL, Python scripts, dashboards, and more from natural language prompts.
This blog post is adapted from a recent podcast episode where Scott Dykstra, CTO and Co-Founder of Space and Time, interviews ChainML Founder and CEO Ron Bodkin to discuss the capabilities of Houston, the future of generative AI, and the partnership between the two projects.
Scott Dykstra: I'm Scott Dykstra, Co-founder and CTO of Space and Time, and I'm here with Ron Bodkin, CEO and Founder of ChainML. Space and Time has been doing a ton of work with ChainML on the text-to-SQL AI integration into Space and Time. Namely, ChainML actually delivered a conversational AI engine that Space and Time uses in our chatbot. Without the ChainML team and the help from Ron and his engineering team, we would be very challenged on the chatbot side of things. So what Ron's delivered is an enterprise-grade conversational AI engine that we've integrated into our app. And it's been a really fun experience working with ChainML, another partner of ours in the Chainlink BUILD program as well. Ron, great to see you.
Ron Bodkin: Great to talk with you here today, Scott, and we're really excited about working with you on Houston. You know, we think Space and Time is really being an innovator in applying conversational AI into making applications more useful. Houston is making it easy to get customer support and ask questions, not just for documentation, that even generates code to make it easy to integrate with the database and supporting technology, and make it easy to do analysis with the Space and Time database so that users can visualize data, even doing forecasting. You know, there's a lot of power in that. And we're seeing more and more projects realize that adding those kinds of capabilities with generative AI is really important. And you know, one of the things we're excited about is that our engine makes it easy to break down these more sophisticated use cases into small agents that collaborate together and really deliver results. And I'm certainly excited to talk about how we've been working together and how we enable that.
Scott: What got ChainML focused on conversational AI? What was the beginning of this journey for ChainML?
Ron: As a team, those of us at ChainML have been working in the AI space for many years, more than a decade for all of the leadership team. And, you know, we've been excited by the power of large language models for several years. I was excited by it and leveraging it in my time in the Google Cloud CTO office working with big customers. When I went to Vector Institute, which is one of the leading AI research institutes in Canada, and working with some of the large industry customers, we were helping them look at how to use these models for customer support, for interacting with their customers. And the technology just keeps getting better and better. So we started talking to Web3 customers and saying, like, "What's exciting for you to do around AI?" We saw so much interest in improving user experience in getting access to data. And we built a little hackathon demo to show what we were excited by, and Scott, you saw it, and maybe I'll let you sort of describe what happened next.
Scott: Yeah, I think we were really, really impressed by the capabilities that ChainML brought to the table in the chatbot—capabilities like forecasting against blockchain data, capabilities around intent routing, like if I asked for a certain action to be performed in the chatbot, ChainML does a good job of figuring out what that intent is, and routing my request to the appropriate backend to fulfill that request, whether that's a text-to-Python app or a text-to-SQL app, or a forecasting model, or, you know, probably some more sophisticated capabilities coming soon. Asking questions about documentation from Space and Time. Where do you see this going? Where do you see ChainML continuing to build in the intersection of AI and Web3?
Ron: We're excited! We are getting an open source release of the engine ready. So that's coming soon. That'll make it accessible to more people to be able to build some of these powerful experiences. Obviously, we're excited as more Space and Time customers start using Houston and say, like, how do we extend this and add some of our own skills and capabilities in the agents to make it work well for our use cases. Those are all important things that we're wanting to do. You know, we think that increasingly, the industry is moving to how do you not just build a demo? Hey, you can build a demo that calls out to GPT-4 and does some interesting things. But how do you build something robust that you can deploy? How do you make it so it's possible to really have confidence in the answers it's returning and do things like fact checking and getting references for them? How can you use different models and increasingly powerful open source models to prepare information and verify it? And in concert with the most powerful commercial models, like GPT-4 and Claude, produce really high quality results. So we think that's important.
And the next step is going to be starting to do multimodal, right? So not just language, but also processing images and videos and more complex modalities. And I think, you know, more companies and projects are going to do like Space and Time and build rich conversational experiences into their apps like Houston. So I think a lot more organizations are going to be doing that, because it's so valuable, you know, and that's going to draw on the need to have access to database data, as well as unstructured data. I think you still see a lot more people who are building demos of “how can I query a document?” But you know, being able to integrate documents and structured data in AI is going to be incredibly important too.
Scott: I completely agree. And I think a lot of projects, a lot of protocols in Web3, will try to build their own chatbot, and realize that it's easy to get a chatbot up and running with OpenAI's chat API, or getting the basics. But when they start to realize, hey, there's sort of a layer of intention that needs to be -- routing layers is the best I can describe it, that's sort of almost like a middleware layer that sits between your frontend chatbot and the backend large language models that you're speaking with. You need something in the middle that acts as an agent-based framework for communicating with the large language models on the backend. Where do you see ChainML going over the next year? You talked about open sourcing your conversational AI engine relatively soon. What else is your team working on?
Ron: We're working with a number of customers who are excited to use it. We think there's going to be more and more capabilities, not only things like intent detection, but how to do refinement and improvement. How can you break down more complicated problems with good planning for how to coordinate? When do you get feedback from a person? Right? So I think, today, people are often rooted in a very, very short cycle chatbot mentality of, like, I asked an AI assistant to do something quick, and it does it or it doesn't. And we think it's moving towards a world where you're going to have more complicated projects that people do with ongoing assistance from AI. And we think that's a really important direction. And part of all of that is also adding much richer quality management.
So how do you measure how well the AI is doing ongoing on a wide variety of tasks, so you can keep improving, and know where you're standing? Right. So that includes both using AI to get feedback on the AI, but also when to get people to give input, whether it be users of your app who are giving explicit feedback or even just using third-party teams of people to give critical feedback so you can keep evaluating and improving. So these are all areas we are excited to work on. And then we're also excited to keep pushing forward on the underlying infrastructure that the engine runs on. We have a protocol for executing AI that will allow you to run in a decentralized way so that you can run on the right GPUs in the right place in the censorship-resistant way. So we're also moving towards that. We think that's also going to be an important direction at the intersection of AI and Web3.
Scott: And do you see ChainML potentially training your own LM at some point, as the open-source side of LMs gets better and better?
Ron: Yeah, it's a good question. I mean, I'm really bullish on the rapid progress on open source language models and the funding that's going into reading high quality open source. So, you know, I think there's going to be more and more opportunities to use a variety of great open source models to create valuable experiences, and many companies are already testing and comparing them. But I think there's often low-hanging fruit around how do you fine tune them and make small adjustments on your data to make it work well for a specific use case? We find that often, that's not the first thing you want to do, that you can move so much faster by integrating external data to make the right information available to the model and prepare a response and even try some different prompting strategies and assess and refine. So there's a lot you can do to improve the quality of an AI agent without having to build your own custom model.
But it's certainly a direction that we think over time, the community would like to see, you know, resources for training open source. So as we scale, it's possible we'll do more in that space. But right now, I think it's great to see such a variety of nonprofits, academics, as well as commercial companies, creating open source large language models. And I think in the Web3 community, DAOs will increasingly come forward and want to support the creation of great models. So I think that's an area of rapid evolution and progress.
Scott: And our partnership is kind of focused right now solely on data management – what are different capabilities we can bring to the table in a chatbot that allow a dapp developer, a business analyst, a data engineer, to manage their data better. So I can see an open source LLM that's maybe fine tuned in those areas being really beneficial to Space and Time and ChainML in the future.
Ron: Yeah, and I think you want to certainly test it out and say, like, at some point, do you get a better result out of fine tuning an LLM specifically for database queries and analytics use cases versus, you know, how much do you benefit by simply keeping up with the rapid advances in the latest models and providing the right information to them to produce good results? Also, a big thing that we're focusing on that I think is important is that it's not just a one and done, like, ask a model to produce an output, but to be able to then test it and use the feedback of what that was to refine it. And even to have, in parallel, some critics that say, “what are the potential errors or what's incorrect, potentially, about this?” and fix that. Or, you know, maybe ultimately go back to the user and ask them, like, “hey, I want to clarify, you know, how do you measure this key objective so that the query is not just correct data, but it's the way you wanted to measure it?” Right?
So I think it's a big shift that, frankly, people in the industry are awfully focused on training models. But the powerful thing about these generative AI models is that one model can do very well on a wide variety of tasks. And if you give it a bit of context, that can do a really good job. So I think there's a lot less need to train custom models going forward.
Scott: Yeah, I definitely agree. We've been talking a lot about Houston. Let's do a quick walkthrough of Houston and describe what Houston is and go from there. Houston is Space and Time's chatbot that was built in partnership with ChainML and, of course, is powered by ChainML. Houston is a chatbot that we've built together in order to allow users to better perform text-to-SQL generation: like, ask a natural-language question at the prompt, get back SQL, run that SQL against blockchain data. We've also done a lot of work with Python. For example, I was playing around with a prompt today: write me a Python script to grab Twitter sentiment data from the Twitter API around Polygon, basically saying, hey, I want to write a simple Python script that pings Twitter's APIs, grabs back any data we can get around Polygon – you know, what's the conversation around Polygon on Twitter? And then, of course, integrate that into our app, and we see that this open AI generates the Python script for us. We can debug it live in our kind of Python previewer, our code editor, code previewer, and go from there.
And, you know, this goes far beyond just text-to-SQL. I'm starting to see a number of different data-focused applications, both in Web3 and in Web2, start doing more and more work with text-to-SQL, which we're going to describe in a second. But I think it becomes really powerful when you can also generate simple Python scripts that load data into your database, load data into Space and Time, grab data from external sources and generate Chainlink oracle jobs, for example. And so where it gets really interesting is on the frontend: I'm saying, hey, give me a Python script, and here are the parameters of my Python script. On the backend, ChainML is interpreting that request and routing that request to a Python-focused module, which is then asking GPT-4 to give back a Python script. What ChainML does is it provides a middleware layer that does the routing appropriately in the conversation, because separately, if I say, you know, show me all Polygon wallets that have, let's say, at least 10 transactions on-chain, this is a much different type of request, and is a much different set of prompts that we're going to be sending over to GPT-4. ChainML kind of offers the middleware, a middle-tier layer that can route requests to the appropriate backend logic, whether it's for SQL, whether it's for Python, whether it's for understanding documentation, submitting bugs, there's so many different actions that you could take in a chatbot focused around data. And if you only do one thing, it's easy, right? If we're only doing text-to-SQL, it's easy, but ChainML helps when there's a lot of different actions.
Ron: Even within generating the code or doing the database query, it turns out that there's already some complexity in the control flow that you have to be able to not only put the right contextual information, right? Like, you know, what code is relevant to get from the Space and Time codebase for generation purposes? What are the latest APIs or the latest tables that are being used? But then also iterating on results to say, hey, maybe let's test it and, like, hey, there might be an error here, so let's revise the script or the SQL code to respond, see what actually produces the right results. So even within one of those chains, right, getting the whole flow of preparing the information and producing the result, iterating until you get the working result and then returning it, is a lot more complicated than just, like, let's take our best shot and see, right?
And this is a big win, because most of the time when users are using ChatGPT, well, one, it doesn't know the latest APIs. Its information was cut off in 2021, so it's way out of date. And you don't even have any information on some APIs. And then two, if it makes a mistake, the onus is on you to go take it, path it, tell it here's the error, what do you do to fix it. And you don't have this spoon-feeding process of going back and forth with the chatbot. So it's a big win for users to have all that work being done automatically and predictably.
Scott: Exactly. So one area where the chatbot can be exceptionally useful is when you just need to kind of start wrangling some data, discover certain datasets or begin writing queries really quickly. These might not be, for example, production-grade queries that you're going to necessarily put into a dashboard. But sometimes they are, and just getting started with finding the data you need, wrangling the data sets, beginning to write SQL against the blockchain data with the right schema, the right columns, and figuring out something as simple as, like, hey, show me Polygon transactions by day to get started, or, you know, show me all Polygon wallets with at least five transactions on-chain, and beginning to, you know, get back accurate SQL in the chatbot. That SQL executes in the chatbot, and, of course, not only do you get a query result, but you can also visualize that query result and edit your data visualization as well. In one case, I was asking, hey, show me Polygon transactions by day within, you know, a comfortable range. And we're getting back, you know, timestamped transaction hash, from and to address, information about that set of wallets as well. In the next query, I was kind of asking about wallets that have a certain number of transactions on-chain. And so just being able to ask natural-language questions in the chatbot and get back accurate SQL which executes against Space and Time, it really accelerates time-to-value for finding what you need.
So, you know, a lot of our business analysts are looking at this data set that represents Polygon, for example. So we've got a relational SQL copy of exactly what's on-chain on Polygon. And there's a lot of tables. You have your blocks table, which can be joined with, for example, your native wallets table, but then beginning to query this can be challenging. So the nice thing is you can jump into our query editor, or into the chatbot, and begin writing SQL, select all from Polygon blocks. And we can begin to see the blocks table we've referenced, as well as other tables we can join in. But a lot of our users say, hey, we need to take that one step forward, and we need to actually start querying it. So I might say, you know, looking at this data set here, I'd say, you know, show me all blocks tables where, you know, gas used is greater than a certain amount of gas or where the block reward was above a certain number. Now, this is a really powerful tool for discovery for beginning to build your dashboards or charge your datasets. In just a second, we get back SQL from OpenAI, that SQL query executes against the Space and Time database. And we can even see some generated visualizations there. And, you know, some guesses of what the right data visualization might look like there in the chatbot. Where ChainML comes in is ChainML knows when I say show me XYZ, this is a SQL request, not a Python script I'm asking for, not a question about Space and Time's documentation. I'm not trying to have a general chat conversation, I have a very targeted, very specific request: show me Polygon blocks. And so ChainML knows this is a SQL request, routes the request appropriately to a backend SQL module, which then talks to OpenAI, and provides a bunch of extra prompt context to GPT-4.
Ron: Yeah, I mean, so not only is ChainML's engine determining what's the right chain to run, what the right set of capabilities are, but within that the control flow to say, like, hey, first, how do we provide the relevant tables for this query that you're asking, the type of query you're asking, and give information on a subset of it to generate SQL, and then use the result of running the query to say if there's a mistake, let it go back and correct it to get the right results? Just like in the code generation example, prompt it with the latest code from the API so that it knows how to generate the information, unlike if you were looking way back to what's built into a model like GPT-4 where the last knowledge it has about APIs is from late 2021. So very out of date, right? So getting in the latest information, generating based on that, and then looping and correcting, it's important for scripts, it's important for SQL, and, you know, we think this is a really powerful way of creating leverage for the analyst, right? You don't need to be an expert at prompting, you don't need to spoon feed a chatbot like ChatGPT and go back and say, let me run it, let me get it some more information, let me fix the error. You don't need to do chat turns to get a useful query. It will automate all of that, and you end up with an analytic tool that’s powerful. Right?
And, you know, I don't want to give too much of a sneak preview of where we're going. But we think that this is a primitive we can build up and build more powerful analytic tools that users can have. So not only generating the query and visualizing it, but how do you keep raising the bar so that analysts can have a better view of all the data that's in a rich database, like the Space and Time database, like all that wonderful blockchain indexed data, and really do powerful analysis on top of it.
Scott: What other actions do you anticipate being important to your customers? Space and Time is very focused on data engineering: SQL, Python, questions about our documentation, focused on the database. What other actions or intentions do you see ChainML routing for your customers?
Ron: We think there's a range of people in DeFi and finance who often like to deep-analyze different strategies and forecast potential outcomes – you know, automate their trading strategies and their risk management protocols, right? So those are all examples of capabilities that increasingly can be built in so agents can do more of that analysis and give you more leverage and more automation while still letting you ultimately have control and decide what you want to do. And we see a lot of protocols that want to integrate support with not just documentation from what's been written, but also documentation that's auto-generated out of the latest code. And also even code examples. We see a lot of projects, you know, that are excited by that kind of capability. They're rethinking the whole area of research and analysis, combining the structured and unstructured data, so not only have rich database data from the Space and Time database, but also the documents that might be previous reports from a company or information that's been published. We're often seeing use cases where analysts want to combine, you know, that kind of information that’s been published, and it's been put out in social media, with database data, and that rich combination we think is really powerful.
Scott: I think so too. And I think there's a lot more use cases around documentation, error reporting, bug reporting, customer support, that almost every protocol, every dapp, every project could leverage, especially any project that has their own documentation that people are already asking a lot of questions about.
Ron: Yeah. And you know, the documentation is one important source. Pulling the signal from the noise in Discord or other community discussions – how can you find the authoritative answers and not the noise of new people showing up and asking the same question over and over again, but who's answering things insightfully? How do you actually get information from the latest code? It turns out it's a well-known secret that most projects' documentation lacks the code. So actually, using the code as a preference to the written documentation is important. Like, this is true that you probably want to answer based on what the code says and not what was documented a while ago, right? So answering technical questions and helping people be productive is a super important area that we think a lot of projects are looking for.
Scott: So, a quick example of that. Let's say I asked a very unpolished question from your docs: tell me how I auth with PKI. ChainML would interpret that as a request for documentation, route that request appropriately to retrieve support documentation, and then OpenAI understands, hey, PKI is referencing public key cryptography. Given all of our documentation, it'll return a very reasonable response back about how to sign a challenge with your private key and how to get started with Space and Time security auth APIs.
Ron: To point out one other thing, too, is to do that, you know, before calling GPT-4, we have indexed the documentation and could take that query and say, we're going to find the parts of documents that are most relevant to it so that we can prompt it and give that as input to the large language model so it answers with the right context in mind.
Scott: Yep. And this will only get more powerful, you know, as people begin asking more complicated questions about not just our documentation, but, of course, the massive amount of indexed blockchain data that we're ingesting into Space and Time and making available. It'll eventually sort of become a Google search engine for blockchain data. Right now, there's this mapping of a question to a SQL statement where I say, hey, show me a count of Polygon wallets with at least 10 NFTs. That's a mapping of a request-to-SQL. But as we get better with vector search, as we get better with how we organize this blockchain data, and even, you know, beginning to feed the raw blockchain data itself, potentially, into open source LLMs, over time, the blockchain data itself can be searched against, can be generatively searched against or using nearest neighbor algorithms with vector search databases, ChainML and Space and Time together can essentially build the Google search for blockchain data.
Well, what we've already done so far with just asking natural language questions that are converted into SQL is already a very, very powerful piece of infrastructure for finding what you need and answering your questions quickly. But that'll get even more, you know, we can abstract a layer away from SQL even further. Do you get what I'm saying, Ron?
Ron: Absolutely. And I think it's a great time to say to people who are using Space and Time, we'd love to hear how we could make Houston even more powerful for you. How can it improve your analysis and make it even easier to access all that great blockchain data and visualization? How do we just make it the best tool for doing analysis on blockchain data?
Scott: Tell me a little bit about how you expect not just Space and Time, but really all of your customers and partners to build on-chainML? Is it really just an API that you host on the backend? What does the integration with ChainML look like?
Ron: Yeah, I mean, so we provide APIs. Our engine makes it really easy to put together different chains that use a variety of models to solve tasks like forecasting, selecting data, testing hypotheses, finding information and documentation, providing focused conversational support like in customer service, right? So making it easy to integrate those. And they're all designed to be really modular, so you can extend them, you can change prompting strategies, you can put your own documentation and your own information in place and say what tables are relevant for your project, etc., and make it really easy to control what the intentions are, what the chains are, how it works, with a budget, even, so you can say how much resources you're going to use.
So we make it really easy to get up and running and do these kinds of use cases, and keep building more sophistication. Keep building, like, how do you do a level of filtering? How do you ensure you're producing results that are consistent with the way your brand wants to reach out and interact with your customers, right? So there's a range of functionality. It's easy to start with the APIs, but it's also easy to extend.
Scott: From our perspective, where we're going next with this, the next steps for Space and Time are really to build out more and more data engineering focus capabilities. Earlier in this discussion, we showed text-to-Python, where I said, hey, give me a Python script that can ping the Twitter API, grab some Twitter sentiment data around a keyword like Polygon. I think that'll become a really, really fundamental thing for accelerating time-to-value with using databases, especially a Web3-native database that already has indexed blockchain data already loaded into it: Space and Time. We, as a service, already index an exact copy of what's on-chain and make that available to query. And our customers are using ChainML's conversational AI engine to query that data. But then customers want to join that with their own data or with off-chain data or data from reference tables from off-chain or financial data from TradFi to join with things that are on-chain related to DeFi, like interest rates. They might want to go grab Twitter data to join with on-chain activity and see how that maps. They might want to map gaming data to, you know, what in-game events led to an on-chain NFT purchase or mint. And that intersection of what's happening on-chain with kind of what's happening off-chain will be really important.
Ron: Yeah, I totally agree. And I think the flexibility to bring in a wide variety of data to expose it so it can be used appropriately by the AI is a big area of continued investment and improvement. So we're excited for that. And I think, you know, you're pioneering making it the best developer experience for the protocol. And I think any Web3 protocol, or, indeed, any developer-focused project, needs to be thinking about how to build these AI capabilities in to make it a rich, productive experience for their customers.
Scott: Yeah, and I think from a data engineering perspective, a few areas we can tackle next beyond just simple text-to-Python is actually building entire pipelines end-to-end, really ETL pipelines, namely, but also pipelines that include Chainlink oracle jobs that not just load data into Space and Time, but actually load Space and Time data on-chain, you know, quickly developing full end-to-end data pipelines, where you might be grabbing off-chain data, loading it into Space and Time, processing it, querying it, and then putting it on-chain, sending it back to your smart contract via Chainlink or via some other oracle network. That end-to-end pipeline will be really important. And we're going to be doing a lot of work to accelerate that with OpenAI. Hey, chatbot, namely, hey, ChainML, give me a pipeline that grabs bond market data, loads it into Space and Time, joins it with on-chain interest rates from Aave, and then hands that entire data set back to my smart contract. So these end-to-end data pipelines will be important mainly for data engineers and dapp developers.
And then on the other end, for business analysts, it'll be really important to actually generate entire dashboards, not just a SQL result, not just a query, but basically a whole analytic suite around a certain topic. So you know, hey, chatbot, hey, Houston, and more importantly, hey, ChainML, give me back a dashboard around liquidations on Aave or give me a dashboard around Chainlink OCR payouts to node operators or give me a dashboard around NFT wallet activity today from this specific NFT project. I'm on Twitter, I see that a certain NFT mint is really popping. I could ask questions in the chatbot, build myself a little dashboard so I can understand on-chain activity live right now around my NFT mint. And that sort of very fast time to value to get from here's the address of my mint, what can you quickly show me about my mint, active wallets, which wallets are trading the most, floor prices, you know, generating an entire dashboard quickly, will be the next frontier for us.
Ron: Yeah, super exciting. And we also think it's kind of raising the bar on more powerful agents, whether it be to automate full pipelines for data engineers or dashboards to really drive the next level of automation and sophistication for analysts are some super exciting directions. And it's very much aligned with our roadmap and where we want to keep building out our engine. So certainly, it's been great collaborating so far, super excited to have Houston live and in the hands of customers and excited to launch the engine for more people to try it out.
Scott: I completely agree. It's kind of the perfect intersection of both of our projects' focuses with ChainML so focused on, you know, being an intent engine, conversational engine for routing requests to large language models, and Space and Time having really a comprehensive copy of what's on-chain. We collect all the data, we organize the data, we make it available for processing. ChainML generates the actual code that would be processed, that would process that data. So, it's been a blast working with your team. Very impressed by what your team is doing. And it's really accelerating the quality of our user experience quite a bit.
Ron: Thanks, guys. It's been great talking to you today.