Bitget App

Trade smarter

Dialogue with 0G Labs: The road to the end of DA and the new era of on-chain AI

BlockBeats2024/06/13 10:43

By:BlockBeats

Translator's note: In March this year, 0G Labs completed a $35 million Pre-Seed round of financing led by Hack VC. 0G Labs aims to build the first modular AI chain to help developers launch AI dApps on a high-performance, programmable data availability layer. Through innovative system design, 0G Labs strives to achieve on-chain data transmission at the GB level per second to support high-performance application scenarios such as AI model training.

In the fourth podcast of DealFlow, BSCN Editor-in-Chief Jonny Huang, MH Ventures General Partner Kamran Iqbal, and Animoca Brands Head of Investment and Strategic Partnerships Mehdi Farooq interviewed Michael Heinrich, co-founder and CEO of 0G Labs. Michael shared his personal background, from software engineer at Microsoft and SAP Labs to founding Garten, a Web2 company valued at over $1 billion, to now working full-time at 0G, dedicated to building a modular AI technology stack on the blockchain. The discussion covered the current status and vision of DA, the advantages of modularity, team management, and the two-way dependency between Web3 and AI. Looking ahead, he stressed that AI will become mainstream and bring about huge social changes, and Web3 needs to keep up with this trend.

The following is the text of the interview:

Web2 unicorn leader starts a new business

Jonny: Today we are going to dive into an important topic - data availability (DA), especially data availability in the field of crypto AI. Michael, your company has a big say in this field. Before we go into the details, please briefly introduce your professional background and how you got into this niche field.

Michael:I started as a software engineer and technical product manager at Microsoft and SAP Labs, working on cutting-edge technologies in the Visual Studio team. Later, I turned to the business side and worked at Bain for a few years. I moved to Connecticut to work for Bridgewater Associates, where I was responsible for portfolio construction. I reviewed about $60 billion in transactions every day and understood many risk indicators. For example, we looked at CDS rates to assess counterparty risk. This experience gave me a deep understanding of traditional finance.

After that, I returned to Stanford for graduate school and founded my first Web2 company, Garten. At its peak, the company had 650 employees, annual revenue of $100 million, and total financing of approximately $130 million, becoming a unicorn company with a valuation of over $1 billion and a star project incubated by Y Combinator.

At the end of 2022, my Stanford classmate Thomas contacted me. He mentioned that he had invested in Conflux five years ago and believed that Ming Wu and Fan Long were the best engineers he had ever funded. The four of us should get together to see if we could create any sparks. After six months of getting along, I came to the same conclusion. I thought to myself, "Wow, Ming and Fan are the best engineers and computer scientists I have ever worked with. We must start a business together." I began to become the chairman of Garten and devoted myself full-time to 0G.

Four co-founders of 0G Labs, from left to right: Fan Long, Thomas Yao, Michael Heinrich, Ming Wu

The current state, challenges and ultimate goals of DA

Jonny: This is one of the best founder introductions I have ever heard, and I guess your VC financing process must have been smooth. Before I delve into the topic of data availability, I would like to discuss the current state of DA. Although there are some well-known players, how do you assess the landscape of DA at present?

Michael:DA now comes from a variety of sources, depending on the blockchain. For example, before the Ethereum Danksharding upgrade, Ethereum's DA was about 0.08 MB per second. Later, Celestia, EigenDA, and Avail entered the market, and their throughputs were typically between 1.2 and 10 MB per second. The problem is that this throughput is far from enough for AI applications or any on-chain gaming applications. We need to talk about GB per second, not MB per second of DA. For example, if you want to train an AI model on the chain, you actually need 50 to 100 GB per second of data transfer to achieve it. This is an order of magnitude difference. We saw this opportunity and thought about how to create this breakthrough so that large-scale applications of Web2 can be built on the chain with the same performance and cost. This is a huge gap we see in the field. In addition, there are some issues that have not been fully considered. For example, we believe that data availability is a combination of data publishing and data storage. Our core insight is to split the data into these two channels to avoid the broadcast bottleneck in the system, thereby achieving breakthrough performance improvements.

An additional storage network allows you to do a lot of things, such as model storage, training data storage for specific use cases, and even programmability. You can do full state management, decide where to store data, how long to store it, and how much security you need. So real use cases that are really needed in various fields are now possible.

The current status of DA is that we have made significant progress, from 0.08 MB per second to 1.4 MB, and have indeed reduced transaction costs, even by 99% in some cases. But this is not enough for the real needs of the future world. High-performance AI applications, on-chain games, high-frequency DeFi, all of these applications require higher throughput.

Mehdi: I have two fundamental questions. The first is about storage. You mentioned the transaction history of L2, and even the history of AI models. In terms of storage, how long do we need to store data? This is my first question. The second question is, there are already decentralized storage networks like Arweave and Filecoin, do you think they can help increase throughput? I don't mean data publishing, but storage.

Michael:How long the data is stored depends on its purpose. If you consider disaster recovery, the data should be stored permanently so that the state can be reconstructed. For optimistic rollups with fraud proof windows, at least 7 days of storage is needed to reconstruct the state when needed. For other types of rollups, the storage time may be shorter. The specific situation is different, but this is roughly the case.

As for other storage platforms, we chose to build our storage system in-house because Arweave and Filecoin are designed more for log-type storage, that is, long-term cold storage. So they are not designed for very fast data writes and reads, which is critical for AI applications and structured data applications that need key-value storage or transactional data types. This allows for fast processing and even building decentralized Google Docs applications.

Jonny: You made a very clear statement about why DA is needed and why existing decentralized storage solutions are not suitable for this specific scenario. Can you discuss the ultimate goal of data availability?

Michael: The ultimate goal is easy to define, we want to achieve performance and cost comparable to Web2, making it possible to build anything on the chain, especially AI applications. This is very straightforward, just like AWS has computing and storage, S3 is a key component. Data availability, although it has different characteristics, is also a key component. Our ultimate goal is to build a modular AI technology stack, where the data availability part includes not only data publishing but also storage components, which are integrated by the consensus network. We let the consensus network handle the data availability sampling, and once consensus is reached, we can prove it on the underlying Layer 1 (such as Ethereum). Our ultimate goal is to build an on-chain system that can run any high-performance application, and even support on-chain training of AI models.

Kamran: Can you elaborate a little bit more on your target market? Besides AI and those building AI applications on blockchain, what projects do you expect to use 0G?

Michael: You already mentioned an application area. We are trying to build the largest decentralized AI community and expect a large number of projects to build on top of us. Whether it's Pond building large graph models, or Fraction AI or PublicAI doing decentralized data annotation or cleaning, or even execution layer projects like Allora, Talus Network or Ritual, we are trying to build the largest community for AI builders. This is the baseline for us.

But really, any high-performance application can be built on top of us. Take on-chain gaming as an example, 5,000 users need 16MB of data availability per second for the full on-chain game state in an uncompressed state. No DA layer can do this right now, maybe Solana can, but that's different from the Ethereum ecosystem and support is limited. So, such applications are also very interesting to us, especially if they are combined with on-chain AI agents (such as NPCs). There is a lot of potential for cross-applications in this regard.

High-frequency DeFi is another example. Future fully homomorphic encryption (FHE), data markets, high-frequency deep-end applications, all of which require very large data throughput and require a DA layer that can really support high performance. Therefore, any high-performance DApp or Layer2 can be built on top of us.

Advantages of modularity: flexible choice

Mehdi: You are working hard to improve scalability, throughput, and solve the problem of state bloat caused by storage components. Why not just launch a complete Layer1? If you have the ability to make breakthroughs in technology, why take a modular approach instead of creating a Layer1 with its own virtual machine? What is the logic behind adopting a modular stack?

Michael: Fundamentally, we are a Layer1 at the bottom, but we firmly believe that modularity is the way to build applications in the future. And we are modular, and we do not rule out providing an execution environment optimized specifically for AI applications in the future. We haven't fully determined the roadmap in this regard, but it is possible.

The core of modularity is choice. You can choose the settlement layer, the execution environment, and the DA layer. Depending on the use case, developers can choose the best solution. Just like in Web2, TCP/IP was successful because it was inherently modular and developers had the freedom to choose to use different aspects of it. So we want to give developers more choices and let them build the most suitable environment for their application type.

Mehdi: If you were to choose a virtual machine now, which virtual machine on the market would be the best for the application you are considering or working on?

Michael: I have a very practical view on this. If you want to attract more Web2 developers to Web3, it should be some type of WASM virtual machine, where applications can be built with the most common programming languages such as JavaScript or Python. These languages are not necessarily the best choice for on-chain development.

Move VM is designed very well in terms of objects and throughput. If you are looking for high performance, this is an option worth paying attention to. If you think about the battle-tested virtual machine, it would be the EVM because there are a lot of Solidity developers. So the choice depends on the specific use case.

Prioritization and community building

Jonny: I want to hear what the biggest obstacles you faced were, or was everything smooth sailing? I can't imagine that with such a large business like yours, it can't always be smooth sailing.

Michael:Yeah, I think it's not smooth sailing for any startup, there will always be some challenges. From my perspective, the biggest challenge was making sure we could keep up with the pace, because we had to perform multiple tasks very well and had to make some trade-offs to get to market quickly.

For example, we wanted to launch with a custom consensus mechanism, but that would have extended the launch time by four to five months. So we decided to use an off-the-shelf consensus mechanism in the first phase to do a strong proof of concept and achieve part of the end goal, such as 50 GB per second per consensus layer. Then in the second phase we will introduce a horizontally scalable consensus layer to achieve unlimited DA throughput. Just like flipping a switch to spin up another AWS server, we can add additional consensus layers, increasing overall DA throughput.

Another challenge is to ensure we can attract top talent to the company. We have a strong team, including gold medalists from the Informatics Olympiad and top computer science PhDs, so we need the marketing team and new developers to match that.

Jonny: It sounds like the biggest hurdle you guys are facing right now is prioritization, right? Accepting that you can't do everything in a short period of time, and that some tradeoffs have to be made. What do you think about competition? I'm guessing that Celestia or EigenDA are not a serious threat to your specific use cases.

Michael: In Web3, competition is largely determined by the community. We have a strong community around high performance and AI builders, while Celestia and EigenDA probably have more general-purpose communities. EigenDA might be more concerned with bringing economic security and building AVS on EigenLayer, while Celestia is more concerned with which Layer2 wants to reduce their transaction costs, and there aren't a lot of high-throughput applications. For example, it's very challenging to build high-frequency DeFi on Celestia because you need multi-megabytes per second of throughput, which would completely clog the Celestia network.

From that perspective, we really don't feel threatened. We are building a very strong community, and even if someone else comes along, we already have the network effect of developers and market share, and hopefully more money will follow. So, the best defense is our network effect.

Web3's two-way reliance on AI

Jonny: You chose AI as your main focus, but why does Web3 need to host AI within its ecosystem? Conversely, why does AI need Web3? This is a two-way question, and the answer to both questions is not necessarily yes.

Michael:Of course, Web3 without AI is possible. But I think in the next 5 to 10 years, every company will become an AI company because AI will bring huge changes like the Internet. Do we really want to miss this opportunity in Web3? I don't think so. According to McKinsey, AI will unlock trillions of dollars in economic value and 70% of jobs can be automated by AI. So why not take advantage of it? Web3 is possible without AI, but with AI, the future will be even better. We believe that in the next 5 to 10 years, most participants on the blockchain will be AI agents who perform tasks and transactions for you. This will be a very exciting world, and we will have a large number of automated services driven by AI and tailored to users.

In turn, I think AI absolutely needs Web3. Our mission is to make AI a public good. This is fundamentally a question of incentives. How do you ensure that AI models don’t cheat, and how do you ensure that they make decisions that are in the best interest of humanity? Alignment can be broken down into incentives, verification, and security components, each of which is well suited to be implemented in a blockchain environment. Blockchain can help with financialization and incentives through tokens, creating an environment where AI is financially disincentivized to cheat. All transaction history is also on the blockchain. Making a bold statement here, I think fundamentally, everything from training data to data cleaning components to data ingestion and collection components should be on-chain so that there is complete traceability of who provided the data and then what decision the AI model made.

Looking forward 5 to 10 years, if AI systems are managing logistics, administrative, and manufacturing systems, I would want to know the version of the model, its decisions, and have oversight of models that exceed human intelligence to ensure that it is aligned with human interests. And putting AI in a black box that can cheat and not make decisions in the best interest of humanity, I'm not sure we can trust a few companies to consistently ensure the security and integrity of such a system, especially considering the superpowers that AI models may have in the next 5 to 10 years.

Kamran: We all know that the crypto space is full of narratives, and you are so focused on the AI field. Do you think this will be a hindrance for you in the long run? As you said, your technology stack will be far superior to what we have seen now. Do you think the narrative and nomenclature around AI itself will hinder your development in the future?

Michael: We don't think so. We firmly believe that in the future, every company will be an AI company. There will be almost no company that does not use AI in some form in its application or platform. From this perspective, every time GPT launches a new version, such as with trillions of parameters, it opens up new features that were not available before and reaches a higher level of performance. I think the excitement will continue because this is a completely new paradigm. This is the first time we can tell computers what to do in human language. In some cases, you can gain capabilities beyond the average person and automate processes that were previously impossible. For example, some companies have almost completely automated their sales development and customer support. With the release of GPT-5, GPT-6, etc., AI models will become even smarter. We need to make sure we keep up with this trend in Web3 and build our own open source versions.

AI agents will run parts of society in the future, and it is critical to ensure that they are governed by the blockchain in an appropriate way. In 10 to 20 years, AI will definitely be mainstream and bring about huge social changes. Just look at Tesla's fully autonomous driving mode to see that the future is becoming a reality day by day. Robots will also enter our lives and provide us with a lot of support. We are basically living in a science fiction movie.

Original link

欢迎加入律动 BlockBeats 官方社群：

Telegram 订阅群： https://t.me/theblockbeats

Telegram 交流群： https://t.me/BlockBeats_App

Twitter 官方账号： https://twitter.com/BlockBeatsAsia

Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.