Home
News Detail

Gata releases large-scale real users worldwide ChatGPT dialogue dataset ChatGPT-RealUser-2.2M

Source: ChainCatcher
According to ChainCatcher, decentralized AI infrastructure company Gata announced the launch of ChatGPT dialogue dataset ChatGPT-RealUser-2.2M, a global large-scale real user. The dataset is collected through Gata's GPT-to-Earn program (voluntary participation by users), and has gathered over 2.24 million real conversations and nearly 3.56 million Q&A since 2024–2025, from more than 15,000 real users, covering interactions with GPT-3.5, GPT-4 and o1. According to reports, the data set is about twice the size of the same data set in the Allen Institute for AI in the past, covering real scenarios, multiple rounds of dialogues, and includes a large number of encryption-related interactions due to the on-chain incentive mechanism. The preview version is available on Hugging Face, with 600 conversation samples, and the complete dataset is available for research and commercial applications. It is reported that in May 2025, Gata announced that it had completed a total seed round of US$4 million, including YZi Labs, IDG Blockchain and others.
Link copied to clipboard