Researchers Scrape 2 Billion Discord Messages and Publish Them Online
Researchers Scrape 2 Billion Discord Messages and Publish Them Online

Researchers Scrape 2 Billion Discord Messages and Publish Them Online

Researchers Scrape 2 Billion Discord Messages and Publish Them Online
Researchers Scrape 2 Billion Discord Messages and Publish Them Online
You're viewing a single thread.
Data Availability The dataset presented in this study has been made publicly available and can be accessed via DOI: 10.5281/zenodo.146585059. The data is provided in a compressed format, which can be decompressed for analysis. Detailed instructions for accessing and utilizing the dataset are provided in this article and the platform
Ok, and in regular english?
Well the DOI is a digital identifier for papers and other data references for sciency stuff. But that DOI just points to the actual paper https://www.arxiv.org/pdf/2502.00627
Link to where the archive is https://zenodo.org/records/15170676 but its been restricted from downloading
Note: Download access has been temporarily suspended at the request of the ICWSM program chairs.
EDIT: lol I love the Internet Archive Its 120GiB if anyone wants to try download it and see if it works.
The only error I can see is "the data is" should be "the data are". Stylistically I would also change utilizing to using, which conveys exactly the same meaning and is more accessible.
Otherwise I believe this is regular English. It's ok to have difficulty comprehending language, but it's more productive to ask questions about the parts you don't understand.
Can I help with any specific questions?
The data is correct English in this context. I was talking more about DOI: 10.5281/zenodo.146585059.
But others have said what it is.
Anything that people write is correct really, like language is created by anyone making sounds or patterns to share meaning.
Data is a plural though, seppo dominance using it as singular is changing it outside that miserable empire but it's one of my hangups haha.