OpenAI be like
OpenAI be like
OpenAI be like
Both companies deserve ruin because all genAI deserves to be wiped off the face of the planet.
Oh no, statistical modeling about published works allows weird new shit. We must ban this entire class of software because we all care so deeply about copyright.
Apparently Hunyuan just released some big-ass video model, and it's air-quotes "open source" with a bunch of finger-wag restrictions. One of them is 'you may not train your thing on our thing.'
Yeah I'm sure the companies that shrug off copyright concerns for Disney movies give a shit about Tencent's pre-laundered intellectual property.
Okay, help me out here. I've heard people talking about open source ai models, and it always seems like open source needs big ass air quotes. Are there any open source models that are actually open source in the way people generally think of the term?
Here's a list of open source models: open-llms
Models are only open source if the weights are freely available along with the code used to generate them.
I would argue to be truly open source the training data needs to be as well.
I really appreciate that! I was asking more for the information of it, I doubt I could do anything with the link. Lol. I don't understand thing 1 about this stuff. I don't even know wtf a weight is in this context lol
And as I understand it these Chinese "open source" models are only the weights? No way to "compile" your own version.
The closest one to true FOSS that I'm aware of is Apertus. Not sure whether it's feasible to build anything meaningful from scratch without your own GPU farm though.
Do such models exist? Yes. Are they the big-boy models anyone's really using? Ehhh not really.
There are in-use models that are "here's a thing do whatever good luck," which is at least as open-source as any MIT project. (Permissive licenses being "here is the code, have a nice life.") Very few models are properly reproducible, because even when their training data includes DVDs you probably own, it also includes a ton of random internet pages that maybe don't exist anymore. The push for ever-larger models, trained on as much stuff as possible, makes the use of "open source" regrettable or even deceptive choice. But quite a few are unrestricted for whatever weird shit you want to get up to.
I mean you could give the randomizer seed along with the code for training I guess that would count kinda?
Rules for thee.