Skip Navigation

Microsoft CEO of AI: Online content is 'freeware' for models • The Register

15 comments
  • 🤖 I'm a bot that provides automatic summaries for articles: ::: spoiler Click here to see the summary Mustafa Suleyman, the CEO of Microsoft AI, said this week that machine-learning companies can scrape most content published online and use it to train neural networks because it's essentially "freeware."

    Shortly afterwards the Center for Investigative Reporting sued OpenAI and its largest investor Microsoft "for using the nonprofit news organization’s content without permission or offering compensation."

    Also, in 2022, several unidentified developers sued OpenAI and GitHub based on claims that the organizations used publicly posted programming code to train generative models in violation of software licensing terms

    Most people posting content online as individuals will have compromised their rights in some way by accepting the Terms of Service agreements offered by major social media platforms.

    The fact that OpenAI and others making AI models are striking content deals with major publishers shows that a strong brand, deep pockets, and a legal team can bring large technology operations to the negotiating table.

    People will stop making work available online, they predict, if it just gets used to power AI models that reduce the marginal cost of content creation to zero and deprive creators of the possibility of any reward.


    Saved 74% of original text. :::

  • Using the term "freeware" is silly, but consider this:

    Is the act of reading/watching something, equivalent to making a copy? Freedom of thought is an agreement much older than the 1990s, it has nothing to do with copyright, and all to do with secrecy. If something is made public, then it isn't secret, so obviously anyone can read/watch it, be it with a wetware neural network, or an AI neural network. Making an exact copy is either plagiarism, or copyright infringement... but abstracting a style, then applying it to some other data, is "inspiration".

    Imagine a website with a licensing disclaimer like "you are allowed to read the content, but not to comprehend or express any thoughts based on it". Nonsense, right?

15 comments