Photo Credit: Pexels/Tracy Le Blanc 521l35
Tumblr and WordPress let s post public and private content
OpenAI and Midjourney to sell -generated content that will reportedly be used help train AI. While the details of the deals and the data-sharing practices remain unclear at the moment, this has raised a question on data privacy and the ethics of companies sharing their s' data with third parties.
Internal communications by employees of Automattic, viewed by 404 Media, both confirmed the deal with AI companies and revealed details on these practices. In its report, the publication confirmed that Automattic's deal with OpenAI and Midjourney could be announced soon. Further, it appears data compilation for the AI firms has already begun. Meanwhile, an internal post made by a product manager Cyle Gage suggested that all Tumblr's public post content between 2014 and 2023 was compiled.
The report also highlights a specific message that suggests private and deleted content was also automatically compiled, alongside public data. It was not clear whether that set of data was already shared with the AI firms or not. Further, since such an accident puts its entire base's private information in jeopardy, it also raises a question about the company's ethical policy and data safety infrastructure.
Automattic on Tuesday issued a statement stating, “AI is rapidly transforming nearly every aspect of our world, including the way we create and consume content. At Automattic, we've always believed in a free and open web and individual choice. Like other tech companies, we're closely following these advancements, including how to work with AI companies in a way that respects our s' preferences.”
The post detailed several things the company is doing for its s including blocking AI platform crawlers, a setting to discourage search engines from indexing a site on WordPress and Tumblr, and an assurance of an opt-out setting for s who do not wish to share data with the third party. “Currently, no law exists that requires crawlers to follow these preferences,” the post stated.
The mechanism to opt-out of data sharing is also somewhat unclear. While the company stated in the post that the AI firms will respect the opt-out settings and even remove the past content from s who have newly opted out, the report claims the reality is more complicated.
The report found an internal document from February 23 where an employee asked whether the company had any assurance that the data partner would respect the opt-out decision made by s. Andrew Spittle, Automattic's Head of AI, reportedly replied, “We will ask that content be deleted and removed from any future training runs. I believe partners will honor this based on our conversations with them to this point. I don't think they gain much overall by retaining it.”
The response was noted to be vague and does not confirm if Automattic had an agreement on the same, according to the report. Further, it appears that the entire line of reasoning holds on the assumption that AI firms will not gain much by retaining the data. It should be noted that the practice of third-party data sharing is not new, and most social media platforms hold the rights to -generated public content on the platform. However, making such deals without revealing it to s could potentially expose private information to companies that are using the same data to train AI systems.
For the latest reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.