TMCnet Feature Free eNews Subscription
March 01, 2024

WordPress Wants To Sell User Data To AI Tech Firms

AI tech firms have scoured the internet to create impressive chatbots capable of having a conversation on just about anything. Over the last year and a half, users have found that they can probe them on everything from the mysteries of dark matter to tax law.

However, getting them to that stage required a lot of work. Companies like OpenAI and Google (News - Alert) had to prepare virtually all public data before feeding it into the models and getting something that made sense out the other side.

Moreover, it still doesn’t appear as if they have enough. Now they are working with WordPress to generate additional data, which is generating controversy.

What Is Included In The Data Transfer Deal?

Statements so far don’t indicate which data WordPress will send to companies like OpenAI and Midjourney. However, the agreement is nearing completion and will be signed soon, according to 404 Media, which is reporting on the issue.

Strangely, WordPress prepared to send data that shouldn’t have been sent in the deal, according to an internal memo from Tumblr’s product manager, Cyle Gage. He says the content included things like unanswered non-public questions, private answers, and posts marked as explicit, which is raising concerns among users.

After the news broke, Engaget tried to get in contact with the AI platforms to learn more. The outlet wanted clarification on how WordPress parent company, Automattic, would tag (News - Alert) excluded IDs or whether they had already been sent to the platforms.

Automattic later replied in a statement saying that they will only send public content hosted on WordPress and Tumblr websites that haven’t opted out of the policy.

Unfortunately, the law in its current form means that AI companies can ignore users’ and website owners’ opt-out policies. That means that OpenAI and Midjourney could potentially use data against someone’s will if WordPress and ignore opt-out requests through WordPress.

To assuage concerns, the AI firms say that they will respect users’ privacy requests. If they don’t want them to use their private information, they won’t.

A New Tool In The Works

Further to this, OpenAI says that there is a new tool in the works for website owners who want to opt out of the data skimming it plans to carry out. Users who opt-out from the start will never see their data fed into the firm’s large language models (LLMs), it claims. That’s because the tool will block the crawlers these platforms use to strip sites of useful information.

Furthermore, even those who don’t opt out immediately will still have the option to block AI companies from using their data. Automattic says that it will allow people and partners who change their minds to remove their content from past and future training sources.

Deletion Promises

Internal documents seem to show that Automattic is serious about providing staff and users with data removal assurances. The AI lead for the brand’s parent company, Andrew Spittle, recently said in an internal memo that he was advocating for privacy. He said that if users change their minds about submitting data to OpenAI and Tumblr from WordPress, he would support them and campaign to get the data removed periodically.

Whether AI firms will oblige depends on the value they obtain from the data. But from the wording of communications so far, it doesn’t seem all that promising for users who want more privacy. Automattic, the parent company of Midjourney and OpenAI, says that it will “advocate” for users who want to remove their website data from training content and its AI boss “believes” firms will comply. But that doesn’t mean that they will. In fact, the incentives for AI firms are operating in the opposite direction.

Why Is Selling Data To AI Companies So Appealing For WordPress?

The fact that WordPress and Tumblr are taking these risks by opening up the vaults to OpenAI and Midjourney suggests there are significant rewards in it for them. Undermining your user base and going against their desire for privacy isn’t free and will have real-world consequences for these firms.

Even so, the benefits are considerable. Companies that sell their data to AI companies stand to profit financially. Currently, many artificial intelligence firms are awash with cash and want to spend it on acquiring more information to feed into their models. And WordPress is an obvious target. Using its massive data repository could improve LLMs and even help train the next generation of multi-modal models.

For WordPress and Tumblr, these financial gains are likely to outweigh any costs or losses that these firms might incur. Incoming money could potentially help these firms bolster their numbers and avoid running a skeleton crew (which is something they have been doing recently).

It’s worth noting that there is still no official confirmation on whether WordPress will begin selling user data. Unlike Google or Facebook (News - Alert), it doesn’t have a huge repository of information on the people who use the platform. Most of the data it has comes from private conversations, which is why this latest move is so controversial.

While users can make money from selling WordPress themes, sources of new revenue for WordPress itself are few and far between. With the rise of AI, the platform is sinking in importance on the internet stage.

Currently, the CMS makes money by upgrading people from its basic free plan to one of its premium plans. These offer more features and come packed with additional services, such as e-commerce tools and increased storage.

It also makes some money through domain name registration. Users can pay the CMS a fee and it’ll designate a domain name for them, again generating significant revenues for some names.

Advertising is a part of WordPress’s business model, but it is small. It can sometimes display ads on free websites, but this disappears when users upgrade to premium plans.

Wrapping Up

This latest move by WordPress suggests that it is bringing a new meaning to the term “open source.” While it is still a part of the ecosystem, raising funds by selling data to AI companies is an obvious move, and protections for website owners and users look minimal.

» More TMCnet Feature Articles
Get stories like this delivered straight to your inbox. [Free eNews Subscription]


» More TMCnet Feature Articles