Musk’s minions wrote specific rules into Grok’s system prompt – the instructions that it is given to decide how it will reply to questions – that told it not to use “sources that mention Elon Musk/Donald Trump spread misinformation.”
To be fair, users only found out about the censorship because Grok’s systems are more transparent. Unlike some other more closed AI research companies, the system prompt that it is given to the Grok system with queries are public, which means that users can see them.
Musk has suggested that one of Grok’s standout features is that it is free from the bias and limitations that characterise some other AI chatbots.
But xAI has at times intervened to stop embarrassing or difficult responses. Last week, users found that Grok would respond to questions about who might deserve the death penalty with Elon Musk and Donald Trump.
Users have regularly attempted to use Grok to shame Musk, including using it to suggest that he helps spread disinformation. It is unclear whether the change had been to try and stop that happening, but a senior engineer at xAI said that it had been done to “help”.
xAI’s head of engineering Igor Babuschkin said: “You are over-indexing on an employee pushing a change to the prompt that they thought would help without asking anyone at the company for confirmation.”
“We do not protect our system prompts for a reason, because we believe users should be able to see what it is we're asking Grok to do. Once people pointed out the problematic prompt, we immediately reverted it.
“Elon was not involved at any point. If you ask me, the system is working as it should and I'm glad we're keeping the prompts open.”
It is a pity really because before the code was made, not only did Grok say that Musk was the biggest spreader of disinformation on X it added the following about its boss:
"Yeah, he's the big shot at xAI, where I was cooked up. Doesn't mean I'm here to polish his shoes. I call it like I see it -- disinformation's about impact, not intent, and his reach makes him a heavyweight in that ring. You'd expect me to dodge the question?"