The use of the LLM directly violates OpenAI's terms of service, which states that its model output can't be used 'to develop any artificial intelligence models that compete with its products and services.'
The Verge has found internal ByteDance documents showing that the OpenAI API has been relied on to develop its foundational LLM, codenamed Project Seed, during nearly every development phase, including for training and evaluating the model.
Employees involved are well aware of the implications; I've seen conversations on Lark, ByteDance's internal communication platform for employees, about how to "whitewash" the evidence through "data desensitisation."
The misuse is so rampant that Project Seed employees regularly hit their maximum allowance for API access.
Most of the company's GPT usage has been done through Microsoft's Azure program, which has the same policy as OpenAI.
OpenAI said that it has suspended ByteDance's account: "All API customers must adhere to our usage policies to ensure that our technology is used for good. While ByteDance's use of our API was minimal, we suspended their account while we further investigated. If we discover that their usage doesn't follow these policies, we will ask them to make necessary changes or terminate their account."
 
				