“The agent is able to make its own choices about where to send it in the decision tree,” says Moldovan. In some cases, the final agent in the chain might send it back up the tree for additional review. “It allows humans to go through a much more manageable set of signals and interpretations,” he adds.
To keep the systems going off the rails, several controls are in place. First of all, OpenAI itself has a set of controls in place, including a moderation API. Then, the system is extremely limited in what information comes in and what it can do with it. Finally, all decisions go to humans for review.
“We’re risk managers, not boundary pushers,” Moldovan says. “We use this system to properly identify a set of content that needs human review, and all final moderation decisions are human. We believe content moderation, especially on a platform like ours, requires a level of nuance we’re not yet ready to cede to robots.”