So last week, I got inexplicably banned from Twitter for posting a video clip from Stray.
Twitter’s algorithm had decided that the clip of the cat falling through the world was classified as intimate imagery posted without consent, a rule usually being used to ban revenge porn. Being the usual stubborn me, I decided to fight this and keep sending appeals asking for an explanation of how this clip managed to break those rules, despite appealing extending the ban. By arguing that the content I posted was not in fact breaking any rules, Twitter automatically keeps people banned until they delete it. You cannot both appeal something and delete the offending tweet at the same time, which makes sense to someone out there.
In this self-imposed exile, I came to find out that I was not alone for being in Twitter jail for cat crimes. Several others responded to the previous article saying that they had been banned for the same thing, most citing that specific glitch or a similar one elsewhere in the game. All were posted directly from the PlayStation 5’s share tools. At some point, even Annapurna Interactive got involved with this, trying to get information from Twitter about what exactly had gone wrong.
After speaking with Twitter employees on background, as well as fruitlessly trying to receive official comment from Twitter itself, it has become clear that this was not an isolated incident. For whatever reason, clips of Stray were tripping a false positive for revenge porn in Twitter’s algorithm and resulting in automatic bans.
Twitter obviously does not wish to explain how this happened to me, despite repeated inquiries. It’s embarrassing to publicly admit for one and likely reveals more about the quality of their AI-based moderation tools than they wish to have known. While Twitter employees have told me that the problem has been fixed, they refuse to go any deeper than that. As a company, Twitter has refused to comment on it.
It also appears to be more than just Stray. Upon further investigation, other examples of game footage triggering automatic bans included DNF Duel, Rollerdrome, and PowerWash Simulator. Twitter had no comment about any of this, either.
While it’s comforting to know this has supposedly been fixed, it’s also impossible to know to what extent. It also exposes some very pertinent questions about Twitter’s AI-based moderation tools, that largely seem to fail at picking out harassment or slurs but can misclassify videos at the drop of a hat. Given that multiple games seem to be triggering the bans, it is not even clear if this can be permanently fixed. What commonality do Stray and DNFDuel share that would stop, say, Wo Long: Fallen Dynasty from not also causing people to get banned for no good reason?
Until or unless Twitter decides to openly talk about it, which it seems unlikely they ever will, we’ll probably never know.