Twitter tests new Safety Mode

September 16, 2021

Twitter is testing a new safety feature aimed at reducing unwanted interactions. As explained in a Twitter Safety blog on 1 September 2021, Safety Mode temporarily blocks accounts (Author Accounts) found by Twitter’s artificial intelligence (AI) to be sending harmful or uninvited Tweets to a user (User). Author Accounts may be automatically blocked from interacting with a User who has activated this feature if the messages from those Author Accounts use potentially harmful language (like insults) or send repetitive uninvited replies or mentions, unless the User already follows or frequently interacts with those Author Accounts. Auto-blocked Author Accounts will be unable to follow the User’s account, see their Tweets or send them direct messages for seven days. Users can enable and disable Safety Mode and undo auto-blocks at any time through their account settings.

Safety Mode is currently only available to a small feedback group of English-language Users. But Twitter has other settings and features that enable all Users to manage content displayed to or by them. They can:

  • Follow, unfollow or mute accounts, which controls whether or not the User sees other Users’ Tweets.
  • Block accounts, which prevents Users from interacting with another User at all.
  • Report accounts or Tweets which the User feels are uninteresting, suspicious, spam, contain sensitive photos or videos, are abusive or harmful or express intentions of self-harm or suicide. Twitter is currently also testing a new option to report misleading information.
  • Filter notifications, mark Tweets as “show less often” (which decreases the likelihood of seeing the same types of Tweets in future) or receive warnings about sensitive content.
  • Control what others see about the User by setting: who can find the User’s account; whether the User’s Tweets are public or seen only by the User’s followers; who can tag a User in photos; whether a User’s Tweets include the User’s location; or whether the User’s own sensitive content is flagged automatically.
  • Manage the conversations a User starts by setting who can reply to the User’s Tweets.

As far as we know, Twitter is the first social media platform to introduce a feature like Safety Mode (not to be confused with “safe mode”, which some applications use to disable functionality when the application is not working properly so that the problem can be identified). However it is definitely not the first or only platform using AI to manage content and interactions between Users (and already does this through existing Twitter features like “show less often”). Most platform operators that take proactive steps to remove harmful content use AI to identify that content by flagging trigger words or graphics.

Earlier this year we discussed regulatory initiatives on content moderation in our blog, Is social media regulation on the way? Platforms’ response to increased obligations to moderate their content will undoubtedly involve AI. This means that they will also need to consider AI regulations (as well as data protection and other relevant regulations) in designing their solutions. AI regulation may of course affect social media platforms in a broader context – targeted advertising and other content (driven by AI) being the core business model of many of these.

In the EU, a proposed Artificial Intelligence Regulation is currently making its way through the legislative process. You can read more about this in our blog here.