Instagram Tests a New Feature That Will Automatically Flag Offensive Captions


Tech giants such as Facebook, Google, and Twitter have been engaged in a never-ending war against potentially 'harmful' content. Today, Instagram just unveiled its plans to further tighten its hold on what users can post.

Instagram's AI will now go through the contents of your caption and match it against its database of 'problematic' content. If there's a match, you'll be shown a prompt telling you that the caption is potentially harmful and consider you to change it.

Meta Has Updated Its Privacy Policy for Easier Understanding of How Your Data is Used

You can either change the caption, learn more about Instagram's content policies or post it without making any changes. Instagram says that the objective isn't to ban hate speech altogether and just a way of making users reconsider their actions.

Furthermore, Instagram states that these warnings will serve as a warning to users when their account is at risk of breaking any rules. The company implemented a similar policy last year, which flagged potentially harmful comments and asked users to reconsider. It further adds that the results of said policy have been 'promising' and that such cues go a long way in reducing the toxicity on the platform.

Will AI-based content recognition help make Instagram safer?

Short answer, probably not. At the end of the day, Instagram's AI will rely upon comments that have been flagged by users. Everything ranging from seemingly innocuous jabs to racial and ethnic slurs gets flagged. Can Instagram's AI differentiate between harmless banter and actual harassment? Looking at the screenshot above, it doesn't seem so. Merely using the word 'stupid' makes triggers Instagram's AI which is rather ludicrous, even for Instagram.

At the end of the day, an AI is only as good as the data it is fed. We've all seen what can happen if you let the internet have a free reign over an AI-powered bot. For example, take a look at how Microsoft's Tay bot ended up.

Instagram is Getting a Visual Refresh with a Brighter Icon and New Typography

Instagram could potentially experience something of a similar nature. If users were to engage in a coordinated effort to flag seeming harmless words, then the AI would be thrown off and start sending warnings to unsuspecting users. It might be a lot harder this time as companies may have learned a thing or two from Tay's story.

Instagram says that the feature is rolling out in some regions and will be available globally next year.