Twitter has rolled out new rules to tackle the use of deepfakes to deceive people and in March it also starts adding labels to certain tweets containing manipulated media.
The company will remove manipulated media, including deepfakes, but only if the content is likely to cause harm, such as threats to the physical safety of a group or person, or if the tweet creates a risk of mass violence or widespread civil unrest.
The harm threshold also applies to stalking, content that aims to silence someone, voter suppression, and intimidation.
SEE: How to implement AI and machine learning (ZDNet special report) | Download the report as a PDF (TechRepublic)
The most likely scenario under which Twitter will remove media is if the content meets all three of its criteria. That is, if it is significantly and deceptively altered, shared in a deceptive manner, and if it’s likely to cause harm.
Twitter today added a new clause in terms covering “synthetic and manipulated media”.
“You may not deceptively share synthetic or manipulated media that are likely to cause harm. In addition, we may label Tweets containing synthetic and manipulated media to help people understand their authenticity and to provide additional context,” it says in the terms, which offer a link to a more detailed explanation of its removal policy.
Facebook last month also outlined new rules for manipulated media and says it will take down a video if it’s been synthesized “in ways that aren’t apparent to an average person and would likely mislead someone into thinking that a subject of a video said words that they did not actually say”. It already applies labels to some manipulated media.
The new rules from the social networks suggest companies are trying to get a handle on misinformation before the 2020 US presidential elections in a way that still allows for free speech.
The companies have faced scrutiny from lawmakers in the US and Europe for previously not doing enough to counter fake news, particularly in election meddling by foreign governments.
Besides labeling tweets, Twitter says it will show a warning to people before they retweet, reduce visibility of the labeled tweet and prevent it from being recommended, and provide additional explanations and clarifications.
“This will be a challenge and we will make errors along the way – we appreciate the patience. However, we’re committed to doing this right. Updating our rules in public and with democratic participation will continue to be core to our approach,” said Yoel Roth, Twitter’s head of site integrity.
Jigsaw, from Google-parent Alphabet, has released a new fact-checking tool called Assembler that uses artificial intelligence to help reporters detect manipulated images. The idea is to make it faster for fact-checkers and reporters to detect when an image has been manipulated.
Assembler uses image-manipulation models developed by researchers from University of Maryland, University Federico II of Naples, and the University of California, Berkeley. The tool uses the models to show the probability that an image was manipulated. Google Research also contributed to the project.
According to Jigsaw’s Jared Cohen, Assembler is an “early stage experimental platform”.
It’s called Assembler because it assembles multiple image-manipulation detectors from academics into a single tool to create a more comprehensive assessment.
SEE: Reality Engines offers a deep learning tour de force to challenge Amazon et al in Enterprise AI
Jigsaw also built a StyleGAN – referring to the generative adversarial networks often used to create deepfakes — to detect deepfakes. It uses machine learning to spot differences between images of real people and manipulated ones.
Finally, Jigsaw has released the first edition of its online magazine, The Current. The first issue covers disinformation campaigns and countermeasures.
Jigsaw published a data-visualization tool that includes a map of recent disinformation campaigns around the world, the tactics used and the platforms that were exploited. The tool relies on data collected by the Atlantic Council’s Digital Forensic Research Lab (DFRLab).