Algorithms to Detect Hoaxes

09.28.2017, by

Reading time: 6 minutes

Following Hurricane Harvey, in August 2017, this picture of a shark swimming in the city of Houston (US) was heavily shared online. It was, however, a fake image that now reappears after every hurricane.

"Fake news"—from deceitful information to fake images—surfaces every time an event receives extensive media coverage. Social networks are attempting to find ways of controlling this phenomenon, while researchers are developing algorithms to more effectively spot them.

Verifying a piece of information without context is somewhat of a gamble. According to a 2006 study conducted by Texas Christian University, we are not much better than chance at detecting deception.1 A decade on, social networks have exposed us to such a massive flow of data that it has become even more difficult to sort through it.

Numerous research teams have thus developed resources to identify hoaxes and other items of fake news, misleading information that can range from a simple joke to large-scale political manipulation. At the Irisa,2 Vincent Claveau, a CNRS researcher, and Ewa Kijak, a lecturer at the Université de Rennes 1, are working with doctoral student Cédric Maigrot on automating the hunt for falsified images and phoney stories.

"Unless they are looking at a basic montage, humans cannot detect alteration or reuse of a photograph," reckons Claveau. "Only information technology can do so." According to him, automation has two objectives: to process a mass of news that is unmanageable by humans, and to offer a vision that is specific to a machine, less biased than a human being. In fact, the researcher admits that "one is less likely to question a piece of information if it confirms their opinion."

© Courtesy of Thomas Peschak/www.thomaspeschak.com

Irisa researchers were able to find the two photos used for the photomontage. One showed a flooded road, the other a shark swimming near a kayak off South Africa, which the magazine "Africa Geographic" published in September 2005.

The origin of information

Verification can be performed in three different ways. First, network analysis makes it possible to identify a message's trail. Does it come from a respected press agency, or a website that mass-produces phoney content? The research team also monitors sites that serve as sound boxes without necessarily producing content.

One is less likely to question a piece of information if it confirms their opinion.

Yet some social networks do not always allow tracing the entire origin of a piece of information, primarily in order to keep their own algorithms secret. Yet researchers are undeterred. A team from the ISC-PI23 and CAMS,4 for example, has implemented the Politoscope project, which maps the diffusion of tweets. The system reveals the formation and evolution of political communities based on how Twitter accounts behave with regard to the content circulating, and even shows which group is the quickest to react and share each new message. The platform achieved this by scrutinizing more than 80 million tweets.

The next step, looking at readers' comments, provides clues about the validity of the publication.

Finally, the content itself is naturally also the subject of analysis, especially if it combines text and images. Was the photo altered or distorted? Is the message related to the image? A document's language level can also betray its origin, such as the presence of smileys, an overabundance of exclamation and question marks, an absence of quotes, an excess of phrases in the first and second persons, etc.

An algorithm can identify and isolate these elements, along with the names and dates that structure the information, some of which are directly present in keywords or hashtags.

A search engine to detect falsification

Irisa researchers especially focus on images. Aside from actual photomontages, it is possible to fool readers by using an authentic image with a modified caption. This was the case for old photographs of victims of a bombing, "recycled" to accuse an actor in an entirely different conflict.

While the general public can trace a photograph on Google Images, researchers have created their own search engine for images. "Google very poorly processes some of the simplest and quickest alterations, such as reversing left and right, changing hue, or cropping..." explains Claveau. "Irisa's search engine is much more robust, and less susceptible to these ruses."

It can scrutinize the elements of a photo, detect whether they come from different images, and were subsequently combined. "The same photo is used for every storm," the researcher cites as an example. "It shows a strip of flooded motorway with a swimming shark. We were able to separately find the real photos of the flood and the shark."

Technical details can also help identify certain photomontages. For instance, a dual compression in the file indicates that part of the photo comes from another image that was itself compressed. The search engine also analyzes the text that accompanies the image. Extraction of the most important keywords, such as place names and people, makes it possible to compare and detect signs of misuse or tampering.

Vidéo de l’artiste allemand Mario Klingemann

About

Année de production:

2017

(Artificial Intelligence algorithms can now sync videos to an audio source. The German artist Mario Klingemann was able to automatically create a sequence where the French singer Françoise Hardy gives a rendition of a controversial speech by Kellyanne Conway, advisor to US President Donald Trump. While this clip obviously looks fake, the rapid progress in the field promises to further blur the line between fact and fiction.)

Alerting rather than ruling

How should these advances be used in the fight against false information? Ideally, they should be inserted into the very architecture of social networks in order to spot hoaxes as early as possible, although this option is subject to the goodwill of companies. For example, depending on the country, Facebook has launched an option enabling users to transmit information that they deem suspicious to established media for verification. Yet there is a risk that the public will not denounce material that supports its opinions. In fact, a Yale study has just shown that this system had no positive effect against the spread of fake news.5 Irisa researchers are therefore leaning toward extensions that are integrated into web browsers.

"Plug-ins would skim through webpages, tweets, and Facebook posts and indicate anything that seems dubious," Claveau suggests. "The idea is to prioritize decision-making by the reader. The machine does not determine the truth, but instead provides leads."

The question of legitimacy indeed surfaces regularly, and some efforts have been criticized. When decoders from the French daily Le Monde launched Décodex, which color-codes informational websites based on their reliability, they received mixed reviews.

"An algorithm will, perhaps wrongfully, be seen as more impartial than a media outlet judging other medias outlet," adds Claveau. "Still, the Décodex team led a useful effort in helping us assess the quality of information."

Regardless, hoaxes follow certain cycles, and tend to increase after each major mediatized event. The crudest ones are easy to spot, but others raise more philosophical questions regarding the evaluation of truth. The purpose of algorithms is therefore to warn rather than judge, the latter being left to the reader.

Footnotes

1. Pers Soc Psychol Rev. 2006, vol. 10 (3) : 214-34.
2. Institut de recherche en informatique et systèmes aléatoires (CNRS / Université Rennes 1 / ENS Rennes / Insa Rennes / Université Bretagne Sud / Inria / CentraleSupélec / IMT Atlantique).
3. Institut des systèmes complexes de Paris Île-de-France (CNRS).
4. Centre d’analyse et de mathématiques sociales (CNRS / EHESS).
5. G. Pennycook et D. G. Rand, Assessing the Effect of « Disputed » Warnings and Source Salience on Perceptions of Fake News Accuracy, SSRN, online on 15/09/2017.

Explore more

Society

Article

01/29/2025

Prison Life Index opens a window on life in jail

Collection ChristopheL © Universal Pictures - Amblin Entertainment

Article

01/22/2025

How Jurassic Park changed the image of dinosaurs

République 11 janvier 2015 à Paris © Stéphane Mahé / Reuters

Article

01/21/2025

"Protecting democratic debate against hatred and lies"

Article

01/20/2025

Telescopes face down-to-earth challenges

Focus Features - Maiden Voyage - Stillking Films - Studio 8 / Collection ChristopheL

Article

01/15/2025

Vampires combine fear, laughter and entertainment

Computer science

Article

05/22/2023

Towards environmentally-friendly cryptoassets?

Article

04/11/2023

Spyware in mobile games

Article

02/09/2023

When the cloud gets closer

Article

01/26/2023

New algorithm helps satellites avoid space debris

Article

10/24/2022

One software, billions of possibilities

Fake News

Article

06/24/2024

Russian propaganda floods Europe's social networks

Article

07/06/2022

The Internet, a disinformation highway?

Article

03/24/2022

How social networks manipulate public opinion

Article

06/04/2020

How Airborne is the Virus?

Article

08/12/2019

The Persistence of Stereotypes

Author

Martin Koppe

A graduate from the School of Journalism in Lille, Martin Koppe has worked for a number of publications including Dossiers d’archéologie, Science et Vie Junior and La Recherche, as well the website Maxisciences.com. He also holds degrees in art history, archaeometry, and epistemology.

See author's bio

Keywords

Fake News Hoax Fake Images Photomontage Internet Social Networks Alternative facts Post-Truth

Follow

Customize your navigation

Sections

You are here

Algorithms to Detect Hoaxes

You are here

Algorithms to Detect Hoaxes

The origin of information

A search engine to detect falsification

Vidéo de l’artiste allemand Mario Klingemann