You are here
Viruses and Malware: Research Strikes Back
Armored doors, security entrance, surveillance cameras, and biometric iris recognition: the Loria1 High Security Laboratory (LHS) in Nancy (northeastern France) is a fortress where six million computer virusesFermerProgram capable of infecting another computer software by altering it so that it can in turn reproduce. Although viruses were not malicious at first, today most carry malware and can disrupt at to varying degrees the operation of the infected computer. They spread via computer networks or peripheral devices such as USB drives, etc. are locked away. This collection of the worst "nasties" from the webosphere, caught online by LHS researchers, includes malwareFermerDeliberately malicious software developed to harm a and infect a computer system without the consent of its user. that hacks into our data, destroys our software programs or hard drives, and even forces our computers to unleash torrents of spamFermerUnsolicited email. Generally sent in bulk for advertising purpose. to paralyze the servers of a competitor or enemy site. What is the point of collecting these super-villains? To analyze them in depth, and develop tools for detecting their "mutants," i.e., slightly modified variants stemming from the same root, but that have not yet caused enough damage to be listed and included in antivirus programs on the market. "In general, these software programs only detect viruses they are already familiar with," points out Jean-Yves Marion, director of the LHS. "This is why their developers pay close attention to research results, in order to improve their programs." All the more so as there are many more attacks today than ten or twenty years ago. The 1980s, when geeks cracked systems for the beauty of it, are long gone...
Ever more sophisticated viruses
"There is no more room for amateurism," continues the researcher. "Most attacks are motivated by profit or espionage, and are carried out by criminal groups or governmental organizations that spend months developing elaborate viruses." In November 2014 for instance, the antivirus software developer Symantec revealed the existence of Regin. For at least six years, through computers and GSM smartphones, this program had been stealing passwords and taking screenshots in the IT network of the EU headquarters, but also in the research centers, airline companies, and communication networks of a number of European countries, as well as of Russia, Saudi Arabia, Mexico, etc. Regin is so extraordinary and complex (it apparently required four people working full-time for a year), that experts believe it could only have originated from a governmental organization.2
In this context of cyber espionage and cybercrime, the LHS virus catcher "collects" everything circulating on the Web in order to enrich its collection. "It is a virtual telescope developed by the Madynes team,"3 explains Marion. "This device, which is connected to the Internet via conventional ADSL lines, makes it possible to simulate the presence of hundreds of vulnerable computers, so as to prompt the greatest number of attacks, which are then captured by the system. It' is the 'honey pot' technique for attracting and catching bears," explains the researcher.
How can fake unwary computers be displayed online? It essentially comes down to sending signaling messages. When, at its node junction, the Internet protocol periodically asks connected machines "hey you, who are you?" the virtual telescope generates false responses, such as: "I'm a Mac navigating without an anti-virus program" or "I'm a PC running on such and such operating system"—the carefully chosen system of course being a basic version with many flaws.
Once captured, the malware is put through the wringer of the Gorilla software program, the laboratory's secret weapon. "Like any anti-virus programs on the market, the objective is to find a signature that is unique to the malware and that will help identify it," explains Marion. "For us, the signature of malware is its complete structure." In short, a sketch of the culprit's overall "silhouette." The result is that it is now possible to find the suspect if the malware developers make slight modifications to produce a mutant.
Flushing out malware, a can of worms
"To extract the structure of a malicious program, it is necessary to look at the list of instructions that make it up, in assembly code (i.e., the language of the machine)," explains Fabrice Sabatier, from the Carbone team at the LHS. According to their nature (performing a calculation, repeating a certain action, asking the user to enter data, etc.), these instructions are then given a geometric form as symbol. These forms are then relayed by nodes representing the "conditional jumps," in other words the famous "if-then-else" propositions that give the order to execute such and such an action according to a condition. These representations are fairly common in information technology. "Yet a program is made up of millions of nodes!" the researcher continues. The beauty of the method resides in rules of simplification, in deleting this node but not another, or such set of instructions that are deemed not very characteristic, in order to obtain a sketch or "graph" of only a few thousand to a few hundred thousand nodes (see video below).
"Thanks to our graphs, which can be viewed in 3D and in color, we can thus compare the global structure, or only a few 'pieces' of a program, against samples from our collection. If there are significant points in common, the presumption that we are dealing with malware with be all the stronger," concludes Marion. "Our method is too complex to operate on the PCs used by the general public, but we have helped law enforcement authorities identify the roots of different virus attacks such as RansomwareFermerMalware that hijacks personal data by encrypting it or taht locks access to a user's computer. It then demands that the owner wires money (ransom) in exchange for the key to decrypt the data or restore user access., which were coming from the same source," points out Sabatier. The technique should soon be available to businesses, as Loria has created a start-up for this purpose.4 It is of particular interest for industry because the method, which is based on in-depth analysis of malware, also makes it possible to determine their objective: by studying the different functionalities of the software, it can trace back to what they wanted to steal or destroy.
Identifying different functionalities, lost among thousands of lines of code, is a central issue in fundamental research in computer science, for there is no algorithm able to confirm with certainty that two programs do the same thing (this is called undecidability, a mathematical demonstration first performed by Alan Turing). "For that matter, analyzing viruses raises questions that are just as fundamental, such as what, in the end, is a malicious program?" continues Marion. How is it possible to distinguish it from an ordinary program that operates in a way that is not always justifiable? What differentiates it from your favorite online game application when the latter tries to determine your geolocation? Or when it tries to send your information to a third party, as a famous bird game was accused of doing last year?5 Researchers must, eventually, define in what contexts certain functionalities are suspect, and in what other contexts they are not...
Should we worry in the future about new generations of viruses with revolutionary natures and structures? "Innovation in computer virology mostly lies in how to protect malware by 'disguising' it," Marion points out. There are the mutants mentioned earlier, but also the encryption of a virus code or the zipping or re-zipping it up to a hundred times within another program, in order to hide it and thereby avoid anti-virus analysis. With regard to dissimulation, there is worse still: certain malware programs change themselves during their execution on a PC, with hidden functionalities triggered in waves, which can erase themselves along the way, or never activate themselves at all!
"For the last five or six years there has been a genuine effort to engineer protection against computer viruses. We conducted a test with one of the samples: we protected it with a software program available on the market (used in the context of file protection for copyright), and it slipped through the nets of a number of anti-virus programs that nevertheless knew about the original," explains Marion. Whereas at the LSH, one can execute a virus: in executing it, one unzips it as many times as necessary and observes its waves of self-modification, which makes it possible to access its hidden sections, in the deepest parts of its code. "We can conduct experiments in full safety on our network, without worrying about contagion to other machines, as it is a confined cluster that is disconnected from the external world. This enables us to simulate hundreds of virtual machines," emphasizes the IT specialist.
Objective: to guarantee the digital security of citizens
Of course Google and other members of GAFAFermerAcronym formed from the initials of the Internet giants Google, Apple, Facebook and Amazon. are also keeping an eye on things. For that matter, the US search engine has an even more impressive malware collection than that of the LHS, with 300 to 400 million samples, even if they are also teeming with duplicates, whether mutants or not, which swell the figure. "In addition to these private actors, public research plays a crucial role, notably in the field of virology, which is little developed in France and Europe," insists the director of the LHS, France's leading platform for academic research dedicated to computer security.
"In parallel to the actions put in place by companies motivated by commercial interests, public research should provide solutions to guarantee the digital security of citizens, seek out the weaknesses of commercially-available resources, and inform the general public, who is then free to use them or not," stresses the researcher. This salutary process can also prompt developers or web providers to quickly offer corrective patches. An edifying example shook the Web last June: a demonstration on Logjam, conducted by researchers from Loria, highlighted a major flaw in the https protocol, which secures Internet connections. Not to mention the fact that the fully-connected world that we are preparing will give hackers more and more targets. "Before being developed on a large scale, the Wi-Fi connected pacemaker, autonomous car, pulse-reading bracelet or electronic voting devices will require a number of guarantees," advises Marion. Whom will we choose to entrust with this responsibility?
- 1. Laboratoire lorrain de recherche en informatique et ses applications (CNRS / Inria / Univ. de Lorraine).
- 2. The English investigative magazine Intercept points the finger at the US and the UK: https://theintercept.com/2014/11/24/secret-regin-malware-belgacom-nsa-gchq/ and https://theintercept.com/2014/12/13/belgacom-hack-gchq-inside-story/.
- 3. Managing dynamic networks and services/Supervision des réseaux & services dynamiques, http://www.loria.fr/la-recherche/equipes/madynes
- 4. This start-up, named Simorfo, was created in collaboration with the Tracip company.
- 5. According to the New York Times dated January 27, 2014, the "Angry Birds" game was used by the US National Security Agency (NSA) and the UK Government Communications Headquarters (GCHQ) to collect data regarding its users.
Share this article
Science journalist, author of chilren's literature, and collections director for over 15 years, Charline Zeitoun is currently Sections editor at CNRS Lejournal/News. Her subjects of choice revolve around societal issues, especially when they interesect with other scientific disciplines. She was an editor at Science & Vie Junior and Ciel & Espace, then...