Sections

One software, billions of possibilities

One software, billions of possibilities

10.24.2022, by
Reading time: 4 minutes
The combination of all the options offered by a software program can reach mind-boggling numbers, with several thousand digits. Research has developed tools to manage this variability as best as possible.

Whether it involves absent-mindedly enabling the notifications for a popular application or carefully configuring a Linux session, everyone has access to different means of personalising their programs. This freedom nevertheless has significant technological consequences.

“In addition to performing what they were designed to do, computer programs are supposed to meet the functional needs of users,” explains Mathieu Acher, an associate professor at Université de Rennes 1, and a member of the IRISA research institute of computer science and random systems.1 “Some rely on the default configuration, which is sufficient in many cases, but potentially forego opportunities to improve the service. Software must adhere to constraints relating to security, execution speed, energy consumption, and numerous other parameters, and then find a compromise between these parameters depending on each user's requirements. These different configurations for the same program form what are known as software variants.”

Imagining exponential combinations

These variants can be extremely numerous. While this is already true of fairly simple applications, their numbers reach astounding quantities for programs as complex as operating systems. Software variability subsequently becomes a seemingly insurmountable problem.

“For a program with only thirty options, each limited to two choices, there are six billion variants,” emphasizes Acher. “If you take something like the kernel of the Linux operating system and its 15,000 configuration options, each with three possible values, you end up with a total numbering six thousand digits.”

This explosion of imaginable combinations affects users as well as those who develop and maintain software, whether the program is a game on a mobile phone or the system for a Mars rover. It is therefore no longer conceivable to check by hand that everything is functioning properly, and that there is no conflict between certain choices. Even using automatic techniques – and assuming each test takes only a second – the example with thirty options would require nearly two hundred years for full validation. Yet in practice each individual operation actually lasts a few minutes.

AI for predicting the properties of variants

The development of scientific methods suitable for navigating these inordinately large sets has now become indispensable. Acher is tackling this problem through feature models, kinds of tree structures that describe the different options for a range of products – in this case software programs – and what each one entails in terms of new possibilities and incompatibilities.
“Feature models are formalisms that present a system’s options, their available values, and any constraints between them. This final point is important, as some combinations will not function within the same software. This kind of modelling is the simplest and most effective way of capturing a program's range of possibilities.”

For instance, from the kernel for Linux with its fifteen thousand choices, Acher developed an artificial intelligence (AI) tool based on machine learning that can predict the properties of variants without actually having to test them. The AI tool generalised the information that it extracted from a limited number of cases, enabling it to make quick predictions with an error rate of only 5%. Acher and his co-authors also ensured that this model will evolve over time, and will include the constant improvements made to Linux by its users.

“Linux is one of the most complex systems ever built by humanity, I am thrilled to be a part of it, and to know that my work has such a strong societal impact,” Acher enthuses. “I was fairly pessimistic at the beginning, fearing that I would have to amass a humongous amount of data to train the AI. In the end it was well-designed, and its learning cost remained reasonable.”

Deep software variability, “an even more astounding space”

This solution can predict in particular the size of the kernel needed for Linux to function with certain priority options, for instance to run it on small devices and processes. “We would like to also include more complex properties in our predictions, such as energy consumption and security level. This requires further research,” Acher admits. “Still, it remains one of our goals.”

Other challenges await the researcher, who was recently appointed as a junior member of the IUF French university institute. In connection with this appointment, Acher launched a project on deep software variability. “We have mentioned software options, but other parameters can have a strong impact, such as the hardware configuration of the computer running the program, the nature of the data it is processing, or the operating system. We end up with an even more outrageous space. We can thus predict the execution time of a program fairly accurately, but only with certain fixed parameters. We hope to extend this to all possible variations on any machine on the planet. That’s something of a challenge!”

This project connects with another of Acher's interests, i.e. scientific reproducibility. The problems relating to deep software variability can result in the same algorithm giving a different response depending on the machine performing the calculation. “I want to improve the reliability of research results in the face of a variability that we still do not fully master.” All the more so as this issue even affects scientific publications.

Footnotes
  • 1. CNRS / Université de Rennes 1.

Author

Martin Koppe

A graduate from the School of Journalism in Lille, Martin Koppe has worked for a number of publications including Dossiers d’archéologie, Science et Vie Junior and La Recherche, as well the website Maxisciences.com. He also holds degrees in art history, archaeometry, and epistemology.