biais des algorithmes, algorithmic bias

Algorithmic bias, discrimination and fairness

David Bounie, Professor of Economics, Head of Economics and Social Sciences at Télécom ParisTech

Patrick WaelbroeckProfessor of Industrial Economy and Econometrics at Télécom ParisTech and co-founder of the Chair Values and Policies of Personal Information

[divider style=”dotted” top=”20″ bottom=”20″]

The original version of this article was published on the website of the Chair Values and Policies of Personal Information. This Chair brings together researchers from Télécom ParisTech, Télécom SudParis and Institut Mines Télécom Business School, and is supported by the Mines-Télécom Foundation.

[divider style=”dotted” top=”20″ bottom=”20″]

 

[dropcap]A[/dropcap]lgorithms rule our lives. They increasingly intervene in our daily activities – i.e. career paths, adverts, recommendations, scoring, online searches, flight prices – as improvements are made in data science and statistical learning.

Despite being initially considered as neutral, they are now blamed for biasing results and discriminating against people, voluntarily or not, according to their gender, ethnicity or sexual orientation. In the United States, studies have shown that African American people were more penalised in court decisions (Angwin et al., 2016). They are also discriminated against more often on online flat rental platforms (Edelman, Luca and Svirsky, 2017). Finally, online targeted and automated ads promoting job opportunities in the Science, Technology, Engineering and Mathematics (STEM) fields seem to be more frequently shown to men than to women (Lambrecht and Tucker, 2017).

Algorithmic bias raises significant issues in terms of ethics and fairness. Why are algorithms biased? Is bias unpreventable? If so, how can it be limited?

Three sources of bias have been identified, in relation to cognitive, statistical and economic aspects. First, algorithm results vary according to the way programmers, i.e. humans, coded them, and studies in behavioural economics have shown there are cognitive biases in decision-making.

  • For instance, a bandwagon bias may lead a programmer to follow popular models without checking whether these are accurate.
  • Anticipation and confirmation biases may lead a programmer to favour their own beliefs, even though available data challenges such beliefs.
  • Illusory correlation may lead someone to perceive a relationship between two independent variables.
  • A framing bias occurs when a person draws different conclusions from a same dataset based on the way the information is presented.

Second, bias can be statistical. The phrase ‘Garbage in, garbage out’ refers to the fact that even the most sophisticated machine will produce incorrect and potentially biased results if the input data provided is inaccurate. After all, it is pretty easy to believe in a score produced by a complex proprietary algorithm and seemingly based on multiple sources. Yet, if the data set based on which the algorithm is trained to learn to categorise or predict is partial or inaccurate, as is often the case with fake news, trolls or fake identities, results are likely to be biased. What happens if the data is incorrect? Or if the algorithm is trained using data from US citizens, who may behave much differently from European citizens? Or even, if certain essential variables are omitted? For instance, how might machines encode relational skills and emotional intelligence (which are hard to get for machines as they do not feel emotions), leadership skills or teamwork in an algorithm? Omitted variables may lead an algorithm to produce a biased result for the simple reason the omitted variables may be correlated with the variables used in the model. Finally, what happens when the training data comes from truncated samples or is not representative of the population that you wish to make predictions for (sample-selection bias)? In his Nobel Memorial Prize-winning research, James Heckman showed that selection bias was related to omitted-variable bias. Credit scoring is a striking example. In order to determine which risk category a borrower belongs to, algorithms rely on data related to people who were eligible for a loan in a particular institution – they ignore files of people who were denied credit, did not need a loan or got one in another institution.

Third, algorithms may bias results for economic reasons. Think of online automated advisors specialised in selling financial services. They can favour the products of the company giving the advice, at the expense of the consumer if these financial products are more expensive than the market average. Such situation is called price discrimination. Besides, in the context of multi-sided platforms, algorithms may favour third parties who have signed agreements with the platform. In the context of e-commerce, the European Commission recently fined Google 2.4bn euros for promoting its own products at the top of search results on Google Shopping, to the detriment of competitors. Other disputes have occurred in relation to the simple delisting of apps in search results on the Apple Store or to apps being downgraded in marketplaces’ search results.

Algorithms thus come with bias, which seems unpreventable. The question now is: how can bias be identified and discrimination be limited? Algorithms and artificial intelligence will indeed only be socially accepted if all actors are capable of meeting the ethical challenges raised by the use of data and following best practice.

Researchers first need to design fairer algorithms. Yet what is fairness, and which fairness rules should be applied? There is no easy answer to these questions, as debates have opposed researchers in social science and those in philosophy for centuries. Fairness is a normative concept, many definitions of which are incompatible. For instance, compare individual fairness and group fairness. One simple criterion of individual fairness is that of equal opportunity, the principle according to which individuals with identical capacities should be treated similarly. However, this criterion is incompatible with group fairness, according to which individuals of the same group, such as women, should be treated similarly. In other words, equal opportunity for all individuals cannot exist if a fairness criterion is applied on gender. These two notions of fairness are incompatible.

A second challenge faces companies, policy makers and regulators, whose duty it is to promote ethical practices – transparency and responsibility – through an efficient regulation of the collection and use of personal data. Many issues arise. Should algorithms be transparent and therefore audited? Who should be responsible for the harm caused by discrimination? Is the General Data Protection Regulation fit for algorithmic bias? How could ethical constraints be included? Admittedly they could increase costs for society at the microeconomic level, yet they could help lower the costs of unfairness and inequality stemming from an automated society that wouldn’t comply with the fundamental principles of unbiasedness and lack of systematic discrimination.

Read on I’MTech Ethics, an overlooked aspect on algorithms?

The water footprint of a product has long been difficult to evaluate. Where does the water come from? What technology is used to process and transport it? These are among the questions researchers have to answer in better measuring environmental impact.

The many layers of our environmental impact

An activity can have many consequences for the environment, from its carbon footprint, water consumption, pollution, changes to biodiversity, etc. Our impacts are so complex that an entire field of research has been developed to evaluate and compare them. At IMT Mines Alès, a team of researchers is working on tools to improve the way we measure our impacts and therefore provide as accurate a picture as possible of our environmental footprint. Miguel Lopez-Ferber, the lead researcher of this team, presents some of the most important research questions in improving our methods of environmental evaluation. He also explains the difficulty in setting up indicators and having them approved for efficient decision making.

 

Can we precisely evaluate all of the impacts of a product on the environment?

Miguel Lopez-Ferber: We do know how to measure some things. A carbon footprint, or the pollution generated by a product or a service. The use of phytosanitary products is another impact we know how to measure. However, some things are more difficult to measure. The impacts linked to the water consumption required in the production of a product have been extremely difficult to evaluate. For a given use, one liter of water taken from a region may generate very different impacts from a liter of water taken from another region. The type of water, the climate, and even the source of the electricity used to extract, transport and process it will be different. We now know how to do this better, but not yet perfectly. We also have trouble measuring the impact on biodiversity due to humans’ development of a territory.

Is it a problem that we cannot fully measure our impact?

MLF: If we don’t take all impacts into account, we risk not noticing the really important ones. Take a bottle of fruit juice, for example. If we only look at the carbon footprint, we will choose a juice made from locally-grown fruit, or one from a neighboring country. Transport does play a major part in a carbon footprint. However, local production may use a water source which is under greater stress than one in a country further away. Perhaps it also has a higher impact on biodiversity. We can have a distorted view of reality.

What makes evaluating the water footprint of a product difficult?

MLF: What is difficult is to first differentiate the different types of water. You have to know where the water comes from. The impact won’t be the same for water taken from a reserve under the Sahara as for water from the Rhône. The scarcity of the water must be evaluated for each production site. Another sensitive point is understanding the associated effects. In a given region, the mix of water used may correspond to 60% surface water, 30% river water and 10% underground water, but these figures do not give us the environmental impacts. Each source then has to be analyzed to determine whether taking the water has consequences, such as drying out a reserve. We also need to be able to differentiate the various uses of the water in a given region, as well as the associated socio-economic conditions, which have a significant impact on the choice of technology used in transporting and processing the water.

What can we determine in the impact of water use?

MLF: Susana Leão’s thesis, co-supervised by my colleague Guillaume Junqua, has provided a regional view of inventories. It presents the origin of the water in each region according to the various household, agricultural or industrial uses, along with the associated technologies. Before, we only had average origin data by continent: I had the average water consumption for one kilogram of steel produced in Europe, without knowing if the water came from a river or from a desalination process, for example. Things became more complicated when we looked at the regional details. We now know how to differentiate the composition of one country’s water mix from another’s, and even to differentiate between the major hydrographic basins. Depending on the data available, we can also focus on a smaller area.

In concrete terms, how does this work contribute to studying impacts?

MLF: As a result, we can differentiate between production sites in different locations. Each type of water on each site will have different impacts, and we are able to take this into account. In addition, in analyzing a product like our bottle of fruit juice, we can categorize the impacts into those which are introduced on the consumption site, in transport or waste, for example, and those which are due to production and packaging. In terms of life cycle analysis, this helps us to understand the consequences of an activity on its own territory as well as other territories, near or far.

Speaking of territories, your work also looks at habitat fragmentation, what does this mean?

MLF: When you develop a business, you need a space to build a factory. You develop roads and transform the territory. These changes disturb ecosystems. For instance, we found that modifications made to a particular surface area may have very different impacts. For example, if you simply decrease the surface area of a habitat without splitting it, the species are not separated. On the contrary, if you fragment the area, species have trouble traveling between the different habitats and become isolated. We are therefore working on methods for evaluating the distribution of species and their ability to interconnect across different fragments of habitat.

With the increasing amount of impact indicators, how do we take all of these footprints into account?

MLF: It’s very complicated. When a life cycle analysis of a product such as a computer is made, this includes a report containing around twenty impact categories: climate change, pollution, heavy metal leaching, radioactivity, water consumption, eutrophication of aquatic environments, etc. However, decision-makers would rather see fewer parameters, so they need to be aggregated into categories. There are essentially three categories: impact on human health, impact on ecosystems, and overuse of resources. Then, the decisions are made.

How is it possible to decide between such important categories?

MLF: Impact reports always raise the question of what decision makers want to prioritize. Do they want a product or service which minimizes energy consumption? Waste production? Use of water resources? Aggregation methods are already based on value scales and strong hypotheses, meaning that the final decision is too. There is no way of setting a universal scale, as the underpinning values are not universal. The weighting of the different impacts will depend on the convictions of a particular decision maker and the geographical location. The work involves more than just traditional engineering, but a sociological aspect too. This is when arbitration enters the realm of politics.

Using personalised services without privacy loss: what solutions does technology have to offer?

Online services are becoming more and more personalised. This transformation designed to satisfy the end-user might be seen as an economic opportunity, but also as a risk, since personalised services usually require personal data to be efficient. Two perceptions that do not seem compatible. Maryline Laurent and Nesrine Kaâniche, researchers at Telecom SudParis and members of the Chair Values and Policies of Personal Information, tackle this tough issue in this article. They give an overview of how technology can solve this equation by allowing both personalization and privacy. 

[divider style=”normal” top=”20″ bottom=”20″]

This article has initially been published on the Chair Values and Policies of Personal Information website.

[divider style=”normal” top=”20″ bottom=”20″]

 

[dropcap]P[/dropcap]ersonalised services have become a major stake in the IT sector as they require actors to improve both the quality of the collected data and their ability to use them. Many services are running the innovation race, namely those related to companies’ information systems, government systems, e-commerce, access to knowledge, health, energy management, leisure and entertaining. The point is to offer end-users the best possible quality of experience, which in practice implies qualifying the relevance of the provided information and continuously adapting services to consumers’ uses and preferences.

Personalised services offer many perks, among which targeted recommendations based on interests, events, news, special offers for local services or goods, movies, books, and so on. Search engines return results that are usually personalised based on a user’s profile and actually start personalising as soon as a keyword is entered, by identifying semantics. For instance, the noun ‘mouse’ may refer to a small rodent if you’re a vet, a stay mouse if you’re a sailor, or a device that helps move the cursor on a computer screen if you’re an Internet user. In particular, mobile phone applications use personalisation; health and wellness apps (e.g. the new FitBit and Vivosport trackers) can come in very handy as they offer tips to improve one’s lifestyle, help users receive medical care remotely, or warn them on any possible health issue they detect as being related to a known illness.

How is personalisation technologically carried out?

When surfing on the Internet and using mobile phone services or apps, users are required to authenticate. Authentication allows to connect their digital identity with the personal data that is saved and collected from exchanges. Some software packages also include trackers, such as cookies, which are exchanged between a browser and a service provider or even a third party and allow to track individuals. Once an activity is linked to a given individual, a provider can easily fill up their profile with personal data, e.g. preferences and interests, and run efficient algorithms, often based on artificial intelligence (AI), to provide them with a piece of information, a service or targeted content. Sometimes, although more rarely, personalisation may rely solely on a situation experienced by a user – the simple fact they are geolocated in a certain place can trigger an ad or targeted content to be sent to them.

What risks may arise from enhanced personalisation?

Enhanced personalisation causes risks for users in particular. Based on geolocation data only, a third party may determine that a user goes to a specialised medical centre to treat cancer, or that they often spend time at a legal advice centre, a place of worship or a political party’s local headquarters. If such personal data is sold on a marketplace[1] and thus made accessible to insurers, credit institutions, employers and lessors, their use may breach user privacy and freedom of movement. And this is just one kind of data. If these were to be cross-referenced with a user’s pictures, Internet clicks, credit card purchases and heart rate… What further behavioural conclusions could be drawn? How could those be used?

One example that comes to mind is price discrimination,[2] i.e. charging different prices for the same product or service to different customers according to their location or social group. Democracies can also suffer from personalisation, as the Cambridge Analytica scandal has shown. In April 2018, Facebook confessed that U.S. citizens’ votes had been influenced through targeted political messaging in the 2016 election.

Responsible vs. resigned consumers

As pointed out in a survey carried out by the Chair Values and Policies of Personal Information (CVPIP) with French audience measurement company Médiamétrie,[3] some users and consumers have adopted data protection strategies, in particular by using software that prevents tracking or enables anonymous online browsing… Yet this requires them to make certain efforts. According to their purpose, they either choose a personalised service or a generic one to gain a certain control over their informational profile.

What if technology could solve the complex equation opposing personalised services and privacy?

Based on this observation, the Chair’s research team carried out a scientific study on Privacy Enhancing Technologies (PETs). In this study, we list the technologies that are best able to meet needs in terms of personalised services, give technical details about them and analyse them comparatively. As a result, we suggest classifying these solutions into 8 families, which are themselves grouped into the following 3 categories:

  • User-oriented solutions. Users manage the protection of their identity by themselves by downloading software that allows them to control outgoing personal data.Protection solutions include attribute disclosure minimisation and noise addition, privacy-preserving certification,[4] and secure multiparty calculations (i.e. distributed among several independent collaborators).
  • Server-oriented solutions. Any server we use is strongly involved in personal data processing by nature. Several protection approaches focus on servers, as these can anonymise databases in order to share or sell data, run heavy calculations on encrypted data upon customer request, implement solutions for automatic data self-destruction after a certain amount of time, or Privacy Information Retrieval solutions for non-intrusive content search tools that confidentially return relevant content to customers.
  • Channel-oriented solutions. What matters here is the quality of the communication channel that connects users with servers, be it intermediated and/or encrypted, and the quality of the exchanged data, which may be damaged on purpose. There are two approaches to such solutions: securing communications and using trusted third parties as intermediators in a communication.

Some PETs are strictly in line with the ‘data protection by design’ concept as they implement data disclosure minimisation or data anonymisation, as required by Article 25-1 of the General Data Protection Regulation (GDPR).[5] Data and privacy protection methods should be implemented at the earliest possible stages of conceiving and developing IT solutions.

Our state of the art shows that using PETs raises many issues. Through a cross-cutting analysis linking CVPIP specialists’ different fields of expertise, we were able to identify several of these challenges:

  • Using AI to better include privacy in personalised services;
  • Improving the performance of existing solutions by adapting them to the limited capacities of mobile phone personalised services;
  • Looking for the best economic trade-off between privacy, use of personal data and user experience;
  • Determining how much it would cost industrials to include PETs in their solutions in terms of development, business model and adjusting their Privacy Impact Assessment (PIA);
  • PETs seen as a way of bypassing or enforcing legislation.
Glioblastoma is a type of brain tumor. It remains difficult to treat. Image: Christaras A / Wikimedia.

Glioblastoma: electric treatment?

At Mines Saint-Étienne, the ATPulseGliome project is looking into a new form of cancer treatment. This therapeutic approach is aimed at fighting glioblastoma, an especially aggressive form of brain cancer, using electrical stimulation. It could eventually increase the life expectancy of glioblastoma patients in comparison with chemotherapy and radiotherapy treatment. 

 

Glioblastoma is a rare form of brain cancer. Of the 400,000 new cases of cancer recorded each year in France, it affects 2,400 people, or 0.6%. Unfortunately, it is also one of the more severe forms, with the life expectancy of glioblastoma patients at between 12 to 15 months with treatment. To improve the survival chances of those affected, Mines Saint-Étienne is leading the ATPulseGliome project, in collaboration with the University of Limoges and with funding from the EDF Foundation. The team of researchers, led by Rodney O’Connor, is testing an approach using electric fields to find new types of treatment.

The glial cells are located around the neurons in our brain. These are the cells affected by glioblastoma. “One particular type of glial cell is affected: the astrocytes” says Hermanus Ruigrok, a cellular biologist at Mines Saint-Étienne and researcher on the ATPulseGliome project. The normal role of astrocytes is to provide nutrients to neurons, repair brain lesions, and ensure separation between the nervous system and blood circulation system. Like all cells, astrocytes regenerate by division. Glioblastoma survive when the astrocytes behave abnormally, dividing in an uncontrollable manner.

Targeting glioblastoma without affecting healthy cells

ATPulseGliome is looking into a form of treatment based on electrical stimulation of cancer cells. “Healthy astrocytes are not sensitive to electricity, but glioblastoma cells are” explains Hermanus Ruigrok. This difference is the foundation of the treatment strategy, which will target the cancer cells only, not the astrocytes and other healthy cells. Glioblastoma cancer cells have a larger amount of proteins in their membrane which are sensitive to electricity.

These proteins act as doors, letting ions in and out, thus enabling communication between the cell and the outside environment. This door malfunctions under the effect of electrical stimulus. An unusually high number of ions then enter the cancer cell, causing harmful effects. “This strategy will allow us to destroy only the cancer cells, and not the healthy astrocytes, which are not sensitive to the electrical stimulus” Hermanus Ruigrok highlights.

Glioblastoma cells, marked with fluorescence.

Glioblastoma cells, marked with fluorescence.

 

It is still much too early to trial this technique on patients. The ATPulseGliome team is working on glioblastoma cell lines. Initially, these cells come from a patient with this form of cancer, but they are cultivated, reproduced and isolated in in vitro experiments. By eliminating the complex molecular interactions present in a real patient, this first step helps to clarify the scientific objectives and test the feasibility of in vivo tests. During this phase, researchers will look at the different types of electrodes to be used for sending an electrical field, determine the characteristics of the electrical signal required in stimulating cells, and measure the initial responses of glioblastoma to the electrical impulses.

To complete these steps, the team at Mines Saint-Étienne is working with Institut des neurosciences de la Timone in Marseille. “We want to take on as many specialists as possible, as the project requires a range of skills: biologists, electronics engineers, neurologists, surgeons, etc.”, Hermanus Ruigrok explains. Although it is a lengthy procedure, this multidisciplinary approach could increase the life expectancy of patients. “We can’t say in advance how much more effective this electrical field approach will be compared with the chemotherapy and radiotherapy currently used”, the researcher explains. “Although it may be difficult to fully cure this cancer, being able to monitor and limit its development in order to significantly increase the life expectancy of those affected by glioblastoma would be a form of satisfaction.”

 

 

AI4EU: a project bringing together a European AI community

Projets européens H2020On January 10th, the AI4EU project (Artificial Intelligence for the European Union), an initiative of the European Commission, was launched in Barcelona. This 3-year project led by Thalès, with a budget of €20 million, aims to bring Europe to the forefront of the world stage in the field of artificial intelligence. While the main goal of AI4EU is to gather and coordinate the European AI community as a single entity, the project also aims to promote EU values: ethics, transparency and algorithmic explainability. TeraLab, the AI platform at IMT, is an AI4EU partner. Interview with its director, Anne-Sophie Taillandier.

 

What is the main goal of the AI4EU H2020 project?

Anne-Sophie Taillandier: To create a platform bringing together the Artificial Intelligence (AI) community and embodying European values: sovereignty, trust, responsibility, transparency, explainability… AI4EU seeks to make AI resources, such as data repositories, algorithms and computing power, available for all users in every sector of society and the economy. This includes everyone from citizens interested in the subject, SMEs seeking to integrate AI components, start-ups, to large groups and researchers—all with the goal of boosting innovation, reinforcing European excellence and strengthening Europe’s leading position in the key areas of artificial intelligence research and applications.

What is the role of this platform?

AST: It primarily plays a federating role. AI4EU, with 79 members in 21 EU countries, will provide a unique entry point for connecting with existing initiatives and accessing various competences and expertise pooled together in a common base. It will also play a watchdog role and will provide the European Commission with the key elements it needs to orient its AI strategy.

TeraLab, the IMT Big Data platform, is also a partner. How will it contribute to this project?

AST: Along with Orange, TeraLab coordinates the “Platform Design & Implementation” work package. We provide users with experimentation and integration tools that are easy to use without prior theoretical knowledge, which accelerates the start-up phase for projects developed using the platform. For common questions that arise when launching a new project, such as the necessary computing power, data security, etc., TeraLab offers well-established infrastructure that can quickly provide solutions.

Which use cases will you work on?

AST: The pilot use cases focus on public services, the Internet of Things (IoT), cybersecurity, health, robotics, agriculture, the media and industry. These use cases will be supplemented by open calls launched over the course of the project. These open calls will target companies and businesses that want to integrate platform components into their activities. They could benefit from the sub-grants provided for in the AI4EU framework: the European Commission funds the overall project, which itself funds companies proposing convincing project through the total dedicated budget of €3 million.

Ethical concerns represent a significant component of European reflection on AI. How will they be addressed?

AST: They certainly represent a central issue. The project governance will rely on a scientific committee, an industrial committee as well as an ethics committee that will ensure transparency, reproducibility and explainability by means of tools including charters, indicator and labels. Far from representing an obstacle to business development, the emphasis on ethics creates added value and a distinguishing feature for this platform and community. The guarantee that the data will be protected and will be used in an unbiased manner represents a competitive advantage for the European vision. Beyond data protection, other ethical aspects such as gender parity in AI will also be taken into account.

What will the structure and coordination look like for this AI community initiated by AI4EU?

AST: The project members will meet at 14 events in 14 different countries to gather as many stakeholders as possible throughout Europe. Coordinating the community is an essential aspect of this project. Weekly meetings are also planned. Every Thursday morning, as part of a “world café”, participants will share information, feedback, and engage in discussions between suppliers and users. A digital collaborative platform will also be established to facilitate interactions between stakeholders. In other words, we are sure to keep in touch!

 

AI4EU consortium members

SPARTA is a European project bringing together leading researchers in cybersecurity to respond to new challenges facing our increasingly connected society.

SPARTA: defining cybersecurity in Europe

Projets européens H2020The EU H2020 program is continuing its efforts to establish scientific communities in Europe through the SPARTA project dedicated to cybersecurity. This 3-year project will bring together researchers to take up the new cybersecurity challenges: providing defense against new attacks, offering protection in highly-connected computing environments and artificial intelligence security. Hervé Debar, a researcher in cybersecurity at Télécom SudParis participating in SPARTA, explains the content of this European initiative led by the CEA, with the participation of Télécom ParisTech, IMT Atlantique and Mines Saint-Etienne.

 

What is the goal of SPARTA?

Hervé Debar: The overall goal of SPARTA is to establish a European cybersecurity community. The future European regulation on cybersecurity proposes to found a European center for cybersecurity competencies in charge of coordinating a community of national centers. In the future, this European center will have several responsibilities, including leading the R&D program for the European Commission in the field of cybersecurity.  This will involve defining program objectives, calls for proposals, selecting projects and managing their completion.

What scientific challenges must the SPARTA project take up?

HD: The project encompasses four major research programs. The first, T-SHARK, addresses the issue of detecting and fighting against cyberattacks. The second, CAPE, is aimed at validating security and safety features for objects and services in dynamic environments. The third, HAII-T, offers security solutions for hardware environments. Finally, the fourth, SAFAIR, is aimed at ensuring secure and understandable artificial intelligence.

Four IMT schools are involved in SPARTA: Télécom SudParis, IMT Atlantique, Télécom ParisTech and Mines Saint-Étienne. What are their roles in this project?

HD: The schools will contribute to different aspects of this project. The research will be carried out within the CAPE and HAII-T programs to work on issues related to hardware certification and security, or the security of industrial systems. The schools will also help coordinate the network and develop training programs.

Where did the idea for this project originate?

HD: It all started with the call for proposals by the H2020 program for establishing and operating a pilot cybersecurity competencies network. As soon as the call was launched, the French scientific community came together to prepare and coordinate a response. The major constraints were related to the need to bring together at least 20 partners from at least 9 countries to work on 4 use cases. The project has been established with four national communities: France, Spain, Italy and Germany. It includes a total of 44 partners from 13 countries to work on 4 R&D programs.

Which use cases will you work on?

HD: The project defines several use cases—this was one of the eligibility requirements for the proposal. The first use case is that of connected vehicles, verifying their cybersecurity and operational safety features, which could be integrated into a test vehicle like EuroNCAP. The second use case will look at complex and dynamic software systems to ensure user confidence in complex computer systems and study the impact of rapid development cycles on security and reliability. The intended applications are in the areas of finance and e-government. Other uses cases will be developed over the course of the project.

What will the structure and coordination look like for this SPARTA community?

HD: A network of organizations outside SPARTA partners will be required to coordinate the community. The organizations that have been contacted are interested in the operations and results of the SPARTA project for several reasons. Two types of organizations have been contacted: professional organizations and public institutions. In terms of institutions, French regions, including Ile-de-France and Brittany, are contributing to defining the strategy and co-funding the research. In terms of professional organizations, the ACN (Alliance pour la Confiance Numérique) and competitiveness clusters like Systematic help provide information on the needs of the industrial sector and enrich the project’s activities.

 

[divider style=”solid” top=”20″ bottom=”20″]

SPARTA: a diverse community with bold ambitions

The SPARTA consortium, led by the CEA, brings together a balanced group of 44 stakeholders from 14 Member States. In France, this includes ANSSI, IMT, INRIA, Thales and YesWeHack. The consortium is seeking to re-imagine the way cybersecurity research, innovation, and training are performed in the European Union through various fields of study and expertise and scientific foundations and applications in the academic and industrial sectors. By pooling and coordinating these experiences, competencies, capacities and challenges, SPARTA will contribute to ensuring the strategic autonomy of the EU.

[divider style=”solid” top=”20″ bottom=”20″]

click

From personal data to artificial intelligence: who benefits from our clicking?

Clicking, liking, sharing: all of our digital activities produce data. This information, which is collected and monetized by big digital information platforms, is on its way to becoming the virtual black gold of the 21st century. Have we all become digital workers? Digital labor specialist and Télécom ParisTech researcher Antonio Casilli has recently published a work entitled En attendant les robots, enquête sur le travail du clic (Waiting for Robots, an Inquiry into Click Work). He sat down with us to shed some light on this exploitation 2.0.

 

Who we are, what we like, what we do, when and with whom: our virtual personal assistants and other digital contacts know everything about us. The digital space has become the new sphere of our private lives. This virtual social capital is the raw material for tech giants. The profitability of digital platforms like Facebook, Airbnb, Apple and Uber relies on the massive analysis of users’ data for advertising purposes. In his work entitled En attendant les robots, enquête sur le travail du clic (Waiting for Robots, an Inquiry into Click Work), Antonio Casilli explores the emergence of surveillance capitalism, an opaque and invisible form of capitalism marking the advent of a new form of digital proletariat: digital labor – or working with our digits. From the click worker who performs microtasks, who is aware of and paid for his activity, to the user who produces data implicitly, the sociologist analyzes the hidden face of this work carried out outside the world of work, and the all too-tangible reality of this intangible economy.

Read on I’MTech What is digtal labor?

Antonio Casilli focuses particularly on net platforms’ ability to put their users to work, convinced that they are consumers more than producers. “Free access to certain digital services is merely an illusion. Each click fuels a vast advertising market and produces data which is mined to develop artificial intelligence. Every “like”, post, photo, comment and connection fulfils one condition: producing value. This digital labor is either very poorly paid or entirely unpaid, since no one receives compensation that measures up to the value produced. But it is work nevertheless: a source of value that is traced, measured, assessed and contractually-regulated by the platforms’ terms and conditions for use,” explains the sociologist.

The hidden, human face of machine learning

For Antonio Casilli, digital labor is a new form of work which remains invisible, but is produced from our digital traces. Far from marking the disappearance of human labor with robots replacing the work they once did, this click work challenges the boundaries between work that is produced implicitly and formally recognizable employment. And for good reason: microworkers paid by the task or user-producers like ourselves are indispensable to these platforms. This data serves as the basis for machine learning models: behind the automation of a given task, such as visual or text recognition, humans are actually fueling applications by indicating clouds on images of the sky, for example, or by typing out words.

“As conventional wisdom would have it, these machines learn by themselves. But to train their algorithms to calibrate, or to improve their services, platforms need a huge number of people to train and test them,” says Antonio Casilli. One of the best-known examples is Mechanical Turk, a service offered by the American giant Amazon. Ironically, its name is a reference to a hoax that dates back to the 18th century. An automaton chess player, called the “Mechanical Turk” was able to win games against human opponents. But the Turk was actually operated by a real human hiding inside.

Likewise, certain so-called “smart” services rely heavily on unskilled workers: a sort of “artificial” artificial intelligence. In this work designed to benefit machines, digital workers are poorly paid to carry out micro-tasks. “Digital labor marks the appearance of a new way of working which can be called “taskified,” since human activity is reduced to a simple click; and “datafied” because it’s a matter of producing data so that digital platforms can obtain value from it,” explains Antonio Casilli. And this is how data can do harm. Alienation and exploitation: in addition to the internet task workers in northern countries, more commonly their counterparts in India, the Philippines and other developing countries with low average earnings, are sometimes paid less than one cent per click.

Legally regulating digital labor?

For now, these new forms of work are exempt from salary standards. Nevertheless, in recent years there has been an increasing number of class action suits against tech platforms to claim certain rights. Following the example of Uber drivers and Deliveroo delivery people, individuals have taken legal action in an attempt to have their commercial contracts reclassified as employment contacts. Antonio Casilli sees three possible ways to help combat job insecurity for digital workers and bring about social, economic and political recognition of digital labor.

From Uber to platform moderators, traditional labor law—meaning reclassifying workers as salaried employees—could lead to the recognition of their status. But dependent employment may not be a one-size-fits-all” solution. There are also a growing number of cooperative platforms being developed, where the users become owners of the means of production and algorithms.” Still, for Antonio Casilli, there are limitations to these advances. He sees a third possible solution. “When it comes to our data, we are not small-scale owners or small-scale entrepreneurs. We are small-scale data workers. And this personal data, which is neither private nor public, belongs to everyone and no one. Our privacy must be a collective bargaining tool. Institutions must still be invented and developed to make it into a real common asset. The internet is a new battleground,” says the researcher.

Toward taxation of the digital economy

Would this make our personal data less personal? “We all produce data. But this data is, in effect, a collective resource, which is appropriated and privatized by platforms. Instead of paying individuals for their data on a piecemeal basis, these platforms should return, give back, the value extracted from this data, to national or international authorities, through fair taxation, explains Antonio Casilli. In May of 2018, the General Data Protection Regulation (GDPR) came into effect in the European Union. Among other things, this text protects data as a personality attribute instead of as property. Therefore, in theory, everyone can now freely consentat any momentto the use of their personal data and withdraw this consent just as easily.

While in its current form, regulation involves a set of protective measures, setting up a tax system like the one put forward by Antonio Casilli would make it possible to establish an unconditional basic income. The very act of clicking or sharing information could give individuals a right to these royalties and allow each user to be paid for any content posted online. This income would not therefore be linked to the tasks carried out but would recognize the value created through these contributions. In 2020, over 20 billion devices will be connected to the Internet of Things. According to some estimates, the data market could reach nearly €430 billion per year by then, which is equivalent to a third of France’s GDP. Data is clearly a commodity unlike any other.

[divider style=”dotted” top=”20″ bottom=”20″]

En attendant les robots, enquête sur le travail du clic (Waiting for Robots, an Inquiry into Click Work)
Antonio A. Casilli
Éditions du Seuil, 2019
400 pages
24 € (paperback) – 16,99 € (e-book)

 

Original article in French written by Anne-Sophie Boutaud, for I’MTech.

 

Hospitals

Improving organization in hospitals through digital simulation

How can we improve emergency room wait times, the way scheduled hospitalizations are managed and cope with unexpected surges of patients? Vincent Augusto, a researcher in healthcare systems engineering at Mines Saint-Étienne is working to find solutions to these problems. He is developing programs based on digital simulation, aimed at optimizing influxes of patients and waiting times at the hospital, especially in emergency care facilities.

 

Chronic emergency department saturation and unacceptable wait times for receiving care are regularly listed among areas in need of improvement. Several of these areas have been studied: taking preventive action beforehand to reduce influxes of patients, organization within emergency departments, managing hospitalizations in advance. Vincent Augusto and his team from the MedTechDesign living lab at the engineering and healthcare center at Mines Saint-Étienne have developed models that contribute to these last two areas by providing valuable information. “We worked on successive projects with hospitals to develop programs using digital simulation. The principle is that any system can potentially be monitored and reproduced based on the data it generates; being able to process this data in real time would help to optimize resources. Unfortunately, major inequalities exist in terms of computerization from one hospital to another.

Vincent Augusto is specialized in modeling, analyzing and managing inflows of patients in hospitals. “At the hospital in Firminy, we modeled unforeseen arrivals in the emergency department to get a better idea of the number of beds required and to improve planning for scheduled patients.” The departments schedule hospitalizations for patients needing diagnostic scans or treatment. However, since it is difficult to predict the number of available places in advance, scheduled hospitalizations must sometimes be canceled at the last minute, forcing patients to wait longer to receive care. On the other hand, the shortage of beds leads to overcrowded emergency services. Improving the management of the internal and external flow of patients in hospitals is therefore of utmost importance.

A modular digital twin

At the university hospital (CHU) in Saint-Étienne, the team developed a digital twin for the emergency department. This twin helped assess the different measures that could be implemented to improve emergency operations. Vincent Augusto explains how this was developed: “First, there is an on-site observation phase. We collect data using existing software. Next, there is a development phase in which we seek to understand and model the flow of patients in the department and create an initial model on paper that is confirmed by the department staff. We can then create a digital assessment model that reproduces the way the emergency department operates, which then undergoes a validation phase.”

The researchers use the department’s activities from the previous year to accomplish this. They enter the data into the system and check if the indicators predicted by the model match those recorded at the time. This approach involves three different components: the first analyzes the patient care circuit, the second analyzes human resources based on type of activity and the third focuses on the organization and interdependence of the resources. “Once this model has been validated, we can use the modular system to test different scenarios: we can alter the human resources, simulate the arrival of an inflow of patients, reduce the wait time for taking further tests—such as scans—or the time required to transfer a patient to a hospital ward,” the researcher explains.

The first measure tested was to divide emergencies into three groups: serious emergencies (road accidents, respiratory problems, etc.), functional emergencies (sprains, wounds requiring stitches, etc.) and fast functional emergencies (requiring care that can be quickly provided). Upon entering, the patients are directed to one of these three groups led by different teams. According to Vincent Augusto and the system users, “this makes it possible to clearly assess the savings in terms of time and costs that are related to organizational changes or an increase in human resources, before any real changes are made. This is a big plus for the departments, since organizational changes can be very time-consuming and costly and sometimes have only a small impact.”

The real impact the organizational measures would have on emergency department operations was assessed and made it possible to continue work on another potential area for improvement: the creation of a psychiatric section within the emergency department, with beds reserved for these patients. To help draw up the plans for the future emergency services, the team from Mines Saint-Étienne is developing a virtual reality interface to directly and realistically view flows of patients more easily than the indicators and charts generated by the digital simulation system. The goal is to optimize the patient circuit within the department and the medical care they receive.

Improving hospitals’ resilience in unexpected events

This method also offers management support for crisis situations involving a massive influx of patients to the emergency department in the event of disasters, attacks or epidemics. “The system was developed to manage, in addition to the usual flow of patients, an exceptional yet predictable arrival of patients,” the researcher explains. It is therefore useful in voltage plans: exceptional situations that push the system beyond its capacity. In these cases, the department must face a critical situation of responding to hospital emergencies that can lead to a French emergency “white plan” being declared, in which non-priority activities are cancelled.

To accomplish this, the program is updated in real time via a direct connection to the hospital’s computer systems. It can therefore determine the exact state of the department at any time. By entering a varying number of patients with specific pathologies in a given situation (flu-related respiratory difficulties, gunshot wounds, etc.), the simulation can determine the most effective measures to take. This is what the engineers call an operational tool. “In the short and medium term, the departments now have a tool that can help them optimize their response to the problems they face and improve the care patients receive,” concludes Vincent Augusto.

Original article in French written by Sarah Balfagon, for I’MTech.

AFA 7 nouveaux projets, German-French Academy

Industry of the future: The German-French Academy launches seven new projects

Following a call for proposals launched by the German-French Academy for researchers at IMT and TUM (Technische Universität Munchen), seven projects were selected in October 2018. The projects focus on key topics for the German-French Academy for the Industry of the Future. A French-German platform for AI will soon be launched.

 

The selected projects focus on six topics: AI for the industry of the future, advanced manufacturing, advanced materials, supply chain and logistics, energy efficiency, and industrial design and processes. They will be funded by the German-French Academy for the Industry of the Future founded by IMT and TUM, for the initial seed stage.

For Christian Roux, Executive Vice President for Research and Innovation at IMT, “The German-French Academy is expanding its scope of exploration to provide solutions to strategic topics related to the industry of the future in order to support and accelerate the digital transformation of French and German industry.”

Alloy Design for Additive Manufacturing (ADAM)

This project focuses on additive manufacturing, in particular through laser melting (LBM, Laser Beam Melting). It aims to optimize the choice of the alloy composition used in this additive manufacturing process so as to limit defaults and optimize mechanical properties in the final product. This optimization will be based on processing large amounts of data collected by the research team at Mines Saint-Étienne, and on experimentation resources equipped with very high-speed cameras at TUM.

Additive Manufacturing for the Industry of the Future

This project aims to analyze the impact of the introduction of additive manufacturing in industry, focusing on three main areas. The first area involves industrial organization (supply chain, use of opensource materials, integration in new innovation ecosystems), the second concerns companies (new duties, new skills, new business models) and the final area focuses on organizational changes in the design process (new possibilities for design, mass customization, user-centered design etc.). The changes resulting directly from the introduction of additive manufacturing itself will also be studied.

Smart Artificial Intelligence based Modeling and Automation of Networked Systems (AI4Performance)

This project is intended to develop a smart approach for testing and evaluating networked systems (while collecting data at the same time). The process will be based on using innovative learning methods (Graph Neural Network) on data provided by partners Cisco and Airbus. This will involve analyzing the impact of changes (increase in the number of users, integration of new sub-systems, virtual machines etc.), detecting bottlenecks and analyzing the root cause as well as detecting malfunctions.

Data-driven Collaboration in Industrial Supply Chains (DISC)

Against the backdrop of digital transformation of industry, this project focuses on supply chain optimization through collaboration, especially in terms of incentives for information sharing. The approach will rely on methods derived from game theory to improve the decision-making process, which is increasingly decentralized as a result of digital transformation.

Modeling and decision-making platform for Reconfigurable, Digitalized Servitized Production systems (RDS-Production)

This research project aims to develop methods for designing reconfigurable production systems based on modeling interoperable components and software (using digital twins), AI techniques and operational research for decision-making support in reconfigurations, service life cycle approaches for production system equipment, and multi-criteria decision-making methods.

Smart Sensor Technology with Decentralized Processing Architecture

This project seeks to develop a new approach for taking sensors into account in systems such as automobiles or eHealth. The process will use smart sensors to distribute and process data starting from the sensor level, in different layers, so that through this multi-layer and adaptable system, storage and processing needs will be distributed in an optimized way to ensure security, reliability, robustness and scalability.

A French-German platform for AI

Joint Platform and Tools for Artificial Intelligence based on Data-Intensive Research and Data Governance.

TeraLab, IMT’s artificial intelligence platform, and TUM will work together to create a shared platform for AI. This will allow researchers at both institutes and their industrial partners to work in close collaboration through shared, secure access to data. The project also includes the possibility for researchers to showcase their research and results, with the development of tools that can test the algorithms and data sets used. This secure, neutral and trustworthy service will facilitate the reproducibility of results within a shared framework of good practices.

[divider style=”normal” top=”20″ bottom=”20″]

For five years IMT has been developing Teralab, an AI platform aimed at accelerating research and innovation, giving researchers, innovative companies and industrial players the opportunity to work together in a secure, sovereign, neutral environment using real data.  

A true success story of the Investissements d’avenir (Future Investments) program, TeraLab is now involved in 60 industrial projects and is a key player in three H2020 projects on industry (MIDIH, BOOST 4.0, AI4EU). It has been awarded the “Silver i-Space” label at the European level for three years.

[divider style=”normal” top=”20″ bottom=”20″]

Browse all articles on the German-French Academy for the Industry of the Future

 

 

Wi6labs

Wi6labs: customized sensor networks

Wi6labs, a start-up incubated at IMT Atlantique, installs connected sensor networks for municipalities and industries. What makes this startup so unique? It offers custom-developed private networks that are easy to install. When it comes to controlling energy networks, water supply and monitoring air quality, the solution proposed by Wi6labs is attractive due to its simplicity and the savings it offers. The startup is part of the IMT delegation to CES 2019 in Las Vegas.

 

It all started three years ago. In July 2016, the mayor of Saint-Sulpice-la-Forêt, a municipality located 10km northeast of Rennes, France, became aware of a leak in the city’s water system. For one year, the municipality’s water bill had been constantly increasing. All in all, the water leaked was equivalent to 26 Olympic-sized swimming pools. The fact that this leak was discovered came as a relief to the mayor. But how could he prevent undetected occurrences like this from happening again? To avoid wasting more water, Saint-Sulpice-la-Forêt contacted a local start-up: Wi6labs.

We proposed installing sensors in the water system,” recalls the start-up’s founder, Ulrich Rousseau. “In just one night, these objects can detect and locate a leak.” Satisfied with the results, the mayor renewed the partnership to monitor the temperature and energy consumption in public buildings. The sensor network revealed, for example, that the town’s school was being heated at night and during school vacations. By adapting its practices based on data from the connected sensors, the municipality saved €7,400 of its annual energy expenditure of €50,000 over the next year. “The investment of €20,000 for installing our solution paid for itself in three years,” Ulrich Rousseau explains.

For Wi6labs, the Saint-Sulpice-la-Forêt experience was a pilot experiment used to test the start-up’s relevance. The operation’s success allowed them to propose this solution to other local municipalities and companies. Each time, there was a common theme: a water leak. “It’s our starting point with customers. They all deal with this problem and are convinced that our approach will help them manage it,” he explains. Once the system is installed for the water meter and the initial data is retrieved, the changes in practices aimed at reducing the water bill provide convincing proof for continuing with the operation.

The start-up then eventually offers its customers solutions for monitoring air quality and adjusting gas consumption. In their partnership with Keolis, a public transport operator, Wi6labs developed a sensor network to inform the company of the number of passengers using its buses in real time. “We study specific cases, for both municipalities and companies, and we respond with a customized solution that meets a wide range of needs,” Ulrich Rousseau explains.

Wi6labs conquers dead zones

All the start-up’s solutions are built on its product Wiotys, a platform used to control a LPWAN network. These low-power, long-range networks enable communication between connected objects. Wiotys makes it possible to install sensor networks that are independent and isolated. In other words, the sensors used by Saint-Sulpice-la-Forêt only communicate amongst themselves and are controlled locally. This approach is therefore different from those used by telecommunication operators like Orange and Bouygues, which deploy national networks connecting the sensors.

This difference has vast implications. First, there are the advantages. Wiotys networks are not limited by the dead zones in the major operators’ networks. Saint-Sulpice-la-Forêt, for example, does not benefit from any LPWAN networks from national operators. It is therefore impossible to connect their sensors to a national network. Secondly, this allows them to create custom solutions. For example, if a company wants to charge its customers based on data from a sensor, it must send information through the network’s downlink channel, in other words, in the opposite direction from the uplink channel, which sends information from the sensor to the platform. “Operators are not comfortable doing this because it is expensive to reserve part of the network for downlink data transmission to the sensor. For us, it is simply a question of taking this need into account when dimensioning the network,” Ulrich Rousseau explains.

However, they cannot offer some of the features operators can. This is the case with roaming—a sensor’s capacity to switch from one connection terminal to another as it moves.  “For our customers, this is not generally a problem, since water meters and air sensors are stationary,” the founder of Wi6labs explains. The start-up has strategically chosen to eliminate certain complex features to make the installation easier. “What we sell our customers is a quick solution that is easy to deploy. It’s a little like installing a router at home: you plug it in, and it works.

Today, Ulrich Rousseau assures us that the start-up no longer experiences any technological barriers. Its use cases have involved working 20 meters underground and responding to complex requests from customers. The true limit is that of social acceptability, especially for municipalities. “All of the sudden, we must explain to the civil servant who used to enter meter readings into an Excel spreadsheet that our sensors will be taking over this task,” Ulrich Rousseau explains. “We have to change his tasks and train him to learn how to control the sensors.

These are no small changes for civil servants who for years have performed tasks unrelated to digital technology. For a municipality, this also requires adjustments to integrate training time and new tasks for civil servants. Social resistance can therefore by significant and the legitimacy of these reactions should not be minimized. According to Ulrich Rousseau, Wi6labs is also responsible for explaining the significant and valuable results of these changes. “We must be educators. For us, this involves showing local citizens and civil servants the savings in euros for the municipality in practical terms, rather than talking about kilowatt hours.” In essence: changing citizens’ perception of energy to increase their awareness of the energy and environmental transition.