TeraLab, a big data platform with a European vision

TeraLab, an IMT platform aimed at accelerating big data projects by uniting researchers and companies, has held the “Silver i-Space” quality label since December 1st, 2016. This label, awarded by the Big Data Value Association, is a guarantee of the quality of the services the platform provides, both at the technical and legal levels. The label testifies to TeraLab’s relevance in the European big data innovation ecosystem and the platform’s ability to offer cutting-edge solutions. Anne-Sophie Taillandier, the platform’s director, tells us about the reasons for this success and TeraLab’s future projects.

 

What does the “Silver i-Space” label, awarded by the Big Data Value Association (BDVA) on December 1st, mean for you? 

Anne-Sophie Taillandier: This is an important award, because it is a Europe-wide reference. The BDVA is an authoritative body, because it ensures the smooth organization of the public-private partnership on big data established by the European Commission. This label therefore has an impact on our ability to work at the continental level. DG Connect, the branch of the Commission in charge of rolling out a digital single market in Europe, pays particular attention to this: for example, it prefers that H2020 research projects use pre-existing platforms. Therefore, this label provides better visibility within the international innovation ecosystem.

 

In addition to TeraLab, three other platforms have been awarded this label. Does this put you in competition with the other platforms at the European level?

AST: The i-Spaces are complementary, not competitive. With TeraLab, we insist on creating a neutral zone: it is a breath of fresh air in the life cycle of a project, so that people—researchers and companies— can test things peacefully. The complementarity that exists between the platforms enables us to combine our energies. For example, one of the other recipients of the Silver i-Space label in December was SDIL, which is based in Germany and is more focused on industrial solutions. The people who contact the SDIL have already made an industrial choice. The stakeholders who contact TeraLab have not yet made a choice, and want to explore the potential value of their data.

 

How do you explain this recognition by the BDVA?

AST: TeraLab has always sought to be cutting-edge. We offer a platform that is equipped with the latest tools. Since the fourth quarter of 2016, for example, the platform has been equipped with GPUs: processors initially designed for computing graphics, but that are now also used for deep learning applications. The computing optimization they provide make it possible to make the time required for learning the algorithms fifteen times faster. We can therefore make much more powerful machines available to researchers working in the area of deep learning. Until now, the infrastructure did not allow for this. Generally speaking, if we feel that a project needs specific equipment, we look at whether we can introduce it, and whether it makes sense to do that. This constant updating is also one of the tasks set as part of the Investments for the Future program (PIA) by the Directorate-General for Enterprise.

 

Is it primarily the platform’s technical excellence that has been recognized?

AST: That’s not all, TeraLab represents an infrastructure, but also an ecosystem and a set of services. We assist our partners so that they can make their data available. We also have quality requirements regarding legal aspects. For example, we are working towards obtaining authorization from the Shared Healthcare Information Systems Agency (ASIP) to allow us to store personal health data. From a technical perspective, we have all we need to store this data and work with it. But we need to meet the legal and administrative requirements in order to do this, such as meeting the ISO 27001 standard. We must therefore provide guarantees of the way we receive data from companies, establish contracts, etc.

 

Have the upstream aspects, prior to data processing, also been the focus of innovations?

 AST: Yes, because we must constantly ensure consent regarding the data. Our close relationship with IMT researchers is an asset in accomplishing this. Therefore, we have projects on data watermarking, cryptographic segmentation, and the blockchain. Data analysis and extracting value from this data can only be carried out once the entire upstream process is completed. The choice of tools for ensuring privacy issues is therefore essential, and we must constantly ensure that we remain cutting-edge in terms of security aspects as well.

 

How are these quality criteria then reflected in the business aspect?

AST: Today, TeraLab is involved in projects that are related to many different themes. We talked about the blockchain, but I could also mention the industry of the future, energy, tourism, health care, insurance, open source issues, interoperability… And, more importantly, since 2016, TeraLab has achieved financial equilibrium: the revenue from projects offsets the operating costs. This is far from trivial, since it means that the platform is sustainable. TeraLab will therefore continue to exist after 2018, the completion date for the PIA that initiated the platform.

 

What are TeraLab’s main objectives for the coming year?

AST: First of all, to strengthen this equilibrium, and continue to ensure TeraLab’s sustainability. To accomplish this, we are currently working on big data projects at the European level to respond to calls for proposals from the European Commission via the H2020 program. Then, once we have obtained the authorization from the ASIP for storing personal health data, we plan to launch health research projects, especially in the area of “P4” medicine: personalized, preventive, predictive, and participatory.

 

Cybersécurité, cyberattaques, Hervé Debar, Télécom SudParis, cybersecurity, cyberattacks

Cybersecurity: new challenges to face up to

Computer hacking and sensitive data theft is increasing. Companies are now highly connected and must adapt their security policy to strengthen the protection of their information assets. Hervé Debar, a researcher at Télécom SudParis and expert in cybersecurity, studies the different types of cyber attacks in order to optimize their detection.

 

The era when IT students developed viruses and then enjoyed their “success” based on the number of computers infected across the world is long gone… Today, the lure of money is the primary motivation for hackers and internet fraudsters and their targets include companies in the French CAC40 and Organizations of Vital Importance (OIV), i.e. those operating in sectors of vital national importance (transport, energy, telecoms etc.) SMEs and SMIs are also concerned by the rise in cyber attacks. They work as subcontractors for client institutions and are therefore targeted by hackers who want to obtain confidential files (R&D program, list of suppliers etc.) or directly infect the main client’s Information System (IS).

 

Highly connected companies

Two factors explain this evolution. The first is the increase in digital data managed and exchanged by companies. Information is at the core of their business activities. Secondly, organizations have become highly connected with fixed and mobile devices, peripheral devices (printers, cameras etc.) connected to networks, sensors in production lines, automated systems and the Internet of Things (IoT). “The result is that the economic value of connection is greater than the economic value of protection. We must therefore do the best we can to isolate what has to be isolated but still allow companies to benefit from a certain level of connectivity” Hervé Debar stresses.

In terms of computer safety, companies have to manage different problems including “the acceptability of security by users. If the recommended solutions are too complicated, they won’t use them and will find a way to get around them. Security must support use, not hinder it”, the expert from Télécom SudParis explains.

 

Complex regulations

To face up to this challenge, companies must be able to manage two major limitations. The first concerns the technical aspect. Services such as BYOD (Bring Your Own Device) or corporate Appstores (portals allowing employees to download selected applications to improve productivity) are being rolled out at a quicker rate than security measures.

The second limitation concerns the human dimension. Regulations on digital technology are very complicated, especially due to the Law on Military Programming and obligations imposed by the CNIL (French National Commission on Data Protection and Civil Liberties), and will become even more so in May 2018 with the introduction of the new European regulations on data protection. All companies will have to report personal data violation, in contrast to the law of 1978 currently in force which only concerns suppliers of communications services.

These legal constraints require companies to bring in experts who are able to administrate and roll-out IT security measures on a daily basis and inform staff members through training.

 

[box type=”shadow” align=”” class=”” width=””]

Attacks by computer zombies

DDoS (Distributed Denial of Service) attacks are a company’s worst nightmare. They use a network of thousands of computers or connected devices (often compromised) to interrupt the targeted service or services. There was a major attack of this kind in France between 18 and 21 September 2016, when the servers of OVH, a French web host, were inundated with millions of requests peaking at as high as one terabit per second. A squadron of around 150,000 IP cameras (or botnets) infected by cyber criminals were behind the attack. Hervé Debar has been studying this type of attack for a number of years along with 6cure – a Normandy-based start-up specializing in traffic cleansing – and in the framework of the NECOMA research project (Nippon-European Cyberdefense-Oriented Multilayer threat Analysis), one of the six FP7 projects financed by the European Commission under the Europe and Japan coordinated call. His team’s work consists in studying the possibilities offered by the mechanisms and functions of the network itself in detecting large-scale DDoS attacks which could saturate a single local defense system. The idea is to identify the attacking flow according to its provenance or technical characteristics in order to differentiate it from the legitimate flow, with the aim of restraining bad traffic to leave more room for “good” traffic.[/box]

 

Detecting cyber attacks

It is crucial to detect these attacks as early as possible in order to combat them. Identifying cyber attacks is one of IMT’s principal research topics. “We mainly work on the detection of and protection against distributed denial of service attacks (see insert) and those which more closely target environments on the network side”, explains Hervé Debar. The process has seen several developments since the first work carried out at the start of the 1980s by the American military. It was initially limited to rolling out a few probes. The few alerts raised were dealt with “manually”. Then the number of sensors increased and alerts became more frequent. To manage them efficiently, companies implemented SIEMs (Security Information and Event Management).

“Today, we need to automate part of the reaction so that operators can concentrate on the attacks that are more difficult to deal with. We are heading toward more autonomous and more reactive systems which can protect themselves against attacks. But it remains a complicated matter. Artificial intelligence (my PhD subject) is one possible way to improve our response to a certain number of threats”, explains Hervé Debar. The other option is training staff members to react better to attacks. We talk of “cyber range” which allows a realistic simulation of cyber attacks and the evaluation of cyberdefense tactics. These centers are designed to help OIV operators in particular make the right decisions in response to the impact.

The professionalization of cybercriminals and the increase in vulnerability, which concerns both IT networks and industry, require greater awareness among all actors, both public and private.

 

Hervé Debar, Cybersécurité, Télécom SudParis

A very involved practitioner

Hervé Debar is not a theorist, but a keenly involved expert with 25 years of experience in the world of cybersecurity and R&D. His work includes more than 100 scientific articles on the subject, the coordination of three European projects (WOMBAT, NECOMA and PANOPTESEC) and participation in a number of French and European programs on the subject. His long career in the private sector partially accounts for this considerable activity. An engineer by training, he obtained his PhD before joining Dassault AT (Automation and Telecoms) to work on European research projects. He then left for Switzerland to join IBM where he developed the IDMEF standard and a product called Tivoli Risk Manager, one of the first security information and event management products on the market. Upon returning to France, he joined Orange Labs in Caen and became an emeritus expert and leader of research activities on security. He then brought his considerable experience in the private sector to Télécom SudParis, where he has been in charge of the “Networks and Security” department since 2009.

 

FEMTO Engineering: a new component of the TSN Carnot institute

Belles histoires, bouton, CarnotIn July 2016 the Ministry for Education and Research renewed the Télécom & Société numérique Carnot institute accreditation and at the same time, the institute presented a new component: the FEMTO Engineering Center for Technological Development. We interviewed the Director of the FEMTO-ST laboratory, Laurent Larger, who told us a little more about the identity of this new component of the TSN Carnot institute and its domains of expertise which could be of interest for companies for their partnership research projects.

 

What is the difference between FEMTO-ST and FEMTO Engineering?

Laurent Larger: FEMTO Engineering was created in 2013 by FEMTO-ST and is associated with the FC’Innov foundation (Franche-Comté Innovation). FEMTO Engineering and the FEMTO-ST laboratory have a strategic proximity. In fact, the researchers in the two structures share the same building and their research overlaps. The experts at FEMTO Engineering can benefit from advice from those in the laboratory. FEMTO Engineering is devoted to the promotion and technology transfer of research developed at FEMTO-ST. The advantage of having a structure like FEMTO Engineering is that it constitutes a partner which is more attractive for the industry and SMEs.

 

What are FEMTO Engineering’s fields of work?

LL: In principle, all the research topics in which FEMTO-ST has expertise. FEMTO-ST has seven main departments: automation and micro-robotics; IT; energy; applied mechanics; micro-nano-science and systems; optics; time and frequency analysis (metrology of time, high-end radiofrequency and microwave sources). In practice, we have identified a certain number of topics which meet the requirements of a Carnot institute in terms of promotion and technology transfer or development.

Carnot TSN, FEMTO Engineering, FEMTO-ST

Laurent Larger, Director of FEMTO-ST institute

Let’s take energy for example. FEMTO-ST has internationally-acknowledged expertise in the field of fuel cell systems. FEMTO Engineering rolls out this expertise in the context of renewable energy using hydrogen. We are also currently working with a vehicle manufacturer on Stirling engines using the technique of heat recovery. FEMTO Engineering is also involved in the biomedical field. Work is currently being carried out on creating biochips allowing for the identification of living species. This allows us, for example, to propose new solutions for proteomic analysis on gold and silica chips using SPR (surface plasmon resonance).

 

What are FEMTO Engineering’s strengths?

LL: FEMTO-ST’s strengths relate to its technological expertise in high-tech components and systems. As an example, let’s take two technologies related to FEMTO-ST’s proprietary expertise.

Femtosecond laser machining is an activity based on expertise in the formation of a femtosecond laser beam. It allows for nano-machining with accurate control in terms of both time and space. In other words, it allows us to make nano holes in materials (glass, silicon, diamond etc.) with unprecedented speeds of execution and extremely high aspect ratios. This method can be used for ultra-fast cutting or for making nano holes to structure thin films. Last but not least, it can be used in surface texturing, which allows us to change the adhesive properties of a surface, to make it hydrophilic or hydrophobic, for example.

Another example of FEMTO Engineering’s strengths is its expertise in the field of time and frequency metrology. We have designed a cryogenic sapphire oscillator at the high end of state-of-the-art technology on an international level, and which has set new records for stability. These oscillators are used to obtain baseline frequencies for calibrating observation systems and have already been used by the ESA (European Space Agency). It also has applications in the field of radiofrequencies and microwaves.

More broadly speaking, our very close links with FEMTO-ST enable FEMTO Engineering to rely upon a vast range of experimental knowledge and high-tech facilities structured around platforms and notably including MIMENTO, our micro-nano-manufacturing plant (a member of the French CNRS RENATECH network).

 

Carnot TSN, FEMTO Engineering, FEMTO-ST

Manufacturing a resin mask. This mask will be used to create motifs on substrate materials. For example, for making metallic electrodes or for engraving. Photo credits: CC Ludovic Godard – UFC

 

What was the reason for wanting to join the TSN Carnot institute, what are your expectations and how will you benefit from it?

LL: We were already familiar with the Carnot institute environment because we belonged to it between 2006 and 2011. We wanted to return in order to benefit from new opportunities in terms of promotion and transfer. In this sense, the deciding factor with the TSN Carnot institute was its solidity. Out of all the Carnot institutes, it is very well positioned in terms of the level of activity. This standing appealed to us and we plan on contributing to it!

In terms of expertise, this is the most complementary Carnot institute in relation to our technology offering. Thanks to this complementarity, we can offer new opportunities in terms of promotion and industrial relations/partnerships. Another important thing to mention is our location. We are on the border with Switzerland, with which we have a lot of international interactions level, and in 2015 thus generated more than 40% of turnover from abroad. This trend was strengthened in 2016 with twice the amount of turnover.

 

Will FEMTO Engineering evolve after joining the TSN Carnot institute?

LL: For the moment, the workforce at FEMTO Engineering stands at 14 people, compared with 800 in the laboratory. This ratio provides significant room for growth, particularly in relation to the pool of expertise within FEMTO-ST. We benefit from this interaction with the laboratory, which offers us plenty of other topics that could be rolled out at FEMTO Engineering. The people with this expertise are mainly former PhD students and engineers from the laboratory. The concept is to offer members of industry the best possible interface to facilitate their innovation projects. Joining the TSN Carnot institute is an opportunity for us to grow and develop new activities with the business world.

 

[box type=”shadow” align=”” class=”” width=””]

The TSN Carnot institute, a guarantee of excellence in partnership-based research since 2006

 

Having first received the Carnot label in 2006, the Télécom & Société numérique Carnot institute is the first national “Information and Communication Science and Technology” Carnot institute. Home to over 2,000 researchers, it is focused on the technical, economic and social implications of the digital transition. In 2016, the Carnot label was renewed for the second consecutive time, demonstrating the quality of the innovations produced through the collaborations between researchers and companies.

The institute encompasses Télécom ParisTech, IMT Atlantique, Télécom SudParis, Télécom École de Management, Eurecom, Télécom Physique Strasbourg and Télécom Saint-Étienne, École Polytechnique (Lix and CMAP laboratories), Strate École de Design and Femto Engineering.[/box]

Algo, Turbocodes, Claude Berrou, Turbocodes, IMT Atlantique

On computer science : Turbo in the algo

Serge AbiteboulEcole Normale Supérieure Paris-Saclay and Christine FroidevauxUniversité Paris Sud – Université Paris-Saclay

A new “Interview on Computer Science”. Serge Abiteboul and Christine Froidevaux interview Claude Berrou, computer engineer and electronics engineer, and a member of the French Academy of Sciences. Claude Berrou is a professor at IMT Atlantique. He is best known for his work on turbo codes, which has been used extensively in mobile telephony. His current research focus is on informational neuroscience. This article is published in collaboration with the blog Binaire.

 

Claude Berrou, Informatique, IMT Atlantique, Turbocodes

Claude Berrou. Binaire. Author provided

Binaire: You started out as an electronics engineer, how did you get into computer science?

Claude Berrou: I am a science rambler. After my initial training at a graduate school that today is called Phelma, I studied a little bit of everything: electronics, signal processing, circuit architecture. Then I got into computer science… by chance, through correction codes and information theory.

Here’s a question we love to ask here at the Binaire blog, what is your definition of computer science?

CB: I have an aphorism: computer science is to the sciences what natural language is to intelligence. Before computer science, there were equations, formulas and theorems. Computer science allowed sequences of operations, processes, and procedures to be developed to process complex problems. This makes it almost synonymous with language, and it is very similar to natural language, which also requires structure. Just like when we have a common language, computer science offers languages that everyone can understand.

You worked with correction codes. Can you tell us what they are used for?

CB: When we transmit information, we want to retrieve the full message that was sent. Even if we have a lot of users and limited bandwidth. If the message is binary, due to noise and interference disturbing the line, some of the transmitted 0s will be received as 1s, and some of the 1s will become 0s. The greater the noise compared to the signal, the more frequent these kinds of errors happen. The signal-to-noise ratio can be decreased by poor weather conditions, for example, or disturbances caused by other communication taking place at the same time. With all these errors, the quality becomes very poor. To prevent this, we encode the transmitted information by adding redundancy. The challenge is to be able to retrieve the message relatively well without adding too much redundancy, without making the message too big. We have a similar problem in mass storage. Bits can switch, sometimes due to wear to the disk. We also introduce redundancy into these systems to be able to retrieve the information.

Talk to us about your wonderful invention, turbo codes.

CB: Turbo codes were born thanks to the Titanic, when we needed to achieve the transmission for viewing the wreck (work by Alain Glavieux). I played around with ways of reducing the effect of the noise in the transmission, and to deal with the errors, and I thought of introducing the principle of negative feedback in the decoding process, a classic concept in electronics.

For me, the interdisciplinary aspect is fundamental; innovation is often found at the interface of different disciplines. You take an idea that has been proven to work in one area of science, and you try to adapt it to an entirely different context. The original idea behind the turbo codes was to import an electronics technique into computer science.

When we want to create a high-gain amplifier, we put in 2 or 3 of them in a series. But this creates instable behaviour. To stabilize the arrangement, we implement a negative feedback principle: send a fraction of the amplifier’s output back to its input with the “–” sign; this reduces unwanted variations.

I started with a known algorithm: the Viterbi algorithm. It makes it possible to correct (if there is not too much noise) the errors that occur during transmission through a noisy channel, and can therefore be considered to be a signal-to-noise ratio amplifier. The Viterbi decoder exploits the algebraic law used to design the redundancy of the encoded message and uses it by means of a trellis (the deterministic equivalent of a Markov chain), thereby delivering the most probable original message. Therefore, I put two Viterbi algorithms in a series. I then tried to integrate the negative feedback concept into the decoding process. It’s a difficult task, and I was not a coding expert.

One problem was that the Viterbi algorithm makes binary choices: the bit was either switched, or it wasn’t. Along with a colleague, Patrick Adde we adapted it so that it would produce probabilistic decisions, which significantly improves the subsequent performance of the decoder.

How does it work?

CB: Like I mentioned, to protect a message, we add redundancy. The turbo code performs the coding in two dimensions. A good analogy is the grid of a crossword puzzle, with vertical and horizontal dimensions. If the definitions were perfect, only one dimension would be enough. We could rebuild the grid, for example, with only horizontal definitions. But since we do not always know what the definitions refer to, and since there can be ambiguities (due to noise, deletions, etc.), we also provide vertical definitions.

The decoding process is a little like what someone does when working on a crossword puzzle. The decoder works in a line (it uses the horizontal definitions), and then moves onto the vertical dimension. Like the crossword fan, the decoder requires several passages to reconstruct the message.

With all of these aspects, the turbo codes are effective.

We believe you. Billions of objects use this technology!

CB: Yes. All media data on 3G and 4G are protected by turbo codes.

Shannon, Claude Berrou

Claude Shannon. Binaire/Wikipédia. Author provided

This brings us to another Claude: Claude Shannon and the information theory?

CB: Yes, with this algorithm we clearly enter the realm of the information theory. In fact, I recently helped organize the symposium at IHP celebrating the centenary of Claude Shannon’s birth, which was a fascinating symposium.

Shannon demonstrated that all ideal transmission (or storage) should be accomplished using two fundamental operations. First, to reduce the message size, it is compressed to remove the maximum amount of unnecessary redundancy. Next, to protect against errors, intelligent redundancy is added.

Shannon demonstrated the limits of correction codes in 1948! Turbo codes reach Shannon’s theoretical limit, to within a few tenths of a decibel!

And now. You have moved on to neuroscience…

CB: My current research is related to informational neuroscience. You recently interviewed Olivier Faugeras, who talked to you about computational neuroscience, a fairly different approach.

Cortex. Nicolas Rougier. Author provided

My starting point is still information, but this time in the brain. The human cerebral cortex can be compared to a graph, with billions of nodes and thousands of billions of edges. There are specific modules, and between the modules are lines of communication. I am convinced that the mental information, carried by the cortex, is binary.

Conventional theories hypothesize that information is stored by the synaptic weights, the weights on the edges of the graph. I propose a different hypothesis. In my opinion, there is too much noise in the brain; it is too fragile, inconsistent, and unstable; pieces of information cannot be carried by weights, but rather by assemblie of nodes. These nodes form a clique, in the geometric sense of the word, meaning they are all connected two by two. This becomes digital information.

Is this where we will see coding and redundancy? To prevent information from getting lost in the brain, do redundancies also exist?

CB: Yes. For the traditional, analog school of thought, information is carried by the synapses. In this case, redundancy could only be achieved using repetitions: several edges would carry the same information.

According to our approach, information is encoded in the connections of a grouping of nodes. Redundancy is naturally present in this type of coding. Take a clique made up of 10 nodes on a graph. You have 45 connections in the clique. This is a large number of connections compared to the number of nodes. I base this on the Hebbian theory (1949): when neuron A sends spikes and neuron B activates systematically, the connection between A and B will be reinforced if it exists, and if it doesn’t exist it will form. Because the clique is redundant, it will resonate, and a modified connection will be reinforced: using Hebbian theory we obtain a reconstruction in the event of deterioration. We have established an entire theory based on this.

You lost us. A clique carries a piece of information. And the fact that the clique features so much redundancy ensures the information will be lasting?

CB: Yes. And furthermore, the clique can be the building block for an associative memory. I will be able to find the complete information based on certain content values. And this is due to the cliques’ highly redundant structure.

What does your work involve?

CB: I have set up a multidisciplinary team made up of neuropsychologists, neurolinguists, computer scientists, etc. We are trying to design a demonstrator, a machine based on the model of the brain as we see it, on an informational level. In a traditional computer, the memory is on one side and the processor on the other. In our machine, and in the brain, everything is interlinked.

Based on the theory we are developing (not yet fully published), mental information relies on little pieces of knowledge that are stored in the cliques. The cliques are chosen randomly. But once it has been done, they become permanent. This varies from one person to another; the same cliques do not carry the same information in different individuals. I would like to develop artificial intelligence using this machine model.

How do you see artificial intelligence?

CB: There are, in fact, two types of artificial intelligence. First, there is the kind concerned with the senses, with vision and speech recognition, for example. We are starting to be able to do this using deep learning. And then, there is the type that allows us to imagine and create, and know how to answer new questions. For now, we are not able to do this. In my opinion, the only way to make progress in this strong AI is to base it on the human cerebral cortex.

I am passionate about this subject. I would like to see it advance and continue my research for a long time to come.

 

Serge Abiteboul, Research Director at INRIA, member of the French Academy of Sciences, Affiliate Professor, Ecole Normale Supérieure Paris-Saclay and Christine Froidevaux, Computer Science Professor, Université Paris Sud – Université Paris-Saclay

The original version of this article was published in French on The Conversation France.

mySMARTLife, Nantes

In Nantes, the smart city becomes a reality with mySMARTlife

Projets européens H2020Alongside European smart city champions like Barcelona, Copenhagen and Stockholm, France boasts a few gems of its own. One such city is Nantes, a participant in the European H2020 research project mySMARTlife since December 1st, 2016. Thanks to this project, the capital of the Pays de la Loire region plans to put its scientific heritage to good use, represented by IMT Atlantique, as it continues its transformation into a smart city.

 

When searching for proof that major French cities are truly transitioning to become smart cities, look no further than Nantes. For several years now, the city has been engaged in a transformational process, turning the concept of the city of the future into a reality. This is a role and ambition the city has taken to the European level, with Johanna Rolland, the Mayor of Nantes and President of Nantes Métropole, serving on the Executive Committee of Eurocities. This network of cities — also chaired by the Mayor of Nantes from 2014 to 2016 — advocates with European authorities for the interests of major metropolitan areas, and includes some big names among smart cities: Barcelona, Stockholm and Milan, etc. In short, at a time when few European cities can claim to be undertaking tangible measures towards becoming smart cities, Nantes can boast of being a pioneer in this area.

On December 1st, 2016, this characteristic was further enhanced with the official launch of the H2020 mySMARTlife research project. As proof of the position that Nantes holds in the European ecosystem of cities of the future, the city is now working alongside Hamburg and Helsinki as pilot cities for the project. At the local level in Nantes, MySmartLife is aimed at modernizing several of the city’s major areas of governance, particularly in terms of energy and transport. More specifically, one of the objectives is to “have a platform for Nantes Métropole, and its associated stakeholders,[1] to enable new services to be developed and monitored and to provide decision-support,” explains Bruno Lacarrière, researcher at IMT Atlantique. The institution is participating in this H2020 project, and offers dual expertise: in both energy efficiency related to heating networks and in offering decision-support. This expertise is provided by the Department of Energy Systems and Environment (member of the UMR CNRS 6144 GEPEA) for Nantes, and by the Department of Logics in Uses, Social Science and Management (member of the UMR CNRS 6285 LAB-STICC) for Brest.

 

Optimizing the city’s energy efficiency

The researchers from the IMT Atlantique Department of Energy Systems and Environment will specifically provide their knowledge in energy efficiency and system analysis, applied to heating networks. “Our skills in the field allow us to model these systems with an integrated approach that goes beyond thermal-hydraulic studies, for example,” explains Bruno Lacarrière. “We do not only model pipes, but an entire set of connected technological objects,” he continues. The researchers take into account the variety of systems that can provide heat sources for the network (boilers, cogeneration units, geothermal energy, recovering surplus industrial heat…), and the diversity of the consumers connected to the network. All of the heating network components are therefore integrated into the researchers’ models. This approach, which is complex because it is based on a comprehensive view of the network, makes it possible to better assess the areas for improvement in optimizing energy efficiency, and to better predict the consequences, for example, of renovating a building.

The researchers will greatly benefit from their proximity to the industrial partners in this project. To develop their models, they need field data such as heat output measurements from various points in the network. “This data is difficult to obtain, because in this case the system is connected to several hundred buildings,” Bruno Lacarrière points out. Furthermore, this information is not public. “Being able to work with stakeholders on the ground, such as Erena (Engie subsidiary and the network operator in Nantes), is therefore a real advantage for us, provided, of course, that the necessary confidentiality clauses are established,” the researcher adds.

 

No smart cities without decision support 

At the same time, the role of the Department of Logics in Uses, Social Science and Management is to develop decision-support tools, an important aspect in many of the smart city’s activities. This is true for mobility and transport, as Bruno Lacarrière points out: “In the context of the boom in electric vehicles, one application of decision-support is providing users with the nearest locations of available charging stations in real time.” Decision-support can also be used by public authorities to determine the best location for charging stations based on the configuration of the infrastructures and electrical distribution. “This is where having a platform becomes truly valuable: the information is centralized and made available to several stakeholders,” the researcher explains.

While the two types of expertise provided by IMT Atlantique are different in terms of research, they are very much complementary. Decision-support can, for example, use information obtained via the heating network models to propose new buildings to be connected to the network, or to study the deployment of new production sources. On the other hand, the results from decision-support based on several criteria (often related to various stakeholders) help to define new modeling scenarios for the networks. The researchers in energy efficiency and those in decision-support therefore complement each other through the platform, and provide support to the different stakeholders in the decisions they must make.

 

Ensuring the transformations are here to stay

While the mySMARTlife project will last five years, all the project’s actions — including rolling out the platform — must be completed within the first three years. The last two years will be dedicated to assessing the various actions, studying the impacts and making revisions if necessary. “For example, the first three years could be spent implementing an optimized energy management system, and the two follow-up years would provide feedback on the actual optimization. It is necessary to have sufficient hindsight, spanning several heating seasons,” explains Bruno Lacarrière.

The platform’s specific features must still be determined, and this will be the partners’ first mission. Because although it will initially be a demo platform, it is intended to remain after the project has ended. Therefore, planning must be done ahead of time to determine what form it will take, specifically so that industrial partners, as well as public authorities and final users, can make the most of it. Through this H2020 project, the European Commission is therefore planning to develop concrete actions that are made to last.

 

From a focus on Nantes to an international perspective

The work will initially focus on the Île de Nantes, located at the heart of the city on the Loire river. However, because certain heating and transportation networks are not confined to this area alone, the project will already be expanded to include other areas of the city. For example, the energy used by the Île de Nantes area is partially produced outside the district’s boundaries, therefore, the geographic area used for the models must be expanded. Several actions involving other zones in the metropolitan area are already planned.

Furthermore, the mySMARTlife project should not be seen solely as an effort to modernize a few areas of Nantes and the other two pilot cities. Brussels’ desire to ensure the sustainability of the actions over time is also related to its stated intention to ensure the scaling-up of the results from mySMARTlife. The key challenge is to produce knowledge and results that can be transferred to other urban areas, in France and abroad. This explains the advantage of entrusting the H2020 project management to Helsinki and Hamburg, in addition to Nantes.

By working together with the partners from these other two cities, the researchers will be able to validate their models by applying them to other major metropolitan areas. They will also attempt to test the validity of their work in smaller cities, since the project also includes the cities of Bydgoszcz (Poland), Rijeka (Croatia), Varna (Bulgaria) and Palencia (Spain). “The project is therefore aimed at demonstrating the implemented technology’s capacity to mass produce the actions used to develop a smart city,” the researcher points out. A key challenge in transforming cities is to make the transition to a smart city available not only to major metropolitan areas that are technologically advanced in this area, but also to smaller cities.

 

[1]­ At the local level in Nantes, Nantes Métropole will be supported by nine different partners: IMT Atlantique, Nantes Métropole Habitat, la Semitan, Armines, Atlanpole, Cerema, Engie and Enedis.

 

 

 

 

CES, CES 2017, Sevenhugs, start-up, innovation

CES: once the show is over, what do start-ups get out of it?

Several weeks after the CES, what remains of this key event for digital innovation? In addition to offering participants a stage for presenting their products, the event provides a place for intense networking and exchanges with future users. For start-ups, the CES accelerates their path towards leaving the incubator and provides a major boost in developing their brand.

 

Let’s take a quick trip back in time to January 9, 2017. The Consumer Electronics Show, better known as the CES, has just opened its doors in Las Vegas, triggering an avalanche of technology amid a flurry of media attention. Over the course of this 4-day event, the start-up and digital technology ecosystems buzz with activity. On January 12, the CES then came to a close. One week later, the return to normal can seem quite abrupt following a show that monopolized the attention of the media and technology stakeholders during its short existence. So, was it just a fleeting annual event? Are start-ups merely heading home (those who do not live in the nearby “valley”) after a short-lived fling?

Of course not! Despite the event’s ephemeral nature, start-ups come away with both medium- and long-term benefits. For Sevenhugs, 2017 was its third consecutive year participating in the event. The start-up from an incubator at Télécom ParisTech has presented two products at CES since 2015. It began by hugOne, a product for monitoring and optimizing sleep, followed by the Smart Remote, a multipurpose remote. Announcing new products at the event means, first of all, increasing press coverage, and therefore visibility, in a very competitive ecosystem. But it also means the start-ups benefit from meeting after meeting with business partners.

During CES, we had meetings with distributors, retailers and potential partners every 30 minutes,” explains Louise Plaquevent, Marketing Director at Sevenhugs. “With so many of these different professionals in the same place, it is possible to establish a lot of contacts that will be helpful throughout the year as we look for partners in Europe and the United States,” she adds. Therefore, CES also represents a springboard for entering the American and global market, which would be less accessible without this gathering.

 

Presenting a product to get critical feedback

Louise Plaquevent also points out that participating at CES exposes the products to the public, resulting in “comments and opinions from potential customers, which helps us improve the products themselves.” The Smart Remote was therefore presented to the public twice in Las Vegas: first in 2016, then again in 2017 as an updated version.

Michel Fiocchi, Director of Entrepreneurship at Mines Saint-Étienne, also shares this view. His role is to provide technological support to the school’s start-ups, founded by students and researchers. “For two of our start-ups — Swap and Air Space Drone — their participation at CES allowed them to refocus their products on clearly identified markets. Through conversations with potential users, they were able to make changes to include other uses and refine their technology,” he explains.

The event in Las Vegas provides a boost for the young entrepreneurs’ projects. Their development is accelerated through the contacts they establish and the opportunity to expose their products to users. For Michel Fiocchi, there is no doubt that participating at CES helps start-ups on their way to leaving the incubator: “There is a very clear difference in the dynamics of start-ups that have participated and those that haven’t,” he stresses.

Finally, participating at this major digital show offers benefits that are difficult to calculate, but may be just as valuable. Louise Plaquevent reminds us, in conclusion, that despite the event’s short duration, it is an intense experience for all the companies that make the trip. She points out that “CES allows us to get to know each other, and unites the teams.” This aspect is particularly important for these smaller companies with fewer employees.

 

machine learning

What is machine learning?

Machine learning is an area of artificial intelligence, at the interface between mathematics and computer science. It is aimed at teaching machines to complete certain tasks, often predictive, based on large amounts of data. Text, image and voice recognition technologies are also used to develop search engines and recommender systems for online retail sites. More broadly speaking, machine learning refers to a corpus of theories and statistical learning methods, which encompass deep learning. Stephan Clémençon, a researcher at Télécom ParisTech and Big Data specialist, explains the realities hidden behind these terms.

 

What is machine learning or automatic learning?

Stéphan Clémençon: Machine learning involves teaching machines to make effective decisions within a predefined framework, using algorithms fueled by examples (learning data). The learning program enables the machine to develop a decision-making system that generalizes what it has “learned” from these examples. The theoretical basis for this approach states that if my algorithm searches a catalogue of decision-making rules that is “not overly complex” and that worked well for sample data, they will continue to work well for future data. This refers to the capacity to generalize rules that have been learned statistically.

 

Is machine learning supported by Big Data?

SC: Absolutely. The statistical principle of machine learning relies on the representativeness of the examples used for learning. The more examples are available, and hence learning data, the better the chances of achieving optimal rules. With the arrival of Big Data, we have reached the statistician’s “frequentist heaven”. However, this mega data also poses problems for calculations and execution times. To access such massive information, it must be distributed in a network of machines. We now need to understand how to reach a compromise between the quantity of examples presented to the machine and the calculation time. Certain infrastructures are quickly penalized by the large proportions of the massive amounts of data (text, signals, images and videos) that are made available by modern technology.

 

What exactly does a machine learning problem look like?

SC: Actually, there are several types of problems. Some are called “supervised” problems, because the variable that must be predicted is observed through a statistical sample. One major example of supervised learning from the early stages of machine learning, was to enable a machine to recognize handwriting. To accomplish this, the database must be provided with many “pixelated” images, while explaining to the machine that it is an “e”, an “a”, etc. The computer was trained to recognize the letter that was written on a tablet. Observing the handwritten form of a character several times improves the machine’s capacity to recognize it in the future.

Other problems are unsupervised, which means that labels are available for the observations. This is the case, for example, in S-Monitoring, which is used in predictive maintenance. The machine must learn what is abnormal in order to be able to issue an alert. In a way, the rarity of an event replaces the label. This problem is much more difficult because the result cannot be immediately verified, a later assessment is required, and false alarms can be very costly.

Other problems require a dilemma to be resolved between exploring the possibilities and making use of past data. This is referred to as reinforcement learning. This is the case for personalized recommendations. In retargeting, for example, banner ads are programmed to propose links related to your areas of interest, so you will click on them. However, if you are never proposed any links related to classical literature, on the pretext that you do not yet have any search history in this subject, it will be impossible to effectively determine if this type of content would interest you. In other words, the algorithm will also need to explore the possibilities and no longer use data alone.

 

To resolve these problems, machine learning relies on different types of models, such as artificial neural networks; what does this involve?

SC: Neural networks are a technique based on a general principle that is relatively old, dating back to the late 1950s. This technique is illustrated by the operating model of biological neurons. It starts with a piece of information – the equivalent of a stimulation in biology – that reaches the neuron. Whether the stimulation is above or below the activation threshold will determine whether the transmitted information triggers a decision/action. The problem is that a single layer of neurons may produce a representation that is too simple to be used to interpret the original input information.

By superimposing layers of neurons, potentially with a varying number of neurons in each layer, new explanatory variables are created, combinations resulting from the output of the previous layer. The calculations continue layer by layer until a complex function has been obtained representing the final model. While these networks can be very predictive for certain problems, it is very difficult to interpret the rules using the neural networks model; it is a black box.

 

We hear a lot about deep learning lately, but what is it exactly?

SC: Deep learning is a deep network of neurons, meaning it is composed of many superimposed layers. Today, this method can be implemented by using modern technology that enables massive calculations to be performed, which in turn allow very complex networks to adapt appropriately to the data. This technique, in which many engineers in the fields of science and technology are very experienced, is currently enjoying undeniable success in the area of computer vision. Deep learning is well suited to the field of biometrics and voice recognition, for example, but it shows mixed performances in handling problems in which the available input information does not fully determine the output variable, as is the case in the fields of biology and finance.

 

If deep learning is the present form of machine learning, what is its future?

SC: In my opinion, research in machine learning will focus specifically on situations in which the decision-making system interacts with the environment that produces the data, as is the case in reinforcement. This means that we will learn on a path, rather than from a collection of time invariant examples, thought to definitively represent the entire variability of a given phenomenon. However, more and more studies are being carried out on dynamic phenomena, with complex interactions, such as the dissemination of information on social networks. These aspects are often ignored by current machine learning techniques, and today are left to be handled by modeling approaches based on human expertise.

 

 

chaire AXA, Maurizio Filippone, Eurecom

Accurate Quantification of Uncertainty. AXA Chair at Eurecom

AXA Chairs reward only a few scientists every year. With his chair on New Computational Approaches to Risk Modeling, Maurizio Filippone a researcher at Eurecom joins a community of prestigious researchers such as Jean Tirole, French Professor who won the Nobel prize in economics.

 

Maurizio, you’ve just been awarded an AXA chair. Could you explain what it is about and what made your project selected?

AXA chairs are funded by the AXA Research Fund, which supports fundamental research to advance our understanding of risk. Started in 2008, the AXA Chair scheme funds about 50 new projects annually, of which four to eight are chairs. They are individual fellowships, and the one I received is going to support my research activities for the next seven years. My project is entitled New Computational Approaches to Risk Modeling”. The AXA Chair selection process is not based on the project only. For this type of grant, several criteria are important: timeliness, vision, credibility of both the proposal and the candidate (track record, collaborations, etc.), institution and fit within institution’s strategy. For example, the fact that the research area of this topic is in line with the Eurecom long-term strategy in Data science played a major role in the selection process of my project. This grant definitely represents a major achievement in my career.

 

What is your project about exactly?

My project deals with one simple question: How do you go from data to decisions? Today, we can access so much data generated by so many sensors, but we are facing difficulties in using these data in a sensible way. Machine learning is the main technique that helps make sense of data and I will use and develop novel techniques in this domain throughout this project. Quantification of risk and decision-making require accurate quantification of uncertainty, which is a major challenge in many areas of sciences involving complex phenomena like finance, environmental and medical sciences. In order to accurately quantify the level of uncertainty, we employ flexible and accurate tools offered by probabilistic nonparametric statistical models. But today’s diversity and abundance of data make it difficult to use these models. The goal of my project is to propose new ways to better manage the interface between computational and statistical models – which in turn will help get accurate confidence on predictions based on observed data.

 

How will you be able to do that? With what kind of advanced computing techniques?

The idea behind the project is that it is possible to carry out exact quantification of uncertainty relying exclusively on approximate, and therefore cheaper, computations. Using nonparametric models is difficult and generally computationally intractable due to the complexity of the systems and amount of data. Although computers are more and more powerful, exact computations remain serial, too long, too expensive and sometimes almost impossible to carry out. The way approximate computations will be designed in this project will be able to reduce computing time by orders of magnitude! The exploitation of parallel and distributed computing on large scale computing facilities – which is a huge expertise at Eurecom – will be key to achieve this. We will thus be able to develop new computer models that will make accurate quantification of uncertainty possible.

 

What are the practical applications?

Part of the focus of the project will be on life and environmental applications that require quantification of risk.  We will then use mostly life sciences data (e.g., neuroimaging and genomics) and environmental data for our models. I am confident that this project will help tackle the explosion of large scale and diverse data in life and environmental sciences. This is already a huge challenge today, and it will be even more difficult to deal with in the future. In the mid-term, we will develop practical and scalable algorithms that learn from data and accurately quantify their uncertainty on predictions. On the long term, we will be able to improve on current approaches for risk estimation: they will be timely and more accurate. These approaches can have major implications in the development of medical treatment strategies or environmental policies for example. Is some seismic activity going to trigger a tsunami for which it is worth warning the population or not? Is a person showing signs of a systemic disease, like Parkinson, actually going develop the disease or not? I hope the results of our project will make it easier to answer these questions.

 

Do you have any partnerships in this project?

Of course! I will initiate some new collaborations and continue collaborating with several prestigious institutions worldwide to make this project a success:  Columbia University in NYC, Oxford, Cambridge, UCL and Glasgow in the UK, the Donders Institute of Neuroscience in the Netherlands, New South Wales in Australia, as well as INRIA in France. The funding from the AXA Research Fund will help create a research team at Eurecom: the team will comprise myself and two PhD students and one Post Doc. I would like the team to comprise a blend of expertise, since novelty requires an interdisciplinary approach: computing, statistics, mathematics, physics, plus some expertise in life and environmental sciences.

 

What are the main challenges you will be facing in this project?

Attracting talents is one of the main challenges! I’ve been lucky so far, but it is generally difficult. This project is extremely ambitious; it is a high-risk, high-gain project, so there are some difficult technical challenges to face – all of them related to the cutting-edge tools, techniques and strategies we will be using and developing. We will find ourselves in the usual situation when working on something new and visionary – namely, being stuck in blind alleys or being forced to dismiss promising ideas that do not work to give some examples. But that is why it has been funded for 7-years! Despite these difficulties, I am confident this project will be a success and that we will make a huge impact.

 

French National Library

The French national Library is combining sociology and big data to learn about its Gallica users

As a repository of French culture, the Bibliothèque Nationale de France (BnF, the French national Library) has always sought to know and understand its users. This is no easy task, especially when it comes to studying the individuals who use Gallica, its digital library. To learn more about them, without limiting itself to interviewing sample individuals, the BnF has joined forces with Télécom ParisTech, taking advantage of its multidisciplinary expertise. To meet this challenge, the scientists are working with IMT’s TeraLab platform to collect and process big data.

[divider style=”normal” top=”20″ bottom=”20″]

 

[dropcap]O[/dropcap]ften seen as a driving force for technological innovation, could big data also represent an epistemological revolution? The use of big data in experimental sciences is nothing new; it has already proven its worth. But the humanities have not been left behind. In April 2016, the Bibliothèque Nationale de France (BnF) leveraged its longstanding partnership with Télécom ParisTech (see box below) to carry out research on the users of Gallica — its free, online library of digital documents. The methodology used is based in part on the analysis of large quantities of data collected when users visit the website.

Every time a user visits the website, the BnF server records a log of all actions carried out by the individual on Gallica. This information includes pages opened on the website, time spent on the site, links clicked on the page, documents downloaded etc. These logs, which are anonymized in compliance with the regulations established the CNIL (French Data Protection Authority), therefore provide a complete map of the user’s journey, from arriving at Gallica to leaving the website.

With 14 million visits per year, this information represents a large volume of data to process, especially since it must be correlated with the records of the 4 million documents available for consultation on the site — which include the type of document, creation date, author etc. — which also provide valuable information for understanding users and their interest in documents. Carrying out sociological fieldwork alone, by interviewing larger or smaller samples of users, is not enough to capture the great diversity and complexity of today’s online user journeys.

Researchers at Télécom ParisTech therefore took a multidisciplinary approach. Sociologist Valérie Beaudouin teamed up with François Roueff to establish a dialogue between the sociological analysis of uses through field research, on one hand, and data mining and modeling on the other. “Adding this big data component allows us to use the information contained in the logs and records to determine the typical behavioral profiles and behavior of Gallica users,” explains Valérie Beaudouin. The data is collected and processed on IMT’s TeraLab platform. The platform provides researchers with a turnkey working environment that can be tailored to their needs and offers more advanced features than commercially-available data processing tools.

Also read on I’MTech TeraLab and La Poste have teamed up to fight package fraud

What are the different profiles of Gallica users?

François Roueff and his team were tasked with using the information available to develop unsupervised learning algorithms in order to identify categories of behavior within the large volume of data. After six months of work, the first results appeared. The initial finding was that only 10 to 15% of Gallica users’ browsing activity involves consulting several digital documents. The remaining 85 to 90% of users represent occasional visits, for a specific document.

We observed some very interesting things about the 10 to 15% of Gallica users involved,” says François Roueff. “If we analyze the Gallica sessions in terms of variety of types of documents (monographs, press, photographs etc.), eight out of ten categories only use a single type,” he says. This reflects a tropism on the part of users toward a certain form of media. When it comes to consulting documents, in general there is little variation in the ways in which Gallica users obtain information. Some search for information about a given topic solely by consulting photographs, while others consult solely press articles.

According to Valérie Beaudouin, the central focus of this research lies in understanding such behavior. “Using these results, we develop hypotheses, which must then be confirmed by comparing them with other survey methodologies,” she explains. Data analysis is therefore supplemented by an online questionnaire to be filled out by Gallica users, field surveys among users, and even by equipping certain users with video cameras to monitor their activity in front of their screens.

[tie_full_img]Photo d'une affiche de communication de la Bibliothèque nationale de France (BnF) avec pour slogan "Êtes-vous déjà entré à l'intérieur d'une encyclopédie ?", octobre 2016. Pour l'institution, rendre la culture accessible au public est une mission cruciale, et cela passe par un accès aux ressources numériques adapté aux utilisateurs.[/tie_full_img]

Photo from a poster for the Bibliothèque Nationale de France (BnF), October 2016. For the institution, making culture available to the public is a crucial mission, and that means digital resources must be made available in a way that reflects users’ needs.

 

Field studies have allowed us to understand, for example, that certain groups of Gallica users prefer downloading documents so they can read them offline, while others would rather consult them online to benefit from the high-quality zoom feature,” she says. The Télécom ParisTech team also noticed that in order to find a document on the digital library website, some users preferred to use Google and include the word “Gallica” in their search, instead of using the website’s internal engine.

Confirming the hypotheses also means working closely with teams at BnF, who provide knowledge about the institution and the technical tools available to users.  Philippe Chevallier, project manager for the Strategy and Research delegation of the cultural institution, attests to the value of dialogue with the researchers: “Through our discussions with Valérie Beaudouin, we learned how to take advantage of the information collected by community managers about individuals who are active on social media, as well as user feedback received by email.”

Analyzing user communities: a crucial challenge for institutions

The project has provided BnF with insight into how existing resources can be used to analyze users. This is another source of satisfaction for Philippe Chevallier, who is committed to the success of the project. “This project is the proof that knowledge about user communities can be a research challenge,” he says with excitement. “It’s too important an issue for an institution like ours, so we need to dedicate time to studying it and leverage real scientific expertise,” he adds.

And when it comes to Gallica, the mission is even more crucial. It is impossible to see Gallica users, whereas the predominant profile of users of BnF’s physical locations can be observed. “A wide range of tools are now available for companies and institutions to easily collect information about online uses or opinions: e-reputation tools, web analytics tools etc. Some of these tools are useful, but they offer limited possibilities for controlling their methods and, consequently, their results. Our responsibility is to provide the library with meaningful, valuable information about its users and to do so, we need to collaborate with the research community,” says Philippe Chevallier.

In order to obtain the precise information it is seeking, the project will continue until 2017. The findings will offer insights into how the cultural institution can improve its services. “We have a public service mission to make knowledge available to as many people as possible,” says Philippe Chevallier. In light of observations by researchers, the key question that will arise is how to optimize Gallica. Who should take priority? The minority of users who spend the most time on the website, or the overwhelming majority of users who only use it sporadically? Users from the academic community— researchers, professors, students — or the “general public”?

The BnF will have to take a stance on these questions. In the meantime, the multidisciplinary team at Télécom ParisTech will continue its work to describe Gallica users. In particular, it will seek to fine-tune the categorization of sessions by enhancing them with a semantic analysis of the records of the 4 million digital documents. This will make it possible to determine, within the large volume of data collected, which topics the sessions are related to. The task poses modeling problems which require particular attention, since the content of the records is intrinsically inhomogeneous: it varies greatly depending on the type of document and digitization conditions.

 

[divider style=”normal” top=”20″ bottom=”20″]

plusOnline users: a focus for the BnF for 15 years

The first study carried out by the BnF to describe its online user community dates back to 2002, five years after the launch of its digital library, in the form of a research project that already combined approaches (online questionnaires, log analysis etc.). In the years that followed, digital users became an increasingly important focus for the institution. In 2011, a survey of 3,800 Gallica users was carried out by a consulting firm. Realizing that studying users would require more in-depth research, the BnF turned to Télécom ParisTech in 2013 with the objective of assessing the different possible approaches for a sociological analysis of digital uses. At the same time, BnF launched its first big data research to measure Gallica’s position on the French internet for World War I research. In 2016, the sociology of online uses and big data experiment components were brought together, resulting in the project aiming to understand the uses and users of Gallica.[divider style=”normal” top=”20″ bottom=”20″]

 

Eurecom, HIGHTS, Autonomous car, H2020

The autonomous car: safety hinging on a 25cm margin

Projets européens H2020Does an autonomous or semi-autonomous car really know where it is located on a map? How accurately can it position itself on the road? For the scientists who are part of the European H2020 “HIGHTS” project, intelligent transportation systems must know their position down to one quarter of a meter. Jérôme Härri, a researcher in communication systems at Eurecom — a partner school for this project — explains how the current positioning technology must be readjusted to achieve this level of precision. He also explains why this involves a different approach than the one used by manufacturers such as Tesla or Google.

 

You are seeking solutions for tracking vehicles’ location within a margin of 25 centimeters. Why this margin?

Jérôme Härri: It is the car’s average margin for drifting to the right or left without leaving its traffic lane. This distance is found both in the scientific literature and in requests from industrial partners seeking to develop intelligent transportation. You could say it’s the value at which driving autonomously becomes possible while ensuring the required safety for vehicles and individuals: greater precision is even better; less precision, and things get complicated

 

Are we currently far from this spatial resolution? With what level of precision do the GPS units in most of our vehicles locate us on the road?

JH: A basic GPS can locate us with an accuracy of 2 to 10 meters, and the new Galileo system promises an accuracy of 4m. But this is only possible when there is sufficient access to satellites and in an open, or rural area. In the urban context, tall buildings make satellites less accessible and reaching a level of accuracy under 5 meters is rare. The margin of error is then reduced by projection, so that the user only rarely experiences such a large error in the positioning. But this does not work for an autonomous car. Improvements to GPS systems do exist, such as differential GPS, which can position us with an accuracy of one meter, or even less. Real time kinematic technology (RTK), used for cartography in the mountains, is even more efficient. Yet it is expensive, and also has its limits in the city. RTK technology is becoming increasingly popular for use in the dynamics of digital cities, but we have not yet reached that point.

 

And yet Google and Tesla are already building their autonomous or semi-autonomous cars. How are these cars being positioned?

JH: The current autonomous cars use a positioning system on maps that is very precise, down to the traffic lane, which combines GPS and 4G. However, this system is slow. It is therefore used for navigation, so that the car knows what it must do to reach its destination, but not for detecting danger. For this aspect, the cars use radar, lidars — in other words, lasers ­— or cameras. But this system has its limits: the sensors can only see around 50 meters away. However, on the highway, cars travel at a speed of 30, even 40 meters per second. This gives the autonomous car one second to stop, slow down, or adapt in the event of a problem… Which is not enough. And the system is not infallible. For example, the Tesla car accident that occurred last May was caused by the camera that is supposed to detect danger confusing a truck’s light color with the light color of the sky.

 

What approaches are you taking in the HIGHTS project for improving the geolocation and reliability?

JH: We want to know within a 25-centimeter margin where a vehicle is located on the road, not just in relation to another car. In order to do this, we use the cooperation between vehicles to triangulate and reduce the effect of a weak GPS signal. We consider that every vehicle nearby can be an anchor for the triangulation. For example, an autonomous car can have a weak GPS signal, but can have three surrounding cars with a better signal. We can increase the car’s absolute positioning by triangulating its position in relation to three nearby vehicles. In order to do this, we need communication technologies for exchanging GPS positions — Bluetooth, Zizbee, Wi-Fi, etc. — and technology such as cameras and radar in order to improve the positioning in relation to surrounding vehicles.

 

And what if the car is isolated, without any other cars nearby?

JH: In the event that there are not enough cars nearby, we also pursue an implicit approach. Using roadside sensors at strategic locations, it is possible to precisely locate the car on the map. For example, if I know the distance between my vehicle and a billboard or traffic light, and the angles between these locations and the road, I can combine this with the GPS position of the billboard and traffic light, which don’t move, making them very strong positioning anchors. We therefore combine the relative approach with the absolute position of the objects on the road. Yet this situation does not occur very frequently. In most cases, what enables us to improve the accuracy is the cooperation with other vehicles.

 

So, does the HIGHTS project emphasize the combination of different existing technologies rather than seeking to find new ones?

JH: Yes, with the aim of validating their effectiveness. Yet at the same time we are working on developing LTE telecommunication networks for the transmission of information from vehicle to vehicle — which we refer to as LTE-V2X. In so doing we are seeking to increase the reliability of the communications. Wi-Fi is not necessarily the most robust form of technology. On a computer, when the Wi-Fi isn’t working, we can still watch a movie. But for cars, the alternative V2X technology ensures the communications if the Wi-Fi connection fails, whether it is by accident or due to a cyber-attack. Furthermore, these networks provide the possibility of using pedestrians’ smartphones to help avoid collisions. With the LTE networks, HIGHTS is testing the reliability of the device-to-device LTE approach for inter-vehicle communication. Our work is situated upstream of the standardization work. The experience of this project enables us to work beyond the current standards and develop them along with organizations such as ETSI-3GPP, ETSI-ITS and IETF.

 

Does your cooperative approach stand a chance of succeeding against the individualistic approach used by Tesla and Google, who seek to remain sovereign regarding their vehicles and solutions? 

JH: The two approaches are not incompatible. It’s a cultural issue. Americans (Google, Tesla) think “autonomous car” in the strictest sense, without any outside help. Europeans, on the other hand, think “autonomous car” in the broader sense, without the driver’s assistance, and therefore are more likely to use a cooperative approach in order to reduce costs and improve the interoperability of the future autonomous cars. We have been working on the collaborative aspect for several years now, which has included research on integrating cars into the internet of things, carried out with the CEA and BMW — which are both partners of the HIGHTS project. We therefore have some very practical and promising lines of research on our side. And the U.S. Department of Transportation has issued a directive requiring vehicles to have a cooperative unit beginning in 2019. Therefore, Google and Tesla can continue to ignore this technology, but since they will be present in vehicles and made freely available to them, there’s a good chance they will use it.

 

[box type=”shadow” align=”” class=”” width=””]

HIGHTS: moving towards a demonstration platform

Launched in 2015, the 3-year HIGHTS project answers the call made by the H2020 research program on the theme of smart, green, and integrated transportation. It brings together 14 academic and industrial partners[1] from five different countries, and includes companies that work closely with major automakers like BMW. Its final objective is to establish a demonstration platform for vehicle positioning solutions, from the physical infrastructure to the software.

[1] Germany: Jacobs University Bremen, Deutsche Zentrum für Luft- und Raumfahrt (DLR), Robert Bosch, Zigpos, Objective Software, Ibeo Automotive Systems, Innotec21.
France: Eurecom, CEA, BeSpoon.
Sweden
: Chalmers University of Technology.
Luxemburg: FBConsulting.
The Netherlands
: PSConsultancy, TASS International Mobility Center.

[/box]