The Lampada ANR is a four year project launched in november 2009 and is funded by the French National Research Agency ANR. Lampada is a fundamental research project on machine learning and structured data. It focuses on scaling learning algorithms to handle large sets of complex data. The main challenges are 1) high dimension learning problems, 2) large sets of data and 3) dynamics of data. Complex data we consider are evolving and composed of parts in some relations. Representations of these data embed both structure and content information and are typically large sequences, trees and graphs. The main application domains are web2, social networks and biological data. The project proposes to study formal representations of such data together with incremental or sequential machine learning methods and similarity learning methods. The representation research topic includes condensed data representation, sampling, prototype selection and representation of streams of data. Machine learning methods include edit distance learning, reinforcement learning and incremental methods, density estimation of structured data and learning on streams.
The Bingo2 project "Knowledge Discovery For and By Inductive Queries in post-genomic applications" ANR-07-MDCO-014 is a three-year project launched in January 2008. Bingo2 is funded by the French National Research Agency ANR and it is a follow-up of the Bingo project (2004-2007) funded by ACI Masse de Données ANR-07-MDCO-014 is a three-year project launched in January 2008. Bingo2 is funded by the French National Research Agency ANR and it is a follow-up of the Bingo project (2004-2007) funded by ACI Masse de Données Within the inductive database framework, Bingo2 aims at designing and developing new methods and tools for supporting knowledge discovery from databases in order to avoid the "pattern flooding which follows data flooding" that is unfortunately so typical in exploratory KDD processes. The discovery and the use of (partial) domain knowledge in post-genomic area is the main thread of our work. Bingo2 tackles the following open problems: (1) how (partial) domain knowledge can be discovered and used for knowledge discovery in post-genomic, (2) designing generic methods to gather and mine heterogeneous data sources and (3) providing KDD scenarios in molecular biology and WWW usage mining. Bingo2 joins four partners gathering computer science and biological skills.
"Strings And Trees for Thumbnail Images Classification" ANR Blanc 07-1__184534 is a three-year project launched in January 2008. SATTIC is funded by the French National Research Agency ANR and is partially supported by the IST Programme of the European Community, under the PASCAL 2 Network of Excellence, IST-2006-216886 In order to manage some of the huge data sets that are now available, and more particularly to classify, recognize or search through these sets, one needs a representation system which is rich enough to describe the data while allowing an efficient and mathematically well understood exploitation. This sort of representation is both well defined and nicely computed when data are numerical values, or more generally vectors of numerical values. However, many objects are poorly modelled with such vectors of numerical values that cannot express notions such as sequentiality or relationships between attributes. In particular, this project aims at representing and exploiting thumbnail images such as those returned by search engines like Google. If much work has been done on images having high definition levels, none concerns the question of filtering these small images, the definition of which is too low to allow a segmentation into regions and/or the exploitation of wide support local measures. An appealing alternative lays in modelling images by extracting and symbolically structuring salient points: salient points, corresponding to the image high contrast points, may be easily detected in thumbnail images; we propose to structure them by means of strings, trees, or more generally graphs, in order to integrate information on saliency degree or spatial relationships. We propose in this project to study the capabilities of such salient point structuring to model and exploit thumbnail images. This goal implies the definition of a new paradigm for analysing and statistically characterizing symbolic structured data, at odds with classical approaches used for numerical data.
The research is being carried out in close collaboration with a French mutual health benefit organization called "Mutualité Française de la Loire". This is a non-profit making organization which provides health and social services care in France. Catherine Combes.
The research was carried out in close collaboration with the French co-operative health organization called the “Centre Mutualiste d’Addictologie”, an aftercare center for addictology.This work is in close collaboration with the Dr. Christian Digonnet, psychiatrist and Manager of the addiction Center - Saint-Galmier France (2009-2011).
The research investigates clustering techniques. The work deals with the identification of patient’s profiles. The machine learning tools are called "unsupervised" learning. The objective is to propose a data mining approach in order to automatically identify patterns (on the co-occurrence concepts of observed variables). Psychiatric disorders related to addictions are studied in a population-based sample to determine whether conditions co-occur that is the recognition of homogenous groups of patients based on their features in such a way that the patients belonging to the same groups are similar and those belonging to different groups are dissimilar. The aim is automatically to find the hidden structure corresponding to feature-patterns related to people suffering from addictions. We want to automatically find the number of categories or groups of patients and the “best” representative patients’ profile of each group. We also present the specificity of the used distance in order to describe of how far apart objects are.
This work is in close collaboration with nursing homes. It concerns the degree of self-handicap for the elderly dependent people living in nursing home (2006-2010).
We investigate the contribution of unsupervised learning and regular grammatical inference to respectively identify profiles of elderly people and their development over time in order to evaluate care needs (human, financial and physical resources). The aim is to forecast the residents’ autonomy/disability progress over time (in using techniques as grammar inference to identify the transition graph between profiles) in order to plan the activities and the necessary human, material and financial resources.
"Web Intelligence" is funded by the French Administrative Region (Région Rhône-Alpes) in the context of the ISLE Research Cluster. A consortium of researchers working in several laboratories of the Region Rhône-Alpes on topics exploiting Artificial Intelligence and advanced information technology on the Web and on the Internet. The objective of the project is to establish in the Région Rhône Alpes an integrated, multidisciplinary, and European leading research community in the area of Web Intelligence by: analyzing and understanding current and possible future roles of the Web and Internet from the users’ perspective ; contributing to the development of technologies for the current and future evolutions of the Web and Internet ; proposing up-to-date teaching material on Web Intelligence topics for students at the undergraduate and graduate levels , but also for ICT companies and main actors ; establishing a dense and fruitful network with the Rhone-Alpes industry for the technology transfer of the results of Web Intelligence project.
In the framework of the CREST research project "Development of a Physiological and Environmental Information Processing Platform and its Application to the Metabolic Syndrome Measures", we are developing a novel information processing platform based on wearable sensors which record real-time physiological and environmental data in one’s daily life and objectively identify one’s lifestyle habits. A target of ours is to develop a service enabling to share bio-environmental information relevant for lifestyle-related diseases between medical specialists and information analysis researchers. This project is carried out in cooperation with the University of Tokyo and is sponsored by the Japan Science and Technology Agency (JST) and the Ministère de l’Education Nationale et de la Recherche de France.