FLASH: a Fast joint model for Longitudinal And Survival data in High dimension
V.T. Nguyen, A. Fermanian, A. Guilloux, A. Barbieri, S. Zohar, A.S. Jannot, S. Bussy
Biometrics (2024)
I'm founder & CEO of Califrais,
the startup that decarbonizes the food supply chain with AI.
Califrais has a foot in both the
supply chain and food industries: two sectors with a huge environmental impact yet very little
technological progress. It's particularly exciting to bring AI innovations to those
area!
Thanks to our LabCom LOPF (Large-scale Optimization of Product Flows), a unique collaborative structure with multiple academic research labs and with the support of our historical sponsors
CNRS and
Sorbonne Université, our mission is to invent AI-supported technological solutions to optimize large-scale food flows. We've deployed our technology in the largest fresh produce market in the world : Rungis. In this first use case, we proved that our solutions reduce food waste by a factor of 2 and CO2 emissions by 7.
I received a PhD in Machine Learning from Sorbonne University prepared at
LPSM
in 2019.
During my PhD, I was insterested in problems related to prognosis studies in high dimension,
with a particular emphasis on the survival analysis framework and the underlying applications to
personalized medicine.
Doctor Norbert Marx Award 2019
(French Statistical Society)
PhD Thesis Award Daniel Schwartz 2020
(French Biometric Society)
Statistical learning theory deals with the problem of finding a predictive function based on data.
The information era has witnessed an explosion in the collection of data in a variety of fields such as medicine, biology, marketing and finance. With it have come new theoretical and algorithmic challenges, that I find fascinating. Statistical learning is a field that precisely provides a theoretical framework for the design and analysis of predictive algorithms.
Problems in which the ambient dimension is of the same order or substantially larger than the sample size.
High-dimensional statistics has become the focus of increasing attention in the modern era of big data. Today, massive data sets (with potentially thousands of variables) play an important role in almost every branch of modern human activity. I am interested in the development of new statistical methods to separate the signal from the noise.
Set of methods for analyzing data where the outcome variable is the time until the occurrence of an event of interest.
Survival analysis is the analysis of time-to-event data. Such data describe the length of time from a time origin to an endpoint of interest. I am particularly interested in designing new methods for medical applications such as prospective cohort studies with longitudinal data in high-dimensional settings.
Problems in which the data is a series of points ordered in time and the goal is usually to make a forecast for the future.
Roughly speaking, time series forecasting is the use of a model to predict future values based on previously observed values. Time series are widely used for non-stationary data, like economic, weather, stock price, or retail sales. I am particularly interested in designing new methods to model longitudinal data in high-dimensional settings.
Problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment.
Reinforcement learning is an area of Machine Learning. It is about taking suitable action to maximize a cumulative reward in a particular situation. It can be employed in various applications to find the best possible behavior or path to take in a specific situation. I am interested in designing new methods in this context as well as new applications.
Models composed of multiple layers to learn data representations with multiple levels of abstraction.
Deep learning algorithms seek to exploit the unknown structure in the input distribution in order to discover good representations, often at multiple levels, with higher-level learned features defined in terms of lower-level features. These methods have dramatically improved the state-of-the-art in various applications, and the research perspectives are still huge.
V.T. Nguyen, A. Fermanian, A. Guilloux, A. Barbieri, S. Zohar, A.S. Jannot, S. Bussy
Biometrics (2024)
M. Hihat, S. Gaïffas, G. Garrigos, S. Bussy,
NeurIPS (2023)
R. Veil, S. Bussy, V. Looten, J.B. Arlet, J. Pouchot, A.S. Jannot, B. Ranque
Journal of Clinical Medicine (2019)
Doctor Norbert Marx 2019 Award
S. Bussy, A. Guilloux, S. Gaïffas, A.S. Jannot
Statistical Methods in Medical Research (2018)
Supervised by A. Guilloux, A.S. Jannot, S. Gaïffas. Paris - France, October 2015-October 2018
Supervised by A.S. Jannot, S. Gaïffas, A. Guilloux. Palaiseau - France, April-September 2015
Paris - France, January 2015 - Mars 2015
Supervised by Emilie Kaufmann. Paris - France, October 2014 - January 2015
San Francisco - United States, February - August 2014
Supervised by Jérémie Jakubowicz. Évry - France, October 2013 - January 2014
Supervised by Wojciech Pieczynski. Évry - France, April - June 2013
We introduced a prognostic method called lights to deal with the problem of joint modeling of longitudinal data and censored durations in a high-dimensional context.
At the intersection between theory and applications, my work was focused on the design and analysis of statistical methods for high-dimensional problems, with a particular emphasis on survival analysis settings.
Harmonic analysis, wavelet analysis and signal processing, optimization, information theory and pattern recognition, statistical learning and high dimensional statistics, kernel methods, reinforcement learning, graphical models, computer vision.
Course (grade): Maths (A), data analysis (A), probability & statistics (A), data mining (A+), numerical analysis (A), optimization (A), information theory (A+), stochastic processes (A+), Queuing theory (A+), Databases Management (A).