logo inria

News
INRIA
Scientific research
Enrichment and transfer
Publication and Documentation
Working and studying at INRIA

Version française directory site map
 advanced search and help

Information de

meme niveau :

| Front Page | INédit | Conferences and Events | Press |

-----------------------
Chance and observation
A marriage of convenience to describe the world
-----------------------

 

 

pessin

"Those who ignore statistics have to reinvent it." This quip, freely adapted from Bradley Efron, one of the major statisticians of the end of the twentieth century, illustrates the importance that statistics has acquired in modern science. Increasingly, research scientists have to study complex processes that involve random phenomena, that is to produce data that vary due to chance, in order to explain and act on the world. We try for instance to model and predict the weather, identify genes, interpret images and so on.

Statistics in the broader sense covers the whole spectrum of techniques used to study random phenomena. These techniques call upon probability theory to design laws capable of modeling this type of phenomena. Statistics strictly speaking is the experimental side of probability theory. It analyzes and interprets the data supplied by random phenomena. Lastly, statistical inference analyzes experimental or observational data in order to estimate their adequacy to probabilistic models.

Using these techniques, scientists build theoretical models that are then confronted with data in order to be optimally adjusted. Depending on the field, this phase of the modeling work is either called learning, parameter estimation or identification, model approximation, or data assimilation. Statistics makes it possible not only to take into account the noise present in the data, but also the heterogeneity of the models studied and the atypical values that may occur. In the following articles, we will see that the complexity of the situations to be modeled led to the development of sophisticated statistical methods that go beyond the traditional framework of homogeneous, independent data sharing the same law. For example, hidden structure models (especially Markov chains and fields) play a major role in signal processing and image analysis.

Another essential aspect of the use of statistics is that it makes it possible to assess the realism or efficiency of a model by supplying an evaluation of its stability and precision. Statistical inference plays a central role in this regard by allowing the construction of model evaluation measures that simultaneously take into account sampling fluctuations in data acquisition and the model's capability to represent reality. Based on measures of the adequacy of data to probabilistic models, inference provides a sound theoretical framework for evaluating the quality of the results produced and thus selecting a significant model or a high performance analysis method. Statistical tests, resampling techniques such as cross-validation or bootstrap-that are very much used in statistical learning-the Bayesian approach that leads to the construction of model selection criteria that compensate a model adjustment quality with a measure of its complexity, these are powerful statistical inference tools to accomplish such a difficult task. In the field of data mining or statistical learning, for example, statistics plays an important role in the validation of methods that were designed in a non random context, such as function approximation (neural networks, support vector machines (SVM)), or in a purely geometric framework.

Contact :

Gilles Celeux,
SELECT team, INRIA Futurs,
Tel. : +33 1 69 15 57 77

--------------------------------
back to top | next Statistics as a compromise between unpredictability, dynamics and simplicity | home page
© INRIA - updated 04/23/2004 - webmaster@inria.fr