Modelling - Biological systems

6/07/2011

Parameter estimation : how to use incomplete biological data ?

E. Coli cell © Institut Pasteur

Modelling biological systems, in particular at the level of the dynamics of cellular processes, is the objective of the IBIS team. Biological data are becoming ever more plentiful, but their use in estimating the mathematical parameters of a model remains an especially difficult problem. Sara Berthoumieux, a PhD student in the IBIS team, tells us about her work on the subject, carried out as part of her thesis.

What are your principal avenues of research?

Sara Berthoumieux: We are working on modelling biological systems, in particular gene networks and metabolic networks. We are using dynamic models that explain the development of these systems over time. For my thesis, we were particularly interested in metabolic networks. Metabolism involves all the chemical reactions allowing energy to be produced from nutrients in the environment, as well as all the proteins necessary for cell development and growth. We studied the metabolism of the bacterium Escherichia coli. This bacterium is widely studied in biology because it is easy to culture. It is therefore well understood, which facilitates modelling. The purpose of this research is to study the metabolism for a better comprehension of the processes, with prospects in particular for biotechnology applications involving E. coli.

Exemple of a metabolic network - © LGCB, university of Clermont Ferrand

What difficulties have been encountered?

Sara Berthoumieux: The main difficulty lies in obtaining values for the mathematical parameters of the model. These are coefficients which enable reactions to be quantified. They are essential in construction of the model. But these parameters are not directly measurable, as most of the time they are not linked to a biological quantity. We must therefore estimate them from biological data on the output of the model, in particular here the metabolite concentrations, components of the chemical reactions and fluxes of these reactions. It should be noted that it is very difficult to measure these values precisely, as metabolic reactions are very fast and the metabolites of the compounds are unstable. This necessitates the use of relatively recent, sophisticated measurement techniques with very powerful measurement apparatus. These new techniques yield a large amount of information, but it contains a great deal of noise due to the significant experimental uncertainties. In addition, it contains a lot of missing data; this is highly problematic in estimating the parameters of the model. Our work therefore consisted of proposing a method for estimating parameters adapted to the model we are studying, to facilitate the use of large biological data sets, even if these data are incomplete and imprecise.

Can you tell us a little more about this method of estimating parameters?

Sara Berthoumieux: We searched in the literature of biological data on metabolites and selected the largest existing data set, published in an article in the journal Science in 2007. To estimate the parameters, we consider the missing data to be random variables, the distribution of which is specified starting from observed data. To estimate the values of parameters starting from these data, we have adapted a standard method from the literature. In addition, we calculate a margin of error called the confidence interval for each parameter. However, we noted that even using the largest existing data set and a valid method, the confidence intervals obtained do not always allow precise estimates of parameter values to be obtained. At present, it is still very difficult to obtain experimental data that are sufficiently precise and abundant for calibration of quantitative models of large metabolic networks!

This work was carried out jointly with the Biometry and Evolutionary Biology Laboratory (LBBE). The model studied was designed especially for the article by Matteo Brilli, currently a post-doctoral researcher in the Inria Bamboo project team.

Sara Berthoumieux has won the Ian Lawson Van Toch Memorial Award f at ISMB/ECCB 2011.

ISMB ECCB logo

The ISMB/ECCB 2011 conference, to be held from 15 to 19 July 2011 in Vienna, Austria, is one of the most important gatherings in bioinformatics. For this 2011 edition, the 19th international conference on Intelligent Systems for Molecular Biology (ISMB) is being organised jointly with the 10th European Congress of Conservation Biology (ECCB). The main objective is development and application of advanced computational methods for biological problems. The conference brings together scientists in computer science, molecular biology, mathematics, statistics and other related fields to offer a programme reflecting the diversity of disciplines within bioinformatics and computational biology.

Keywords: Parameter estimation Sara Berthoumieux IBIS research team Inria Grenoble - Rhône-Alpes Modelling

Top