Estimation of Model Parameters: Houston, there was a problem

Dear DREAM6 Parameter Estimation Challenge participants
It was brought to our attention by Clemens Kreutz, from University of Freiburg, that there was a "bug" in the definition of the models of the "Estimation of Model Parameters" challenges that affects the scores in non-trivial ways. 
The Problem
 
 The models were defined in SBML, where the ordering of the equations can be entered arbitrarily. The correct order is usually left to be handled by the simulators. However, some simulators would, in some cases, get it wrong too. The result was that when the SBML was translated into MATLAB, the arbitrary order of equations permitted by SBML created the "WRONG" (i.e., unintended) differential equations for matlab, as in the following example taken from model 1 definition: 

as3 = ((1+((p1/v3_Kd)^v3_h))-1)/((1+((p1/v3_Kd)^v3_h)));

cod3 = pro3_strength * (( as3) *(rs4));

rs4 = 1.0/((1+((p2/v4_Kd)^v4_h)));

In this case, the repression terms (rs4) was updated after the rate term (cod3) which depends on rs4. This means that a temporal shift is introduced during the ODE integration, as the protein concentration of the previous integration step is used to calculate the repression term. The scoring and data given for purchase were generated using the unintended (i.e., wrong order) order of differential equations.

Participants using the SBML model definitions with "intelligent" integrators, or those that explored the equations by eye and spotted the error, integrated the equations correctly. However, they were unintentionally penalized in the process of fitting the model parameters because, even though their model was correct, the data purchased and the predictions were generated with the incorrect order of assignments. his error had different effects in different models. Plots for data offered for purchase given by DREAM and the corrected integration scheme can be seen in www.fdmold.uni-freiburg.de/~ckreutz/ComparisonWithoutNoise.zip. Some analysis show that in model 1 and model 2 the error introduced is not considerable, but this is not the case for model 3.The exact effect of this error in the fitting depends on the order the teams purchased the data, and is hard to disentangle.

Some Analysis

We studied the correlation of between the final scores and the average square difference between data purchased and data correctly simulated with the right parameters for the different teams. We saw no correlation, which in principle indicates that the variance due to different fitting ability is comparable to the effect of the errors in integration. We also requested participants to tell us what integration method was used. In principle, people using Matlab (as given in the challenge), should have had an advantage over people using the corrected integration scheme. However, this did not result in an enrichment of MATLAB users amongst the best performing teams.

Are the Scores right or wrong?

The strict ordering of the scores is incorrect. However, there are a number of conclusions that can be extracted from the challenge as is:

1) The team Orangeballs was the best performer. They independently discovered that the provided Matlab code updated repression coefficients one time step later than activation coefficients, and decided to used this "wrong" dynamics in their fitting given that it seemed to yield consistent results. Therefore their best performance can be considered as justified. They were able to fit the the parameters correctly using the dynamics used by the organizers.

2) The second best performing team, team crux, is the team that discovered the error. Despite the fact that they "corrected" the order of integration, they obtained a second place. But if they had used the same dynamics as the one used to produced the data for purchase, they could have obtained the best performance. Therefore, we cannot say that they are in second place. We will call Orangeballs and crux best performers, without distinction of 1st place or 2nd place.

3) The difference in dynamics using different order of assignments in the odes in model 1 and 2 are smaller than those in model 3. So, in a way, a re-ranking of performance according to models 1 and 2 may seem reasonable. However, the resulting order could have changed if the teams had purchased their data in a different order.

4) Another alternative is to separate the teams as those whose model is the same as the one used to generate the dynamics given for purchase (such as team orangeballs), and those who were trying to fit that data with a slightly wrong model (as is the case in real biological applications). We could analyze each set separately, as they represent almost 2 different challenges. We could use the comparison between both lists as an example of what happens when our starting model assumptions are wrong (in this case there was a delay in system). We can get reasonably good fits and parameters even when your assumptions are not fully correct. Clearly in this reframing of the challenge there would still be two best performers: Orangeballs and crux, which also justifies our assignment of these two teams as best performers.

A few lessons

There were many lessons to learn from this challenge despite the mistakes.

1) We should have given the exact code for integration we used, including the error models.

2) SBML specification should include the order of the differential equations, as this problem can make for serious divergence of results intended to be the same.

3) We should have specified the time steps, and integrator used for this challenge.

We hope we will do this challenge again next year, and next time we hope to fix all these wrinkles. Raise your hand if you would participate again?

Thanks and Apologies

 We want to thank all the effort that was put by participants in this challenge. We are truly and deeply sorry that the clear success of this challenge was somehow deterred by an error in coding.

The DREAM organizers

Comments

Oakley Sunglasses

Lapses in concentration really hurt Grantsville as Oakley Sunglasses was able to sneak in goals during those moments. Cheap Oakley Sunglasses wasn’t fazed by Grantsville’s Sam Williamson opening the game with the goal in the fourth minute.

Louis Vuitton UK, the manufacturer of expensive handbags carried by thousands of people who usually cannot afford to do so, Louis Vuitton Sale is asking a Florida court to put the hurt on hundreds of Louis Vuitton Outlet websites that sell knock-off Vitton products!

Ralph Lauren Outle

Electronic Coach Outlet stores, launched in December as part of the promotion of the company's 70th anniversary celebration, Coach Factory Outlet is to try the online luxury shopping, selling limited-edition bags and accessories.

Designers are going for the Olympic gold! Find out what Ralph Lauren Outlet, Giorgio Armani and Stella McCartney have planned for the summer games using Cheap Polo Ralph Lauren!
More and more people love the Cheap Polo Shirts, the Polo Outlet Online provide the clothing for the Olympic Games!

download perturbations and scoring script

You can download at the end of the page the perturbations without noise and the scoring script

problematic parameters

It is amazing that crux used the 'correct' model and still got such good results on the 'wrong' data! We had so much trouble with protein 8 and 9 in model 3: we actually purchased all gel shift experiments for the parameters in the synthesis rate of gene 8, but using the purchased values didn't give the required dynamics.

I will not disagree with the final conclusions, but really all that we can really do now is look forward to next year's challenge.

I agree with point 4 that in most realistic situations the form of the equations are unknown. But in that case I would prefer to use other types of equations than those given in the SBML.

In fact, this could be a sub-challenge in DREAM7 Parameter estimation: obtaining a dynamic model given only the network structure and time course data, then score models based on predictive power.

Another suggestion is to start simulations from a non-zero initial state, corresponding to some physiological steady state in which then pertubations are performed.

In Model-3 I had the same problem.

I appreciate the effort of crux and orange balls, that they detected this problem and were able to get good parameters value.

Even I had the same problem of fitting simulations to mRNA/protien 8 and 9 in model 3.

Yes, we spent quite some time

  • Yes, we spent quite some time puzzling a handful of numbers (p5, p6, p7 at times 1, 1.5, 2) during our initial analysis of Model 3.  The cascade of jumps in concentration seemed to be happening too slowly.  Eventually we realized the integration might not be being carried out accurately.  I think the main issue was less a matter of the explicit time lag introduced by order of update than of the time step for numerical integration being too big, however.
  • The biggest lesson to be learned is point 1) from Gustavo's post: "We should have given the exact code for integration we used, including the error models."  I have a fair amount of experience competing in challenges of a similar reverse-engineering flavor on the TopCoder "Marathon Match" platform and their procedure is to post all simulation and scoring code at the beginning of each match.  Often, the participants then discover bugs (sometimes minor disagreements with the problem specification and occasionally more serious issues) which can be resolved immediately.  In general that experience has shown that it's very hard to run a contest with completely bug-free code, so having a few dozen more people look at it really helps.
  • Thanks again to the organizers for their efforts!  The overall setup of this challenge was quite unique and we really enjoyed participating.-- Po-Ru

Actually, re-reading this

Actually, re-reading this discussion, I think I should highlight more clearly that the main source of inaccuracy was the time step of integration: based on our fitting I believe it was something on the order of 0.1 time units.  The Matlab vs. SBML order of equations issue seems to be the main topic of discussion so far, but in fact the order of equations is inconsequential if a suitable time step is used.  I would hazard a guess that this is why there was no correlation between performance of teams and usage of Matlab vs. SBML.

Initially we were using dt=0.001 and didn't think twice about the order of equations.  (I had noticed that the rs variable updates were defined a few lines too late and actually caused my version of the code to break, so I moved them to the proper place.)  At some point this became a bit slow so we considered dt=0.01.  We found that there was marginal loss of accuracy and were a little surprised that the model fit was very slightly better rather than worse.  Eventually we decided to try dt=0.05 and 0.1, which we knew would cause significant inaccuracy, but turned out to produce much better fits.  It was at that point that we reconsidered the order of equations as well.

Po-Ru