Network Topology and Parameter Inference Challenge
Scoring the Network Topology and Parameter Inference Challenge
Please go to the BOTTOM of this Challenge description page for downloading the solutions. The gold standards for model 2 will be soon be available for parameters.
Important Note |
The data of the challenges cannot be used for publication without the explicit permission of the data producers and the DREAM organization. Feel free to contact us about this. |
Calculation of p-value for parameter predictions in model 1
Let’s denote withv_{i}^{pred} and v_{i}^{real} the predicted and actual parameters that determine model 1 where iruns between 1 and N_{p}=45. We estimate the discrepancy between predicted and real parameters as:
This measure can be interpreted as an average (logarithmic) distance between predicted and actual parameter values. The logarithm is needed because the parameters span a large dynamic range. A null model was created from the distance between estimated and known parameters, based on the predictions of all the participants, what we call a relative null model. If there are M participants, we generate a the predictions in the null model as follows. We choose at random one of the M predictions for v_{1}^{pred}, then at random one of the M predictions for v_{2}^{pred}, …, and finally one of the M predictions for v_{Np}^{pred}. The resulting value of D_{j}^{param} represents one possible random choice of predictions amongst all the participants. By doing the same process a large number of times, we generated a distribution of D_{j}^{param} between known and estimated parameters, from which a p-value can be estimated for the actual predictions. We denote this p-value with p^{param} . This p value was used to score model 1 and is tabulated for each team in the shown below.
Calculation of p-value for time-course predictions in model 1
Let’s denote as p_{k}^{pred}(t_{i}) and p_{k}^{sim}(t_{i}) the predicted and simulated levels of protein k at times t_{i}. Because the initial conditions were given, the real challenging predictions take place after some time has elapsed from t=0. We took that time to be 5. Therefore the squared distance between predicted and measured protein abundances for model 1 was taken to be:
Note that the squared difference terms are normalized with the model of noise that was implemented in the data provided. A null model was created from this distance, based on the predictions of all the participants. If there are M participants, we chose at random one of the M predictions for p_{k}^{pred}(t_{11}), then at random one of the M predictions for p_{k}^{pred}(t_{12}), …, and finally one of the M predictions for p_{k}^{pred}(t_{40}). The resulting value of D_{j}^{prot }represents one possible random choice of predictions amongst all the participants. By doing the same process a large number of times, we generated a distribution of squared distances, from which a p-value can be estimated for the actual predictions. That p-value denoted as p^{prot} for model 1 is tabulated for each team in the the table below.
Calculation of p-values for network topology predictions in model 2
The challenge requests predictions for 3 missing links. A regulator gene can regulate either one gene, or two genes when these two genes are in the same operon. Participants had to indicate whether the interactions are activating (+) or repressing (-).
For each of the 3 predicted links i=1,2,3, we define a score:
S_{i}^{link} = L_{i} + N_{i}
where L_{i} = 6 if one connection has all its elements correctly predicted (that is, the source gene, the sign of the connection, and the destination gene are all correct) and L_{i} = 12 if the link regulates an operon composed of two genes and both connections are correct. Alternatively, L_{i} = 0 if some element of the connection is incorrect. If L_{i} >0 then N_{i}=0
In case a link is NOT correctly predicted (L_{i}=0) N_{i} adds to the score different values for depending on how good the prediction is. In the score is increased by 1 for each correctly regulated gene, 2 if the regulated gene and the nature of the regulation (i.e +/-) are correct and 1 if the regulator gene is correct
Hence ONLY for the links where L_{i}=0, N_{i} rewards correctly predicted element of the link as shown in the following (non-exhaustive) table, where i stands for incorrect and c correct predictions. Note that correct (+/-) predictions without the correct gene give no points.
Regulator gene |
(+/-) |
Regulated gene |
(+/-) |
Regulated gene |
Value of Ni |
i |
i |
i |
i |
i |
0 |
c |
i |
i |
i |
i |
1 |
i |
c |
i |
i |
i |
0 |
i |
i |
c |
i |
i |
1 |
i |
i |
i |
c |
i |
0 |
i |
i |
i |
i |
c |
1 |
i |
i |
i |
c |
c |
2 |
i |
c |
c |
i |
i |
2 |
c |
i |
c |
i |
i |
2 |
c |
i |
i |
i |
c |
2 |
c |
i |
c |
i |
c |
3 |
c |
i |
i |
c |
c |
3 |
c |
c |
c |
i |
i |
3 |
c |
c |
c |
i |
c |
4 |
c |
i |
c |
c |
c |
4 |
The final score is for the three predictions is
S^{netw}= S_{1}^{link}+S_{2}^{link}+S_{3}^{link}
A null model is calculated by generating a distribution of scores from a large number of surrogate gene networks obtained by randomly adding 3 links that follow the connection rules indicated in the challenge description. For each participant, a p-value associated with the score under the null hypothese is calculated, and shown in the table for model 2.
Overall score
For model 1, each team obtained a p-value for the time-course predictions and a p-value for the parameter predictions. The overall score is -log_{10} of the product of these two p-values.
For model 2, the overall score was defined as the -log_{10} of the p-value for the network topology prediction.
Model 1. Relative p-values and Overall scores |
|||||
Team | Parameter Distance D^{param} | p-value for parameter predictions | Protein Distance D^{prot} | p-value for protein time-couse predictions | OverallScore |
orangeballs |
0.0229 | 3.25E-03 | 0.002438361 | 1.21E-25 | 27.40 |
Team #341 |
0.8404 | 1.00E+00 | 0.016023721 | 3.39E-18 | 17.47 |
Team #92 |
0.1592 | 6.00E-01 | 0.035404398 | 4.45E-15 | 14.57 |
Team #452 |
0.0899 | 1.88E-01 | 0.047495432 | 6.28E-14 | 13.93 |
Team #328 |
0.1683 | 6.45E-01 | 0.09791128 | 4.01E-11 | 10.59 |
Team #194 |
0.0453 | 1.37E-02 | 0.198785197 | 1.93E-08 | 9.58 |
Team #199 |
0.1702 | 6.45E-01 | 0.362463945 | 2.90E-06 | 5.73 |
Team #397 |
0.8128 | 1.00E+00 | 0.356429217 | 2.53E-06 | 5.60 |
Team #109 |
0.3766 | 9.99E-01 | 0.817972877 | 1.34E-03 | 2.87 |
Team #77 |
0.0699 | 9.83E-02 | 19.32326868 | 1.00E+00 | 1.01 |
Team #361 |
0.1883 | 7.29E-01 | 3.222767988 | 6.90E-01 | 0.30 |
Team #514 |
5.0278 | 1.00E+00 | 14.77443631 | 1.00E+00 | 0 |
Model 2. Relative p-value and scores |
|||
Teams | Topology prediction score | p-value for topology | OverallScore |
crux | 12 | 1.49E-02 | 1.83 |
Team #452 | 9 | 5.60E-02 | 1.25 |
Team #194 | 8 | 1.07E-01 | 0.97 |
Team #328 | 8 | 1.07E-01 | 0.97 |
Team #199 | 8 | 1.07E-01 | 0.97 |
Team #341 | 7 | 2.10E-01 | 0.68 |
Team #109 | 6 | 3.83E-01 | 0.42 |
Team #77 | 5 | 6.01E-01 | 0.22 |
Team #161 | 4 | 8.01E-01 | 0.10 |
Team #397 | 4 | 8.01E-01 | 0.10 |
Team #361 | 3 | 9.86E-01 | 0.01 |
Team #514 | 2 | 1.00E+00 | 0 |