Challenge 1: Overlapping data

I have a question about experiments that overlap in terms of what they measure. I know from an answer to a previous forum that I cannot request the exact same experiment twice. However, if I requested both a low resolution microarray and a high resolution microarray under the same pertubation, would the smaller dataset be a subset of the larger one, or would the smaller dataset contain entirely new measurements? Similarly, if I requested proteins 2 and 3 and then 3 and 5 of the same perturbation, would the data for 3 be different? On a slightly different track, can I request the experiment identical to the start-up data that was already provided?

Comments

  Hi For microarrays, low

 

Hi

For microarrays, low resolution data are a sub-sample of high resolution data.

When you measure protein 3 in two different experiments its values will differ by  the noise as it changes from experiment to experiment.

You can request any experiments you like, but you will pay for them, so I do not see why you want to perform again experiments giving you the startup data.
 
thanks for the interest in the challenge
 
Pablo

Highly different values for same protein measurement

I just purchased two fluorescence experiments (same condition) that included measurements for one shared protein.  The measurements seem very different between the two datasets, however.  For example, at one particular time point one dataset has value 4.5 while the other has 11.  This seems improbable given the stated noise model (10% standard error).  Is it possible there is a mistake?

Hi as you don't indicate

Hi as you don't indicate precisely what you purchased it is difficult for me to check if other differences might arise explaining the discordance in protein levels (you say that you used the same condition, but did you not modify at all the source of the measurements? no deletion, no overexpression, nothing?).

One possibility is the following:

The noise formula is  vnoisy = max[0,v + 0.1*g1 + C*g2*v] 

 In your case, the value is high (between 4.5 and 11),  vnoisy ~ C*g2*v. Let's say the REAL value is 8, and the gaussian variables came out extreme for this specific case ( 3 and -3) so although the standard error is 10% (i.e 0.1), you will get 8+3*.8 and 8-3*.8 as values, which is close to what you get.

Hope this helps

pablo

files to check

Thanks for the quick response.  The two files I purchased were p1_p3_mod_2_wildtype.tab and p1_p7_mod_2_wildtype.tab so if you could take a look that would be great.  I had thought about the explanation you suggested but that seems extremely unlikely -- you'd need close to 4 standard deviations on each side which should happen together something like 1 in 100 million times.

As far as I can see from the

As far as I can see from the graphs, the extreme difference you indicate happens only for one timepoint, the rest seem to be in range... Let me check further, but I think there is no problem with this data.

Pablo

Looking a little closer

Looking a little closer myself, the values seem to be highly consistent with a 20% standard error.  Maybe 20% was being used for protein data instead of 10%... could that be the case?

Hi indeed as indicated, .2 is

Hi indeed as indicated, .2 is used for the protein data noise. If you take the average difference for p1 in your data set, you do get something close to 0.2. It seems that indeed that point is a very improbable outlier...

Hope this solves the matter

pablo

Thanks!  Yes, that explains

Thanks!  Yes, that explains everything.  It's a bit unfortunate because it makes the protein data 4x less valuable than advertised.  I guess this can't really be changed though, given that people have already downloaded data.

One last question -- just to

One last question -- just to check, is C = 0.2 being used for both the microarray and protein data then?  I just want to make sure they didn't get swapped since the problem statement had said C = 0.1 for one of them and C = 0.2 for the other.

Hi Indeed  C = 0.1 for RNA

Hi

Indeed  C = 0.1 for RNA and C = 0.2 for protein data.

thanks

pablo

Hmm... now I'm confused,

Hmm... now I'm confused, because the start-up mRNA data for Model 1 doesn't seem possible with C = 0.1.  The value of pp1_mrna is 4.058 at t=6 and 1.234 at t=14, but based on the ODEs, the value of pp1_mrna can only increase.  This discrepancy would be believable with C = 0.2, but with C = 0.1 the probability of these observations is vanishingly small.  Maybe it's C = 0.2 for both mRNA and protein after all?

Hi I re-checked and indeed

Hi

I re-checked and indeed C=0.2 for both RNA and protein

sorry about the confusion

pablo

Thanks!  That makes sense and

Thanks!  That makes sense and agrees with what I've been seeing.  One last question (I think): is there a noise model for the gel shift data (direct determination of binding affinity and Hill coefficient)?

Hi and sorry for the

Hi and sorry for the delay.

There is no noise model for gel shift data, the Hill coefficient and Kd are a defined value.

thanks

pablo

Thanks!  No problem about the

Thanks!  No problem about the delay; actually I'd say you've been very responsive which really makes a difference in the challenge experience.

What about the goal function?

The goal function says that the weight of each prediction will be based on the measurement noise function for protein, 10% + 0.1. Will this goal function be altered now that the measurement noise function for protein is different or will it stay the same?

No change in scoring function

 

Hi

yes sorry for not changing the scoring write-up, of course the weight function will now be 0.2.

The scoring function will not be changed from what was stated originally because the original scoring function may have been used in the strategy to purchase data. 

thanks

pablo and Gustavo