Hi to all! Look at this screenshot:

Here are 1 offer and split of 2 landers, 2 days interval. Can we decide to stop the second lander? "Yes!",-most of you will say. But I'm not sure..

5 days interval.
And in the end - it was 1 lander copied in different folders inside 1 domain.
It seems incredible, but it's really. Keep in mind this when you will optimize after 2-3-5-10-30 conversions 
Interesting 
By the way, was there any slight difference among these two landers? Also what is the amount of traffic these landers received from top converting placements?
About traffic: I used only 2 placements and 2 banners. All traffic was splitted up fifty-fifty.
What about frequency cap? It would very well be you ads are being served differently. In that regards, I love
Very very interesting - assuming all external test conditions ARE indeed the same - which they certainly seem to be. Thanks so much for doing this test and presenting your findings!
Although I've never actually run the same lander as 2 split-test candidates (that's ingenius BTW!), I DID occasionally "over-run" split-tests - for example I'd have a minute to check my stats and noticed a really big difference in performance between test candidates, but didn't have the time to cut the inferior performer and log the cut into my campaign journal - then when I actually had time to perform the cut later on, I would find that the gap in performance was drastically reduced as if by magic.
And at other times, a lander that "wins" over another lander by a mile, when I run the same landers on another traffic source, I would see the opposite trend, i.e. the "winning" lander ending up being the loser. (Although, you could argue that the nature of the traffic on the other traffic source must have been different - which is entirely possible, in spite of the fact that pop traffic tends to be "broad" or general traffic for the most part.)
So yes - even when the stats calculator clearly tells us that test candidates have reached statistical significance with a high percentage probability (90-100%), it doesn't mean the results are 100% irrefutable and consistent. Run the same split-test again and we may get completely different results.
The question though isn't whether our stats methods are perfect or foolproof. A more practical question would be whether stats will help us make the right choice more often than if we were NOT to use stats - and based on my personal knowledge and experience, the answer is a resounding and definite YES!
Nonetheless - findings like these are interesting to observe. And looking at this from another angle: If, even by running tests to statistical significance doesn't 100% guarantee that we'd pick the "true" winner, how much worse would it be if we DIDN'T run to statistical significance - and can we afford not to?
Again liocamdiosong - thanks so much for running this test and taking the time to post results!
Amy
I'm sorry but this test was pretty much useless. The only thing that it proved, was that there are day to day fluctuation in terms of conversions, which is a known fact.
To give it some value, you need to run it against a DIFFERENT lander, to see whether one of them shows bigger fluctuations than the other and whether one wins even despite those fluctuations.
These daily fluctuations are the very reason why I always emphasize the need to run more than 1 instance of everything (banner, lp, offer lander), whenever possible. This way, you will partially protect yourself against loses in case one element tanks on one day, the rest can make up for it, as it's rare to see all variables to go south all at once.