GENEALOGY-DNA-L Archives
Archiver > GENEALOGY-DNA > 2008-03 > 1205015682
From: James Heald <>
Subject: Re: [DNA] Central Limit Theorem in Action
Date: Sat, 08 Mar 2008 22:34:42 +0000
References: <013801c87d66$d40ce450$6400a8c0@Ken1><REME20080303195544@alum.mit.edu> <47CDA27D.4080506@ucl.ac.uk><00fc01c87e31$8a7d64b0$6400a8c0@Ken1> <47CE9EA6.9060204@ucl.ac.uk><004901c87f17$1527e2d0$6400a8c0@Ken1>
In-Reply-To: <004901c87f17$1527e2d0$6400a8c0@Ken1>
I think BATWING might be quite interesting for comparison all the same
-- it's a fully Bayesian program, so like you underlying its
calculations it's generating an ensemble of representative samples of
possible branching and mutation histories.
SNP structure can be explicitly specified, so histories can then be
constrained to match a known SNP framework.
You asked elsewhere about priors - Batwing uses a coalescent prior,
which means a-priori the probability of non-coalescence of two lineages
decreases geometrically going back in time. It also allows various
total population growth scenarios to be tweaked.
But often I think the STR mutations will fix a date much more narrowly
than that; so the final probability is often comparatively independent
of the detail chosen for the prior probability, if the prior is
comparatively slowly changing over that range of final interest.
I do think it would be quite an interesting thing to see how the
calculations compare.
-- James.
Ken Nordtvedt wrote:
> ----- Original Message -----
> From: "James Heald" <>
>
>>Have you compared your group TMRCA estimates with BATWING ? BATWING
>>goes about things in a strictly Bayesian way (though I still need to
>>think some more about the Coalescent prior it assumes). Do you know
>>whether your estimates match up ?
>
>
> Look at my warpedfounderstree at http://knordtvedt.home.bresnan.net The
> methodology there has features like coalescence analysis, but there are some
> differences, including the impositions of all known SNP tree constraints on
> the STR-driven tree construction. This last point is important for keeping
> the early part of the resulting tree more on the straight and narrow. In
> haplogroup I the founder haplotypes for the 15 or so well identified clades
> are obtained. We are saying as our starting point that essentially every
> haplogroup I haplotype seen to date will coalesce rather quickly (several
> thousand years) back to one of 15 or so clade founders. But we don't know
> the times back to those founders yet. Now the chore is to do a coalescence
> analysis of 15 or so very well separated haplotypes of these founders back
> to their MRCA --- someone who we might call the haplogroup I MRCA. This
> construction is done by searching for the tree which best fits the 15 times
> 14 divided by 2 GDs between each pair of founding haplotypes, and is
> parsimonious in the sense that it consumes the least number of mutations.
> The resulting tree must be consistent with all known SNP status for the
> various clade founders, hence rejecting some alternative trees of comparable
> parsimony.
>
> In the end we obtain the following estimates: compared to some arbitrary
> time, for example the time for the MRCA for all of haplogroup I, we obtain
> times for each of the founding haplotypes we began with, we obtain
> inferences of the haplotypes at each branch time in the tree, and we have
> times for occurence of each branch point in the tree.
>
> We don't know where this tree sits in time relative to the present. But if
> everything were consistent, if we could do age estimates for each of those
> founders for the clade populations we see today, they would compare in size
> with each other in a manner consistent with where the times of these
> founders sit relative to each other in the warpedfounderstree. Then one
> could use any of those founder times and add the age of that founder's
> descendant population to position that whole tree relative to the present.
> This tree has predictive power concerning future discoveries of SNPs which
> separate or unify various clades. The methodology is testable in that the
> resulting tree can make predictions that could be contradicted by new future
> evidence.
>
> Why is the tree called the "warped" founderstree? The GDs between the
> greatly separated founder haplotypes are so great that "back mutation"
> corrections to the relationship between GD and elapsed generations had to be
> made.
>
> This methodology has no particular merit for trying to construct a tree by
> coalescence for really bushy trees which result from very rapidly growing
> populations expanding from founders such as has occured in the
> post-agricultural eras around the world.
>
> Ken
>
>
>
>
> -------------------------------
> To unsubscribe from the list, please send an email to with the word 'unsubscribe' without the quotes in the subject and the body of the message
>
>
>
This thread:
| Re: [DNA] Central Limit Theorem in Action by James Heald <> |