GENEALOGY-DNA-L Archives
Archiver > GENEALOGY-DNA > 2008-03 > 1204573397
From: "Ken Nordtvedt" <>
Subject: [DNA] Central Limit Theorem in Action
Date: Mon, 3 Mar 2008 12:43:17 -0700
It is important to use many STR markers for ASD/variance determination of a population. Here's an analytic example of the central limit theorem in action, converting an asymmetric distribution of ASD for a single marker into a more and more symmetric Gaussian-like distribution for many markers. (My analytic distribution for a single marker is not meant to exactly be that for ASD from a single marker, but only qualitatively like such distributions, with the most likely value of the variable given by the peak of the distribution being substantially below the average of the variable over the distribution)
I use the probability distribution P(v) = v exp(-kv) as an example of an asymmetric distribution for which the most likely value for the variable "v" is substantially less (50 percent) of the average value for the variable. "k" is a constant.
For two such markers each with that distribution, the probability distribution for the sum s = v(1)+v(2) is the convolution integral
P(2,s) = Integral ( P(v)P(s-v) dv ) = s^3 exp(-ks) (times a normalization constant)
For n markers each with the the asymmetric distribution we started with, the distribution for the sum s = v(1)+v(2)+... v(n)
is then found to be more and more like a symmetric Gaussian over most of its peak region.
P(n,s) = s^(2n-1) exp(-ks) (times normalization constants)
The most likely s and the average s can then be determined analytically when n markers are used for the ASD
s(ML) = (2n-1) / k and s(AVG) = 2n / k
For n = 1 marker, s(ML) = (1/2) s(AVG)
For n = 8 markers, s(ML) = (15/16) s(AVG)
Use as many markers as you can in determining a population's ASD
Ken
This thread:
| [DNA] Central Limit Theorem in Action by "Ken Nordtvedt" <> |