GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2005-08 > 1122996495


From:
Subject: SMGF [was Re: [DNA] FTDNA Genetic Distance Calculations...]
Date: Tue, 2 Aug 2005 11:28:15 EDT


In a message dated 08/01/05 5:20:31 PM Pacific Daylight Time,
writes:

> Can you elaborate on the changes they are making in their scoring
> methodology? Are they going to be consistent with that used by Sorenson Genomics and
> SMGF? If not, how will they differ?

It looks like the database search at SMGF uses the infinite allele model.
That is, they screen for matches based on the number of markers which match, not
factoring in whether the mismatches are one or many steps different. This
results in a more liberal listing of records in some cases. If there are a small
number of mismatches and a large number of markers (implying a relatively
recent MRCA) the step-wise model and the infinite allele model probably end up
giving very similar results for the time to the MRCA.

I've grown accustomed to thinking of "genetic distance" as a description of
how many mutations have occurred, and personally, I'd reserve that phrase for
calculations using the step-wise model. I think that's closer to reality than
the infinite allele model, even though it can also underestimate the true
number of mutations.

Grouping individuals within a surname project is still another question. It
could be done mechanically, by either the infinite allele or step-wise model,
with an arbitrary cut-off. Again, I think the step-wise model is closer to
reality, but there are hard cases, such as the example Georgia gave for the George
in the English project. But project coordinators don't have to limit
themselves to pairwise comparisons of matches or genetic distance -- they can use
additional DNA results from other participants, and even more importantly,
whatever genealogical data they have compiled.

Ann Turner


This thread: