Author Topic: New website: auto-clustering your matches  (Read 7070 times)

Offline UK4753

  • RootsChat Member
  • ***
  • Posts: 137
    • View Profile
Re: New website: auto-clustering your matches
« Reply #27 on: Saturday 08 December 18 19:54 GMT (UK) »

It’s free to register; you upload your data and it runs a cross-tabulation of all your matches (within defined parameters either of closeness of relationship, or a cM range). The clever bit is that it reorganises the massive chart into clusters where several matches all match each other.  This doesn’t PROVE they’re all on the family line, but it does help to narrow things down and focus your research.

In a way, these clusters seem similar to the Ancestry DNA Circles. To quote Ancestry "A DNA Circle is a group of individuals who all have the same ancestor in their family trees and where each member shares DNA with at least one other individual in the circle. These circles are created directly from your DNA and your family tree in a five-step process."

It may be worth a look.

 :)
Wiltshire: JONES, BANKS
Yorkshire: FEVERS, SCALES
Kent:  RUMLEY, NIGH
London:  HUGHES, NIGHTINGALE

Offline Gardenshed

  • RootsChat Extra
  • **
  • Posts: 55
  • Census information Crown Copyright, from www.nationalarchives.gov.uk
    • View Profile
Re: New website: auto-clustering your matches
« Reply #28 on: Saturday 08 December 18 20:24 GMT (UK) »

I don't think it is accessing the raw data.

I think what it is doing is going through and finding ALL the shared matches (i.e. two people who share more than 20cM) for you and putting this in a table.

It still involves people handing over their log ins so that a third party website can access data (including names of course) of people who have not consented to this.

Offline davidft

  • RootsChat Marquessate
  • *******
  • Posts: 4,209
  • Census information Crown Copyright, from www.nationalarchives.gov.uk
    • View Profile
Re: New website: auto-clustering your matches
« Reply #29 on: Saturday 08 December 18 20:31 GMT (UK) »


What is of more interest to me is how is this new company doing this. A major gripe people have with Ancestry is that it does not have a chromosome browser to interpret tree matches. Yet this new site must gain access to the raw data of results if they can tabulate matches the way they do. That being the case why can't we have access to the data the new website is obviously getting? Anyone fancy questioning Ancestry about this and whether a chromosome browser is on the horizon?

I don't think it is accessing the raw data.

I think what it is doing is going through and finding ALL the shared matches (i.e. two people who share more than 20cM) for you and putting this in a table.


I don't think that is the full picture though. The fuller picture is that they are matching specific matches on specific chromosomes to specific people i.e. going into greater detail that can only be achieved by accessing the underlying data to some extent (sorry if I have not explained that very well).
James Stott c1775-1850. James was born in Yorkshire but where? He was a stonemason and married Elizabeth Archer (nee Nicholson) in 1794 at Ripon. They lived thereafter in Masham. If anyone has any suggestions or leads as to his birthplace I would be interested to know. I have searched for it for years without success. Thank you.

Offline hurworth

  • RootsChat Aristocrat
  • ******
  • Posts: 1,336
  • Census information Crown Copyright, from www.nationalarchives.gov.uk
    • View Profile
Re: New website: auto-clustering your matches
« Reply #30 on: Saturday 08 December 18 20:58 GMT (UK) »

I don't think that is the full picture though. The fuller picture is that they are matching specific matches on specific chromosomes to specific people i.e. going into greater detail that can only be achieved by accessing the underlying data to some extent (sorry if I have not explained that very well).

We don't know that though.

I think it just goes through all of your mutual matches at whichever site and takes note of who matches whom, which is the same information that any of us can access at Ancestry.  It doesn't need to access that at a chromosome level.



Offline davidft

  • RootsChat Marquessate
  • *******
  • Posts: 4,209
  • Census information Crown Copyright, from www.nationalarchives.gov.uk
    • View Profile
Re: New website: auto-clustering your matches
« Reply #31 on: Saturday 08 December 18 21:04 GMT (UK) »

We don't know that though.

I think it just goes through all of your mutual matches at whichever site and takes note of who matches whom, which is the same information that any of us can access at Ancestry. It doesn't need to access that at a chromosome level.


Well if it is not accessing data at the chromosome level and making matches there it is much less useful than matching at say Gedmatch, ftdna or MyHeritage all of which have chromosome browsers and if that is the case then why use Genetic Affairs at all ?
James Stott c1775-1850. James was born in Yorkshire but where? He was a stonemason and married Elizabeth Archer (nee Nicholson) in 1794 at Ripon. They lived thereafter in Masham. If anyone has any suggestions or leads as to his birthplace I would be interested to know. I have searched for it for years without success. Thank you.

Offline hurworth

  • RootsChat Aristocrat
  • ******
  • Posts: 1,336
  • Census information Crown Copyright, from www.nationalarchives.gov.uk
    • View Profile
Re: New website: auto-clustering your matches
« Reply #32 on: Saturday 08 December 18 22:35 GMT (UK) »

Well if it is not accessing data at the chromosome level and making matches there it is much less useful than matching at say Gedmatch, ftdna or MyHeritage all of which have chromosome browsers and if that is the case then why use Genetic Affairs at all ?

I agree that it's a lot less useful than matching at a site with a chromosome browser. 

But most matches at Ancestry never upload anywhere else so this is a way of speeding up the process of finding which of your matches share 20cM or more with your other matches.  This info is already available if you want to go through hundreds of pages match by match.

Offline davidft

  • RootsChat Marquessate
  • *******
  • Posts: 4,209
  • Census information Crown Copyright, from www.nationalarchives.gov.uk
    • View Profile
Re: New website: auto-clustering your matches
« Reply #33 on: Saturday 08 December 18 22:49 GMT (UK) »

I agree that it's a lot less useful than matching at a site with a chromosome browser. 

But most matches at Ancestry never upload anywhere else so this is a way of speeding up the process of finding which of your matches share 20cM or more with your other matches.  This info is already available if you want to go through hundreds of pages match by match.


I really am not trying to be difficult here but if Genetic Affairs is just telling you that you share 20+cM with bob, sue and mary without giving any indication what chromosomes its on then it could be false matching in that whilst you do share 20+cM with each of those people it could be a different 20+cM in each case but you would not know that unless you had the chromosome breakdown. (In other words you have made matches with three different ancestors to three different people and not one ancestor matching with three different people). That is why I think there "must" be more to this matching and Genetic Affairs must somehow be getting access to underlying data.


And just for the record I do accept I could be wrong on all this but it is not possible to tell from the Genetic Affairs website that I can see
James Stott c1775-1850. James was born in Yorkshire but where? He was a stonemason and married Elizabeth Archer (nee Nicholson) in 1794 at Ripon. They lived thereafter in Masham. If anyone has any suggestions or leads as to his birthplace I would be interested to know. I have searched for it for years without success. Thank you.

Offline Genetic Affairs

  • RootsChat Extra
  • **
  • Posts: 2
  • Census information Crown Copyright, from www.nationalarchives.gov.uk
    • View Profile
Re: New website: auto-clustering your matches
« Reply #34 on: Sunday 09 December 18 08:41 GMT (UK) »
Hi there,

I found this thread and thought it might be a good idea to come over and try to explain some more. First, I am not accessing any hidden data source on Ancestry, I wish they would provide the chromosome data but I still haven't found it :-). In addition, I am not sure if you have the Leeds methodology (https://www.danaleeds.com/) but my method is more or less a Leeds on steroids. The added value is in the automated clustering of the matches and the visualization also helps in the analyis. I could try to come up with reasons to use it but luckily yesterday a similar question was raised on FB, the replies are excellent so if anyone wants to read about how people employ the AutoCluster approach: https://www.facebook.com/groups/geneticgenealogytipsandtechniques/permalink/549573322173039/

Now for the aspects of privacy. I understand that it might difficult for quite some people to hand out the login credentials of these sites. Personally, I would probably feel the same. I wish I could use the old API system of 23andme. Using that approach, I could just ask a user for permission to only see his relatives. For Ancestry, it is possible to create a dummy account which is then given limited rights. I think it is impossible to download the raw results without performing some manual invocations on the concerned websites. The DNAGedcom tool that is already around for quite some years is using a similar approach with respect to downloading matches but ofcourse they don't store the passwords on their server, everything is local. I address several of the aspects of security on my site, so feel free to look around.

Feel free to ask for more info, I don't have a lot of time but I do think it's important to explain as much as possible.



Offline stevemiller

  • RootsChat Senior
  • ****
  • Posts: 287
  • James Aaron Grigg 1875-1916
    • View Profile
Re: New website: auto-clustering your matches
« Reply #35 on: Sunday 09 December 18 15:36 GMT (UK) »
I try to keep groups of shared matches in a spreadsheet. Obviously, the clustering, visualisation, and time-saving offered by Genetic Affairs would be a bonus.

The drawback with Ancestry is that Shared Matches only show people with above 20 cMs (“4th cousins” or closer). If you look at someone with, say, 10 cMs it only shows those with 20 cMs or more. You cannot see shared matches between  “Distant cousins”.

Does anyone know if this new tool picks up the shared matches between these, say, 10-19 cMs or distant cousins?
West Berks- Appleton Bailey Barlow Bartholomew Carter/Cook Childs Corderoy Coxhead Froud Fryzer Griffin Harrison Head Noke Richmond Salter Sawyer Shrimpton Sidwell Stratton Stroud Wernham Wheatland
South Bucks- Miller Mitchell Horton
Cornwall- Aunger Baker Grigg Luxton
Hants- Hine/Hind
South Oxon- Applebee Barlow Clark Edginton Elliott Fryzer Simmonds Toby
Suffolk- Chilvers Darby Philpot Russell Stone
Surrey- Edwards Knight Lanaway
Sussex- English Exeter Jeffery Knight Mugridge
Wilts Bishop