Welcome, Guest. Please login or register for free.
Did you miss your activation email?
Wednesday 25 November 09 08:48 UTC (UK)
Welcome Home Help Surnames Library Shop Search Login Register

+  RootsChat.Com
|-+  General
| |-+  Technical Help
| | |-+  Extracting Page data from Ancestry URLs
« previous next »
Pages: [1] Print
Author Topic: Extracting Page data from Ancestry URLs  (Read 294 times)
Dolgellau
RootsChat Senior
****
Posts: 276



Extracting Page data from Ancestry URLs
« on: Friday 04 February 05 00:36 UTC (UK) »

If you download a list from the Ancestry Index into a spread sheet and then hover over the “view Image” icon an URL such as:

http://content.ancestry.co.uk/iexec/?htx=view&r=5538&dbid=7618&iid=MERRG10_5690_5694-0443&desc=Rowland+Jones

appears.

Part of this URL, the bit before &desc is a page number, if all the page numbers are extracted it makes it easier to use Ancestry’s partial transcription as a basis for a full transcription. But extracting 5 or 6 thousand of these page numbers is a long drawn out process.

Before the National Archive blocked it a programme existed, which was basically a spreadsheet macro that enabled the extraction of page numbers from similar URLs used on the 1901 census.  Does anybody know if a similar programme / macro exists that will extract data from Ancestry URLs?
« Last Edit: Sunday 13 February 05 10:45 UTC (UK) by Copyright Editor » Logged

Mae gwybodaeth Cyfrifiad  wedi ei ddarparu drwy Archif Genedlaethol Lloegr o dan hawlfraint y Goron.
Darparwyd  gwybodaeth o'r Ad Esg o gyhoeddiadau dan hawlfraint y Llyfrgell Genedlaethol
Daw wybodaeth arall o gyhoeddiadau dan hawlfraint Cymdeithasau Hanes Teuluol

Census information was obtained through the National Archive under Crown Copyright.
BT lookups are supplied through © of The National Library of Wales
Other lookups are from copyright publications by FHS's
jeffH
RootsChat Member
***
Posts: 114



Re: Extracting Page data from Ancestry URLs
« Reply #1 on: Friday 04 February 05 05:04 UTC (UK) »

Interesting.

I've recently used the 1901 England and Wales Census decoder and it worked fine. Although the folio & piece numbers can't be decoded and the person ID is gone as well. But the page ID was still there and usable by the software.

I don't think it would take much effort to rework the same software for decoding the ancestry site, or come up with something new. I think I'm going to pop over to ancestry.com and look through some html code.

jeff
Logged

Pembrokeshire - Harries and Blethyn
Somerset - Wilkins, Parsons and Ball
Essex/Suffolk - Edwards and Smith
Morayshire - Younie and Mavor
jeffH
RootsChat Member
***
Posts: 114



Re: Extracting Page data from Ancestry URLs
« Reply #2 on: Friday 04 February 05 05:46 UTC (UK) »

Ok, so using word and excel, I did a little playing around and ended up with an excel file containing  a bunch of people sharing the same surname, all sorted and grouped by census page number (the &iid field). Cool.

But I had to log in to get the search list.  Perhaps I'm missing something. How would doing this at ancestry be helpful if we have to subscribe to the service in the first place.

jeff
Logged

Pembrokeshire - Harries and Blethyn
Somerset - Wilkins, Parsons and Ball
Essex/Suffolk - Edwards and Smith
Morayshire - Younie and Mavor
Dolgellau
RootsChat Senior
****
Posts: 276



Re: Extracting Page data from Ancestry URLs
« Reply #3 on: Friday 04 February 05 15:27 UTC (UK) »

It probably isn't of much benefit to researchers looking for individual families. Although if you were to extract all the people with the same surname and sort them by page number you could make a reasonable assumption that all those appearing on the same page are related (as long as there is only one Head with the same name on the page) without having to pay to see the original page.

The reason I wanted to know if it was possible to sort individuals by page number was to aid a transcription project.

A typical census entry contains eight pieces of information Address Name Relationship Marital status Gender Occupation and Where born. Ancestry's index contains four of these pieces of information. If it was possible to sort ancestry's index by page number one would only need to add the missing 4 bits of information and do a minor bit of sorting to get a full transcription for half the work.

Unfortunately trying to extract all the page numbers manually takes more effort than doing a full original transcription.
Logged

Mae gwybodaeth Cyfrifiad  wedi ei ddarparu drwy Archif Genedlaethol Lloegr o dan hawlfraint y Goron.
Darparwyd  gwybodaeth o'r Ad Esg o gyhoeddiadau dan hawlfraint y Llyfrgell Genedlaethol
Daw wybodaeth arall o gyhoeddiadau dan hawlfraint Cymdeithasau Hanes Teuluol

Census information was obtained through the National Archive under Crown Copyright.
BT lookups are supplied through © of The National Library of Wales
Other lookups are from copyright publications by FHS's
ae359
RootsChat Pioneer
*
Posts: 1



Re: Extracting Page data from Ancestry URLs
« Reply #4 on: Wednesday 23 March 05 13:28 UTC (UK) »

If you access ancestry.com via http://www.nationalarchives.gov.uk/census/ then you can look at the indexes without payment - just as you can the 1901 census on the GRO site.  You still have to pay to view the actual records or family transcripts.

I have used the &iid string  manually (search on the source page is a fairly efficient way of doing it) to group family members, after a search on surname and piece only,  but it would be great to have a tool like Census Decoder 1901 to use on the 1891 and 1871 censuses on Ancestry.com.

Cheers

Keith
Logged

Lancashire Heywood, Buckley, Wild, Ashton
Staffoirdshire Ferneyhough, Selvey
Gloucestershire Theyer
Pages: [1] Print 
« previous next »


[Copyright] [Shrink Link] [About Us] [Terms of Use]
All Census Lookups are Crown Copyright, National Archives for academic and non-commercial research purposes only
RootsChat.com cannot be held responsible directly or indirectly for the messages or content posted by others. Inline images in messages are the copyright of the respective linked sites.
RootsChat.com, Europa House, Bury, Lancashire, BL9 5BT

In loving memory of Eric George Davies, 1934-2009, the father of RootsChat.com































Powered by SMF 1.0.7 | SMF © 2006-2009, Simple Machines LLC
0.036:17