Author Topic: Papers Past - Part V  (Read 45900 times)

Offline minniehaha

  • RootsChat Marquessate
  • *******
  • Posts: 7,303
  • "To live in hearts we leave behind, Is not to die"
    • View Profile
Re: PAPERS PAST - Part V
« Reply #63 on: Thursday 06 August 20 02:46 BST (UK) »
Thank you!! Much appreciated.  ;D


Minniehaha.
HAMMOND, Cainham/Caynham, Shropshire, U.K. Otago-NZ.
GALBRAITH, Ireland, Dunedin, Otago-NZ., Kensington-London, U.K.
GRANT, Sct., Dunedin, Otago-NZ., Vancouver, Canada.
GLASS, Aberdeenshire, Otago-NZ.
CAIRNEY/CARNEY/KEARNEY/Ireland, Airdrie, Scotland, Otago-NZ.
O'BRIEN Mary Ann, Limerick, Otago-NZ.
NICOL(L) James, Scotland, Otago-NZ.
SCOTT Thomas, Shetland, Otago-NZ.
MCHARDY/MCHARDIE Euphemia, Scotland, Otago-NZ.

Offline kiwihalfpint

  • RootsChat Marquessate
  • *******
  • Posts: 7,905
  • Women and Cats will do as they please
    • View Profile
Re: PAPERS PAST - Part V
« Reply #64 on: Thursday 06 August 20 03:51 BST (UK) »
Thanks very much! :D


Cheers
KHP
Census information Crown Copyright, from www.nationalarchives.gov.uk

Offline PapersPast

  • RootsChat Senior
  • ****
  • Posts: 255
    • View Profile
Re: PAPERS PAST - Part V
« Reply #65 on: Wednesday 23 September 20 01:52 BST (UK) »
Hi folks -

I thought I'd pass a wee search tip along for Births/Deaths notices, helpfully prompted by an email I got from Rootschat user Thamesite:

She was having difficulty searching for a surname that very clearly was printed on a newspaper page, and despite the letters of the name being recognised correctly, the name wasn't showing up in her search results. Here's an example, where I've done a search for "Brien" - https://paperspast.natlib.govt.nz/newspapers/THS19260318.2.9.1?query=brien

You'll notice that "Brien" isn't highlighted, despite the letters appearing correctly in the "text" view of the article.

Our search index treats strings of characters with a space before and after them as being words, so
I figured out that the ".—" characters immediately after the name, with no space, have a big impact.
They're effectively turning the name "Brien" into "Brien.—On".

No-one is going to search for a name "Brien.—On"!
Unfortunately this looks like a very common typographic convention in older newspapers within family and personal notices - this means all those names that are followed by ".—", as they often were in those notices, might well be missing from your searches.

There is a solution - to find any "missing" results, search for the surname followed by an asterisk (boolean wildcard). For example:

brien*

This search trick works on the "all of these words" search setting.

Substitute "brien" for any surname you're looking for, and give it a crack! Good luck all, and thanks again to Thamesite.

 

Offline Fresh Fields

  • RootsChat Aristocrat
  • ******
  • Posts: 1,845
  • If only they could talk !
    • View Profile
Re: PAPERS PAST - Part V
« Reply #66 on: Wednesday 23 September 20 02:10 BST (UK) »
Thanks for the tip.

I had noted from time to time that highlighting was sometimes rather random, and this point will help to explain it.

For a period the NZ HERALD in particular, was a very heavily inked press with many bleeds and blotches, which the human eye tends to ignore, but evidently, not the optical reader.

Alan.
Early Settlers & Heritage. Family History.


Offline PapersPast

  • RootsChat Senior
  • ****
  • Posts: 255
    • View Profile
Re: PAPERS PAST - Part V
« Reply #67 on: Wednesday 23 September 20 04:31 BST (UK) »
You're welcome Alan. Spot-on correct about some editions of the NZ Herald - the ones we digitised from around the WW1 period have a lot of ink-bleed and print-quality issues.

If any of you ever see what looks like random-looking highlighting, it's worth taking a moment to click on the text tab just above the article image to see how the words in question have been identified. It'll give you a few leads on how you might spot other unidentified occurrences elsewhere in the collection by searching using a few basic boolean tricks. Fuzzy search (name~1), wildcard substitutions (Auckl*) and character substitutions (Dun?din) are the obvious approaches to consider, but it really is very case-by-case!

As always, don't hesitate to get in touch if you want a hand  :)

Offline Fresh Fields

  • RootsChat Aristocrat
  • ******
  • Posts: 1,845
  • If only they could talk !
    • View Profile
Re: PAPERS PAST - Part V
« Reply #68 on: Wednesday 23 September 20 12:23 BST (UK) »
Hello again.

As I started my researching hobby long before the internet, when we were obliged to spend hours in libraries, archives, and museums, going through their holdings. It’s a privilege to now be able to sit in front of a computer, at home, and have access to all the material on offer, on line.

However it is important to remember the systems in place to optically scan, and then digitally reproduce the written word, does have it’s limitations.

Because you don’t find an item in a search engine service, does not mean that it is not there.

Parts of articles can be carried forward to other columns within the paper or book, leaving you to miss an all important associated item, with regard to your search. Likewise imperfections in the inking and printing thereof of type, can see key searched for words being missed by the scanner.

If looking for further, and or more detailed press reports of an incident, do searches also for secondary associated items, that are included in the material that you have collected.

It can often be found that by the time an incident gets to a court hearing, the place name of the incident is more technically correct; so a different name or spelling, BUT the attending named persons in the first press reports, will also be those called upon to give evidence, to the court. So also search using the named attending police, medical providers, teacher, instructor or priest etc.

Tonight I decided to check once again, that I had not overlooked an item in my MYERS research of late.

I set my search criteria as  “Myers*”  and attached below are a few examples of highlighted text findings from 3,345 hits, over a seven year search period, with adverts excluded. They are an example of what I call random hits.

Happy hunting,

Alan.
Early Settlers & Heritage. Family History.

Offline Fresh Fields

  • RootsChat Aristocrat
  • ******
  • Posts: 1,845
  • If only they could talk !
    • View Profile
Re: PAPERS PAST - Part V
« Reply #69 on: Wednesday 23 September 20 13:19 BST (UK) »
Plus a completely random selection within one search hit, was recieved, but not sure wether it will replicate.

OPEN ENDED all these words search for   “Charles 3rd Waikato Militia”   results giving 513 hits.

In the link below there were many highlighted words, and spaces, that to my eye were quite random.

https://paperspast.natlib.govt.nz/newspapers/TC18601110.2.4?items_per_page=100&query=Charles+3rd+Waikato+Militia&snippet=true&sort_by=byDA

Alan.

Early Settlers & Heritage. Family History.

Offline PapersPast

  • RootsChat Senior
  • ****
  • Posts: 255
    • View Profile
Re: PAPERS PAST - Part V
« Reply #70 on: Wednesday 23 September 20 22:41 BST (UK) »
Very interesting Alan, I always appreciate the visibility that extra eyeballs bring, I would never have spotted that myself.

Looking at the page-level view of that article is also quite informative, and suggests some reasons why the highlighting isn't great on this one - https://paperspast.natlib.govt.nz/newspapers/colonist/1860/11/10/1 - it looks to me like the page was printed with a bit of radial skew. Nowadays we usually run an automated image de-skew process that mitigates this, but it's possible that this image predates that, or somehow missed that step. The follow-on impact of that is that the automated zoning we map out for article boundaries (which is obviously all based around rectilinear shapes with 90 degree corners) actually doesn't map very well on this skewed article. In turn, the highlighting (which is based on those automated zones) doesn't map.

You'll notice that the highlighting in the "Journal of events at Taranaki" article is often one-and-a-half lines above the word it should relate to, and a little to the side. I'll pass this one back to the digitisation team and see if this is something we can maybe reprocess to improve.

Backtracking slightly to typographic conventions throwing off word/name recognition, and the convention of having ".—" immediately following a name in a family notice: A key thing to keep in mind is the context of that convention - in the places this is used (birth or death notices), you'll find other typical keywords like "Born" or "missed", so an approach with a wildcard could be augmented with those other keywords to focus the results on those types of things.

e.g.

brien* born

...will surface additional relevant results that wouldn't have shown up in a straight "Brien born" search.


Offline PapersPast

  • RootsChat Senior
  • ****
  • Posts: 255
    • View Profile
Re: PAPERS PAST - Part V
« Reply #71 on: Wednesday 23 September 20 22:44 BST (UK) »
Also, what an excellent set of research tips Alan.