I'm Tony Knight, and I am the developer of QromaScan, which was mentioned here.
So QromaScan has had the ability to create metadata using voice recognition since our first release about 2 years ago. You scan your photos with our special lightbox and your iPhone, and then use voice recognition to describe the date, location and people in the image, and they are converted to industry standard standard photo metadata.
QromaScan v3, which we released about a month ago introduces something called natural language tagging. This means that you describe a photo your own way, as you might if you wrote it on the back of a photo, and our technology uses machine learning and linguistic parsing to find things like the date the photo was taken, where it was taken, and who is in it. The full description is also embedded into the image, and writing the metadata to the file accomplishes two purposes.
First, when the image is opened, the user can see things like a map of the where the images was taken, a description to tell the user the backstory of what was going on, and other historically important information that you might have otherwise written on the back of the photo. The second purpose is probably the most important of all. Applying this industry standard metadata means that image is now indexed by your operating system and is searchable across all of your other images. I have more than 44,000 images, but I can find almost any one in seconds because they are all tagged with metadata for date, location and people. Whether I am on my phone, tablet, computer, or even any computer connected via a web browser, I can type something like "Izzy Paris" and I can quickly find just the photos of my daughter in Paris 4 years ago.
I think photo tagging is in many ways as important as the photo itself. It provides context in a way that will be used many generations from now by those you pass your photos down to.