October 6th, 2009

"Indonesia - Java. Little girl." Frank George Carpenter collection.

In January 2008 the Frank Bette Center for the Arts held a fundraiser, auctioning off donated art. One piece was “Unattributed sepia print, young girl.” It seemed like a fun challenge, so I went to the Library of Congress website. I actually got lucky searching their Prints & Photographs Online Catalog (PPOC). A few minutes later I was able to make my report: “The photo is dated between 1890-1923. It’s part of the Frank George Carpenter collection, but I don’t know if he was the actual photographer or just the collector. You’ll find it online at http://hdl.loc.gov/loc.pnp/cph.3b34405.”

But really, that did involve quite a bit of luck. The Library of Congress has about a million photos digitized on their site. That number means two things: It’s only a tiny fraction of all the photos on the web–much less all photographs in existence–and it’s still a whole lot of photos to trawl through.

In March of this year I read an interesting article about proof of the “Borneo Monster”–basically a big ass snake–being shown to be a hoax. Duh. That’s not the interesting part. What caught my eye was how the hoax was uncovered:

"Borneo Monster"

"Borneo Monster"

“Scientific American talked to a Kansas librarian who was one of many to reveal the hoax. Nathan Chadwick explained that by using the reverse search engine TinEye, which crawls the Web for pixels that match an uploaded image, he was able to locate the original photo. According to many Web sites, the original image shows the Congo River in Africa, not the Baleh River in Borneo” (http://news.aol.com/article/borneo-monster-photos-proven-fake/382584).

Using TinEye woud have been so convenient when I was trying to identify the photo for the Frank Bette! OK, not really. I just tried it and couldn’t find it among the more than one billion images TinEye searched. I e-mailed a librarian at the Library of Congress to ask about this, and received an interesting and thoughtful response:

We have not specifically blocked TinEye.com. There may be a couple of explanations for the lack of representation from Prints & Photographs Division collections.

I’m not sure how TinEye does its crawling, but our images and metadata are not sitting on static pages. They are only retrieved in response to a search. Most crawlers would not pick up them up because of this dynamic condition. Supporting my supposition is the fact that if you do submit to TinEye one of our static pages that contains images from the collections, it is able to find it (e.g., try the “About” page for the Farm Security Administration/Office of War Information Color Photographs: <http://lcweb2.loc.gov/pp/fsacabt.html>).

As far as I know, on the few occasions when the Library has experimented with enabling services to crawl through special mapping technologies, services send the requests in so fast that it adversely affects our servers.

So searching for images on the Library of Congress was a bust. Same with Wikipedia, although I haven’t e-mailed them to find out why. TinEye did successfully find images I knew were lurking on the Internet Movie Database, and Amazon.com. Oh, and no searching for porn: This violates their terms of use. Not that I was going to.

TinEye works like fingerprint identifying software, which maps key information in a photo and looks for the same image, even if its shape, size, or color has been modified. For example, I uploaded the image on the left, and the image on the right was among the matches it found.


Pretty impressive! However, you can’t upload a photo of your high school sweetheart and cyberstalk them. Sorry. TinEye says they don’t do face recognition. But that’s like telling a kid that an oven is hot: I had to try uploading a picture of myself. The result?


At least someone sees the resemblance. That TinEye’s pretty smart.

Michael Singman-Aste
Postdiluvian Photo

Tags: ,

Leave a Reply