Scratchpad
(books, google, metadata — )
6 Sep. 2007
Err...okay, well, it would appear that literally as I was typing in the last post, Google updated book search to use subject headings! I searched for something, and then 5 minutes later searched for the exact same term again and poof! there was the option to use subjects. Now if you try a search for "fiction" it'll ask if you want American or German fiction. Yowza. Didn't exist at 4.16pm, does exist at 4.30pm.
Comments (0)
(books, google, metadata, search, spider — )
5 Sep. 2007
Poking around Google Books a little more, discovered the following path from which metadata can be snagged to compile full details of item:
On results page, click on "about this book." Yeah. That's it. Duh. Of course, you still can't actually search on metadata, but at least it's there...you could automate a search and retrieve everything for certain keywords, then use the metadata to do a secondary "weeding."
Or, you could do this:
Google Books, search only "full view" to find complete e-books ->
On results page follow "Find this book in a library" for OCLC results ->
OCLC site, retrieve metadata for object.
Oops, is my face red. Sort of. At the same time, really, if the metadata is there, if each record is already tied to an OCLC record, is it really necessary to prevent users from searching the fields directly? Still, at least there seem to be workarounds.
Comments (0)
(books, google, metadata, search — )
29 Aug. 2007
All I wanted to do was write a spider to steal all of the public domain books off of Google. But I can't. You know why?
You can't specifically search the metadata on Google Books.
That's right - no metadata. For you non-library/non-tech geeks out there who have no idea what I'm talking about, that means you can't do complex searches on the "aboutness" of a book. The text of a book is just text, but the metadata includes keywords on who wrote it, where it was written, when it was written, what it is about. Since a bazillion words appear in a book, it is often useful to search strictly on human-created, trusted metadata....it takes out the extra cruft and minimizes false positive results. But without that ability, you can't go "oh, Google Books, show me just the fiction," because the word "fiction" might appear in the title of an academic paper about fiction, but which itself is non-fiction. Google, as a machine, doesn't handle the difference between those two concepts very adeptly. It also doesn't appear to appreciate the difference between a book and a journal, and you can't search for items published in a particular country.
God, I hate Google more and more with each passing day.
Comments (0)