New Google Online Newspaper Archive Search

Google Newspaper Archive

It seems like the ultimate goal Google had to make billions of pages of newsprint from around the world search able, discoverable, and accessible online is going to be accomplished now. Google has announced their expansion of historical newspaper articles that may be searched online and are partnering up with newspaper publishers to do so. Many publishers are scanning their print archives and making it available on Google’s News Archive Search.

Google had started working on this Newspaper archive project back in 2006 when they began with the New York Times and the Washington Post indexing existing digital archives. Now they have joined hands with ProQuest, Heritage and Quebec Chronicle-Telegraph, which is also the oldest newspaper of North America and has been publishing continuously for more than 244 years. With the help of these new publisher partners, Google is increasing their range of material available online to search through on their News Archive Search.

The news archives will run contextual ads from Google AdSense, which will be split with the respective newspaper publishers. “Not every search will trigger this new content, but you can start by trying queries like [Nixon space shuttle] or [Titanic located]. Stories we’ve scanned under this initiative will appear alongside already-digitized material from publications like the New York Times as well as from archive aggregators, and are marked ‘Google News Archive'” said Google’s Product Manager, Over time, as we scan more articles and our index grows, we’ll also start blending these archives into our main search results so that when you search Google.com, you’ll be searching the full text of these newspapers as well.”

The technology Google uses for the Newspaper indexing was used to scan books but now they have made it more advanced and tuned it to search and index newsprint pages. The optical character recognition distinguishes between headlines and text which helps it to show more relevant pages.

GoogleBlog via TechCrunch