Using Google is AJAX Search API with Java ...

http://www.ajaxlines.com/ajax/stuff/article/using_google_is_ajax_search_api_with_java.php

 

I was rather depressed over a year ago when Google deprecated their SOAP Search API with their AJAX Search API. Essentially Google was saying that they didn want anyone programmatically accessing Google search results unless they were going to be presenting the results unaltered in a rectangular portion of a website. This was particularly troubling to me because, like many academics, I have relied on the API to do automated queries, especially for Warrick. A few months ago I got a little excited when Google opened their AJAX API to non-JavaScript environments. Google is now allowing queries using a REST-based interface that returns search results using JSON. The purpose of this API is still to show unaltered results to your website is user, but I don see anything in the Terms of Use that prevent the API being used in an automated fashion (having a program regularly execute queries), especially for research purposes, as long as you aren trying to make money (or prevent Google from making money) from the operation. So, here is what I have learned about using the Google AJAX Search API with Java. I haven found this information anywhere else on the Web in one spot, so I hope you will find it useful. Here is a Java program that queries Google three times. The first query is for the title of this blog (Questio Verum). The second query asks Google if the root page has been indexed, and the third query asks how many pages from this website are indexed. (Please forgive the poor formatting... Blogger thinks it knows better than I how I want my text indented. Argh.) import java.io.BufferedReader; import java.io.InputStreamReader; import java.net.URL; import java.net.URLConnection; import java.net.URLEncoder; import org.json.JSONArray; // JSON library from http://www.json.org/java/ import org.json.JSONObject; public class GoogleQuery { // Put your website here private final String HTTP_REFERER = "http://www.example.com/"; public GoogleQuery() { makeQuery("questio verum"); makeQuery("info:http://frankmccown.blogspot.com/"); makeQuery("site:frankmccown.blogspot.com"); } private void makeQuery(String query) { System.out.println(" Querying for " + query); try { // Convert spaces to +, etc. to make a valid URL query = URLEncoder.encode(query, "UTF-8"); URL url = new URL("http://ajax.googleapis.com/ajax/services/search/web?start=0&rsz=large&v=1.0&q=" + query); URLConnection connection = url.openConnection(); connection.addRequestProperty("Referer", HTTP_REFERER); // Get the JSON response String line; StringBuilder builder = new StringBuilder(); BufferedReader reader = new BufferedReader( new InputStreamReader(connection.getInputStream())); while((line = reader.readLine()) != null) { builder.append(line); } String response = builder.toString(); JSONObject json = new JSONObject(response); System.out.println("Total results = " + json.getJSONObject("responseData") .getJSONObject("cursor") .getString("estimatedResultCount")); JSONArray ja = json.getJSONObject("responseData") .getJSONArray("results"); System.out.println(" Results:"); for (int i = 0; i < ja.length(); i++) { System.out.print((i+1) + ". "); JSONObject j = ja.getJSONObject(i); System.out.println(j.getString("titleNoFormatting")); System.out.println(j.getString("url")); } } catch (Exception e) { System.err.println("Something went wrong..."); e.printStackTrace(); } } public static void main(String args[]) { new GoogleQuery(); } } Note that this example does not use a key. Although it is suggested you use one, you don have to. All that is required is that you put your website or the URL of the webpage that is making the query in the query string (coming from the HTTP_REFERER constant). When you run this program, you will see the following output: Querying for questio verum Total results = 1320 Results: 1. Questio Verum http://frankmccown.blogspot.com/ 2. Questio Verum: URL Canonicalization http://frankmccown.blogspot.com/2006/04/url-canonicalization.html 3. WikiAnswers - What does questio verum mean http://wiki.answers.com/Q/What_does_questio_verum_mean 4. Amazon.com: Questio Verum "iracund" is review of How to Get Happily ... http://www.amazon.com/review/R3VRSYWW5EJZFH 5. Amazon.com: Profile for Questio Verum http://www.amazon.com/gp/pdp/profile/A2Q6CLLQPXG55A 6. How and where to get Emerald? - Linux Forums http://www.linuxforums.org/forum/ubuntu-help/119375-how-where-get-emerald.html 7. Lemme hit that wifi, baby! - Linux Forums http://www.linuxforums.org/forum/coffee-lounge/122922-lemme-hit-wifi-baby.html 8. [SOLVED] lost in tv tuner hell... please help - Ubuntu Forums http://ubuntuforums.org/showthread.php%3Fp%3D3802299 Querying for info:http://frankmccown.blogspot.com/ Results: Total results = 1 1. Questio Verum http://frankmccown.blogspot.com/ Querying for site:frankmccown.blogspot.com Total results = 463 Results: 1. Questio Verum http://frankmccown.blogspot.com/ 2. Questio Verum: March 2006 http://frankmccown.blogspot.com/2006_03_01_archive.html 3. Questio Verum: December 2006 http://frankmccown.blogspot.com/2006_12_01_archive.html 4. Questio Verum: June 2006 http://frankmccown.blogspot.com/2006_06_01_archive.html 5. Questio Verum: October 2007 http://frankmccown.blogspot.com/2007_10_01_archive.html 6. Questio Verum: July 2007 http://frankmccown.blogspot.com/2007_07_01_archive.html 7. Questio Verum: April 2006 http://frankmccown.blogspot.com/2006_04_01_archive.html 8. Questio Verum: July 2006 http://frankmccown.blogspot.com/2006_07_01_archive.html The program is only printing the title of each search result and its URL, but there are many other items you have access to. The partial JSON response looks something like this: "GsearchResultClass": "GwebSearch", "cacheUrl": "http://www.google.com/search?q=cache:Euh9Z1rDeXUJ:frankmccown.blogspot.com", "content": "Questio Verum . The adventures of academia, or how I learned to stop worrying and love teacher evaluations.*. Saturday, June 07, 2008 ... ", "title": "Questio Verum ", "titleNoFormatting": "Questio Verum", "unescapedUrl": "http://frankmccown.blogspot.com/", "url": "http://frankmccown.blogspot.com/", "visibleUrl": "frankmccown.blogspot.com" So, for example, you could display the result is cached URL (Google is copy of the web page) or the snippet (page content) by modifying the code in the example is for loop. You will note that only 8 results are shown for the first and third queries. The AJAX API will only return either 8 results or 4 results (by changing rsz=large to rsz=small in the query string). Currently there are no other sizes. You can see additional results (page through the results) by changing start=0 in the query string to start=8 (page 2), start=16 (page 3), or start=24 (page 4). You cannot see anything past the first 32 results. In fact, setting start to any value larger than 24 will result in a org.json.JSONException being thrown. More info on the query string parameters is available here. From the limited number of queries I have ran, the the first 8 results returned from the AJAX API are the same as the first 8 results returned from Google is web interface, but I am not sure this is always so. In other words, I wouldn use the AJAX API for SEO just yet. One last thing: the old SOAP API had a limit of 1000 queries per key, per 24 hours. There are no published limits for the AJAX API, so have at it. source: frankmccown.blogspot Original Source: AddThis Social Bookmark Button Posted at 08:20:26 am | Permalink | Posted in Google Java Related Stuff * MooV: Using cutting edge Video phones and Software Video Phones - coupling all that with VoIP and empowering the disabled. * Moo Telecom: VoIP communications made easy - Ring anyway with the fun and ease of using a normal phone * TagR:Mobile Social Network with Real Time Locations Based services, and Ambience Intelligence, VoiP, IM, Skype, Googletalk, Mapping, Flickr, Events, Calendaring, Scheduling, SecondLife Support * ClearSMS : ClearSMS is a Web-based application that lets you send bulk SMS messages to your customers, contacts, or just about anyone. * Jajah:jah is a VoIP (Voice over IP) provider, founded by Austrians Roman Scharf and Daniel Mattes in 2005[1]. The Jajah headquarters are located in Mountain View, CA, USA, and Luxembourg. Jajah maintains a development centre in Israel. * Skype: It’s free to download and free to call other people on Skype. Skype the number one voice over ip software * PrivatePhone: a free local phone number with voicemail and messages you can check online or from any phone.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值