java程序示例_Java程序中的Google搜索示例

java程序示例

Sometime back I was looking for a way to search Google using Java Program. I was surprised to see that Google had a web search API but it has been deprecated long back and now there is no standard way to achieve this.

有时,我正在寻找一种使用Java程序搜索Google的方法。 我很惊讶地看到Google拥有一个Web搜索API,但是很早以前就已弃用了它,现在没有标准的方法可以实现此目的。

Basically google search is an HTTP GET request where query parameter is part of the URL, and earlier we have seen that there are different options such as Java HttpUrlConnection or Apache HttpClient to perform this search. But the problem is more related to parsing the HTML response and get the useful information out of it. That’s why I chose to use jsoup that is an open source HTML parser and it’s capable to fetch HTML from given URL.

基本上,谷歌搜索是一个HTTP GET请求,其中查询参数是URL的一部分,并且我们之前已经看到有不同的选项(例如Java HttpUrlConnectionApache HttpClient)来执行此搜索。 但是问题更多与解析HTML响应并从中获取有用信息有关。 这就是为什么我选择使用jsoup ,它是一个开放源代码HTML解析器,并且能够从给定的URL中获取HTML。

So below is a simple program to fetch google search results in a java program and then parse it to find out the search results.

因此,以下是一个简单的程序,可通过Java程序获取Google搜索结果,然后对其进行解析以找出搜索结果。

package com.journaldev.jsoup;

import java.io.IOException;
import java.util.Scanner;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class GoogleSearchJava {

	public static final String GOOGLE_SEARCH_URL = "https://www.google.com/search";
	public static void main(String[] args) throws IOException {
		//Taking search term input from console
		Scanner scanner = new Scanner(System.in);
		System.out.println("Please enter the search term.");
		String searchTerm = scanner.nextLine();
		System.out.println("Please enter the number of results. Example: 5 10 20");
		int num = scanner.nextInt();
		scanner.close();
		
		String searchURL = GOOGLE_SEARCH_URL + "?q="+searchTerm+"&num="+num;
		//without proper User-Agent, we will get 403 error
		Document doc = Jsoup.connect(searchURL).userAgent("Mozilla/5.0").get();
		
		//below will print HTML data, save it to a file and open in browser to compare
		//System.out.println(doc.html());
		
		//If google search results HTML change the <h3 class="r" to <h3 class="r1"
		//we need to change below accordingly
		Elements results = doc.select("h3.r > a");

		for (Element result : results) {
			String linkHref = result.attr("href");
			String linkText = result.text();
			System.out.println("Text::" + linkText + ", URL::" + linkHref.substring(6, linkHref.indexOf("&")));
		}
	}

}

Below is a sample output from above program, I saved the HTML data into file and opened in a browser to confirm the output and it’s what we wanted. Compare the output with below image.

下面是上述程序的输出示例,我将HTML数据保存到文件中,并在浏览器中打开以确认输出,这就是我们想要的。 将输出与下图进行比较。

Please enter the search term.
journaldev
Please enter the number of results. Example: 5 10 20
20
Text::JournalDev, URL::=https://www.journaldev.com/
Text::Java Interview Questions, URL::=https://www.journaldev.com/java-interview-questions
Text::Java design patterns, URL::=https://www.journaldev.com/tag/java-design-patterns
Text::Tutorials, URL::=https://www.journaldev.com/tutorials
Text::Java servlet, URL::=https://www.journaldev.com/tag/java-servlet
Text::Spring Framework Tutorial ..., URL::=https://www.journaldev.com/2888/spring-tutorial-spring-core-tutorial
Text::Java Design Patterns PDF ..., URL::=https://www.journaldev.com/6308/java-design-patterns-pdf-ebook-free-download-130-pages
Text::Pankaj Kumar (@JournalDev) | Twitter, URL::=https://twitter.com/journaldev
Text::JournalDev | Facebook, URL::=https://www.facebook.com/JournalDev
Text::JournalDev - Chrome Web Store - Google, URL::=https://chrome.google.com/webstore/detail/journaldev/ckdhakodkbphniaehlpackbmhbgfmekf
Text::Debian -- Details of package libsystemd-journal-dev in wheezy, URL::=https://packages.debian.org/wheezy/libsystemd-journal-dev
Text::Debian -- Details of package libsystemd-journal-dev in wheezy ..., URL::=https://packages.debian.org/wheezy-backports/libsystemd-journal-dev
Text::Debian -- Details of package libsystemd-journal-dev in sid, URL::=https://packages.debian.org/sid/libsystemd-journal-dev
Text::Debian -- Details of package libsystemd-journal-dev in jessie, URL::=https://packages.debian.org/jessie/libsystemd-journal-dev
Text::Ubuntu – Details of package libsystemd-journal-dev in trusty, URL::=https://packages.ubuntu.com/trusty/libsystemd-journal-dev
Text::libsystemd-journal-dev : Utopic (14.10) : Ubuntu - Launchpad, URL::=https://launchpad.net/ubuntu/utopic/%2Bpackage/libsystemd-journal-dev
Text::Debian -- Details of package libghc-libsystemd-journal-dev in jessie, URL::=https://packages.debian.org/jessie/libghc-libsystemd-journal-dev
Text::Advertise on JournalDev | BuySellAds, URL::=https://buysellads.com/buy/detail/231824
Text::JournalDev | LinkedIn, URL::=https://www.linkedin.com/groups/JournalDev-6748558
Text::How to install libsystemd-journal-dev package in Ubuntu Trusty, URL::=https://www.howtoinstall.co/en/ubuntu/trusty/main/libsystemd-journal-dev/
Text::[global] auth supported = cephx ms bind ipv6 = true [mon] mon data ..., URL::=https://zooi.widodh.nl/ceph/ceph.conf
Text::UbuntuUpdates - Package "libsystemd-journal-dev" (trusty 14.04), URL::=https://www.ubuntuupdates.org/libsystemd-journal-dev
Text::[Journal]Dev'err - Cursus Honorum - Enjin, URL::=https://cursushonorum.enjin.com/holonet/m/23958869/viewthread/13220130-journaldeverr/post/last

That’s all for google search in a java program, use it cautiously because if there is unusual traffic from your computer, chances are Google will block you.

这就是在Java程序中进行Google搜索的全部内容,请谨慎使用,因为如果您的计算机出现异常流量,则Google可能会阻止您。

翻译自: https://www.journaldev.com/7207/google-search-from-java-program-example

java程序示例

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值