前言
你是否也这样?每天加班完后只想回家躺着,经常忘记带伞回家。如果第二天早上有雨,往往就会成为
落汤鸡
,特别是笔者所在的深圳,更是喜欢下雨,稍不注意,就成落汤鸡
。其实想想,这种情况也是可以有效避免的,只需要晚上带伞回家,然后第二天早上带出来,最后美滋滋的吃早餐。但前提是晚上带伞回家,你知道的,做IT
的都在忙着改变世界,带伞这种小事当然不值一提,华丽忘记。这时候默想,要是有人每天晚上提醒我带伞回家就好了,这种想法似乎有些奢侈。既然别人做不到,那就让程序来做吧。
思路
本项目其实就是个天气提醒器,用来提醒我们广大
IT
同仁们明天天气,思路大致分为如下几步。
- 从网上爬取深圳明天天气情况并解析。
- 解析天气信息后发送邮件提醒。
- 将项目打包后上传至服务器。
- 编写
Linux
的定时任务,定时运行启动脚本。
整体框架
整体框架包括
Linux
的定时任务部分和weather-service
中处理部分,系统会在每天启动定时任务(自动运行指定脚本),启动脚本会启动weather-service
服务完成天气信息的爬取和邮件提醒。
技术点
整个项目涉及的技术点如下。
Crontab
,定时任务命令。Shell脚本
,启动脚本编写。weather-service
涉及技术如下Maven
,项目使用Maven
构建。HttpClient
,爬取网页信息。JSoup
,解析网页信息。JavaMail
,发送邮件。log4j、slf4j
,日志输出。
源码
weather-service
的核心模块为爬取模块和邮件模块;而完成自动化执行动作则需要编写Crontab
定时任务和Shell
脚本,定时任务定时启动Shell
脚本。
爬取模块
主要完成从互联网上爬取天气信息并进行解析。
- WeatherCrawler
package com.hust.grid.weather;
import org.apache.http.HttpEntity;
import org.apache.http.HttpStatus;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.DefaultHttpRequestRetryHandler;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import com.hust.grid.bean.WeatherInfo;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class WeatherCrawler {
private Logger logger = LoggerFactory.getLogger(WeatherCrawler.class);
public WeatherInfo crawlWeather(String url) {
CloseableHttpClient client = null;
HttpGet get;
WeatherInfo weatherInfo = null;
try {
client = HttpClients.custom().setRetryHandler(DefaultHttpRequestRetryHandler.INSTANCE).build();
RequestConfig config = RequestConfig
.custom()
.setConnectionRequestTimeout(3000)
.setConnectTimeout(3000)
.setSocketTimeout(30 * 60 * 1000)
.build();
get = new HttpGet(url);
get.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8");
get.setHeader("Accept-Encoding", "gzip, deflate");
get.setHeader("Accept-Language", "zh-CN,zh;q=0.8");
get.setHeader("Host", "www.weather.com.cn");
get.setHeader("Proxy-Connection", "keep-alive");
get.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36");
get.setConfig(config);
CloseableHttpResponse response = client.execute(get);
int statusCode = response.getStatusLine().getStatusCode();
if (statusCode == HttpStatus.SC_OK) {
HttpEntity entity = response.getEntity();
String content = EntityUtils.toString(entity, "utf8");
logger.debug("content =====>" + content);
if (content != null)
weatherInfo = parseResult(content);
}
} catch (Exception e) {
logger.error(e.getMessage());
} finally {
if (client != null) {
try {
client.close();
} catch (Exception e) {
logger.error("close client error " + e.getMessage());
}
}
}
return weatherInfo;
}
public WeatherInfo parseResult(String content) {
Document document = Jsoup.parse(content);
Element element = document.getElementById("7d");
Elements elements = element.getElementsByTag("ul");
Element clearFix = elements.get(0);
Elements lis = clearFix.getElementsByTag("li");
// 7 days weather info, we just take tomorrow weather info
Element tomorrow = lis.get(1);
logger.debug("tomorrow =====> " + tomorrow);
return parseWeatherInfo(tomorrow);
}
private WeatherInfo parseWeatherInfo(Element element) {
Elements weathers = element.getElementsByTag("p");
String weather = weathers.get(0).text();
String temp = weathers.get(1).text();
String wind = weathers.get(2).text();
WeatherInfo weatherInfo = new WeatherInfo(weather, temp, wind);
logger.info("---------------------------------------------------------------------------------");
logger.info("---------------------------------------------------------------------------------");
logger.info("weather is " + weather);