处理前:
处理后:
代码实现:
将文件的内容读取出来,因为网址与网站名称对应,所以需要对没有网址的那一行进行过滤,如果只有网址没有网站名称则只提取网站,
public static String readStringFromtxt(String txtpath) {
File file = new File(txtpath);
StringBuilder result = new StringBuilder();
try {
BufferedReader br = new BufferedReader(new FileReader(file));
String s = null;
while ((s = br.readLine()) != null) {
//没有网址的那行就过滤掉
if (s.contains("http")) {
//提取网址
String url = getUrl(s);
//提取网站名称
String replace = s.replace(url, "");
//System.out.println(url + " " + replace);
//将得到的数据拼接成自己想要的格式
result.append(System.lineSeparator() + "{\n\t" + "\"name\": \"" + replace + "\",\n\t" + "\"type\": \"url\",\n\t" + "\"url\": " + "\"" + url + "\"\n},");
}
}
br.close();
} catch (Exception e) {
e.printStackTrace();
}
return result.toString();
}
通过正则表达式对网址进行提取
public static String getUrl(String input) {
String regex = "(https?|ftp|file)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]+[-A-Za-z0-9+&@#/%=~_|]";
Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
return matcher.group();
}
return "";
}
打印出来的效果: