【Java爬虫】008-网页内容解析:JSON解析

一、JSON矫正

1、概述

使用网络爬虫向服务器发送请求时,服务器经常返回的数据是包含JSON的字符串,如下所示:

jQuery6({
	"id":"07",
	"language":"C++",
	"edition":"second",
	"author":"E.Balagurusamy"
})

上述字符串虽包含JSON,但并不能直接用org.json、Gson和Fastjson等工具进行直接解析,因为其头部和尾部包含多余的字符(“jQuery6(”和“)”)。为使上述字符串能够正常解析,需要对其进行预处理(掐头去尾)操作,将其转化成标准的JSON字符串。

 

2、代码示例

package com.zb.book.parse;

//JSON预处理
public class ParseJSON {
    public static void main(String[] args) {
        //JSON
        String json = "jQuery6({\n" +
                "\t\"id\":\"07\",\n" +
                "\t\"language\":\"C++\",\n" +
                "\t\"edition\":\"second\",\n" +
                "\t\"author\":\"E.Balagurusamy\"\n" +
                "})";
        //预处理
        String attr = json.split("\\(")[1];
        System.out.println(attr.substring(0,attr.length()-1));
    }
}

运行结果:

{
	"id":"07",
	"language":"C++",
	"edition":"second",
	"author":"E.Balagurusamy"
}

 

3、补充

与处理好的JSON字符串可以复制到JSON在线校准网站进行校准(常用);

 

二、org.json解析JSON

1、概述

org.json是Java中常用的一款JSON解析工具,其常用的两个类是JSONObject和JSONArray;

 

2、Maven坐标

<!-- https://mvnrepository.com/artifact/org.json/json -->
<dependency>
    <groupId>org.json</groupId>
    <artifactId>json</artifactId>
    <version>20200518</version>
</dependency>

 

3、JSONObject类

 

4、代码演示

说明:

在解析数据时,常用getString(String key)方法获取JSON数据中key值对应的value值;

代码:

package com.zb.book.parse;

import org.json.JSONObject;

//JSON解析
public class ParseJSON {
    public static void main(String[] args) {
        //JSON
        String json = "jQuery6({\n" +
                "\t\"id\":\"07\",\n" +
                "\t\"language\":\"C++\",\n" +
                "\t\"edition\":\"second\",\n" +
                "\t\"author\":\"E.Balagurusamy\"\n" +
                "})";
        //预处理
        String attr = json.split("\\(")[1];
        attr = attr.substring(0,attr.length()-1);
        //解析
        JSONObject jsonObject = new JSONObject(attr);
        System.out.println(jsonObject.getString("id"));
        System.out.println(jsonObject.getString("language"));
        System.out.println(jsonObject.getString("edition"));
        System.out.println(jsonObject.getString("author"));
    }
}

运行结果:

07
C++
second
E.Balagurusamy

 

5、JSONArray类

概述:

JSONArray类的功能是解析JSON数组,该类中包括一些实例化JSONArray对象的构造方法、获取指定JSONObject对象的方法等。

 

代码演示:

package com.zb.book.parse;

import org.json.JSONArray;
import org.json.JSONObject;

//JSON解析
public class ParseJSON {
    public static void main(String[] args) {
        //JSON
        String json = "jQuery6([{\n" +
                "\t\"id\":\"07\",\n" +
                "\t\"language\":\"C++\",\n" +
                "\t\"edition\":\"second\",\n" +
                "\t\"author\":\"E.Balagurusamy\"\n" +
                "},{\n" +
                "\t\"id\":\"08\",\n" +
                "\t\"language\":\"C++\",\n" +
                "\t\"edition\":\"second\",\n" +
                "\t\"author\":\"E.Balagurusamy\"\n" +
                "}])";
        //预处理
        String attr = json.split("\\(")[1];
        attr = attr.substring(0,attr.length()-1);
        //解析
        JSONArray jsonArray = new JSONArray(attr);
        for (int i = 0; i < jsonArray.length(); i++) {
            JSONObject jsonObject = jsonArray.getJSONObject(i);
            System.out.println(jsonObject.getString("id"));
            System.out.println(jsonObject.getString("language"));
            System.out.println(jsonObject.getString("edition"));
            System.out.println(jsonObject.getString("author"));
            System.out.println();
        }
    }
}

运行结果:

07
C++
second
E.Balagurusamy

08
C++
second
E.Balagurusamy

 

三、Gson解析JSON

1、概述

Gson是Google提供的处理JSON数据的Java类库,主要用于转换Java对象和JSON对象;

 

2、Maven坐标

<!-- https://mvnrepository.com/artifact/com.google.code.gson/gson -->
<dependency>
    <groupId>com.google.code.gson</groupId>
    <artifactId>gson</artifactId>
    <version>2.8.6</version>
</dependency>

 

3、代码示例

解析JSON对象:

package com.zb.book.parse.gson;

import com.google.gson.Gson;

//Gson解析JSON对象
public class ParseJSONObject {
    public static void main(String[] args) {
        //实例化Gson对象
        Gson gson = new Gson();//方法一
        //Gson gson = new GsonBuilder().create();方法二
        String json = "{\n" +
                "\t\"name\":\"Java程序设计\",\n" +
                "\t\"page\":\"800\",\n" +
                "\t\"money\":\"98\"\n" +
                "}";
        //解析成Book对象
        Book book = gson.fromJson(json, Book.class);
        System.out.println(book.toString());
        //book{name='Java程序设计', page='800', money='98'}
    }
}
class Book {
    private String name;
    private String page;
    private String money;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getPage() {
        return page;
    }

    public void setPage(String page) {
        this.page = page;
    }

    public String getMoney() {
        return money;
    }

    public void setMoney(String money) {
        this.money = money;
    }

    @Override
    public String toString() {
        return "book{" +
                "name='" + name + '\'' +
                ", page='" + page + '\'' +
                ", money='" + money + '\'' +
                '}';
    }
}

解析JSON对象数组:

package com.zb.book.parse.gson;

import com.google.gson.Gson;
import com.google.gson.reflect.TypeToken;

import java.lang.reflect.Type;
import java.util.List;

//Gson解析JSON对象
public class ParseJSONObject {
    public static void main(String[] args) {
        //实例化Gson对象
        Gson gson = new Gson();//方法一
        //Gson gson = new GsonBuilder().create();方法二
        String json = "[{\n" +
                "\t\"name\":\"Java程序设计\",\n" +
                "\t\"page\":\"800\",\n" +
                "\t\"money\":\"98\"\n" +
                "},{\n" +
                "\t\"name\":\"C语言程序设计\",\n" +
                "\t\"page\":\"700\",\n" +
                "\t\"money\":\"88\"\n" +
                "}]";
        //TypeToken操作,可支持类型包括泛型
        Type listType = new TypeToken<List<Book>>() {}.getType();
        //转化成集合
        List<Book> list = gson.fromJson(json, listType);
        //循环打印
        for (Book book : list) {
            System.out.println(book.toString());
        }
        //book{name='Java程序设计', page='800', money='98'}
        //book{name='C语言程序设计', page='700', money='88'}
    }
}
class Book {
    private String name;
    private String page;
    private String money;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getPage() {
        return page;
    }

    public void setPage(String page) {
        this.page = page;
    }

    public String getMoney() {
        return money;
    }

    public void setMoney(String money) {
        this.money = money;
    }

    @Override
    public String toString() {
        return "book{" +
                "name='" + name + '\'' +
                ", page='" + page + '\'' +
                ", money='" + money + '\'' +
                '}';
    }
}

解析复杂嵌套式JSON数据:

json示例:

{
	"name" : "訾博",
	"tall" : "183CM",
	"books" : [
		{
		"name":"Java程序设计",
		"page":"800",
		"money":"98"
		},{
			"name":"C语言程序设计",
			"page":"700",
			"money":"88"
		}
	]
}

java代码:

package com.zb.book.parse.gson;

import com.google.gson.Gson;

import java.util.List;

//Gson解析JSON对象
public class ParseJSONObject {
    public static void main(String[] args) {
        //实例化Gson对象
        Gson gson = new Gson();//方法一
        //Gson gson = new GsonBuilder().create();方法二
        String json = "{\n" +
                "\t\"name\" : \"訾博\",\n" +
                "\t\"tall\" : \"183CM\",\n" +
                "\t\"books\" : [\n" +
                "\t\t{\n" +
                "\t\t\"name\":\"Java程序设计\",\n" +
                "\t\t\"page\":\"800\",\n" +
                "\t\t\"money\":\"98\"\n" +
                "\t\t},{\n" +
                "\t\t\t\"name\":\"C语言程序设计\",\n" +
                "\t\t\t\"page\":\"700\",\n" +
                "\t\t\t\"money\":\"88\"\n" +
                "\t\t}\n" +
                "\t]\n" +
                "}";
        //转化成集合
        People people = gson.fromJson(json, People.class);
        //打印
        System.out.println(people);
        List<Book> books = people.getBooks();
        for (Book book : books) {
            System.out.println(book);
        }
    }
}
class People{
    private String name;
    private String tall;
    private List<Book> books;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getTall() {
        return tall;
    }

    public void setTall(String tall) {
        this.tall = tall;
    }

    public List<Book> getBooks() {
        return books;
    }

    public void setBooks(List<Book> books) {
        this.books = books;
    }

    @Override
    public String toString() {
        return "People{" +
                "name='" + name + '\'' +
                ", tall='" + tall + '\'' +
                ", books=" + books +
                '}';
    }
}
class Book {
    private String name;
    private String page;
    private String money;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getPage() {
        return page;
    }

    public void setPage(String page) {
        this.page = page;
    }

    public String getMoney() {
        return money;
    }

    public void setMoney(String money) {
        this.money = money;
    }

    @Override
    public String toString() {
        return "book{" +
                "name='" + name + '\'' +
                ", page='" + page + '\'' +
                ", money='" + money + '\'' +
                '}';
    }
}

运行结果:

People{name='訾博', tall='183CM', books=[book{name='Java程序设计', page='800', money='98'}, book{name='C语言程序设计', page='700', money='88'}]}
book{name='Java程序设计', page='800', money='98'}
book{name='C语言程序设计', page='700', money='88'}

 

四、Fastjson解析JSON

1、概述

Fastjson是阿里巴巴基于Java语言开发的高性能且功能完善的JSON操作类库;

Fastjson解析JSON的方式与Gson类似,都是讲JSON数据转化为JavaBean对象;

 

2、Maven坐标

<!-- https://mvnrepository.com/artifact/com.alibaba/fastjson -->
<dependency>
    <groupId>com.alibaba</groupId>
    <artifactId>fastjson</artifactId>
    <version>1.2.73</version>
</dependency>

 

3、代码示例

解析JSON对象:

package com.zb.book.parse.fastjson;

import com.alibaba.fastjson.JSON;

//使用fastJson解析JSON对象
public class ParseJSONObject {
    public static void main(String[] args) {
        String json = "{\n" +
                "\t\"name\":\"Java程序设计\",\n" +
                "\t\"page\":\"800\",\n" +
                "\t\"money\":\"98\"\n" +
                "}";
        //使用fastJson解析JSON对象
        Book book = JSON.parseObject(json, Book.class);
        System.out.println(book.toString());
        //book{name='Java程序设计', page='800', money='98'}
    }
}
class Book {
    private String name;
    private String page;
    private String money;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getPage() {
        return page;
    }

    public void setPage(String page) {
        this.page = page;
    }

    public String getMoney() {
        return money;
    }

    public void setMoney(String money) {
        this.money = money;
    }

    @Override
    public String toString() {
        return "book{" +
                "name='" + name + '\'' +
                ", page='" + page + '\'' +
                ", money='" + money + '\'' +
                '}';
    }
}

解析JSON数组:

package com.zb.book.parse.fastjson;

import com.alibaba.fastjson.JSON;

import java.util.List;

//使用fastJson解析JSON对象
public class ParseJSONObject {
    public static void main(String[] args) {
        String json = "[\n" +
                "\t{\n" +
                "\t\"name\":\"Java程序设计\",\n" +
                "\t\"page\":\"800\",\n" +
                "\t\"money\":\"98\"\n" +
                "\t},{\n" +
                "\t\t\"name\":\"C语言程序设计\",\n" +
                "\t\t\"page\":\"700\",\n" +
                "\t\t\"money\":\"88\"\n" +
                "\t}\n" +
                "]";
        //使用fastJson解析JSON对象
        //方法一
        List<Book> books = JSON.parseArray(json, Book.class);
        //方法二
        //List<Book> books = JSON.parseObject(json, new TypeReference<List<Book>>() {});
        for (Book book : books) {
            System.out.println(book.toString());
        }
        //book{name='Java程序设计', page='800', money='98'}
        //book{name='C语言程序设计', page='700', money='88'}
    }
}
class Book {
    private String name;
    private String page;
    private String money;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getPage() {
        return page;
    }

    public void setPage(String page) {
        this.page = page;
    }

    public String getMoney() {
        return money;
    }

    public void setMoney(String money) {
        this.money = money;
    }

    @Override
    public String toString() {
        return "book{" +
                "name='" + name + '\'' +
                ", page='" + page + '\'' +
                ", money='" + money + '\'' +
                '}';
    }
}

 

 

 

 

 

 

 

 

 

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值