一、使用avro-maven插件为avsc文件生成对应的java类:
在项目的pom.xml中增加依赖及插件如下:
org.apache.avro
avro
1.8.1
...
org.apache.maven.plugins
maven-compiler-plugin
1.6
1.6
org.apache.avro
avro-maven-plugin
1.8.1
generate-sources
schema
${project.basedir}/src/main/avro/
${project.basedir}/src/main/java/
执行mvn的install命令后,提示:
[INFO] Final Memory: 16M/217M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.avro:avro-maven-plugin:1.8.1:schema (default) on project study: neither sourceDirectory: D:\fvp-workspace\study\src\main\avro or testSourceDirectory: D:\fvp-workspace\study\src\test\avro are directories -> [Help 1]
[ERROR]
需要注意下,需要手动在${project.basedir}/src/main和${project.basedir}/src/test下建立avro文件夹。avro文件夹就是后面存放Avro的schema文件了(*.avsc)。
1.1、定义schema
使用JSON为Avro定义schema。schema由基本类型(null,boolean, int, long, float, double, bytes 和string)和复杂类型(record, enum, array, map, union, 和fixed)组成。例如,以下定义一个user的schema,在main目录下创建一个avro目录,然后在avro目录下新建文件 user.avsc :
{"namespace": "com.sf.study.avro",
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "favorite_number", "type": ["int", "null"]},
{"name": "favorite_color", "type": ["string", "null"]}
]
}
如IDE的截图所示:
1.2、用schema生成类文件
在这里,因为使用avro插件,所以,直接输入以下命令,maven插件会自动帮我们生成类文件:
mvn clean install
然后在刚才配置的目录下就会生成相应的类,如下:
如果不使用插件,也可以使用avro-tools来生成:
java -jar /path/to/avro-tools-1.8.1.jar compile schema
1.3、使用前面生成的类
在前面,类文件已经创建好了,接下来,可以使用刚才自动生成的类来创建用户了:
package com.sf.study.avro;
public class CreateUserTest {
public static void main(String[] args) {
User user1 = new User();
user1.setName("zhangsan");
user1.setFavoriteNumber(256);
// Leave favorite color null
// Alternate constructor
User user2 = new User("lisi", 7, "red");
// Construct via builder
User user3 = User.newBuilder()
.setName("wangwu")
.setFavoriteColor("blue")
.setFavoriteNumber(null)
.build();
}
}
1.4、序列化
把前面创建的用户序列化并存储到磁盘文件:
// Serialize user1, user2 and user3 to disk
DatumWriter userDatumWriter = new SpecificDatumWriter(User.class);
DataFileWriter dataFileWriter = new DataFileWriter(userDatumWriter);
try {
dataFileWriter.create(user1.getSchema(), new File("users.avro"));
dataFileWriter.append(user1);
dataFileWriter.append(user2);
dataFileWriter.append(user3);
dataFileWriter.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
这里,我们是序列化user到文件users.avro
1.5、反序列化
接下来,我们对序列化后的数据进行反序列化:
public static void unserialize() {
try {
// Deserialize Users from disk
DatumReader userDatumReader = new SpecificDatumReader(User.class);
DataFileReader dataFileReader;
dataFileReader = new DataFileReader(new File("users.avro"), userDatumReader);
User user = null;
while (dataFileReader.hasNext()) {
// Reuse user object by passing it to next(). This saves us from
// allocating and garbage collecting many objects for files with
// many items.
user = dataFileReader.next(user);
System.out.println(user);
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
输出结果为:
{"name": "Alyssa", "favorite_number": 256, "favorite_color": null}
{"name": "Ben", "favorite_number": 7, "favorite_color": "red"}
{"name": "Charlie", "favorite_number": null, "favorite_color": "blue"}