背景
应用IDEA ,通过Maven,编写MapReduce的DistuibutedCount代码,遇到问题:
在项目执行package命令时,出现如下错误指示:
[ERROR] Failed to execute goal org.apache.avro:avro-maven-plugin:1.7.7:schema (default) on project mapreduce_learn: neither sourceDirectory: H:\Source_Code\mapreduce_learn\src\main\avro or testSourceDirectory: H:\Source_Code\mapreduce_learn\src\test\avro
pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.akun</groupId>
<artifactId>mapreduce_learn</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-mapreduce-client-core -->
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>1.8.0</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-mapred</artifactId>
<version>1.8.0</version>
</dependency>
...
<build>
<plugins>
<plugin>
<groupId>org.apache.avro</groupId>
<artifactId>avro-maven-plugin</artifactId>
<version>1.7.7</version>
<executions>
<execution>
<phase>generate-sources</phase>
<goals>
<goal>schema</goal>
</goals>
<configuration>
<sourceDirectory>${project.basedir}/src/main/avro</sourceDirectory>
<outputDirectory>${project.basedir}/src/main/java</outputDirectory>
</configuration>
</execution>
</executions>
</plugin>
...
</plugins>
</build>
</project>
解决方式
出现问题的原因是之前缺少avro文件夹以及AVAVSC文件。
person.avsc
{"namespace": "com.twq.avro",
"type": "record",
"name": "Person",
"fields": [
{"name": "name", "type": "string"},
{"name": "age", "type": ["int", "null"]},
{"name": "favorite_number", "type": ["int", "null"]},
{"name": "favorite_color", "type": ["string", "null"]}
]
}
ncdcrecord.avsc
{"namespace": "com.twq.avro",
"type": "record",
"name": "NcdcRecord",
"fields": [
{"name": "stationId", "type": "string"},
{"name": "stationName", "type": "string"},
{"name": "stationCity", "type": "string"},
{"name": "stationState", "type": "string"},
{"name": "stationICAO", "type": "string"},
{"name": "stationLatitude", "type": "string"},
{"name": "stationLongitude", "type": "string"},
{"name": "stationElev", "type": "string"},
{"name": "stationBeginTime", "type": "string"},
{"name": "stationEndTime", "type": "string"},
{"name": "year", "type": "string"},
{"name": "month", "type": "string"},
{"name": "day", "type": "string"},
{"name": "meanTemp", "type": "double"},
{"name": "meanTempCount", "type": "int"},
{"name": "meanDewPointTemp", "type": "double"},
{"name": "meanDewPointTempCount", "type": "int"},
{"name": "meanSeaLevelPressure", "type": "double"},
{"name": "meanSeaLevelPressureCount", "type": "int"},
{"name": "meanStationPressure", "type": "double"},
{"name": "meanStationPressureCount", "type": "int"},
{"name": "meanVisibility", "type": "double"},
{"name": "meanVisibilityCount", "type": "int"},
{"name": "meanWindSpeed", "type": "double"},
{"name": "meanWindSpeedCount", "type": "int"},
{"name": "maxSustainedWindSpeed", "type": "double"},
{"name": "maxGustWindSpeed", "type": "double"},
{"name": "maxTemp", "type": "double"},
{"name": "maxTempFlag", "type": "string"},
{"name": "minTemp", "type": "double"},
{"name": "minTempFlag", "type": "string"},
{"name": "totalPrecipitation", "type": "double"},
{"name": "totalPrecipitationFlag", "type": "string"},
{"name": "snowDepth", "type": "double"},
{"name": "hasFog", "type": "boolean"},
{"name": "hasRain", "type": "boolean"},
{"name": "hasSnow", "type": "boolean"},
{"name": "hasHail", "type": "boolean"},
{"name": "hasThunder", "type": "boolean"},
{"name": "hasTornado", "type": "boolean"}
]
}
Why?
首先需要知道Avro是干什么的?
avro
大数据通用的序列化器——Apache Avro
简称 Avro是一种与编程语言无关的序列化格式。Doug Cutting 创建了这个项目,目的是提供一种共享数据文件的方式。
学习参考1
参考学习2