当我们在使用Apache NiFi时,当已有的processor不支持我们现有需求时,需要我们自己编写自定义处理器并且发布,以至于更好的支持我们的需求;
下面我写一个将JSON数据转化为Excel并且输出到本地文件的自定义Processor Demo
目录
1.1导入Maven Apache NiFi依赖,我这边nifi版本是1.26.0
2.3将打好的nar包拷贝进nifi安装目录下lib目录编辑
实现方式
1.编写Java代码
1.1导入Maven Apache NiFi依赖,我这边nifi版本是1.26.0
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-api</artifactId>
<version>${nifi.version}</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-utils</artifactId>
<version>${nifi.version}</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-mock</artifactId>
<version>${nifi.version}</version>
<scope>test</scope>
</dependency>
1.2实现AbstractProcessor
package com.example.testjsontoexcel;
import cn.hutool.json.JSONUtil;
import org.apache.nifi.annotation.behavior.WritesAttribute;
import org.apache.nifi.annotation.behavior.WritesAttributes;
import org.apache.nifi.annotation.documentation.CapabilityDescription;
import org.apache.nifi.annotation.documentation.Tags;
import org.apache.nifi.annotation.lifecycle.OnScheduled;
import org.apache.nifi.components.PropertyDescriptor;
import org.apache.nifi.expression.ExpressionLanguageScope;
import org.apache.nifi.flowfile.FlowFile;
import org.apache.nifi.processor.*;
import org.apache.nifi.processor.exception.ProcessException;
import org.apache.nifi.processor.util.StandardValidators;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.json.JSONArray;
import org.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.io.*;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.*;
@CapabilityDescription("将JSON内容转换为Excel (XLSX)文件.")
@Tags({"json", "excel", "xlsx", "convert"})
@WritesAttributes({@WritesAttribute(attribute="mime.type", description="Sets the mime type to application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")})
public class ConvertJsonToExcel extends AbstractProcessor {
private static final Logger logger = LoggerFactory.getLogger(ConvertJsonToExcel.class);
public static final PropertyDescriptor FILE_EXCEL_NAME = new PropertyDescriptor.Builder()
.name("Excel文件名")
.description("填写你的Excel文件输出名称,默认值为output")
.required(true)
.defaultValue("output")
.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
.expressionLanguageSupported(ExpressionLanguageScope.NONE)
.build();
public static final PropertyDescriptor OUTPUT_PATH = new PropertyDescriptor.Builder()
.name("输出文件的路径")
.description("此处填写Excel的文件输入路径,你可以在这个路径找到文件")
.required(true)
.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
.expressionLanguageSupported(ExpressionLanguageScope.VARIABLE_REGISTRY)
.build();
static final Relationship REL_SUCCESS = new Relationship.Builder()
.name("success")
.description("success description")
.build();
static final Relationship REL_FAILURE = new Relationship.Builder()
.name("failure")
.description("failure description")
.build();
private List<PropertyDescriptor> descriptors;
private Set<Relationship> relationships;
@Override
protected void init(ProcessorInitializationContext context) {
descriptors = Collections.unmodifiableList(Arrays.asList(FILE_EXCEL_NAME, OUTPUT_PATH));
relationships = Collections.unmodifiableSet(new HashSet<>(Arrays.asList(REL_SUCCESS, REL_FAILURE)));
}
@Override
public Set<Relationship> getRelationships() {
return relationships;
}
@Override
public final List<PropertyDescriptor> getSupportedPropertyDescriptors() {
return descriptors;
}
@OnScheduled
public void onScheduled(ProcessContext context) {
// Initialization logic if needed
}
@Override
public void onTrigger(ProcessContext context, ProcessSession session) throws ProcessException {
FlowFile flowFile = session.get();
if (flowFile == null) {
return;
}
String filenamePre = context.getProperty(FILE_EXCEL_NAME).evaluateAttributeExpressions(flowFile).getValue();
String outputPath = context.getProperty(OUTPUT_PATH).evaluateAttributeExpressions().getValue();
try {
InputStream inputStream = session.read(flowFile);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
convertJsonToExcel(inputStream, outputStream);
byte[] excelBytes = outputStream.toByteArray();
// 设置输出文件名和路径
String filename = filenamePre + ".xlsx";
flowFile = session.putAttribute(flowFile, "filename", filename);
Path outputPathPath = Paths.get(outputPath, filename);
// 确保输出目录存在
Files.createDirectories(outputPathPath.getParent());
// 写入文件
Files.write(outputPathPath, excelBytes);
// 更新FlowFile属性并传输
flowFile = session.write(flowFile, out -> {});
session.putAttribute(flowFile, "mime.type", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
session.transfer(flowFile, REL_SUCCESS);
logger.info("Successfully converted JSON to Excel and saved at {}", outputPathPath);
} catch (IOException e) {
logger.error("Failed to convert JSON to Excel or save file", e);
session.transfer(flowFile, REL_FAILURE);
}
}
private void convertJsonToExcel(InputStream jsonInputStream, OutputStream excelOutputStream) throws IOException {
Workbook workbook = new XSSFWorkbook();
Sheet sheet = workbook.createSheet("Sheet1");
try (BufferedReader reader = new BufferedReader(new InputStreamReader(jsonInputStream))) {
JSONArray jsonArray = new JSONArray(reader.readLine());
for (int i = 0; i < jsonArray.length(); i++) {
JSONObject jsonObject = jsonArray.getJSONObject(i);
Row row = sheet.createRow(i);
Iterator<String> keys = jsonObject.keys();
int colNum = 0;
while (keys.hasNext()) {
String key = keys.next();
Object value = jsonObject.get(key);
Cell cell = row.createCell(colNum++);
if (value instanceof Number) {
cell.setCellValue(((Number) value).doubleValue());
} else {
cell.setCellValue(value.toString());
}
}
}
workbook.write(excelOutputStream);
}
}
}
其中PropertyDescriptor为参数,也就是我们在配置处理器时的属性,我这边需要传入两个参数,Excel的输出文件名和文件保存路径
1.3配置Processor路径
在项目resource目录下新增目录META-INF,META-INF目录下再新建services目录,最后在services目录下新建org.apache.nifi.processor.Processor文件,文件内容为上面写的处理器全路径,如下图
com.example.testjsontoexcel.ConvertJsonToExcel
2.打包
2.1配置Maven插件,用于打nar包
在pom中添加插件
<plugins>
<plugin>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-nar-maven-plugin</artifactId>
<version>1.5.1</version>
<extensions>true</extensions>
</plugin>
</plugins>
在pom根目录需要添加packaging配置,这样打完的包才是nifi能识别的nar包
<packaging>nar</packaging>
2.2使用Maven 打包
mvn clean package
2.3将打好的nar包拷贝进nifi安装目录下lib目录![](https://img-blog.csdnimg.cn/direct/dfa7f5708c7d41ed8deaff6171eea1be.png)
2.4重启nifi
bin/nifi.sh restart #重启命令
2.5添加Processor时,选择我们的自定义处理器
3参考nifi源码
具体可以参考nifi官方源码,参考Processor的写法
GitHub地址:GitHub - apache/nifi: Apache NiFi