1 信用卡消费欺诈
信用卡消费欺诈是指在信用卡的使用过程中,通过不正当手段获取或使用信用卡资金,侵犯他人或银行的财产权益的行为。这种行为可能包括但不限于盗刷、伪造信用卡、冒用他人信用卡、恶意透支等
2 模拟场景
我们模拟不同账户的信用卡消费记录,通过分析实时的消费记录,针对常见的消费欺诈进行检测,检测出来的欺诈行为进行告警。
3 核心流程与代码
1)通过TransactionSource构建消费记录,主要包含accountId,timestamp,amount三个字段,分别代表账户id,时间戳,消费金额。
2)然后通过keyBy按照accountId账户进行分区;即同一样账户的消费记录进行检测,不同账户的消费记录没有关联性;
3)通过FraudDetector进行消费欺诈检测:
检测逻辑设定为:我们先实现第一版报警程序,对于一个账户,如果出现小于 $1 美元的交易后紧跟着一个大于 $500 的交易,就输出一个报警信息。
交易 3 和交易 4 应该被标记为欺诈行为,因为交易 3 是一个 $0.09 的小额交易,而紧随着的交易 4 是一个 $510 的大额交易。 另外,交易 7、8 和 交易 9 就不属于欺诈交易了,因为在交易 7 这个 $0.02 的小额交易之后,并没有跟随一个大额交易,而是一个金额适中的交易,这使得交易 7 到 交易 9 不属于欺诈行为。
package com.example.frauddetection;
import com.example.frauddetection.common.entity.Alert;
import com.example.frauddetection.common.entity.Transaction;
import com.example.frauddetection.common.sink.AlertSink;
import com.example.frauddetection.common.source.TransactionSource;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
public class FraudDetectionJob {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
DataStream<Transaction> transactions = env
.addSource(new TransactionSource())
.name("transactions");
DataStream<Alert> alerts = transactions
.keyBy(Transaction::getAccountId)
.process(new FraudDetector())
.name("fraud-detector");
alerts
.addSink(new AlertSink())
.name("send-alerts");
env.execute("Fraud Detection");
}
}
欺诈检测器需要在多个交易事件之间记住一些信息。仅当一个大额的交易紧随一个小额交易的情况发生时,这个大额交易才被认为是欺诈交易。 在多个事件之间存储信息就需要使用到 状态,这也是我们选择使用 KeyedProcessFunction 的原因。 它能够同时提供对状态和时间的细粒度操作,这使得我们能够在接下来的代码练习中实现更复杂的算法。
最直接的实现方式是使用一个 boolean 型的标记状态来表示是否刚处理过一个小额交易。 当处理到该账户的一个大额交易时,你只需要检查这个标记状态来确认上一个交易是是否小额交易即可。
然而,仅使用一个标记作为 FraudDetector
的类成员来记录账户的上一个交易状态是不准确的。 Flink 会在同一个 FraudDetector
的并发实例中处理多个账户的交易数据,假设,当账户 A 和账户 B 的数据被分发的同一个并发实例上处理时,账户 A 的小额交易行为可能会将标记状态设置为真,随后账户 B 的大额交易可能会被误判为欺诈交易。 当然,我们可以使用如 Map
这样的数据结构来保存每一个账户的状态,但是常规的类成员变量是无法做到容错处理的,当任务失败重启后,之前的状态信息将会丢失。 这样的话,如果程序曾出现过失败重启的情况,将会漏掉一些欺诈报警。
为了应对这个问题,Flink 提供了一套支持容错状态的原语,这些原语几乎与常规成员变量一样易于使用。
Flink 中最基础的状态类型是 ValueState,这是一种能够为被其封装的变量添加容错能力的类型。 ValueState
是一种 keyed state,也就是说它只能被用于 keyed context 提供的 operator 中,即所有能够紧随 DataStream#keyBy
之后被调用的operator。 一个 operator 中的 keyed state 的作用域默认是属于它所属的 key 的。 这个例子中,key 就是当前正在处理的交易行为所属的信用卡账户(key 传入 keyBy() 函数调用),而 FraudDetector
维护了每个帐户的标记状态。 ValueState
需要使用 ValueStateDescriptor
来创建,ValueStateDescriptor
包含了 Flink 如何管理变量的一些元数据信息。状态在使用之前需要先被注册。 状态需要使用 open()
函数来注册状态。
3.1 欺诈检测考虑时间
骗子们在小额交易后不会等很久就进行大额消费,这样可以降低小额测试交易被发现的几率。 比如,假设你为欺诈检测器设置了一分钟的超时,对于上边的例子,交易 3 和 交易 4 只有间隔在一分钟之内才被认为是欺诈交易。 Flink 中的 KeyedProcessFunction
允许您设置计时器,该计时器在将来的某个时间点执行回调函数。
让我们看看如何修改程序以符合我们的新要求:
- 当标记状态被设置为
true
时,设置一个在当前时间一分钟后触发的定时器。 - 当定时器被触发时,重置标记状态。
- 当标记状态被重置时,删除定时器。
要删除一个定时器,你需要记录这个定时器的触发时间,这同样需要状态来实现,所以你需要在标记状态后也创建一个记录定时器时间的状态。
package com.example.frauddetection;
import com.example.frauddetection.common.entity.Alert;
import com.example.frauddetection.common.entity.Transaction;
import org.apache.flink.api.common.state.ValueState;
import org.apache.flink.api.common.state.ValueStateDescriptor;
import org.apache.flink.api.common.typeinfo.Types;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.KeyedProcessFunction;
import org.apache.flink.util.Collector;
public class FraudDetector extends KeyedProcessFunction<Long, Transaction, Alert> {
private static final long serialVersionUID = 1L;
private static final double SMALL_AMOUNT = 1.00;
private static final double LARGE_AMOUNT = 500.00;
private static final long ONE_MINUTE = 60 * 1000;
private transient ValueState<Boolean> flagState;
private transient ValueState<Long> timerState;
@Override
public void open(Configuration parameters) {
ValueStateDescriptor<Boolean> flagDescriptor = new ValueStateDescriptor<>(
"flag",
Types.BOOLEAN);
flagState = getRuntimeContext().getState(flagDescriptor);
ValueStateDescriptor<Long> timerDescriptor = new ValueStateDescriptor<>(
"timer-state",
Types.LONG);
timerState = getRuntimeContext().getState(timerDescriptor);
}
@Override
public void processElement(
Transaction transaction,
Context context,
Collector<Alert> collector) throws Exception {
// Get the current state for the current key
Boolean lastTransactionWasSmall = flagState.value();
// Check if the flag is set
if (lastTransactionWasSmall != null) {
if (transaction.getAmount() > LARGE_AMOUNT) {
//Output an alert downstream
Alert alert = new Alert();
alert.setId(transaction.getAccountId());
collector.collect(alert);
}
// Clean up our state
cleanUp(context);
}
if (transaction.getAmount() < SMALL_AMOUNT) {
// set the flag to true
flagState.update(true);
long timer = context.timerService().currentProcessingTime() + ONE_MINUTE;
context.timerService().registerProcessingTimeTimer(timer);
timerState.update(timer);
}
}
@Override
public void onTimer(long timestamp, OnTimerContext ctx, Collector<Alert> out) {
// remove flag after 1 minute
timerState.clear();
flagState.clear();
}
private void cleanUp(Context ctx) throws Exception {
// delete timer
Long timer = timerState.value();
ctx.timerService().deleteProcessingTimeTimer(timer);
// clean up all state
timerState.clear();
flagState.clear();
}
}
3.2 相关依赖类
TransactionSource
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.example.frauddetection.common.source;
import com.example.frauddetection.common.entity.Transaction;
import org.apache.flink.annotation.Public;
import org.apache.flink.streaming.api.functions.source.FromIteratorFunction;
import java.io.Serializable;
import java.util.Iterator;
/**
* A stream of transactions.
*
* @deprecated This class is based on the {@link
* org.apache.flink.streaming.api.functions.source.SourceFunction} API, which is due to be
* removed. Use the new {@link org.apache.flink.api.connector.source.Source} API instead.
*/
@Public
public class TransactionSource extends FromIteratorFunction<Transaction> {
private static final long serialVersionUID = 1L;
public TransactionSource() {
super(new RateLimitedIterator<>(TransactionIterator.unbounded()));
}
private static class RateLimitedIterator<T> implements Iterator<T>, Serializable {
private static final long serialVersionUID = 1L;
private final Iterator<T> inner;
private RateLimitedIterator(Iterator<T> inner) {
this.inner = inner;
}
@Override
public boolean hasNext() {
return inner.hasNext();
}
@Override
public T next() {
try {
Thread.sleep(100);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
return inner.next();
}
}
}
TransactionIterator
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.example.frauddetection.common.source;
import com.example.frauddetection.common.entity.Transaction;
import java.io.Serializable;
import java.sql.Timestamp;
import java.util.Arrays;
import java.util.Iterator;
import java.util.List;
/** An iterator of transaction events. */
final class TransactionIterator implements Iterator<Transaction>, Serializable {
private static final long serialVersionUID = 1L;
private static final Timestamp INITIAL_TIMESTAMP = Timestamp.valueOf("2019-01-01 00:00:00");
private static final long SIX_MINUTES = 6 * 60 * 1000;
private final boolean bounded;
private int index = 0;
private long timestamp;
static TransactionIterator bounded() {
return new TransactionIterator(true);
}
static TransactionIterator unbounded() {
return new TransactionIterator(false);
}
private TransactionIterator(boolean bounded) {
this.bounded = bounded;
this.timestamp = INITIAL_TIMESTAMP.getTime();
}
@Override
public boolean hasNext() {
if (index < data.size()) {
return true;
} else if (!bounded) {
index = 0;
return true;
} else {
return false;
}
}
@Override
public Transaction next() {
Transaction transaction = data.get(index++);
transaction.setTimestamp(timestamp);
timestamp += SIX_MINUTES;
return transaction;
}
private static List<Transaction> data =
Arrays.asList(
new Transaction(1, 0L, 188.23),
new Transaction(2, 0L, 374.79),
new Transaction(3, 0L, 112.15),
new Transaction(4, 0L, 478.75),
new Transaction(5, 0L, 208.85),
new Transaction(1, 0L, 379.64),
new Transaction(2, 0L, 351.44),
new Transaction(3, 0L, 320.75),
new Transaction(4, 0L, 259.42),
new Transaction(5, 0L, 273.44),
new Transaction(1, 0L, 267.25),
new Transaction(2, 0L, 397.15),
new Transaction(3, 0L, 0.219),
new Transaction(4, 0L, 231.94),
new Transaction(5, 0L, 384.73),
new Transaction(1, 0L, 419.62),
new Transaction(2, 0L, 412.91),
new Transaction(3, 0L, 0.77),
new Transaction(4, 0L, 22.10),
new Transaction(5, 0L, 377.54),
new Transaction(1, 0L, 375.44),
new Transaction(2, 0L, 230.18),
new Transaction(3, 0L, 0.80),
new Transaction(4, 0L, 350.89),
new Transaction(5, 0L, 127.55),
new Transaction(1, 0L, 483.91),
new Transaction(2, 0L, 228.22),
new Transaction(3, 0L, 871.15),
new Transaction(4, 0L, 64.19),
new Transaction(5, 0L, 79.43),
new Transaction(1, 0L, 56.12),
new Transaction(2, 0L, 256.48),
new Transaction(3, 0L, 148.16),
new Transaction(4, 0L, 199.95),
new Transaction(5, 0L, 252.37),
new Transaction(1, 0L, 274.73),
new Transaction(2, 0L, 473.54),
new Transaction(3, 0L, 119.92),
new Transaction(4, 0L, 323.59),
new Transaction(5, 0L, 353.16),
new Transaction(1, 0L, 211.90),
new Transaction(2, 0L, 280.93),
new Transaction(3, 0L, 347.89),
new Transaction(4, 0L, 459.86),
new Transaction(5, 0L, 82.31),
new Transaction(1, 0L, 373.26),
new Transaction(2, 0L, 479.83),
new Transaction(3, 0L, 454.25),
new Transaction(4, 0L, 83.64),
new Transaction(5, 0L, 292.44));
}
AlertSink
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.example.frauddetection.common.sink;
import com.example.frauddetection.common.entity.Alert;
import org.apache.flink.annotation.PublicEvolving;
import org.apache.flink.streaming.api.functions.sink.SinkFunction;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/** A sink for outputting alerts. */
@PublicEvolving
@SuppressWarnings("unused")
public class AlertSink implements SinkFunction<Alert> {
private static final long serialVersionUID = 1L;
private static final Logger LOG = LoggerFactory.getLogger(AlertSink.class);
@Override
public void invoke(Alert value, Context context) {
//todo 通过控制台输出
// LOG.info(value.toString());
System.out.println(value.toString());
}
}
Transaction
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.example.frauddetection.common.entity;
import java.util.Objects;
/** A simple transaction. */
@SuppressWarnings("unused")
public final class Transaction {
private long accountId;
private long timestamp;
private double amount;
public Transaction() {}
public Transaction(long accountId, long timestamp, double amount) {
this.accountId = accountId;
this.timestamp = timestamp;
this.amount = amount;
}
public long getAccountId() {
return accountId;
}
public void setAccountId(long accountId) {
this.accountId = accountId;
}
public long getTimestamp() {
return timestamp;
}
public void setTimestamp(long timestamp) {
this.timestamp = timestamp;
}
public double getAmount() {
return amount;
}
public void setAmount(double amount) {
this.amount = amount;
}
@Override
public boolean equals(Object o) {
if (this == o) {
return true;
} else if (o == null || getClass() != o.getClass()) {
return false;
}
Transaction that = (Transaction) o;
return accountId == that.accountId
&& timestamp == that.timestamp
&& Double.compare(that.amount, amount) == 0;
}
@Override
public int hashCode() {
return Objects.hash(accountId, timestamp, amount);
}
@Override
public String toString() {
return "Transaction{"
+ "accountId="
+ accountId
+ ", timestamp="
+ timestamp
+ ", amount="
+ amount
+ '}';
}
}
Alert
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.example.frauddetection.common.entity;
import java.util.Objects;
/** A simple alert event. */
@SuppressWarnings("unused")
public final class Alert {
private long id;
public long getId() {
return id;
}
public void setId(long id) {
this.id = id;
}
@Override
public boolean equals(Object o) {
if (this == o) {
return true;
} else if (o == null || getClass() != o.getClass()) {
return false;
}
Alert alert = (Alert) o;
return id == alert.id;
}
@Override
public int hashCode() {
return Objects.hash(id);
}
@Override
public String toString() {
return "Alert{" + "id=" + id + '}';
}
}
pom依赖
<dependencies>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java</artifactId>
<version>1.19.0</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients</artifactId>
<version>1.19.0</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>1.19.0</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-runtime</artifactId>
<version>1.19.0</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-runtime-web</artifactId>
<version>1.19.0</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-core</artifactId>
<version>1.19.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.10.2</version>
</dependency>
</dependencies>