本方案的需求背景
用户要明确知道有没有下单成功,没办法异步,也就是对用户而言必须是同步的
本次额优化,仅仅是纯DB层
技术点
内存队列合并
分布式事务
canal更新redis
内存队列合并
每个商品一个队列,不能放redis因为要依赖本地事务
用队列进行批量扣减,如果库存不够了,这个批次的扣减岂不是都失败了?
退化为循环,优先订单数量大的先尝试去扣减
秒杀服务多实例,不同实例维护的内存队列不同,同一个用户的一笔混合订单,不会分配到不同实例,只会一个实例的多个队列
分布式事务
调用方线程将扣库存请求丢入内存队列,异步合并线程从内存队列中拿一批扣库存请求,进行合并再统一扣减库存并记录扣减流水
调用方线程,和异步合并线程这两方要保持数据一致性,就涉及到了多线程事务,而多线程事务就约等于分布式事务
本次用的就是类似seata的实现方案,记录流水 + 发起方重试确认的机制
极端情况
异步合并线程,拿到队列的数据在执行合并时,因为jvm gc等原因阻塞,导致扣减库存的调用方认为失败了,调用方就发起回滚,此时异步线程还在阻塞没有完成合并请求写入数据库的动作,调用方此时去数据库查询当然就发现此扣减库存的流水不存在,不做任何回滚操作就返回了,但此后异步线程继续执行了扣减了库存(上游认为失败了,本地却成功了),这种情况的话如何避免?
这个问题很好!这就是分布式事务的极端情况!一般我们会让回滚的消息重试几次!用的rocketMQ是有阶梯式时间间隔,一般几分钟内重试都没流水就认为确实没抠减了
服务挂掉的时候,跟视频里提到的异步线程失败一样,上游会发回滚消息,最终一致
发布上线会考虑平滑下线,队列当然会设置长度限制不会OOM。
canal更新redis
时间窗口更新
binlog同步乱序问题
mysql binlog-> canal -> MQ -> 消费者 -> Redis/mysql
订单数据同步中的消息乱序的根本原因就是,属于同一个订单的binlog消息进入了不同的MessageQueue,进而导致一个订单有前后顺序的binlog被不同机器上的consumer获取并处理,从而导致binlog乱序,比如,旧的update语句覆盖了新的update语句
如何解决binlog乱无问题?
给订单表或者库存表,加一个版本号字段,执行前先检验一下版本号
1、原子性的问题的话,很多采用redis+lua的这种方案,这个您怎么看?
1、因为先update再去Insert,那么在Insert的这个时间也是持有update的这个行锁的。这个是不是说update执行之后,事务还没有提交,所以还是存在行锁,所以Insert的时候,也是持有update这个行锁是吗?
Insert和update在一个事务,事务开启的时候,只有执行到update语句的时候才开始对库存加上行锁,事务提交的时候锁释放。所以先执行insert语句的时候库存还没有加锁,这个就是前面说的减少锁的持有时间
2、内存队列扩容越大,单机合并的效果越差?这个可以详细说一下吗?是不是说扩容到多台服务器之后,每个服务器负载到的请求就变少了,所以单机的内存队列合并的就越少了,发挥不了更大的作用了?
package com.switchvov.network.example;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.*;
public class AsyncPaymentService {
private static BlockingQueue<RequestPromise> queue = new LinkedBlockingQueue<>(10);
private static ExecutorService pool = Executors.newFixedThreadPool(10);
private static CountDownLatch countDownLatch = new CountDownLatch(10);
private static Integer stockCount = 6;
public static void main(String[] args) throws Exception {
mergeJob();
TimeUnit.SECONDS.sleep(2);
List<Future> futureList = new ArrayList<>();
for ( int i=0;i<10;i++){
final Long userId = Long.valueOf(i);
Future future = pool.submit( () -> {
UserRequest userRequest = new UserRequest(1L, userId , 1);
Result result = null;
try {
countDownLatch.countDown();
countDownLatch.await(1,TimeUnit.SECONDS);
result = operate(userRequest);
} catch (InterruptedException e) {
e.printStackTrace();
}
return result;
});
futureList.add(future);
}
System.out.println("===========================================================");
for (Future future : futureList) {
System.out.println(future.get());
}
pool.shutdown();
}
private static Result operate(UserRequest userRequest) throws InterruptedException {
// TODO 阈值判断
// TODO 队列的创建
RequestPromise promise = new RequestPromise(userRequest);
/*
// TODO 可以看到这里,阻塞队列offer方法进队是否成功,不是通过捕捉异常判断的,而是通过返回值的boolean判断的;
try {
blockingDeque.offer(promise,100, TimeUnit.MICROSECONDS);
}catch (Exception e){
return new Result(false,"系统繁忙");
}
*/
/*
// TODO 这段代码放在sync代码块外面,就会有一种风险:notify比wait先执行,造成所有的线程都得等满200ms
boolean enqueueSuccess = queue.offer(promise, 100, TimeUnit.MICROSECONDS);
if (!enqueueSuccess) {
return new Result(false, "系统繁忙");
}*/
// 进队成功后,就开始同步wait,直到异步线程处理完成后唤醒自己
synchronized (promise) {
boolean enqueueSuccess = queue.offer(promise, 100, TimeUnit.MICROSECONDS);
if (!enqueueSuccess) {
return new Result(false, "系统繁忙");
}
// TODO 可以看到这里的bug:wait方法超时以后,是不会主动抛出异常的,只有在被别的线程中断时,才会抛出中断异常
try {
// 实际jdk自身,并没有提供一个直接的判断,是因为被notify了还是wait超时而退出
promise.wait(200);
// promise.wait();
if ( promise.getResult() == null){
// 正常情况下,不论是下单成功还是下单失败,promise.getResult()都是有值的,不会为null
return new Result(false, "等待超时");
}
} catch (Exception e) {
// return new Result(false, "超时");
e.printStackTrace();
}
}
return promise.getResult();
}
private static void mergeJob() {
new Thread(() -> {
List<RequestPromise> tmpList = new ArrayList<>();
while (true) {
// 如果队列中没有元素,则睡眠10ms,进入下一轮扫描
// if (blockingDeque.peek() == null) {
if (queue.isEmpty()) {
try {
TimeUnit.MILLISECONDS.sleep(10);
continue;
} catch (InterruptedException e) {
e.printStackTrace();
}
}
// List<RequestPromise> tmpList = new ArrayList<>();
/*
// TODO 这种写法,会有死循环的风险:
可以看过一个生产者-消费者模型,当生产者速度大于消费者速度,那么就会进入死循环,知道队列报OOM
while (queue.peek() != null){
tmpList.add(queue.poll());
}
*/
int size = queue.size();
for (int i = 0; i < size; i++) {
tmpList.add(queue.poll());
}
int sumNeedCount = 0;
for (int i = 0; i < size; i++){
sumNeedCount += tmpList.get(i).getUserRequest().getCount();
}
if (sumNeedCount <= stockCount){
stockCount -= sumNeedCount;
for (int i = 0; i < size; i++){
RequestPromise promise = tmpList.get(i);
promise.setResult(new Result(true,"下单成功"));
synchronized (promise) {
promise.notify();
}
}
tmpList.clear();
continue;
}else {
for (int i = 0; i < size; i++){
RequestPromise promise = tmpList.get(i);
int count = promise.getUserRequest().getCount();
if ( count <= stockCount){
stockCount -= count;
promise.setResult(new Result(true,"下单成功"));
}else {
promise.setResult(new Result(false,"库存不足"));
}
// 无论库存是否足够,都要去通知正处于同步等待中的线程
synchronized (promise) {
promise.notify();
}
}
}
// 清空临时容器,进行下一轮扫描任务处理
tmpList.clear();
}
},"mergeThread").start();
}
}
class RequestPromise {
private UserRequest userRequest;
private Result result;
public RequestPromise(UserRequest userRequest) {
this.userRequest = userRequest;
}
public UserRequest getUserRequest() {
return userRequest;
}
public void setUserRequest(UserRequest userRequest) {
this.userRequest = userRequest;
}
public Result getResult() {
return result;
}
public void setResult(Result result) {
this.result = result;
}
}
class Result {
private Boolean success;
private String msg;
public Result(Boolean success, String msg) {
this.success = success;
this.msg = msg;
}
public Boolean getSuccess() {
return success;
}
public void setSuccess(Boolean success) {
this.success = success;
}
public String getMsg() {
return msg;
}
public void setMsg(String msg) {
this.msg = msg;
}
@Override
public String toString() {
return "Result{" +
"success=" + success +
", msg='" + msg + '\'' +
'}';
}
}
class UserRequest {
private Long orderId;
private Long userId;
private Integer count;
public UserRequest(Long orderId, Long userId, Integer count) {
this.orderId = orderId;
this.userId = userId;
this.count = count;
}
public Long getOrderId() {
return orderId;
}
public void setOrderId(Long orderId) {
this.orderId = orderId;
}
public Long getUserId() {
return userId;
}
public void setUserId(Long userId) {
this.userId = userId;
}
public Integer getCount() {
return count;
}
public void setCount(Integer count) {
this.count = count;
}
@Override
public String toString() {
return "UserRequest{" +
"orderId=" + orderId +
", userId=" + userId +
", count=" + count +
'}';
}
}
分布式事务问题
1)客户端Order-center的多个对同一个商品进行下单的线程,同时被服务端stock-center给hold住并进入队列中,客户端线程设置默认最多hold住200ms,超过则进行补偿处理
2)异步线程准实时的对队列中的下单请求进行合并处理,如果合并处理因为full gc,或者合并线程调用db报错等原因,造成客户端线程超时,此时客户端线程就会发出回滚MQ (最终一致方案,用MQ性能更好)
3)实际业务处理逻辑是,每次服务端stock-center在进行库存操作时,都会在本地写一条库存流水表,上游客户端如果等待超时了,会发一条MQ消息出来,服务端stock-center监听到这条消息之后会去查这个流水表,看一下到底有没有扣减库存,如果有这条流水则回滚这条库存扣减、如果没有则忽略这条回滚MQ;
ps:
1.为什么是发MQ,不是调RPC接口?
可能因为stock中心的full gc导致客户端阻塞线程超时,而客户端的回滚rpc请求都到达stock中心后,stock中心自身的full gc都还没结束,此时的rpc请求自身又会超时,或者没有查到对应的库存流水,等full gc结束,stock中心才会去写这条库存流水并扣减库存,此时rpc请求早已结束;
而采用MQ,天然有一定的延迟效果,虽然这里采用MQ也不能完全解决stock因为full gc而产生的上述问题
public class AsyncPaymentService {
private static final BlockingQueue<RequestPromise> queue = new LinkedBlockingQueue<>(10);
private static final ExecutorService pool = Executors.newFixedThreadPool(10);
private static final CountDownLatch countDownLatch = new CountDownLatch(10);
private static volatile Integer stockCount = 6;
public static void main(String[] args) throws Exception {
mergeJob();
TimeUnit.SECONDS.sleep(2);
Map<UserRequest,Future<Result>> requestFutureMap = new HashMap<>();
System.out.println("库存开始数量:"+stockCount);
for ( int i=0;i<10;i++){
final Long userId = Long.valueOf(i);
// orderId 不要写死
final Long orderId = i + 100L;
UserRequest userRequest = new UserRequest(orderId, userId , 1);
Future<Result> future = pool.submit( () -> {
Result result = null;
try {
/*
countDownLatch.countDown();
countDownLatch.await(1,TimeUnit.SECONDS);*/
result = operate(userRequest);
} catch (InterruptedException e) {
e.printStackTrace();
}
return result;
});
// 用于唯一确认,请求和响应的对应关系
requestFutureMap.put(userRequest,future);
}
System.out.println();
System.out.println();
// TODO 如果理解这一句代码的作用
TimeUnit.SECONDS.sleep(1);
for (Map.Entry<UserRequest,Future<Result>> entry : requestFutureMap.entrySet()) {
UserRequest request = entry.getKey();
Result result = entry.getValue().get(300,TimeUnit.MICROSECONDS);
System.out.println("客户端的请求:"+request+",对应的响应为:"+result);
if ( !result.getSuccess() ){
// 超时,发送回滚请求
// TODO 回滚请求也可能发送失败,需要进一步兜底
System.out.println(request+",发起了回滚");
rollback(request);
}
}
System.out.println();
System.out.println();
System.out.println("------- 库存操作日志 -------");
System.out.println("扣减成功条数: " + operateChangeLogList.stream().filter(e -> e.getOperateType() == 1).count());
operateChangeLogList.forEach(e -> {
if (e.getOperateType() == 1 ) {
System.out.println("当前数据库中存在的扣减记录是:"+e);
}
});
System.out.println();
System.out.println();
System.out.println("扣减回滚条数: " + operateChangeLogList.stream().filter(e -> e.getOperateType() == 2 ).count());
operateChangeLogList.forEach(e -> {
if (e.getOperateType() == 2) {
System.out.println("当前数据库中存在的的回滚记录是:"+e);
}
});
System.out.println();
System.out.println();
System.out.println("-------- 库存 --------");
System.out.println("库存结束数量 :" + stockCount);
pool.shutdown();
}
private static void rollback(UserRequest userRequest) {
// 回滚
if (operateChangeLogList.stream().anyMatch(operateChangeLog -> operateChangeLog.getOrderId().equals(userRequest.getOrderId()))) {
// 幂等:需要保证这条回滚流水,在数据库表中只插入一条
boolean hasRollback = operateChangeLogList.stream().anyMatch(operateChangeLog ->
operateChangeLog.getOrderId().equals(userRequest.getOrderId()) && operateChangeLog.getOperateType() == 2 );
if (hasRollback)
return ;
System.out.println(userRequest+" 最终回滚");
stockCount += userRequest.getCount();
saveChangeLog(userRequest, 2);
}
// 如果库存操作流水表中,没有userRequest.getOrderId()对应的记录,则忽略客户端发过来的这条MQ回滚流水
}
private static Result operate(UserRequest userRequest) throws InterruptedException {
// TODO 阈值判断
// TODO 队列的创建
RequestPromise promise = new RequestPromise(userRequest);
// 进队成功后,就开始同步wait,直到异步线程处理完成后唤醒自己
synchronized (promise) {
boolean enqueueSuccess = queue.offer(promise, 100, TimeUnit.MICROSECONDS);
if (!enqueueSuccess) {
return new Result(false, "系统繁忙");
}
// TODO 可以看到这里的bug:wait方法超时以后,是不会主动抛出异常的,只有在被别的线程中断时,才会抛出中断异常
try {
// 实际jdk自身,并没有提供一个直接的判断,是因为被notify了还是wait超时而退出
promise.wait(200);
// promise.wait();
if ( promise.getResult() == null){
// 正常情况下,不论是下单成功还是下单失败,promise.getResult()都是有值的,不会为null
return new Result(false, "等待超时");
}
} catch (Exception e) {
// return new Result(false, "超时");
e.printStackTrace();
}
}
return promise.getResult();
}
private static void mergeJob() {
new Thread(() -> {
List<RequestPromise> tmpList = new ArrayList<>();
while (true) {
// 如果队列中没有元素,则睡眠10ms,进入下一轮扫描
// if (blockingDeque.peek() == null) {
if (queue.isEmpty()) {
try {
TimeUnit.MILLISECONDS.sleep(10);
continue;
} catch (InterruptedException e) {
e.printStackTrace();
}
}
// 每次合并3个请求操作一次数据库
int batchSize = 3;
for (int i = 0; i < batchSize; i++) {
// tmpList.add(queue.poll());
try {
// take方法,是有才会返回
tmpList.add(queue.take());
} catch (InterruptedException e) {
e.printStackTrace();
}
}
if (tmpList.stream().anyMatch(requestPromise -> requestPromise.getUserRequest().getUserId() == 5) ){
try {
TimeUnit.MILLISECONDS.sleep(300);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
int sumNeedCount = 0;
for (int i = 0; i < batchSize; i++){
sumNeedCount += tmpList.get(i).getUserRequest().getCount();
}
if (sumNeedCount <= stockCount){
// 开启事务
stockCount -= sumNeedCount;
saveChangeLog(tmpList.stream().map(RequestPromise::getUserRequest).collect(Collectors.toList()),1);
// 关闭事务
for (int i = 0; i < batchSize; i++){
RequestPromise promise = tmpList.get(i);
promise.setResult(new Result(true,"下单成功"));
synchronized (promise) {
promise.notify();
}
}
tmpList.clear();
continue;
}else {
for (int i = 0; i < batchSize; i++){
RequestPromise promise = tmpList.get(i);
int count = promise.getUserRequest().getCount();
if ( count <= stockCount){
// 开启事务
stockCount -= count;
saveChangeLog(promise.getUserRequest() ,1);
// 关闭事务
promise.setResult(new Result(true,"下单成功"));
}else {
promise.setResult(new Result(false,"库存不足"));
}
// 无论库存是否足够,都要去通知正处于同步等待中的线程
synchronized (promise) {
promise.notify();
}
}
}
// 清空临时容器,进行下一轮扫描任务处理
tmpList.clear();
}
},"mergeThread").start();
}
// 模拟数据库操作日志表
// order_id_operate_type 这个应该是一个唯一键,保证幂等
private static List<OperateChangeLog> operateChangeLogList = new ArrayList<>();
/**
* 写库存流水
* @param list
* @param operateType
*/
private static void saveChangeLog(List<UserRequest> list, int operateType) {
List<OperateChangeLog> tempLogList = list.stream().map(userRequest -> new OperateChangeLog(userRequest.getOrderId(),
userRequest.getCount(), operateType)).collect(Collectors.toList());
// 将operateChangeLogList 插入数据库
operateChangeLogList.addAll(tempLogList);
}
private static void saveChangeLog(UserRequest request, int operateType) {
List<UserRequest> userRequestList = new ArrayList<>();
userRequestList.add(request);
List<OperateChangeLog> tempLogList = userRequestList.stream().map(userRequest ->
new OperateChangeLog(userRequest.getOrderId(),userRequest.getCount(),operateType)).collect(Collectors.toList());
// 将operateChangeLogList 插入数据库
operateChangeLogList.addAll(tempLogList);
}
}
/**
* 操作流水日志类
*/
class OperateChangeLog {
private Long orderId;
private Integer count;
// 1-扣减成功,2-回滚
private int operateType;
public OperateChangeLog(Long orderId, Integer count, int operateType) {
this.orderId = orderId;
this.count = count;
this.operateType = operateType;
}
public Long getOrderId() {
return orderId;
}
public void setOrderId(Long orderId) {
this.orderId = orderId;
}
public Integer getCount() {
return count;
}
public void setCount(Integer count) {
this.count = count;
}
public int getOperateType() {
return operateType;
}
public void setOperateType(int operateType) {
this.operateType = operateType;
}
@Override
public String toString() {
return "OperateChangeLog{" +
"orderId=" + orderId +
", count=" + count +
", operateType='" + operateType + '\'' +
'}';
}
}
class RequestPromise {
private UserRequest userRequest;
private Result result;
public RequestPromise(UserRequest userRequest) {
this.userRequest = userRequest;
}
public UserRequest getUserRequest() {
return userRequest;
}
public void setUserRequest(UserRequest userRequest) {
this.userRequest = userRequest;
}
public Result getResult() {
return result;
}
public void setResult(Result result) {
this.result = result;
}
}
class Result {
private Boolean success;
private String msg;
public Result(Boolean success, String msg) {
this.success = success;
this.msg = msg;
}
public Boolean getSuccess() {
return success;
}
public void setSuccess(Boolean success) {
this.success = success;
}
public String getMsg() {
return msg;
}
public void setMsg(String msg) {
this.msg = msg;
}
@Override
public String toString() {
return "Result{" +
"success=" + success +
", msg='" + msg + '\'' +
'}';
}
}
class UserRequest {
private Long orderId;
private Long userId;
private Integer count;
public UserRequest(Long orderId, Long userId, Integer count) {
this.orderId = orderId;
this.userId = userId;
this.count = count;
}
public Long getOrderId() {
return orderId;
}
public void setOrderId(Long orderId) {
this.orderId = orderId;
}
public Long getUserId() {
return userId;
}
public void setUserId(Long userId) {
this.userId = userId;
}
public Integer getCount() {
return count;
}
public void setCount(Integer count) {
this.count = count;
}
@Override
public String toString() {
return "UserRequest{" +
"orderId=" + orderId +
", userId=" + userId +
", count=" + count +
'}';
}
}
内容参考:
【高级请进】核心业务挑战与方案落地!技术与业务如何权衡!P7级项目如何阐述!_哔哩哔哩_bilibili