背景
前几天在实现一个新功能时,在dubbo provider提供的SDK中某个方法的返回对象增加了一个字段,照理consumer不使用的时候无需升级也不受影响,结果provider上线后发现未升级的consumer出现反序列失败的异常,这就让我一脸懵B了,于是我赶紧去排查一遍。
先说明一下情况,dubbo的版本为2.7.X,provider发布的SDK为1.0版本:
<groupId>com.test.provider/groupId>
<artifactId>dubbo-api</artifactId>
<version>1.0</version>
对外提供的方法:
public interface ProviderApi {
ResponseVo test(Long id);
}
public class ResponseVo implements Serializable {
private Long id;
}
本次该方法的返回对象增加新的对象字段:
// 新的返回对象
public class ResponseVo implements Serializable {
private Long id;
private UserInfo userInfo;
}
public class UserInfo {
private String name;
}
因此SDK的版本升级为1.1:
<groupId>com.test.provider/groupId>
<artifactId>dubbo-api</artifactId>
<version>1.1</version>
此时consumer A由于没用到 userInfo,因此继续使用1.0版本
异常过程
provider发布完成后,consumer A突然出现一堆调用异常的告警,当时紧急将provider回滚后才正常,事后查看日志,发现异常日志如下:
[EXCEPTION] side=consumer, preCode: null, code: RpcException, message: Failed to invoke remote method: test, cause: org.apache.dubbo.remoting.RemotingException: com.alibaba.com.caucho.hessian.io.HessianFieldException: com.ResponseVo: com.UserInfo cannot be assigned from null
at com.alibaba.com.caucho.hessian.io.JavaDeserializer.logDeserializeError(JavaDeserializer.java:174)
at com.alibaba.com.caucho.hessian.io.JavaDeserializer$ObjectFieldDeserializer.deserialize(JavaDeserializer.java:415)
at com.alibaba.com.caucho.hessian.io.JavaDeserializer.readObject(JavaDeserializer.java:277)
at com.alibaba.com.caucho.hessian.io.JavaDeserializer.readObject(JavaDeserializer.java:204)
at com.alibaba.com.caucho.hessian.io.SerializerFactory.readObject(SerializerFactory.java:555)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObjectInstance(Hessian2Input.java:2850)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2773)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2308)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2747)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2308)
at com.alibaba.com.caucho.hessian.io.JavaDeserializer.readObject(JavaDeserializer.java:279)
at com.alibaba.com.caucho.hessian.io.JavaDeserializer.readObject(JavaDeserializer.java:204)
at com.alibaba.com.caucho.hessian.io.SerializerFactory.readObject(SerializerFactory.java:555)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObjectInstance(Hessian2Input.java:2850)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2773)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2308)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2747)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2308)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2110)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2104)
at com.alibaba.com.caucho.hessian.io.JavaDeserializer$ObjectFieldDeserializer.deserialize(JavaDeserializer.java:411)
at com.alibaba.com.caucho.hessian.io.JavaDeserializer.readObject(JavaDeserializer.java:277)
at com.alibaba.com.caucho.hessian.io.JavaDeserializer.readObject(JavaDeserializer.java:204)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObjectInstance(Hessian2Input.java:2848)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2175)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2104)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2148)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2104)
at org.apache.dubbo.common.serialize.hessian2.Hessian2ObjectInput.readObject(Hessian2ObjectInput.java:96)
at org.apache.dubbo.common.serialize.hessian2.Hessian2ObjectInput.readObject(Hessian2ObjectInput.java:101)
Caused by: java.lang.IllegalStateException: Serialized class com.UserInfo must implement java.io.Serializable
at com.alibaba.com.caucho.hessian.io.SerializerFactory.getDefaultDeserializer(SerializerFactory.java:500)
at com.alibaba.com.caucho.hessian.io.SerializerFactory.getDeserializer(SerializerFactory.java:479)
at com.alibaba.com.caucho.hessian.io.SerializerFactory.getObjectDeserializer(SerializerFactory.java:584)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObjectInstance(Hessian2Input.java:2846)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2175)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2104)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2148)
at com.alibaba.com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:2104)
at com.alibaba.com.caucho.hessian.io.JavaDeserializer$ObjectFieldDeserializer.deserialize(JavaDeserializer.java:411)
... 182 more
很显然是UserInfo没有实现Serializable接口导致,但问题是consumer A并没有升级SDK,不应该反序列到新字段才对呀
原因分析(找出真因)
通过堆栈可以找到反序列化类,发现当response返回的协议里包含sdk类不存在的字段时,dubbo仍然尝试读取它:
// 264 com.alibaba.com.caucho.hessian.io.JavaDeserializer#readObject(com.alibaba.com.caucho.hessian.io.AbstractHessianInput, java.lang.Object, java.lang.String[])
public Object readObject(AbstractHessianInput in,
Object obj,
String[] fieldNames) throws IOException {
try {
int ref = in.addRef(obj);
for (int i = 0; i < fieldNames.length; i++) {
String name = fieldNames[i];
FieldDeserializer deser = (FieldDeserializer) _fieldMap.get(name);
if (deser != null)
deser.deserialize(in, obj);
else
// 当属性找不到时尝试读取
in.readObject();
}
Object resolve = resolve(obj);
if (obj != resolve)
in.setRef(ref, resolve);
return resolve;
} catch (IOException e) {
throw e;
} catch (Exception e) {
throw new IOExceptionWrapper(obj.getClass().getName() + ":" + e, e);
}
}
既然如此,它又如何知道该读取哪个类呢?通过获取dubbo的原始body能够一探究竟:
{
"id":1,
"userInfo": {
"userName": "test",
"class": "com.UserInfo"
}
}
尽管有了class路径,为什么consumer A能加载它呢?原因 com.UserInfo在一个公共包定义,而consumer A又正好引用了,自然也就可以加载了
总结:provider返回新字段时consumer会根据类路径尝试加载,若类不存在时会报classNotFound的warning,若存在则尝试读取但不会设置到field里,而读取的前提是类必须实现java序列化接口
问题回顾
1)既然SDK的字段不存在,为何反序列化时还要读取?
答:跳过不认识的字节,只有读取完整字段才能读取下一个字段
2)既然使用了hession序列化,为何还需要实现Serializable?
答:若没有实现Serializable,攻击者可以实例化任何类,举例:攻击者知道服务用了mysql,就一直实例化mysql连接类,导致服务一直建连,参考官方链接