起因
在跨集群复制HBase快照时,经常会出现由于/hbase/.tmp/data/xxx FileNotFoundException导致任务失败。
现还原出错场景,并分析错误原因,给出一些常用的解决方法:
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File /datafs/.tmp/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3 not found.
at org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:119)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:419)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:107)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:595)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.connect(WebHdfsFileSystem.java:1855)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:673)
... 23 more
18/08/13 20:14:14 INFO mapreduce.Job: map 100% reduce 0%
18/08/13 20:14:14 INFO mapreduce.Job: Job job_1533546266978_0038 failed with state FAILED due to: Task failed task_1533546266978_0038_m_000000
主要原因
在创建快照到跨集群复制过程中,部分StoreFile的位置发生了变动,以至不能正常寻址(使用webhdfs的bug)。
场景还原
准备工作
-
环境:
源集群:HBase 1.2.0-cdh5.10.0
目标集群:HBase 1.2.0-cdh5.12.1
1. 创建表mytable,2个region,以03为分割,一个列族info,Put 6条数据
put 'mytable','01','info:age','1'`
put 'mytable','02','info:age','2'`
put 'mytable','03','info:age','3'
put 'mytable','04','info:age','1'
put 'mytable','05','info:age','1'
put 'mytable','06','info:age','1'
2. 创建快照mysnapshot,生成以下文件
[root@test108 ~]# hdfs dfs -ls /datafs/.hbase-snapshot/mysnapshot/
Found 2 items
-rw-r--r-- 2 hbase hbase 32 2018-08-13 18:48 /datafs/.hbase-snapshot/mysnapshot/.snapshotinfo
-rw-r--r-- 2 hbase hbase 466 2018-08-13 18:48 /datafs/.hbase-snapshot/mysnapshot/data.manifest
-
.snapshot 包含了快照信息,即HBaseProtos.SnapshotDescription对象
name: "mysnapshot"
table: "mytable"
creation_time: 1533774121010
type: FLUSH
version: 2 -
data.manifest
包含了hbase表schema、attributes、column_families,即HBaseProtos.SnapshotDescription对象,重点的是store_files信息,
region_info {
region_id: 1533784567273
table_name {
namespace: "default"
qualifier: "mytable"
}
start_key: "03"
end_key: ""
offline: false
split: false
replica_id: 0
}
family_files {
family_name: "info"
store_files {
name: "3c5e9ec890f04560a396040fa8b592a3"
file_size: 1115
}
}
3. 修改数据
通过Put 修改一个Region的数据
put 'mytable','04','info:age','4'
put 'mytable','05','info:age','5'
put 'mytable','06','info:age','6'
4. 进行flush,major_compat
模拟跨集群复制过程中出现的大/小合并
hbase(main):001:0> flush 'mytable'
0 row(s) in 0.8200 seconds
hbase(main):002:0> major_compact 'mytable'
0 row(s) in 0.1730 seconds
此时 storefile 3c5e9ec890f04560a396040fa8b592a3 出现在了archive下
[root@test108 ~]# hdfs dfs -ls -R /datafs/archive/data/default/mytable/c48642fecae3913e0d09ba236b014667
drwxr-xr-x - hbase hbase 0 2018-08-15 08:30 /datafs/archive/data/default/mytable/c48642fecae3913e0d09ba236b014667/info
-rw-r--r-- 2 hbase hbase 1115 2018-08-13 18:48 /datafs/archive/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3
还原出错
[root@a2502f06 ~]# hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot \
> -Dipc.client.fallback-to-simple-auth-allowed=true \
> -Dmapreduce.job.queuename=root.default \
> -snapshot mysnapshot \
> -copy-from webhdfs://archive.cloudera.com/datafs \
> -copy-to webhdfs://nameservice1/hbase/ \
> -chuser hbase -chgroup hbase -chmod 755 -overwrite
控制台提示,FileNotFound,任务失败。
18/08/13 20:59:34 INFO mapreduce.Job: Task Id : attempt_1533546266978_0037_m_000000_0, Status : FAILED
Error: java.io.FileNotFoundException: File /datafs/.tmp/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3 not found.
at sun.reflect.GeneratedConstructorAccessor14.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:450)
源码剖析
1.ExportSnapshot执行复制前会先将.snapshot,data.manifest 复制到目标端 .hbase-snapshot/.tmp/mysnapshot下:
[root@a2502f06 ~]# hdfs dfs -ls /hbase/.hbase-snapshot/.tmp/mysnapshot
Found 2 items
-rwxr-xr-x 2 hbase hbase 32 2018-08-13 20:28 /hbase/.hbase-snapshot/.tmp/mysnapshot/.snapshotinfo
-rwxr-xr-x 2 hbase hbase 466 2018-08-13 20:28 /hbase/.hbase-snapshot/.tmp/mysnapshot/data.manifest
2.解析data.manifest,按照storefile进行逻辑切片,map每次会读入一个SnapshotFileInfo的信息,只包含了HFileLink信息,并没有包括具体路径。
String region = regionInfo.getEncodedName();
String hfile = storeFile.getName();
Path path = HFileLink.createPath(table, region, family, hfile);
SnapshotFileInfo fileInfo = SnapshotFileInfo.newBuilder()
.setType(SnapshotFileInfo.Type.HFILE)
.setHfile(path.toString())
.build();
3.map阶段
每读入一个SnapshotFileInfo时,拼接出关于StoreFile可能出现的4个路径,读取时按照该顺序查找。
/datafs/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3
/datafs/.tmp/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3
/datafs/mobdir/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3
/datafs/archive/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3
当map读入数据时,调用ExportSnapshot.ExportMapper#openSourceFile 初始化InputStream的过程中
通过调用FileLink.tryOpen()方法中,来确定StoreFile的真实路径路径(遍历4个路径,抛出异常说明不存在,继续找下一个)。
private FSDataInputStream tryOpen() throws IOException {
for (Path path: fileLink.getLocations()) {
if (path.equals(currentPath)) continue;
try {
in = fs.open(path, bufferSize);
if (pos != 0) in.seek(pos);
assert(in.getPos() == pos) : "Link unable to seek to the right position=" + pos;
currentPath = path;
return(in);
} catch (FileNotFoundException e) {
// Try another file location
}
}
throw new FileNotFoundException("Unable to open link: " + fileLink);
}
在debug中发现,fs为org.apache.hadoop.hdfs.web.WebHdfsFileSystem对象
遗憾的是,WebHdfsFileSystem调用getPos()时,不会抛出异常,因此,第一次获取到的路径如下(实际文件存在于archive)。
/datafs/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3
并将该路径设置为currentPath(下一次会用到,避免重复判定)。
当InputStream.read(buffer)时,调用FileLink.read()。
@Override
public int read() throws IOException {
int res;
try {
res = in.read();
} catch (FileNotFoundException e) {
res = tryOpen().read();
} catch (NullPointerException e) { // HDFS 1.x - DFSInputStream.getBlockAt()
res = tryOpen().read();
} catch (AssertionError e) { // assert in HDFS 1.x - DFSInputStream.getBlockAt()
res = tryOpen().read();
}
if (res > 0) pos += 1;
return res;
}
由于初始化时,并没有使用正确的路径,因此 in.read()时,抛出FileNotFoundException(第一次)
继续调用tryOpen().read()方法遍历4个路径,此时 currentPath为 data路径跳过,使用下一个路径(文件仍不在这)
/datafs/.tmp/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3
read .tmp路径,抛出FileNotFoundException(第二次),此异常向上抛出,task失败,观察,经常出现由于.tmp下文件找不到报Error,实际跟.tmp并没多大关系。
2018-08-13 20:13:59,738 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hbase (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /datafs/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3
2018-08-13 20:13:59,740 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hbase (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /datafs/.tmp/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3
2018-08-13 20:13:59,741 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hbase (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /datafs/mobdir/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3
--------------------------------------------------------------------------------------------------------------------------------------------
2018-08-13 20:13:59,830 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hbase (auth:SIMPLE) cause:java.io.FileNotFoundException: File /datafs/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3 not found.
2018-08-13 20:13:59,833 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hbase (auth:SIMPLE) cause:java.io.FileNotFoundException: File /datafs/.tmp/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3 not found.
--------------------------------------------------------------------------------------------------------------------------------------------
2018-08-13 20:13:59,833 ERROR [main] org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper: Error copying webhdfs://archive.cloudera.com/datafs/archive/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3 to webhdfs://nameservice1/hbase/archive/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3
java.io.FileNotFoundException: File /datafs/.tmp/data/default/mytable/c48642fecae3913e0d09ba236b014667/info/3c5e9ec890f04560a396040fa8b592a3 not found.
at sun.reflect.GeneratedConstructorAccessor14.newInstance(Unknown Source)
分隔线之间的FileNotFoundException,即为read()时,抛出的两次异常
分隔线之上File does not exist 为ExportSnapshot 系调用 getSourceFileStatus产生,可以观察到在遍历 data/.tmp/mobdir 后寻找到了正确路径archive(未打印出)。
解决思路
综上:查找StoreFile时只会查找data、.tmp目录,不会查找archive目录。
因此解决思路上,一是避免StoreFile出现在archive下,二是能正确获致到archive路径。
避免StoreFile出现在archive
根据生产经验,在数据大量写入过程中,Region下不断生成StoreFile,当StoreFile数量达到阈值时,触发大/小合并
被合并的StoreFile文件移动到了archive文件下,可使用以下几个方法避免复制时大/小合并
-
对表进行major_compact后再建快照
-
如果表可以接受一段时间的不可用,几分钟到几十分钟不等,可对表进行disable后再操作
-
或者适当调大 hbase.hstore.compaction.Threadhold(表写入不频繁下)
-
根据业务情况,尽可能大的错开数据写入与复制的间隔(等待大/小合并自动完成)
避免使用webhdfs
使用hdfs时,可以正常的抛出异常(未具体使用)
修复源码bug
使得在寻址过程中,可正确读到archive文件夹
借鉴getSourceFileStatus(),在for中加一行 fs.getFileStatus(),遍历时正常抛出FileNotFoundException。
private FSDataInputStream tryOpen() throws IOException {
for (Path path : fileLink.getLocations()) {
if (path.equals(currentPath)) continue;
try {
fs.getFileStatus(path); // 添加此行,使正常抛出异常
in = fs.open(path, bufferSize);
if (pos != 0) in.seek(pos);
assert(in.getPos() == pos) : "Link unable to seek to the right position=" + pos;
if (LOG.isTraceEnabled()) {
if (currentPath == null) {
LOG.debug("link open path=" + path);
} else {
LOG.trace("link switch from path=" + currentPath + " to path=" + path);
}
}
currentPath = path;
return(in);
} catch (FileNotFoundException e) {
// Try another file location
}
}
throw new FileNotFoundException("Unable to open link: " + fileLink);
}
将ExportSnapshot抽出,重新组织HFileLink,FileLink,WALLink依赖。
打包成hadoop jar,避免影响其它功能。