- 博客(312)
- 资源 (6)
- 收藏
- 关注
原创 vim 格式化 json 字符串
" jsonlet mapleader = "\<space>"map <Leader>jf :%!python -m json.tool<CR>command! JM :execute '%!python -m json.tool' | :execute '%!python -c "import re,sys;sys.stdout.write(re.sub(r\"\\\u[0-9a-f]{4}\", lambda m:m.group().decode(\"u.
2021-06-19 11:25:10 419
原创 hdfs dfsadmin -report
查看 datanode block 是否均匀hdfs dfsadmin -report |grep 'DFS Used%'
2021-06-15 16:57:41 987 1
原创 hdfs dfs test 命令
testUsage: hadoop fs -test -[defswrz] URIOptions:-d: if the path is a directory, return 0.-e: if the path exists, return 0.-f: if the path is a file, return 0.-s: if the path is not empty, return 0.-w: if the path exists and write permission is g
2021-06-11 14:09:36 972
原创 iceberg org.apache.iceberg.parquet.Parquet parquet file read
org.apache.iceberg.parquet.Parquet#readpublic static ReadBuilder read(InputFile file) { return new ReadBuilder(file);}
2021-06-03 16:56:29 399
原创 iceberg flink 读操作
org.apache.iceberg.flink.data.FlinkParquetReaders.StringReader.readorg.apache.iceberg.parquet.ParquetValueReaders.StructReader.read
2021-05-31 18:10:03 361
原创 iceberg flink 写操作
org.apache.iceberg.io.PartitionedFanoutWriter#writepublic void write(T row) throws IOException { // org.apache.flink.table.data.RowData -> org.apache.iceberg.PartitionKey PartitionKey partitionKey = partition(row); // org.apache.iceberg.i
2021-05-27 20:10:53 262
原创 iceberg 元数据组织方式
metadata:// org.apache.iceberg.hadoop.HadoopTableOperations#metadataRootprivate Path metadataRoot() { return new Path(location, "metadata");}metadata.json:// org.apache.iceberg.hadoop.HadoopTableOperations#metadataFilePathprivate Path meta
2021-05-26 18:12:58 703
原创 FsDataOutputStream hflush 与 hsync 方法的区别是什么?
原理:client:client --create--> namenode -> block ->packet ->DataStreamer -> datanode --ack--> client --complete--> namenodeorg.apache.hadoop.hdfs.DFSOutputStream#hflushorg.apache.hadoop.hdfs.DFSOutputStream#hsyncorg.apache.ha..
2021-05-25 20:14:59 341
原创 HDFS 在写文件的过程中能否 ls 到正在写的文件?
原理:上传本地文件到 HDFS 过程:20G May 25 14:56 xx.tarhdfs dfs -put xx.tar /data/hdfs dfs -ls /data//data/xx.tar._COPYING_通过 java api 写文件:org.apache.hadoop.fs.FileSystem#create 实际上已把元数据写到 editlog 了; -ls 就能看到一个 size = 0 的文件org.apache.hadoop.fs.FSD
2021-05-25 17:51:25 186
原创 hadoop distcp 报错: Could not find any valid local directory for s3ablock-xxxx
fs.s3a.buffer.dirdefualt:${hadoop.tmp.dir}/s3adesc:Comma separated list of directories that will be used to buffer file uploads to.此参数在用到org.apache.hadoop.fs.s3a.S3AFileSystem 时使用,在写 s3a 时先在本地的local 存储目录(fs.s3a.buffer.dir)写tmp 的 s3 file block,再 ...
2021-05-24 19:10:53 1045
原创 java System.out.println 输出带颜色的字符串到 stdout
Unix shellANSI escape codepublic static final String ANSI_RESET = "\u001B[0m";public static final String ANSI_BLACK = "\u001B[30m";public static final String ANSI_RED = "\u001B[31m";public static final String ANSI_GREEN = "\u001B[32m";public static.
2021-05-20 15:21:53 846
原创 java sonar 提示 Add the missing @deprecated Javadoc tag 解决方法
/** * Hello ~. * * @deprecated use {@link #new()} instead. */@Deprecatedpublic void hello() {// ...}
2021-05-19 11:11:53 3179
原创 查看 hdfs snapshot counter 值
ssh $(hdfs haadmin -getAllServiceState |grep active |awk -F ':' '{print $1}') "NN_DIR=$(hdfs getconf -confKey 'dfs.namenode.name.dir'); FSIMAGE_NAME=$(ls -t $NN_DIR/current | grep fsimage | grep -v md5 | head -n 1); hdfs oiv -i $NN_DIR/current/$FSIMAGE_NA.
2021-05-17 16:34:51 251
原创 源码学习:yarn application 状态机
目录RMApp 是 ResourceManager 中用于维护一个 Application 生命周期的数据结构,由 RMAppImpl 实现,该类维护了一个 Application 状态机,记录了一个 Application 可能存在的各个状态 RMAppState 以及导致状态间转换的事件 RMAppEvent。状态迁移图RMAppState public enum RMAppState { // 初始状态 NEW, // RM 接受到 client 的 app sub.
2021-05-15 00:58:55 760
原创 kudu - 单机版安装
官方文档/etc/kudu/conf/master.gflagfile--fs_wal_dir=/home/data/kudu/master/wal--fs_data_dirs=/home/data/kudu/master/data/etc/kudu/conf/tserver.gflagfile--fs_wal_dir=/home/data/kudu/tserver/wal-...
2019-07-18 22:10:01 1170
原创 mysql -binlog
lock tables tname read/write;unlock tables;$ cat /etc/my.cnf[mysqld]log-bin=mysql-binbinlog-format=ROWserver_id=1flush logs;show master status;sudo mysqlbinlog --no-defaults -v --base64-outpu...
2019-05-13 19:44:01 417
原创 kafka - cmd
查所有topickafka-topics --zookeeper zks --list拿到broker listbash zkCli.sh -server IP:port(2181)ls /brokers/idsget /brokers/ids/{id}查看kafka topic 最大消费kafka-run-class kafka.tools.GetOffsetShe...
2019-04-30 18:00:12 316
原创 linux - centos6 ops(持续更新)
安装与网络配置 下载 VirtualBox 网络:Bridged Adapter 配置centos6: [root@localhost ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE=eth0HWADDR=08:00:27:52:FB:8DTYPE=E...
2019-04-22 22:33:31 406
转载 tmux - “转载
fromdoctheme$ cat .tmux.conf set-window-option -g mode-keys vi ...
2019-04-19 21:48:16 139
原创 idea - plantuml
brew install Graphvizorsudo apt-get install graphvizdownload1gyjwdownload2plantuml
2019-04-19 00:10:39 480
原创 git - proxy
git config --global http.proxy 'socks5://127.0.0.1:1080'git config --global https.proxy 'socks5://127.0.0.1:1080'git config --global https.proxy http://127.0.0.1:1080git config --global https.p...
2019-04-18 22:38:53 652
原创 git - upstream
git remote -vgit remote add upstream https://github.com/xxx.gitgit remote -vgit fetch upstreamgit checkout mastergit merge upstream/master参考1参考2
2019-04-18 22:31:41 976
原创 ubuntu - 格式化磁盘并挂载
sudo fdisk -lsudo mkfs.ext4 /dev/sdasudo mkdir /home/plli/Datasudo mount /dev/sda /home/plli/Datasudo blkidsudo vim /etc/fstab> UUID=xxx /home/plli/Data ext4 defaults ...
2019-04-18 21:52:01 937
原创 python - you complete me - ImportError: cannot import name _remove_dead_weakref
brew uninstall --ignore-dependencies --force python@2git submodule update --recursive --initsh install.sh
2019-04-16 14:32:04 979
原创 python - 计算日期
today = datetime.date.today()today - datetime.timedelta(days=90)time.time()/60/60/24In [2]: time.strftime("%Y-%m-%d", time.localtime())Out[2]: '2019-04-18'In [3]: time.strptime("2019-02-02"...
2019-04-13 12:19:17 437
原创 linux - vimrc
"peerslee { set foldmethod=indent filetype on filetype plugin on filetype indent on autocmd BufNewFile *.py exec ":call SetTitle()" func SetTitle() call setline(1, "...
2019-03-28 21:17:23 220
原创 linux - keepassx install
keepassxinstallsudo apt-get install build-essential cmake libqt4-dev libgcrypt11-dev zlib1g-dev libxtst-devcmake {target}make -j5sudo make install{current}/src/keepassx
2019-03-22 00:05:31 454
原创 Java - UT - verifyStatic
@RunWith(PowerMockRunner.class)@PrepareForTest({xx.class})mockStatic(xx.class);doSomething();verifyStatic(xx.class, times(1));xx.xxx(any());
2019-01-17 15:20:28 880
原创 Java - 细节
1. 非出参,尽量用final 包起来,避免在外边被修改2. new 这个动作尽量放在构造器,第一时间创建,否则可能会 抛异常 npe3. override equals()-> (1)refrence (2)instanceof (3)... (4) false4. 不依赖于成员变量的方法, 定义为静态的...
2018-12-07 19:53:20 176
原创 mysql - root 忘记密码
vim /etc/my.cnf 在[mysqld]的段中加上一句:skip-grant-tables service mysqld restartmysqladmin -u root password
2018-11-25 00:46:18 217
转载 Intellij IDEA Cannot resolve symbol XXX 问题解决办法汇总
Intellij IDEA Cannot resolve symbol XXX 问题解决办法汇总IntelliJ IDEA 中项目 Cannot Resolve symbol **解决方案
2018-11-07 16:31:14 4645
原创 MapReduce - topk
package topk;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapr...
2018-10-05 22:06:24 511
原创 HDFS - upload file to hdfs
#!/bin/bashexport JAVA_HOME=/data/jdk1.8.0_111export HADOOP_HOME=/data/hadoop-2.6.5export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATHexport JRE_HOME=${JAVA_HOME}/jreexpo...
2018-09-26 00:05:01 952
原创 babun - zsh theme
.zshrc ZSH_THEME="agnoster"DEFAULT_USER="$USER".minttyrcFont=Menlo for PowerlineFontHeight=14Locale=Charset=BoldAsColour=noFontSmoothing=full
2018-09-16 21:49:34 991
原创 ubuntu 16.04 - Nvidia binary driver install
1. nVidia GM107M [GeForce GTX 950M]2. NVIDIA binary driver -version 3843. /etc/modprobe.d/blacklist.conf4. addblacklist vga16fbblacklist nouveaublacklist rivafbblacklist rivatvblacklist ...
2018-08-05 23:48:45 838
原创 Java - 正则表达式(使用?:)匹配密码
String password = "Zz345678901234567890"; // (?=.*x) 非捕获,只是检测整个passwrod 是否含有x String pattern = "^(?=.*\\d)(?=.*[a-z])(?=.*[A-Z])(\\S){8,20}$"; System.out.println(Pa...
2018-06-07 00:43:03 2139
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人