Keep Learning

学习Spark、CarbonData 、Alluxio等，且为其Contributor，Github为：https://github.com/xubo245。欢迎微信联系601450868！

11月 10月 06月 01月

原创基因数据处理106之bwa-mem运行paird-end（1千万条100bp的reads g38L100c10000000Nhs20Paired12）

脚本：hadoop@Master:~/xubo/project/alignment/sparkBWA$ cat g38L100c10000000Nhs20Paired12Bwamem.sh echo "start"startTime4=`date +"%s.%N"` time4=`date +"%Y%m%d%H%M%S"` #spark-submit --cla

2018-01-11 00:45:49 1167

原创基因数据处理105之SparkBWAYarn模式运行1000万条paired-reads实例g38L100c10000000Nhs20Paired12YarnPartition

1.数据生成：art_illumina -ss HS20 -i GRCH38BWAindex/GRCH38chr1L3556522.fna -p -l 100 -m 200 -s 10 -c 10000000 -o g38L100c10000000Nhs20Paired位置：hadoop@Master:~/xubo/ref/GRCH38L1Index/pe$ pwd/home/hadoop/x

2018-01-11 00:45:34 658

原创基因数据处理104之SparkBWAMaster文件得到空文件，中间sam文件找不到

脚本1：spark-submit --class SparkBWA \--master spark://219.219.220.149:7077 \--conf "spark.executor.extraJavaOptions=-Djava.library.path=/home/hadoop/xubo/tools/SparkBWA/build" \--driver-java-options

2018-01-11 00:45:18 846

原创基因数据处理103之SparkBWAYarn模式运行100万条paired-reads实例

脚本：spark-submit --class SparkBWA \--master yarn-client \--conf "spark.executor.extraJavaOptions=-Djava.library.path=/home/hadoop/xubo/tools/SparkBWA/build" \--archives ./bwa.zip \SparkBWA.jar \-al

2018-01-11 00:44:58 584

原创基因数据处理102之SparkBWA本地运行100万条paired-reads实例

脚本：spark-submit --class SparkBWA \--master local \--archives bwa.zip \SparkBWA.jar \-algorithm mem -reads paired \-index /home/hadoop/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta \-partitions

2018-01-11 00:44:43 1137

原创基因数据处理101之SparkBWA本地运行配置和实例

1.修改Makefile.common：将LIBBWA_LIBS = -lrt 改为LIBBWA_LIBS = -lrt -lz不然会报错误【5】2.make之后修改java.library.path步骤：vi /etc/profile加入export LD_LIBRARY_PATH=/home/hadoop/xubo/tools/SparkBWA/build:$LD_LIBRARY_PATH

2018-01-11 00:44:24 1281

原创基因数据处理100之bwamem算法处理100万条paired-reads数据GRCH38chr1L3556522N1000000L100paired12

运行记录：hadoop@Master:~/xubo/ref/GRCH38L1Index/pe$ bwa mem ../GRCH38chr1L3556522.fasta GRCH38chr1L3556522N1000000L100paired1.fastq GRCH38chr1L3556522N1000000L100paired2.fastq >GRCH38chr1L3556522N1000000L1

2018-01-11 00:44:10 1842 1

原创基因数据处理99之SparkBWA修改下载文件

由于每次make都需要下载spark包，180M，所以在Make中将其注释掉了。然后运行make：hadoop@Mcnode1:~/xubo/tools/SparkBWA$ makeif [ ! -d "build" ]; then mkdir build; figcc -c -g -Wall -Wno-unused-function -O2 -fPIC -DHAVE_PTHREAD -DUSE

2018-01-11 00:43:57 558

原创基因数据处理98之SparkBWA运行时spark on Yarn问题日志完整记录

脚本：hadoop@Mcnode1:~/xubo/tools/SparkBWA/build$ cat paired.sh spark-submit --class SparkBWA \ --master yarn-client \ --conf "spark.executor.extraJavaOptions=-XX:MaxPermSize=1024M" --driver-

2018-01-11 00:43:26 1018

原创基因数据处理97之SparkBWA运行时spark on Yarn问题

hadoop@Master:~/xubo/tools/SparkBWA/build$ ./paired.sh Using properties file: /home/hadoop/cloud/spark-1.5.2/conf/spark-defaults.confAdding default property: spark.executor.extraJavaOptions=-Djava.l

2018-01-11 00:42:28 473

原创基因数据处理96之sparkBWA运行问题（yarn）

hadoop@Master:~/xubo/project/alignment/sparkBWA$ ./paired.sh Using properties file: /home/hadoop/cloud/spark-1.5.2/conf/spark-defaults.conf Adding default property: spark.executor.extraJavaOp

2018-01-05 00:36:40 463

原创基因数据处理95之sparkBWA运行问题

脚本：hadoop@Master:~/xubo/project/alignment/sparkBWA$ cat pairedERR.sh spark-submit --class SparkBWA \--master local[4] \--driver-memory 1500m \--executor-memory 1500m \--executor-cores 1 \--arc

2018-01-05 00:34:15 528

原创基因数据处理94之使用kmer分析SRR003161数据的kmer分布

1.分两组(1)kmer长度为：5 to 21(2)kmer长度为：5 to 55 by 102.代码：package org.gcdss.cliimport java.text.SimpleDateFormatimport java.util._import org.apache.spark._import org.bdgenomics.adam.projection

2018-01-05 00:28:47 2290

原创基因数据处理93之sparkBWA安装和使用

1.安装git clone https://github.com/citiususc/SparkBWA.gitcd SparkBWAmake2.使用：报错：没有成功hadoop@Master:~/xubo/project/alignment/sparkBWA$ ./run.sh Error: Must specify a primary resource (JAR or P

2018-01-04 23:47:06 1794

原创基因数据处理92之重新调整loadDataProcessing使之能适应基因数据处理91的问题

1.解决思路：正如基因数据处理91之disease的vcf2omim和dataProcessing的数据对不上描述的问题，目前解决办法如下：采取简单的map和union的方式将alternateAllele的读取改为逗号分开的。然后进行union余留问题，这样的方法需要遍历四次RDD，可以将返回的类型改为Array或者其他形式来减少时间开销。还好RDD不大，只有1万多行。2.解决代码

2018-01-04 23:45:48 388

原创基因数据处理91之disease的vcf2omim和dataProcessing的数据对不上

1.介绍： vcf2omim数据量为：rdd2.count:8623 dataProcessing数据为： rdd2.count:10884 sum:2300 85842300为AlternateAllele有逗号的数量，例如ref为A，AlternateAllele为G,C2.原因分析，主要是读入vcf时上诉情况会变成两条：数据：1 10493 rs199606

2018-01-04 23:43:22 476

原创基因数据处理90之disease的DataProcessing修改后运行记录

hadoop@Master:~/xubo/project/callDisease/DataProcessing$ ./allVcf.sh start:vcfFile:/xubo/callVariant/vcf/All_20160407.vcfdbSnp2omimFile:/xubo/callDisease/input/omimFilter9Text.txtomimFile:/xubo/ca

2018-01-04 23:41:56 342

原创基因数据处理89之vcf2omim大数据集错误

hadoop@Master:~/xubo/project/callDisease/Vcf2Omim$ ./allVcf.sh start call Vcf2Omimstart:Vcf2OmimvcfArrRDD:end[Stage 1:> (0 + 15) / 203]16/06

2018-01-04 23:39:33 487

原创基因数据处理88之vcf2omim得到omim和dbSnpId信息

1.代码：/** * @author xubo * more code:https://github.com/xubo245/SparkLearning * more blog:http://blog.csdn.net/xubo245 */ package org.gcdss.cli.diseaseimport java.text.

2018-01-04 23:36:28 597

CarbonData学习资料

Apache CarbonData学习文档汇总，包含视频/文档/文件等。

2018-11-22

opencv 3.4.1 jar

opencv-341.jar. for invoking opencv,you can add the code to your project

2018-05-16

高级Shell脚本编程

高级Shell脚本编程,高级Shell脚本编程

2016-03-15

2015年中国软件开发者白皮书

2016-01-12

neo4j-javadocs-2.3.1-javadoc.jar

neo4j-javadocs-2.3.1-javadoc.jar neo4j 2.3.1 API

2015-11-26

neo4j-enterprise-2.3.1-unix.tar.gz

neo4j-enterprise-2.3.1-unix.tar.gz，官网下载

2015-11-25

neo4j-enterprise-2.3.0-M03-unix.tar.gz

neo4j-enterprise-2.3.0-M03-unix.tar.gz,官网下载

2015-11-25

资金流入流出预测大赛冠军答辩PPT

资金流入流出预测大赛冠军答辩PPT，资金流入流出预测冠军答辩PPT 阿里云天池

2015-09-09

redis-3.0.4安装包

redis-3.0.4.tar.gz，redis-3.0.4安装包，官网下载

2015-09-09

JDK.API.7_English.chm

JDK.API.7_English.chm Java™ Platform, Standard Edition 7 API Specification This document is the API specification for the Java™ Platform, Standard Edition.

2015-08-24

Java 2 SE 6 Documentation.chm

Java 2 SE 6 Documentation.chm JavaTM SE 6 Platform at a Glance This document covers the JavaTM Platform, Standard Edition 6 JDK. Its product version number is 6 and developer version number is 1.6.0, as described in Platform Name and Version Numbers. For information on a feature of the JDK, click on a component in the diagram below.

2015-08-24

JavaSE中文API.chm

JavaSE中文API.chm JavaTM 2 Platform Standard Edition 5.0 API 规范本文档是 Java 2 Platform Standard Edition 5.0 的 API 规范。

2015-08-24

jdk api 1.7英文版-带索引

java, jdk api 1.7英文版-带索引,English,Index,Java™ Platform, Standard Edition 7 API Specification

2015-08-24

微软、谷歌、百度、腾讯等各大公司笔试面试题整理全版.rar

2015-08-20

10部算法经典著作的合集

2015-08-20

百度人搜，阿里巴巴，腾讯华为小米搜狗笔试面试八十题.pdf

2015-08-20

色彩空间转换matlab

色彩空间转换matlab RGB HSV YIQ NTSC

2014-04-14

isrgb.m,matlab

isrgb.m matlab rgb function y = isrgb(x) %ISRGB Return true for RGB image. % FLAG = ISRGB(A) returns 1 if A is an RGB truecolor image and % 0 otherwise. % % ISRGB uses these criteria to determine if A is an RGB image: % % - If A is of class double, all values must be in the range % [0,1], and A must be M-by-N-by-3. % % - If A is of class uint8 or uint16, A must be M-by-N-by-3. % % Note that a four-dimensional array that contains multiple RGB % images returns 0, not 1. % % Class Support % ------------- % A can be of class uint8, uint16, or double. If A is of % class logical it is considered not to be RGB. % % See also ISBW, ISGRAY, ISIND. % Copyright 1993-2003 The MathWorks, Inc. % $Revision: 1.15.4.2 $ $Date: 2003/08/23 05:52:55 $ wid = sprintf('Images:%s:obsoleteFunction',mfilename); str1= sprintf('%s is obsolete and may be removed in the future.',mfilename); str2 = 'See product release notes for more information.'; warning(wid,'%s\n%s',str1,str2); y = size(x,3)==3; if y if isa(x, 'logical') y = false; elseif isa(x, 'double') % At first just test a small chunk to get a possible quick negative m = size(x,1); n = size(x,2); chunk = x(1:min(m,10),1:min(n,10),:); y = (min(chunk(:))>=0 && max(chunk(:))=0 && max(x(:))<=1); end end end

2014-03-27

C语言头函数包include

C语言头函数包include stdio.h stdlib.h等

2013-10-18

计算方法实验Gauss_Seidel法和Runge_Kutta法

计算方法实验说明文档 PB10210016 徐波实验要求：第二版208页程序15 第二版208页程序20，将二阶改为四阶，求第二个实验环境：操作系统：Windows8 64位　编译软件:Code::Blocks 版本：10.05 位数：32位实验提交时间：　考前实验说明： Gauss_Seidel：左侧为数据文档，为了方便多次测试，可将txt文档中数据复制到exe中运行，输入规范请见上图上图为正确输出之一 Runge_Kutta 左侧为数据文档，为了方便多次测试，可将txt文档中数据复制到exe中运行，输入规范请见上图上图为正确输出之一附件：程序15：Gauss_Seidel代码、可运行exe程序、输入数据文件和运行截图程序20：Runge_Kutta代码、可运行exe程序、输入数据文件和运行截图实验心得：　　通过这次实验，对Gauss_Seidel法和Runge_Kutta法了解更深，并且有了实际运行经验，而且通过编程，对方法每一步的运算数据的输入输出了解更深，总的来说收获很大，我们应该多写些类似的程序，希望能将其放在网页上，输入数据就能运行出结果。 PB10210016 徐波 2013.5.28 代码请联系QQ：601450868　　

2013-10-17

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人