自定义博客皮肤VIP专享

*博客头图:

格式为PNG、JPG,宽度*高度大于1920*100像素,不超过2MB,主视觉建议放在右侧,请参照线上博客头图

请上传大于1920*100像素的图片!

博客底图:

图片格式为PNG、JPG,不超过1MB,可上下左右平铺至整个背景

栏目图:

图片格式为PNG、JPG,图片宽度*高度为300*38像素,不超过0.5MB

主标题颜色:

RGB颜色,例如:#AFAFAF

Hover:

RGB颜色,例如:#AFAFAF

副标题颜色:

RGB颜色,例如:#AFAFAF

自定义博客皮肤

-+
  • 博客(9)
  • 资源 (4)
  • 收藏
  • 关注

原创 Langchain+Milvus+Chatglm2实现基于大模型的知识问答

这里不作详细说明,开发者需要自行学习Milvus向量数据库的使用。笔者是从ES里加载了部分二级市场的舆情数据到Milvus。# 多显卡支持,使用下面两行代替上面一行,将num_gpus改为你实际的显卡数量。Milvus新闻库中如果有的新闻内容大模型一般都能回复正确。因为前面改了模型默认下载地址,所以这里需要改下路径参数。安装Langchain及Milvus。及以上的版本,以获得最佳的推理性能。这里我将下载的模型文件放到了本地的。下载embeddings模型到本地。下载CHATGLM2大模型源码。

2023-11-03 11:17:16 768

原创 Langchain框架整合Minio实现大模型知识数据的工程化处理

大当前模型时代下,很多公司都开始使用大模型构建自己的产品体系,在现有业务基础上进行创新。目前国内比较流行使用的开源大模型有CHATGLM2、LLAMA2等。使用Langchain\Llama_index等框架结合LLM与向量库等技术可以轻松的构建各领域的大模型体系。知识数据丰富性和准确性是大模型产品成功与否的关键。本文介绍Langchain框架整合Minio实现大模型知识数据的工程化处理,并附上LANGCHAIN整合MINIO源代码

2023-11-01 18:56:05 316 1

原创 CENTOS 下elasticsearch集群安装配置

elasticsearch安装:参考文章:http://www.ruanyifeng.com/blog/2017/08/elasticsearch.htmlLINUX环境1、先确保安装JDK 1.82、下载 elasticsearch-6.1.2.tar.gz 安装包tar -zxvf ./elasticsearch-6.1.2.tar.gz3、sudo sysctl -w vm.ma...

2018-10-18 16:32:08 246

原创 HADOOP安装前SSH免密登录配置(CENTOS)

1、假设master 机器需要免密访问slave1 ,slave2机器 (注意:访问的用户名和登录用户一致 如:三台机器都有hadoop用户,以下的行为都基于同一用户)在master 机器上运行 ssh-keygen运行结束以后, 默认在 ~/.ssh目录生成两个文件: id_rsa :私钥 id_rsa.pub :公钥2.导入公钥到认证文件,更改权限导入本机(master) ca...

2018-10-18 16:29:01 1049

原创 hive安装LLAP

hive安装LLAP参考URL https://blog.csdn.net/zhoudetiankong/article/details/73089225slider下载与编译##参考URL:https://blog.csdn.net/qingzhenli/article/details/72688539我使用的slider版本是0.92.0-incubating,下载地址:https:/...

2018-10-18 16:23:58 1945

原创 Apache Zeppelin 源代码分析

Zeppelin的组成 Zeppelin的基本功能介绍可 参考文章:https://blog.csdn.net/spacewalkman/article/details/62135285Zeppelin程序的基本结构如下(这里只画出主要部件,方便大家理解):Zeppelin Server: Zeppelin主程序,启动时通过配置,加载war包提供WEB 服务,通过Websocket协议与WEB端程...

2018-06-21 11:13:24 869 1

原创 HBASE 安装配置指南

HBASE 安装配置HBASE下载地址: http://mirror.bit.edu.cn/apache/hbase/stable/hbase-1.2.6-bin.tar.gzHBASE安装教程:http://www.yiibai.com/hbase/hbase_installation.htmlhttp://blog.csdn.net/smile0198/article/details/1766...

2018-03-21 15:53:24 696

原创 kafka+zookeeper集群安装与配置(CENTOS7环境)及开发中遇到的问题解决

kafka+zookeeper集群安装与配置及问题解决(CENTOS)             具体安装配置可参考文章 ##:http://www.cnblogs.com/luotianshuai/p/5206662.htmlZOOKEEPER 集群主要配置(zoo.cfg) :tickTime=2000 initLimit=10 syncLimit=5 dataDir=/home/hadoop/...

2018-03-21 15:49:39 601

原创 SPARK 集群运行是读取配置文件的常见问题

SPARK 集群运行是读取配置文件的问题 在实时计算时,遇到一个异常:Caused by: java.lang.NullPointerException at com.szkingdom.kdap.api.compute.dao.DBDaoFactory.batchUpdate(DBDaoFactory.java:119) at com.szkingdom.kdap.dao.statisticre...

2018-03-21 15:29:35 2955

Perl高级教程人称“大骆驼”(免费)

目录 第一章 Perl概述............................................................................................................................10 1.1 从头开始..........................................................................................................................10 1.2 自然语言与人工语言......................................................................................................11 1.2.1 变量语法...............................................................................................................12 1.2.2 单数变量...............................................................................................................13 1.2.3 复数变量...............................................................................................................14 1.2.4 复杂数据结构.......................................................................................................17 1.2.5 简单数据结构.......................................................................................................19 1.2.6 动词.......................................................................................................................21 1.3 一个平均值例子..............................................................................................................22 1.3.1 如何运行...............................................................................................................24 1.4 文件句柄..........................................................................................................................26 1.5 操作符..............................................................................................................................28 1.5.1 双目算术操作符...................................................................................................28 1.5.2 字符串操作符.......................................................................................................28 1.5.3 赋值操作符...........................................................................................................29

2017-10-11

spark快速大数据分析

非常好的一本SPARK入门书 强烈推荐! 第1 章 Spark 数据分析导论 ..........................................................................................................1 1.1 Spark 是什么...............................................................................................................................1 1.2 一个大一统的软件栈 .................................................................................................................2 1.2.1 Spark Core ......................................................................................................................2 1.2.2 Spark SQL.......................................................................................................................3 1.2.3 Spark Streaming ..............................................................................................................3 1.2.4 MLlib ..............................................................................................................................3 1.2.5 GraphX............................................................................................................................3 1.2.6 集群管理器 ....................................................................................................................4 1.3 Spark 的用户和用途...................................................................................................................4 1.3.1 数据科学任务 ................................................................................................................4 1.3.2 数据处理应用 ................................................................................................................5 1.4 Spark 简史...................................................................................................................................5 1.5 Spark 的版本和发布...................................................................................................................6 1.6 Spark 的存储层次.......................................................................................................................6 第2 章 Spark 下载与入门 ...............................................................................................................7 2.1 下载Spark...................................................................................................................................7 vi | 目录 2.2 Spark 中Python 和Scala 的shell ..............................................................................................9 2.3 Spark 核心概念简介.................................................................................................................12 2.4 独立应用 ...................................................................................................................................14 2.4.1 初始化SparkContext ...................................................................................................15 2.4.2 构建独立应用 ..............................................................................................................16 2.5 总结 ...........................................................................................................................................19 第3 章 RDD 编程 ............................................................................................................................21 3.1 RDD 基础 .................................................................................................................................21 3.2 创建RDD .................................................................................................................................23 3.3 RDD 操作 .................................................................................................................................24 3.3.1 转化操作 ......................................................................................................................24 3.3.2 行动操作 ......................................................................................................................26 3.3.3 惰性求值 ......................................................................................................................27 3.4 向Spark 传递函数....................................................................................................................27 3.4.1 Python ...........................................................................................................................27 3.4.2 Scala ..............................................................................................................................28 3.4.3 Java ...............................................................................................................................29 3.5 常见的转化操作和行动操作 ...................................................................................................30 3.5.1 基本RDD .....................................................................................................................30 3.5.2 在不同RDD 类型间转换 ............................................................................................37 3.6 持久化( 缓存) .........................................................................................................................39 3.7 总结 ...........................................................................................................................................40 第4 章 键值对操作 .........................................................................................................................41 4.1 动机 ...........................................................................................................................................41 4.2 创建Pair RDD ..........................................................................................................................42 4.3 Pair RDD 的转化操作 ..............................................................................................................42 4.3.1 聚合操作 ......................................................................................................................45 4.3.2 数据分组 ......................................................................................................................49 4.3.3 连接 ..............................................................................................................................50 4.3.4 数据排序 ......................................................................................................................51 4.4 Pair RDD 的行动操作 ..............................................................................................................52 4.5 数据分区(进阶) .....................................................................................................................52 4.5.1 获取RDD 的分区方式 ................................................................................................55 4.5.2 从分区中获益的操作 ..................................................................................................56 4.5.3 影响分区方式的操作 ..................................................................................................57 4.5.4 示例:PageRank ..........................................................................................................57 4.5.5 自定义分区方式 ..........................................................................................................59 4.6 总结 ...........................................................................................................................................61 目录 | vii 第5 章 数据读取与保存 ................................................................................................................63 5.1 动机 ...........................................................................................................................................63 5.2 文件格式 ...................................................................................................................................64 5.2.1 文本文件 ......................................................................................................................64 5.2.2 JSON .............................................................................................................................66 5.2.3 逗号分隔值与制表符分隔值 ......................................................................................68 5.2.4 SequenceFile .................................................................................................................71 5.2.5 对象文件 ......................................................................................................................73 5.2.6 Hadoop 输入输出格式 .................................................................................................73 5.2.7 文件压缩 ......................................................................................................................77 5.3 文件系统 ...................................................................................................................................78 5.3.1 本地/“常规”文件系统 ............................................................................................78 5.3.2 Amazon S3 ....................................................................................................................78 5.3.3 HDFS ............................................................................................................................79 5.4 Spark SQL 中的结构化数据 ....................................................................................................79 5.4.1 Apache Hive .................................................................................................................80 5.4.2 JSON .............................................................................................................................80 5.5 数据库 .......................................................................................................................................81 5.5.1 Java 数据库连接 ..........................................................................................................81 5.5.2 Cassandra ......................................................................................................................82 5.5.3 HBase ............................................................................................................................84 5.5.4 Elasticsearch .................................................................................................................85 5.6 总结 ...........................................................................................................................................86 第6 章 Spark 编程进阶 .................................................................................................................87 6.1 简介 ...........................................................................................................................................87 6.2 累加器 .......................................................................................................................................88 6.2.1 累加器与容错性 ..........................................................................................................90 6.2.2 自定义累加器 ..............................................................................................................91 6.3 广播变量 ...................................................................................................................................91 6.4 基于分区进行操作 ...................................................................................................................94 6.5 与外部程序间的管道 ...............................................................................................................96 6.6 数值RDD 的操作 ....................................................................................................................99 6.7 总结 .........................................................................................................................................100 第7 章 在集群上运行Spark ......................................................................................................101 7.1 简介 .........................................................................................................................................101 7.2 Spark 运行时架构...................................................................................................................101 7.2.1 驱动器节点 ................................................................................................................102 viii | 目录 7.2.2 执行器节点 ................................................................................................................103 7.2.3 集群管理器 ................................................................................................................103 7.2.4 启动一个程序 ............................................................................................................104 7.2.5 小结 ............................................................................................................................104 7.3 使用spark-submit 部署应用 ................................................................................................105 7.4 打包代码与依赖 .....................................................................................................................107 7.4.1 使用Maven 构建的用Java 编写的Spark 应用 .......................................................108 7.4.2 使用sbt 构建的用Scala 编写的Spark 应用 ............................................................109 7.4.3 依赖冲突 .................................................................................................................... 111 7.5 Spark 应用内与应用间调度...................................................................................................111 7.6 集群管理器 .............................................................................................................................112 7.6.1 独立集群管理器 ........................................................................................................112 7.6.2 Hadoop YARN ............................................................................................................115 7.6.3 Apache Mesos .............................................................................................................116 7.6.4 Amazon EC2 ...............................................................................................................117 7.7 选择合适的集群管理器 .........................................................................................................120 7.8 总结 .........................................................................................................................................121 第8 章 Spark 调优与调试 ...........................................................................................................123 8.1 使用SparkConf 配置Spark ...................................................................................................123 8.2 Spark 执行的组成部分:作业、任务和步骤.......................................................................127 8.3 查找信息 .................................................................................................................................131 8.3.1 Spark 网页用户界面 ..................................................................................................131 8.3.2 驱动器进程和执行器进程的日志 ............................................................................134 8.4 关键性能考量 .........................................................................................................................135 8.4.1 并行度 ........................................................................................................................135 8.4.2 序列化格式 ................................................................................................................136 8.4.3 内存管理 ....................................................................................................................137 8.4.4 硬件供给 ....................................................................................................................138 8.5 总结 .........................................................................................................................................139 第9 章 Spark SQL ........................................................................................................................141 9.1 连接Spark SQL ......................................................................................................................142 9.2 在应用中使用Spark SQL ......................................................................................................144 9.2.1 初始化Spark SQL......................................................................................................144 9.2.2 基本查询示例 ............................................................................................................145 9.2.3 SchemaRDD ...............................................................................................................146 9.2.4 缓存 ............................................................................................................................148 9.3 读取和存储数据 .....................................................................................................................149 9.3.1 Apache Hive ...............................................................................................................149 目录 | ix 9.3.2 Parquet ........................................................................................................................150 9.3.3 JSON ...........................................................................................................................150 9.3.4 基于RDD ...................................................................................................................152 9.4 JDBC/ODBC 服务器 ..............................................................................................................153 9.4.1 使用Beeline ...............................................................................................................155 9.4.2 长生命周期的表与查询 ............................................................................................156 9.5 用户自定义函数 .....................................................................................................................156 9.5.1 Spark SQL UDF ..........................................................................................................156 9.5.2 Hive UDF ....................................................................................................................157 9.6 Spark SQL 性能 ......................................................................................................................158 9.7 总结 .........................................................................................................................................159 第10 章 Spark Streaming ..........................................................................................................161 10.1 一个简单的例子 ...................................................................................................................162 10.2 架构与抽象 ...........................................................................................................................164 10.3 转化操作 ...............................................................................................................................167 10.3.1 无状态转化操作 .....................................................................................................167 10.3.2 有状态转化操作 .....................................................................................................169 10.4 输出操作 ...............................................................................................................................173 10.5 输入源 ...................................................................................................................................175 10.5.1 核心数据源 .............................................................................................................175 10.5.2 附加数据源 .............................................................................................................176 10.5.3 多数据源与集群规模 .............................................................................................179 10.6 24/7 不间断运行 ...................................................................................................................180 10.6.1 检查点机制 .............................................................................................................180 10.6.2 驱动器程序容错 .....................................................................................................181 10.6.3 工作节点容错 .........................................................................................................182 10.6.4 接收器容错 .............................................................................................................182 10.6.5 处理保证 .................................................................................................................183 10.7 Streaming 用户界面 .............................................................................................................183 10.8 性能考量 ...............................................................................................................................184 10.8.1 批次和窗口大小 .....................................................................................................184 10.8.2 并行度 .....................................................................................................................184 10.8.3 垃圾回收和内存使用 .............................................................................................185 10.9 总结 .......................................................................................................................................185 第11 章 基于MLlib 的机器学习 ...............................................................................................187 11.1 概述 .......................................................................................................................................187 11.2 系统要求 ...............................................................................................................................188 11.3 机器学习基础 .......................................................................................................................189

2017-10-11

portmap工具

portmap工具 ,非常好用的端口映射工具

2015-08-06

DOS TSR 驻留

DOS内在驻留技术,DOS时代的经典,WINDOWS操作系统的多任务由此诞生

2012-03-05

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

提示
确定要删除当前文章?
取消 删除