Mongodb概述
MongoDB是一个面向文档的基于分布式的NoSQL数据库
存储格式是类似于Json格式(BSON)
mongodb数据模型
database,collection,document,field ,index
1.不同文档可以有不同的字段
2.不同文档的字段类型可以不一样
模式验证:可以指定Json模式,插入或更新数据的时候,会按该模式进行验证
上限集合:可以指定MongoDB的集合大小,比如指定1G,当数据达到1G,集合大小不会再增加
1、MongoDB安装
1.1、配置yum源
cd /etc/yum.repos.d/
vim mongodb.repo
[MongoDB]
name=MongoDB Repository
baseurl=http://mirrors.aliyun.com/mongodb/yum/redhat/7Server/mongodb-org/4.0/x86_64/
gpgcheck=0
enabled=1
1.2、通过tar解压文件,或yum进行安装
tar -zxf mongodb-linux-x86_64-rhel70-3.2.22.tgz -C /opt/software/
配置环境变量并激活
[root@single11 software]# mkdir -p /var/lib/mongo
[root@single11 software]# mkdir -p /var/log/mongodb
cd /opt/software/mongodb
vi mongodb.conf
=====================================================
port=27017
dbpath=/var/lib/mongo
logpath=/var/log/mongodb/mongodb.log
logappend=true
fork=false
maxConns=100
noauth=true
journal=true
storageEngine=wiredTiger
bind_ip=0.0.0.0
mongod --config /opt/software/mongodb/mongodb.conf
#进入服务
mongo
1.3、启动mongod 服务
systemctl start mongod.service
systemctl status mongod.service
2、mongodb 操作(重点)
1.集合和数据库操作
> db
test
> use events
switched to db events
> db
events
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
> db
events
> db.test1.insert({"name":"mongodb test"})
WriteResult({ "nInserted" : 1 })
> show dbs
admin 0.000GB
config 0.000GB
events 0.000GB
local 0.000GB
> show collections
test1
> show tables
test1
> db.dropDatabase()
{ "dropped" : "events", "ok" : 1 }
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
> db
events
> db.createCollection("mycol1")
{ "ok" : 1 }
> show tables
mycol1
> show dbs
admin 0.000GB
config 0.000GB
events 0.000GB
local 0.000GB
>
2.1、修改mongodb 配置文件
vim /etc/mongod.conf
重启mongodb service
2.2、集合的创建与删除
> db.createCollection("mycol2",{capped:true,size:1024000})
{ "ok" : 1 }
> show tables
mycol1
mycol2
> db.mycol2.drop()
true
> show tables;
mycol1
2.3、文档操作
2.3.1、插入文档
db.customers.insertOne({“name”:“lisi”})
2.3.2、批量插入文档
db.users.insertMany([{“name”:“zhangsan”},{“name”:“lisi”},{“name”:“wangwu”}])
2.3.3、查询
条件查询
db.inventory.find({“status”:“A”}).pretty()
投影查询,获取指定的字段
db.inventory.find({“status”:“D”},{“status”:1,“size”:1})
in & nin
db.inventory.find({“status”:{$in:[“A”,“D”]}})
比较运算符
for(var i =1; i<30; i++) db.customers.insert ({id:i,name: "xiaoming",age: 100+i})
db.customers.find({age:{$lt:105}})
db.customers.find({age:{$gte:125}})
db.customers.find({age:{gt:110,lt:115}})
逻辑运算
db.customers.find({$or:[{id:20},{age:110}]})
db.customers.find({$and:[{id:20},{age:120}]})
2.3.4、更新
db.inventory.updateOne({"status":"D"},{$set:{"item":"journal"}})
db.inventory.updateMany({status:"A"},{$set:{"item":"updateA"}})
upsert操作
db.inventory.updateOne({"status":"C"},{$set:{"item":"journal"}},{upsert:true})
修改增加
db.inventory.updateOne({"status":"A"},{$set:{"item":"journal"},$inc:{qty:100}})
2.3.5、删除
db.inventory.deleteOne({status:"A"})
db.inventory.deleteMany({status:"D"})
db.inventory.deleteMany({})
2.3.6、多字段分组聚合
db.students.insertMany([
{name:"xiaoming",record:"A",gender:"male",age:14},
{name:"xiaoming",record:"A",gender:"male",age:14},
{name:"xiaoming",record:"A",gender:"male",age:14},
{name:"xiaoming",record:"b",gender:"male",age:21},
{name:"xiaoming",record:"b",gender:"male",age:21},
{name:"xiaoming",record:"b",gender:"male",age:22},
{name:"xiaoming",record:"b",gender:"female",age:24},
{name:"xiaoming",record:"b",gender:"female",age:24},
{name:"xiaoming",record:"b",gender:"female",age:12}
])
db.students.aggregate([
{$group:{_id:{record:"$record",gender:"$gender"},avg_age:{$avg:"$age"}}}
])
2.4、mongodb与hive整合
2.4.1、拷贝jar包到hive的lib目录
下载jar包,放到hive节点的第三方包/opt/hive/lib/目录下(这个目录通过hive hive.aux.jars.path属性配置)
http://mvnrepository.com/artifact/org.mongodb.mongo-hadoop/mongo-hadoop-core/2.0.2
http://mvnrepository.com/artifact/org.mongodb.mongo-hadoop/mongo-hadoop-hive/2.0.2
http://mvnrepository.com/artifact/org.mongodb/mongo-java-driver/3.2.1
mongo-hadoop-core-2.0.2.jar
mongo-hadoop-hive-2.0.2.jar
mongo-java-driver-3.2.1.jar
修改文件的访问权限:chmod 777 /opt/hive/lib/mongo-*
注意:mongo-java-driver jar版本不能低于mongodb组件的版本
重启Hive 完成
STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler' WITH SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id"}') TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/test.test');
2.4.2、在hive中创建表,关联mongodb
create table individuals(
id int,
name string,
age int,
work STRUCT<title:string,hours:int>)
stored by "com.mongodb.hadoop.hive.MongoStorageHandler"
with serdeproperties
('mongo.columns.mapping'='{"id":"_id","work.title":"job.postition"}')
tblproperties
('mongo.uri'='mongodb://127.0.0.1:27017/test.persons');
insert into individuals select
1,"zhangsan",20,named_struct('title','beijing','hours',10);
2.5、mongodb加载csv
yum -y install mongodb-org-tools-4.0.1
mongoimport --headerline --type=csv --file=./train.csv -d events -c train
mongoimport --headerline --type=csv --file=./train.csv -d events -c train
2.6、重启mongodb
修改配置文件
vim /etc/mongod.conf
关闭mongod进程
ps -ef | grep mongo
kill 81775
ps -ef | grep mongo
重启服务
systemctl stop mongod.service
systemctl status mongod.service
systemctl start mongod.service
systemctl status mongod.service
2.7、配置mongod用户认证
进入mongo
use admin
db.createUser(
{
user:"kgc",
pwd:"123456",
roles:[{role:"userAdminAnyDatabase",db:"admin"},"readWriteAnyDatabase"]
}
)
开启用户认证,重启
[root@hadoop101 ~]# systemctl stop mongod.service
[root@hadoop101 ~]# vim /etc/mongod.conf
security:
authorization: enabled
[root@hadoop101 ~]# systemctl start mongod.service