链接数据库
mysql -hlocalhost -uroot -p123456
mysql -h10.0.0.2 -P3306 -uroot -p123456
显示所有数据库
show databases;
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| my_spider |
| myspider_project |
| myspider_result |
| myspider_task |
| mysql |
| performance_schema |
| test |
+--------------------+
8 rows in set (0.00 sec)
使用数据库
use database;
mysql> use myspider_task;
Database changed
显示当前数据库里所有的表
show tables;
mysql> show tables;
+-------------------------+
| Tables_in_myspider_task |
+-------------------------+
| debug_douban_movie |
| projectdb |
+-------------------------+
2 rows in set (0.00 sec)
显示表的schema
desc table;
mysql> desc projectdb;
+------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| name | varchar(64) | NO | PRI | NULL | |
| group | varchar(64) | YES | | NULL | |
| status | varchar(16) | YES | | NULL | |
| script | text | YES | | NULL | |
| comments | varchar(1024) | YES | | NULL | |
| rate | float(11,4) | YES | | NULL | |
| burst | float(11,4) | YES | | NULL | |
| updatetime | double(16,4) | YES | | NULL | |
+------------+---------------+------+-----+---------+-------+
8 rows in set (0.12 sec)
将数据库myspider_task 中表 porjectdb 导出到文件, 再将此数据导入到 数据库myspider_project中
导出
mysqldump -hlocalhost -uroot -p123456 --opt myspider_task projectdb > /tmp/proj.sql
导入
mysql -hlocalhost -uroot -p123456 myspider_project < /tmp/proj.sql
删除原有表
use myspider_task;
...
drop table projectdb;
Query OK, 0 rows affected (0.10 sec)
将文本文件中的数据导入到数据库
文件文件中数据形如:
$head -2 test_dianping.dat
on_start|test_dianping|data:,on_start|2|{"priority": 9, "age": 0, "force_update": true}||{"callback": "on_start"}|{"process": {"exception": null, "ok": true, "logs": "", "follows": 4431, "result": null, "time": 0.3786752223968506}, "save": {}, "fetch": {"ok": true, "encoding": "unicode", "status_code": 200, "time": 0, "content": null, "headers": {}, "redirect_url": null, "error": null}}|1442372656.75403|1442372656.75406
994698e3141bc4a9dece268c4d1a7fce|test_dianping|http://www.dianping.com/shop/4730646|2|{"priority": 2, "age": 3600}|{"headers": {"Accept-Language": "zh-CN,zh;q=0.8", "Accept-Encoding": "gzip, deflate, sdch", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36", "Connection": "keep-alive", "Cache-Control": "max-age=0"}}|{"callback": "detail_page"}|{"process": {"exception": null, "ok": true, "logs": "", "follows": 0, "result": "{'rating':", "time": 0.2674269676208496}, "save": {}, "fetch": {"ok": true, "encoding": "UTF-8", "status_code": 200, "time": 0.29839086532592773, "content": null, "headers": {}, "redirect_url": null, "error": null}}|1442315073.11369|1442315073.11371
用mysqlimport 导入数据, mysqlimport 用法
mysqlimport [options] database_name, table_name.*
mysqlimport -hlocalhost -uroot -p123456 --fields-terminated-by="|" myspider_task /home/admin/test_dianping.dat
- myspider_task 数据库名
- test_dianping.dat 即是将数据插入 test_dianping表, 文件写绝对路径,不然找不到数据文件(ruobaole)
- fields-terminated-by指定每个字段以‘ |’作分隔
但mysqlimport似乎bug 不少,导入数据出现错乱
mysql> select taskid, url, status, updatetime from test_dianping limit 2;
+------------------------------------------------------------------+--------------------------------------+--------+-----------------+
| taskid | url | status | updatetime |
+------------------------------------------------------------------+--------------------------------------+--------+-----------------+
| 00000cd3873bd4f87e38980d88dd8d97 | http://www.dianping.com/shop/7896843 | 2 | 1443269117.7505 |
| 00000cd3873bd4f87e38980d88dd8d97|test_dianping|http://www.dianpi | NULL | NULL | NULL |
+------------------------------------------------------------------+--------------------------------------+--------+-----------------+
2 rows in set (0.04 sec)