1.安装mysql
2.建一个tpcd的数据库。
2.下载TPC-H压缩包
3.使用TPC-H生成数据集(参考上一篇博文)
4.在TPC-H的dbgen文件夹中找到dss.ddl,dss.ri。dss.ddl是建表语句。dss.ri是增加主外键语句。
dss.ri有错误,增加外键时,外表的字段名丢了,没有写。需要自己修改,不然执行会报错。
5.执行上面两个文件,把表结构和主外键关系建好。
use tpcd;
source C:\Users\chen\Desktop\mysql\dss.ddl;(先执行dss.ddl ,路径根据你存放的位置来修改)
source C:\Users\chen\Desktop\mysql\dss.ri;(后执行dss.ri)
6.表建好后,就可以导入数据了。导入用TPC-H生成的数据集。
因为生成的数据都是分开的,最好写一个批处理。命名load.sh (按顺序执行 不然可能会因为主外键约束导致导入数据不成功)
LOAD DATA LOCAL INFILE 'C:/Users/chen/Desktop/mysql/10GB/region.tbl' INTO TABLE REGION FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n';
LOAD DATA LOCAL INFILE 'C:/Users/chen/Desktop/mysql/10GB/nation.tbl' INTO TABLE NATION FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n';
LOAD DATA LOCAL INFILE 'C:/Users/chen/Desktop/mysql/10GB/part.tbl' INTO TABLE PART FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n';
LOAD DATA LOCAL INFILE 'C:/Users/chen/Desktop/mysql/10GB/supplier.tbl' INTO TABLE SUPPLIER FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n';
LOAD DATA LOCAL INFILE 'C:/Users/chen/Desktop/mysql/10GB/partsupp.tbl' INTO TABLE PARTSUPP FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n';
LOAD DATA LOCAL INFILE 'C:/Users/chen/Desktop/mysql/10GB/customer.tbl' INTO TABLE CUSTOMER FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n';
LOAD DATA LOCAL INFILE 'C:/Users/chen/Desktop/mysql/10GB/orders.tbl' INTO TABLE ORDERS FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n';
LOAD DATA LOCAL INFILE 'C:/Users/chen/Desktop/mysql/10GB/lineitem.tbl' INTO TABLE LINEITEM FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n';
接下来,在命令行中执行load.sh,数据集大的话,导入时间有点长~~忧愁!
ps:上述采用的导入数据的方法太慢啦!因为建了主键,还有主外键约束,每次插入数据都要去检查,几千万条数据真的太多了,电脑会炸掉。下面介绍另一种导入数据的方法:
先执行dss.ddl;
把数据库的引擎改成MyISAM。具体方法就是,用navicat连上数据库,选择表,右键,再选择设计表,在选项里找到引擎,改一下就ok啦。
然后就是导数据。用上面介绍的方法。
最后再执行dss.ri。
– For table REGION
ALTER TABLE TPCD.REGION
ADD PRIMARY KEY (R_REGIONKEY);
– For table NATION
ALTER TABLE TPCD.NATION
ADD PRIMARY KEY (N_NATIONKEY);
ALTER TABLE TPCD.NATION
ADD FOREIGN KEY NATION_FK1 (N_REGIONKEY) references TPCD.REGION(R_REGIONKEY);
COMMIT WORK;
– For table PART
ALTER TABLE TPCD.PART
ADD PRIMARY KEY (P_PARTKEY);
COMMIT WORK;
– For table SUPPLIER
ALTER TABLE TPCD.SUPPLIER
ADD PRIMARY KEY (S_SUPPKEY);
ALTER TABLE TPCD.SUPPLIER
ADD FOREIGN KEY SUPPLIER_FK1 (S_NATIONKEY) references TPCD.NATION(N_NATIONKEY);
COMMIT WORK;
– For table PARTSUPP
ALTER TABLE TPCD.PARTSUPP
ADD PRIMARY KEY (PS_PARTKEY,PS_SUPPKEY);
COMMIT WORK;
– For table CUSTOMER
ALTER TABLE TPCD.CUSTOMER
ADD PRIMARY KEY (C_CUSTKEY);
ALTER TABLE TPCD.CUSTOMER
ADD FOREIGN KEY CUSTOMER_FK1 (C_NATIONKEY) references TPCD.NATION(N_NATIONKEY);
COMMIT WORK;
– For table LINEITEM
ALTER TABLE TPCD.LINEITEM
ADD PRIMARY KEY (L_ORDERKEY,L_LINENUMBER);
COMMIT WORK;
– For table ORDERS
ALTER TABLE TPCD.ORDERS
ADD PRIMARY KEY (O_ORDERKEY);
COMMIT WORK;
– For table PARTSUPP
ALTER TABLE TPCD.PARTSUPP
ADD FOREIGN KEY PARTSUPP_FK1 (PS_SUPPKEY) references TPCD.SUPPLIER(S_SUPPKEY);
COMMIT WORK;
ALTER TABLE TPCD.PARTSUPP
ADD FOREIGN KEY PARTSUPP_FK2 (PS_PARTKEY) references TPCD.PART(P_PARTKEY);
COMMIT WORK;
– For table ORDERS
ALTER TABLE TPCD.ORDERS
ADD FOREIGN KEY ORDERS_FK1 (O_CUSTKEY) references TPCD.CUSTOMER(C_CUSTKEY);
COMMIT WORK;
– For table LINEITEM
ALTER TABLE TPCD.LINEITEM
ADD FOREIGN KEY LINEITEM_FK1 (L_ORDERKEY) references TPCD.ORDERS(O_ORDERKEY);
COMMIT WORK;
ALTER TABLE TPCD.LINEITEM
ADD FOREIGN KEY LINEITEM_FK2 (L_PARTKEY,L_SUPPKEY) references
TPCD.PARTSUPP(PS_PARTKEY,PS_SUPPKEY);
COMMIT WORK;