请阅读
Data Normalization,
General Indexing概念和
Foreign Key约束,以保持数据清理与引用完整性.这将让你走.
将数据存储在数组中可能在纸上看起来很自然,但对db引擎来说,性能大多数没有索引使用.此外,您将在第二天找到获取和维护您的数据将是一场噩梦.
以下应该让你开始一个良好的开始,因为你修补. Joins.
create table student
( studentId int auto_increment primary key,
fullName varchar(100) not null
-- etc
);
create table dept
( deptId int auto_increment primary key,
deptName varchar(100) not null -- Economics
-- etc
);
create table course
( courseId int auto_increment primary key,
deptId int not null,
courseName varchar(100) not null,
-- etc
CONSTRAINT fk_crs_dept FOREIGN KEY (deptId) REFERENCES dept(deptId)
);
create table SCJunction
( -- Student/Course Junction table (a.k.a Student is taking the course)
-- also holds the attendance and grade
id int auto_increment primary key,
studentId int not null,
courseId int not null,
term int not null, -- term (I am using 100 in below examples for this term)
attendance int not null, -- whatever you want, 100=always there, 0=he must have been partying,
grade int not null, -- just an idea
-- See (Note Composite Index) at bottom concerning next two lines.
unique key(studentId,courseId,term), -- no duplicates allowed for the combo (note student can re-take it next term)
key (courseId,studentId),
CONSTRAINT fk_sc_student FOREIGN KEY (studentId) REFERENCES student(studentId),
CONSTRAINT fk_sc_courses FOREIGN KEY (courseId) REFERENCES course(courseId)
);
创建测试数据
insert student(fullName) values ('Henry Carthage'),('Kim Billings'),('Shy Guy'); -- id's 1,2,3
insert student(fullName) values ('Shy Guy');
insert dept(deptName) values ('History'),('Math'),('English'); -- id's 1,2,3
insert course(deptId,courseName) values (1,'Early Roman Empire'),(1,'Italian Nation States'); -- id's 1 and 2 (History dept)
insert course(deptId,courseName) values (2,'Calculus 1'),(2,'Linear Algebra A'); -- id's 3 and 4 (Math dept)
insert course(deptId,courseName) values (3,'World of Chaucer'); -- id 5 (English dept)
-- show why FK constraints are important based on data at the moment
insert course(deptId,courseName) values (66,'Fly Fishing 101'); -- will generate error 1452. That dept 66 does not exist
-- That error is a good error to have. Better than faulty data
-- Have Kim (studentId=2) enrolled in a few courses
insert SCJunction(studentId,courseId,term,attendance,grade) values (2,1,100,-1,-1); -- Early Roman Empire, term 100 (made up), unknown attendance/grade
insert SCJunction(studentId,courseId,term,attendance,grade) values (2,4,100,-1,-1); -- Linear Algebra A
insert SCJunction(studentId,courseId,term,attendance,grade) values (2,5,100,-1,-1); -- World of Chaucer
-- Have Shy Guy (studentId=3) enrolled in one course only. He is shy
insert SCJunction(studentId,courseId,term,attendance,grade) values (3,5,100,-1,-1); -- Early Roman Empire, term 100 (made up), unknow attendance/grade
-- note if you run that line again, the Error 1062 Duplicate entry happens. Can't take same course more than once per term
一些简单的问题.
什么课程在哪个部门?
显示所有,使用表别名(缩写)使打字较少,可读性(有时)更好
select c.courseId,c.courseName,d.deptId,d.deptName
from course c
join dept d
on c.deptId=d.deptId
order by d.deptName,c.courseName -- note the order
+----------+-----------------------+--------+----------+
| courseId | courseName | deptId | deptName |
+----------+-----------------------+--------+----------+
| 5 | World of Chaucer | 3 | English |
| 1 | Early Roman Empire | 1 | History |
| 2 | Italian Nation States | 1 | History |
| 3 | Calculus 1 | 2 | Math |
| 4 | Linear Algebra A | 2 | Math |
+----------+-----------------------+--------+----------+
这个学期谁来参加乔the大世界?
(知道courseId = 5)
以下受益于我们在SCJunction中的综合指数之一.复合是多个列的索引.
select s.StudentId,s.FullName
from SCJunction j
join student s
on j.studentId=s.studentId
where j.courseId=5 and j.term=100
+-----------+--------------+
| StudentId | FullName |
+-----------+--------------+
| 2 | Kim Billings |
| 3 | Shy Guy |
+-----------+--------------+
金比宁在这个术语中注册了什么?
select s.StudentId,s.FullName,c.courseId,c.courseName
from SCJunction j
join student s
on j.studentId=s.studentId
join course c
on j.courseId=c.courseId
where s.studentId=2 and j.term=100
order by c.courseId DESC -- descending, just for the fun of it
+-----------+--------------+----------+--------------------+
| StudentId | FullName | courseId | courseName |
+-----------+--------------+----------+--------------------+
| 2 | Kim Billings | 5 | World of Chaucer |
| 2 | Kim Billings | 4 | Linear Algebra A |
| 2 | Kim Billings | 1 | Early Roman Empire |
+-----------+--------------+----------+--------------------+
金不知所措,所以掉下数学课
delete from SCJunction
where studentId=2 and courseId=4 and term=100
运行上面的select语句显示金正在采取的:
+-----------+--------------+----------+--------------------+
| StudentId | FullName | courseId | courseName |
+-----------+--------------+----------+--------------------+
| 2 | Kim Billings | 5 | World of Chaucer |
| 2 | Kim Billings | 1 | Early Roman Empire |
+-----------+--------------+----------+--------------------+
啊,容易的话爸爸不会很开心.
注意事项如SCJunction.term.很多都可以写出来,大部分时候我会跳过它,除了说它也应该在FK的某个地方.你可能希望你的术语看起来更像是SPRING2015而不是int.
而且到了id.这是我做的方式.这是个人喜好.它需要知道id#,查找它们.其他人可以选择一个类似HIST101而不是17的课程.那些可读性非常高(但是索引较慢(几乎没有)),那么最适合你.
注意综合指数
综合指数(INDEX表示KEY,反之亦然)是组合多个列用于快速数据检索的索引.在SCJunction表中的两个复合物的订单被翻转,以便根据数据后面的查询的Universe,数据库引擎可以根据您将要执行的最左边的列选择要使用哪个索引进行最快的检索.
对于唯一键#1,旁边的注释说明强制不重复(意思是垃圾数据)是不言自明的.例如,学生1课程1第1学期在该表中不能存在两次.
要理解的一个关键概念是索引中列名最左边排序的概念.
对于仅在studentId之后的查询,则使用首先列出studentId(最左边)的键.在仅在courseId之后的查询中,使用最多具有courseId的键.在studentId和courseId之后的查询中,db引擎可以决定要使用哪个复合密钥.
当我说“后续”,我的意思是在on子句或where子句条件.
如果没有这两个复合键(它们中的第1列和第2列翻转),那么在所寻求的列不是最左边索引的查询中,您不会受益于密钥使用,并且对数据进行缓慢的表格扫描回来.
所以,这两个指标结合了以下两个概念
>基于最左侧或两者的快速数据检索(studentId和courseId列)
>根据studentId,courseId和术语值执行该表中的不重复数据
外卖
重要的一点是,接线表可以快速索引检索,并且将数据的统一管理与逗号分隔的数据(数组心态)压缩成列,以及使用这样一个构造的所有痛苦.