6.830 lab1

lab地址
大佬博客地址
exercise 1
exercise 2
exercise 3
exercise 4
exercise 5
exercise 6
总结

Exercise 1

lab1实现元组Tuple和其模式TupleDesc

1.TupleDesc可以理解成表头
		我们每次在数据库中创建一张表时,
		首先便要指定 
			1>每个列的列名  
			2>列的类型(domain)
		TupleDesc中 以TDItem内部类来表示列
			TDItem内  Type fieldType表示列类型   
					  String fieldName表示列的名称
/**
     * 元组schema
     */
    CopyOnWriteArrayList<TDItem> tupleSchema;

    /**
     * A help class to facilitate organizing the information of each field
     * */
    public static class TDItem implements Serializable {

        private static final long serialVersionUID = 1L;

        /**
         * The type of the field
         * */
        public final Type fieldType;
        
        /**
         * The name of the field
         * */
        public final String fieldName;

        public TDItem(Type t, String n) {
            this.fieldName = n;
            this.fieldType = t;
        }

        public String toString() {
            return fieldName + "(" + fieldType + ")";
        }
    }

所也就是讲:对于一个元组schema,我们用TupleDesc来表示,TupleDesc维护一个集合->用于存储元组schema的各个列的类型和名称

2.Tuple就是表的每一行了
	1.显然每一行  都需要知道他的TupleDesc
	2.对于每一行数据   我们也用一个集合来存储值  CopyOnWriteArrayList<Field> fields 
		Feild即为各个列
		需要注意的是 值的类型需要和TupleDesc中描述的一致 
	   
	   这里我们有一个Field接口,IntField和StringField实现了此接口,
	   他们都有一个getType的方法
	   返回的就是 枚举类Type中的INT_TYPE和STRING_TYPE
	   
	   Type枚举类定义各个列的类型,Field接口表示各个列,保存其值
	3.这里我们还记录了每个元组即每行数据,在哪一个页的第几行
		此处封装了一个RecordId,内部有两个属性
									页id PageId pid;
									和行号 int tupleno;
		 而当我们去实现PageId接口时,例如HeapPageId,我们也有两个属性
		 							表id  int tableId;
		 							此页在表中的页码号  int pgNo;
		 							
         以此来记录此行数据在  哪一个表的第几页第几行
    /**
     * 元组tuple的schema信息
     */
    private TupleDesc tupleSchema;

    /**
     * 标志着此条元组的所有字段值
     */
    private final CopyOnWriteArrayList<Field> fields;

    /**
     * 代表元组的位置
     */
    private RecordId recordId;
对于如何取出数据
Tuple和TupleDesc都有一个iterator方法
	返回了各自维护的集合的迭代器
	用以迭代访问 TupleDesc的各个TDItem即列类型和列名
				Tuple的各个Field 即列的值  

At this point,
your code should pass the unit tests TupleTest and TupleDescTest.
modifyRecordId() should fail because you havn’t implemented it yet.

全代码

package simpledb.storage;

import simpledb.common.Type;

import java.io.Serializable;
import java.util.*;
import java.util.concurrent.CopyOnWriteArrayList;

/**
 * TupleDesc describes the schema of a tuple.
 */
public class TupleDesc implements Serializable {

    /**
     * 元组schema
     */
    CopyOnWriteArrayList<TDItem> tupleSchema;

    /**
     * A help class to facilitate organizing the information of each field
     * */
    public static class TDItem implements Serializable {

        private static final long serialVersionUID = 1L;

        /**
         * The type of the field
         * */
        public final Type fieldType;
        
        /**
         * The name of the field
         * */
        public final String fieldName;

        public TDItem(Type t, String n) {
            this.fieldName = n;
            this.fieldType = t;
        }

        public String toString() {
            return fieldName + "(" + fieldType + ")";
        }
    }

    /**
     * @return
     *        An iterator which iterates over all the field TDItems
     *        that are included in this TupleDesc
     * */
    public Iterator<TDItem> iterator() {
        // some code goes here
        return tupleSchema.iterator();
    }

    private static final long serialVersionUID = 1L;

    /**
     * Create a new TupleDesc with typeAr.length fields with fields of the
     * specified types, with associated named fields.
     * 
     * @param typeAr
     *            array specifying the number of and types of fields in this
     *            TupleDesc. It must contain at least one entry.
     * @param fieldAr
     *            array specifying the names of the fields. Note that names may
     *            be null.
     */
    public TupleDesc(Type[] typeAr, String[] fieldAr) {
        // some code goes here
        if(typeAr==null){
            throw new NullPointerException("typeAr is null");
        }
        int typeLen = typeAr.length;
        int fieldLen;
        if(fieldAr==null){
            fieldLen=0;
        }else {
            fieldLen=fieldAr.length;
        }

        if(typeLen<=0||typeLen<fieldLen){//typeAr非法 或者 typeAr长度大于fieldAr则异常
            throw new IllegalArgumentException();
        }
        tupleSchema=new CopyOnWriteArrayList<>();
        int i=0;
        for(;i<typeLen&&i<fieldLen;i++){
            tupleSchema.add(new TDItem(typeAr[i],fieldAr[i]));
        }
        for(;i<typeLen;i++){
            tupleSchema.add(new TDItem(typeAr[i],null));
        }
    }

    /**
     * Constructor. Create a new tuple desc with typeAr.length fields with
     * fields of the specified types, with anonymous (unnamed) fields.
     * 
     * @param typeAr
     *            array specifying the number of and types of fields in this
     *            TupleDesc. It must contain at least one entry.
     */
    public TupleDesc(Type[] typeAr) {
        // some code goes here
        this(typeAr,null);//用this(参数列表)的形式,自动调用对应的构造方法。不可以直接使用类名进行调用。
    }

    /**
     * @return the number of fields in this TupleDesc
     */
    public int numFields() {
        // some code goes here
        return tupleSchema.size();
    }

    /**
     * Gets the (possibly null) field name of the ith field of this TupleDesc.
     * 
     * @param i
     *            index of the field name to return. It must be a valid index.
     * @return the name of the ith field
     * @throws NoSuchElementException
     *             if i is not a valid field reference.
     */
    public String getFieldName(int i) throws NoSuchElementException {
        // some code goes here
        if(i<0||i>=this.numFields()){
            throw new NoSuchElementException();
        }
        return tupleSchema.get(i).fieldName;
    }

    /**
     * Gets the type of the ith field of this TupleDesc.
     * 
     * @param i
     *            The index of the field to get the type of. It must be a valid
     *            index.
     * @return the type of the ith field
     * @throws NoSuchElementException
     *             if i is not a valid field reference.
     */
    public Type getFieldType(int i) throws NoSuchElementException {
        if(i<0||i>=this.numFields()){
            throw new NoSuchElementException();
        }
        return tupleSchema.get(i).fieldType;
    }

    /**
     * Find the index of the field with a given name.
     * 
     * @param name
     *            name of the field.
     * @return the index of the field that is first to have the given name.
     * @throws NoSuchElementException
     *             if no field with a matching name is found.
     */
    public int fieldNameToIndex(String name) throws NoSuchElementException {
        int i=0;
        for(;i<tupleSchema.size();i++){
            String fieldName=tupleSchema.get(i).fieldName;
            if((fieldName!=null&&fieldName.equals(name))||name==null&&fieldName==null){
                break;
            }
        }
        if(i==tupleSchema.size()){
            throw new NoSuchElementException();
        }
        return i;
    }

    /**
     * @return The size (in bytes) of tuples corresponding to this TupleDesc.
     *         Note that tuples from a given TupleDesc are of a fixed size.
     */
    public int getSize() {
        // some code goes here
        int len=0;
        for(TDItem tdItem:tupleSchema){
            int fieldTypeLen = tdItem.fieldType.getLen();
            len+=fieldTypeLen;
        }
        return len;
    }

    /**
     * Merge two TupleDescs into one, with td1.numFields + td2.numFields fields,
     * with the first td1.numFields coming from td1 and the remaining from td2.
     * 
     * @param td1
     *            The TupleDesc with the first fields of the new TupleDesc
     * @param td2
     *            The TupleDesc with the last fields of the TupleDesc
     * @return the new TupleDesc
     */
    public static TupleDesc merge(TupleDesc td1, TupleDesc td2) {
        if(td1==null){
            return td2;
        }
        if (td2==null){
            return td1;
        }

        int numFields1= td1.numFields();
        int numFields2= td2.numFields();
        Type[] fieldType = new Type[numFields1 + numFields2];
        String[] fieldName = new String[numFields1 + numFields2];
        int i=0,j=0;
        while(i<numFields1) {
            fieldType[i]=td1.tupleSchema.get(i).fieldType;
            fieldName[i]=td1.tupleSchema.get(i).fieldName;
            i++;
        }
        while(j<numFields2){
            fieldType[i]=td2.tupleSchema.get(j).fieldType;
            fieldName[i]=td2.tupleSchema.get(j).fieldName;
            i++;
            j++;
        }
        return  new TupleDesc(fieldType,fieldName);
    }

    /**
     * Compares the specified object with this TupleDesc for equality. Two
     * TupleDescs are considered equal if they have the same number of items
     * and if the i-th type in this TupleDesc is equal to the i-th type in o
     * for every i.
     * 
     * @param o
     *            the Object to be compared for equality with this TupleDesc.
     * @return true if the object is equal to this TupleDesc.
     */

    public boolean equals(Object o) {
        // some code goes here
        if(!(o instanceof TupleDesc)){
            return false;
        }
        TupleDesc other = (TupleDesc) o;
        if(this.numFields()!=other.numFields() ||
                this.getSize()!=other.getSize()){
            return false;
        }
        for(int i=0;i<this.numFields();i++){
            if(!this.getFieldType(i).equals(other.getFieldType(i))){
                return false;
            }
        }

        return true;
    }

    public int hashCode() {
        // If you want to use TupleDesc as keys for HashMap, implement this so
        // that equal objects have equals hashCode() results
       // throw new UnsupportedOperationException("unimplemented");
        StringBuilder hash= new StringBuilder();
        for (TDItem next : tupleSchema) {
            hash.append(next.fieldType)
                    .append(next.fieldName);
        }
        return hash.toString().hashCode();
    }

    /**
     * Returns a String describing this descriptor. It should be of the form
     * "fieldType[0](fieldName[0]), ..., fieldType[M](fieldName[M])", although
     * the exact format does not matter.
     * 
     * @return String describing this descriptor.
     */
    public String toString() {
        // some code goes here
        StringBuilder stringBuilder = new StringBuilder();
        for(int i=0;i<numFields();i++){
            TDItem tdItem = tupleSchema.get(i);
            stringBuilder.append(tdItem.fieldType.toString())
                    .append("(").append(tdItem.fieldName).append("),");
        }
        stringBuilder.deleteCharAt(numFields()-1);
        return stringBuilder.toString();
    }
}

package simpledb.storage;

import java.io.Serializable;
import java.util.Iterator;
import java.util.Objects;
import java.util.concurrent.CopyOnWriteArrayList;

/**
 * Tuple maintains information about the contents of a tuple. Tuples have a
 * specified schema specified by a TupleDesc object and contain Field objects
 * with the data for each field.
 */
public class Tuple implements Serializable {

    /**
     * 元组tuple的schema信息
     */
    private TupleDesc tupleSchema;

    /**
     * 标志着此条元组的所有字段值
     */
    private final CopyOnWriteArrayList<Field> fields;

    /**
     * 代表元组的位置
     */
    private RecordId recordId;



    private static final long serialVersionUID = 1L;

    /**
     * Create a new tuple with the specified schema (type).
     *
     * @param td
     *            the schema of this tuple. It must be a valid TupleDesc
     *            instance with at least one field.
     */
    public Tuple(TupleDesc td) {
        // some code goes here
        this.tupleSchema =td;
        this.fields=new CopyOnWriteArrayList<>();
    }

    /**
     * @return The TupleDesc representing the schema of this tuple.
     */
    public TupleDesc getTupleDesc() {
        // some code goes here
        return tupleSchema;
    }

    /**
     * @return The RecordId representing the location of this tuple on disk. May
     *         be null.
     */
    public RecordId getRecordId() {
        // some code goes here
        return recordId;
    }

    /**
     * Set the RecordId information for this tuple.
     *
     * @param rid
     *            the new RecordId for this tuple.
     */
    public void setRecordId(RecordId rid) {
        // some code goes here
        this.recordId=rid;
    }

    /**
     * Change the value of the ith field of this tuple.
     *
     * @param i
     *            index of the field to change. It must be a valid index.
     * @param f
     *            new value for the field.
     */
    public void setField(int i, Field f) {
        // some code goes here
        if(i>=0&&i<fields.size()){
            fields.set(i,f);
        }else {
            fields.add(f);
        }
    }

    /**
     * @return the value of the ith field, or null if it has not been set.
     *
     * @param i
     *            field index to return. Must be a valid index.
     */
    public Field getField(int i) {
        // some code goes here
        if(i<0||i>=fields.size()){
            return null;
        }
        return fields.get(i);
    }

    /**
     * Returns the contents of this Tuple as a string. Note that to pass the
     * system tests, the format needs to be as follows:
     *
     * column1\tcolumn2\tcolumn3\t...\tcolumnN
     *
     * where \t is any whitespace (except a newline)
     */
    public String toString() {
        // some code goes here
        //throw new UnsupportedOperationException("Implement this");
        StringBuilder stringBuilder = new StringBuilder();
        for(int i=0;i<fields.size();i++){
            stringBuilder
                    //.append("FieldName: ")
                    //.append(tupleSchema==null?"null":tupleSchema.getFieldName(i))
                    //.append("==>Value: ")
                    .append(fields.get(i).toString())
                    .append("\t");
        }
        stringBuilder.append("\n");
        return stringBuilder.toString();

    }

    /**
     * @return
     *        An iterator which iterates over all the fields of this tuple
     * */
    public Iterator<Field> fields()
    {
        // some code goes here
        return fields.iterator();
    }

    /**
     * reset the TupleDesc of this tuple (only affecting the TupleDesc)
     * */
    public void resetTupleDesc(TupleDesc td)
    {
        // some code goes here
        this.tupleSchema =td;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Tuple tuple = (Tuple) o;
        return Objects.equals(tupleSchema, tuple.tupleSchema) && Objects.equals(fields, tuple.fields) && Objects.equals(recordId, tuple.recordId);
    }

    @Override
    public int hashCode() {
        return Objects.hash(tupleSchema, fields, recordId);
    }
}

Exercise 2

实现Catelog

Catelog用来管理所有的表记录
		     如何管理呢?
		     维护一个hashtable   key是表id  value是对应的表
		      ConcurrentHashMap<Integer,Table> hashTable;
然后此处的表类也需要自己是实现(并且注意  此处一张表独占一个文件)
			表的属性:
						所占的文件  DbFile dbFile;
						表的名称   String name;
						表的主键的名称  String pkeyField;
需要注意的是,我们以文件的磁盘路径为文件id,以文件id作为了表id

这里还有一个值得注意的方法,通过表名获取表的id

至此,我们了解可以通过Database类的getCatelog方法得到Catelog
通过catelog可以添加表addTable,
可以根据传入表id,得到表名,主键名,相应的存储文件(在文件上执行读取页,写页,插入删除元组的操作)

package simpledb.common;

import simpledb.common.Type;
import simpledb.storage.DbFile;
import simpledb.storage.HeapFile;
import simpledb.storage.TupleDesc;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.*;
import java.util.concurrent.ConcurrentHashMap;

/**
 * The Catalog keeps track of all available tables in the database and their
 * associated schemas.
 * For now, this is a stub catalog that must be populated with tables by a
 * user program before it can be used -- eventually, this should be converted
 * to a catalog that reads a catalog table from disk.
 * @Threadsafe
 */

/**
 * 存储所有表记录
 */
public class Catalog {

    /**
     * 然后我们再来说说为什么又将内部类设计为静态内部类与内部类:首先来看一下静态内部类的特点:如 昭言 用户所述那样,我是静态内部类,只不过是想借你的外壳用一下。
     * 本身来说,我和你没有什么“强依赖”上的关系。没有你,我也可以创建实例。
     * 那么,在设计内部类的时候我们就可以做出权衡:如果我内部类与你外部类关系不紧密,耦合程度不高,不需要访问外部类的所有属性或方法,那么我就设计成静态内部类。
     * 而且,由于静态内部类与外部类并不会保存相互之间的引用,因此在一定程度上,还会节省那么一点内存资源,何乐而不为呢~~
     *
     * 既然上面已经说了什么时候应该用静态内部类,那么如果你的需求不符合静态内部类所提供的一切好处,你就应该考虑使用内部类了。
     * 最大的特点就是:你在内部类中需要访问有关外部类的所有属性及方法,我知晓你的一切... ...
     *
     */
    /**
     * 表类
     */
    private static class Table{
        /**
         * 对应文件
         */
        DbFile dbFile;
        /**
         * 表名称
         */
        String name;
        /**
         * 表主键的名称
         */
        String pkeyField;

        public Table(DbFile dbFile,String name,String pkeyField){
            this.dbFile=dbFile;
            this.name=name;
            this.pkeyField=pkeyField;
        }

        @Override
        public String toString(){
            StringBuilder stringBuilder = new StringBuilder();
            stringBuilder.append("DbFile: ").append(dbFile)
                    .append("Name: ").append(name).
                    append("PkeyField: ").append(pkeyField);
            return stringBuilder.toString();
        }
    }

    /**
     * 管理所有的表文件的哈希表
     * tableId -->对应的table
     * 用于存储 tableId 和 表记录的映射
     *
     * 注意表id 等价于 文件id
     * 由文件绝对路径生成  一张表对应一个文件
     */
    ConcurrentHashMap<Integer,Table> hashTable;
    /**
     * Constructor.
     * Creates a new, empty catalog.
     */
    public Catalog() {
        // some code goes here
        hashTable=new ConcurrentHashMap<>();
    }

    /**
     * Add a new table to the catalog.
     * This table's contents are stored in the specified DbFile.
     * @param file the contents of the table to add;  file.getId() is the identfier of
     *    this file/tupledesc param for the calls getTupleDesc and getFile
     * @param name the name of the table -- may be an empty string.  May not be null.  If a name
     * conflict exists, use the last table to be added as the table for a given name.
     * @param pkeyField the name of the primary key field
     */
    public void addTable(DbFile file, String name, String pkeyField) {
        // some code goes here
        hashTable.put(file.getId(),new Table(file,name,pkeyField));
    }

    public void addTable(DbFile file, String name) {
        addTable(file, name, "");
    }

    /**
     * Add a new table to the catalog.
     * This table has tuples formatted using the specified TupleDesc and its
     * contents are stored in the specified DbFile.
     * @param file the contents of the table to add;  file.getId() is the identfier of
     *    this file/tupledesc param for the calls getTupleDesc and getFile
     */
    public void addTable(DbFile file) {
        addTable(file, (UUID.randomUUID()).toString());
    }

    /**
     * Return the id of the table with a specified name,
     * @throws NoSuchElementException if the table doesn't exist
     */
    public int getTableId(String name) throws NoSuchElementException {
        // some code goes here
        // 遍历
        Integer res=null;
        for(Integer key:hashTable.keySet()){
            if(hashTable.get(key).name.equals(name)){
                res=key;
                break;
            }
        }
        if(res != null){
            return res;
        }
        throw new NoSuchElementException("not found id for table " + name);
    }

    /**
     * Returns the tuple descriptor (schema) of the specified table
     * @param tableid The id of the table, as specified by the DbFile.getId()
     *     function passed to addTable
     * @throws NoSuchElementException if the table doesn't exist
     */
    public TupleDesc getTupleDesc(int tableid) throws NoSuchElementException {
        // some code goes here
        Table table = hashTable.getOrDefault(tableid, null);
        if(table!=null){
            return table.dbFile.getTupleDesc();
        }
        throw new NoSuchElementException("not found TupleDesc for table:"+tableid);
    }

    /**
     * Returns the DbFile that can be used to read the contents of the
     * specified table.
     * @param tableid The id of the table, as specified by the DbFile.getId()
     *     function passed to addTable
     */
    public DbFile getDatabaseFile(int tableid) throws NoSuchElementException {
        // some code goes here
        Table table = hashTable.getOrDefault(tableid, null);
        if(table!=null){
            return table.dbFile;
        }
        throw new NoSuchElementException("not found DatabaseFile for table:"+tableid);
    }

    public String getPrimaryKey(int tableid) {
        // some code goes here
        Table table = hashTable.getOrDefault(tableid, null);
        if(table!=null){
            return table.pkeyField;
        }
        throw new NoSuchElementException("not found PrimaryKey for table:"+tableid);
    }

    public Iterator<Integer> tableIdIterator() {
        // some code goes here
        return hashTable.keySet().iterator();
    }

    public String getTableName(int id) {
        // some code goes here
        Table table = hashTable.getOrDefault(id, null);
        if(table!=null){
            return table.name;
        }
        throw new NoSuchElementException("not found name for table:"+id);
    }
    
    /** Delete all tables from the catalog */
    public void clear() {
        // some code goes here
        hashTable.clear();
    }
    
    /**
     * Reads the schema from a file and creates the appropriate tables in the database.
     * @param catalogFile
     */
    public void loadSchema(String catalogFile) {
        String line = "";
        String baseFolder=new File(new File(catalogFile).getAbsolutePath()).getParent();
        try {
            BufferedReader br = new BufferedReader(new FileReader(catalogFile));
            
            while ((line = br.readLine()) != null) {
                //assume line is of the format name (field type, field type, ...)
                String name = line.substring(0, line.indexOf("(")).trim();
                //System.out.println("TABLE NAME: " + name);
                String fields = line.substring(line.indexOf("(") + 1, line.indexOf(")")).trim();
                String[] els = fields.split(",");
                ArrayList<String> names = new ArrayList<>();
                ArrayList<Type> types = new ArrayList<>();
                String primaryKey = "";
                for (String e : els) {
                    String[] els2 = e.trim().split(" ");
                    names.add(els2[0].trim());
                    if (els2[1].trim().equalsIgnoreCase("int"))
                        types.add(Type.INT_TYPE);
                    else if (els2[1].trim().equalsIgnoreCase("string"))
                        types.add(Type.STRING_TYPE);
                    else {
                        System.out.println("Unknown type " + els2[1]);
                        System.exit(0);
                    }
                    if (els2.length == 3) {
                        if (els2[2].trim().equals("pk"))
                            primaryKey = els2[0].trim();
                        else {
                            System.out.println("Unknown annotation " + els2[2]);
                            System.exit(0);
                        }
                    }
                }
                Type[] typeAr = types.toArray(new Type[0]);
                String[] namesAr = names.toArray(new String[0]);
                TupleDesc t = new TupleDesc(typeAr, namesAr);
                HeapFile tabHf = new HeapFile(new File(baseFolder+"/"+name + ".dat"), t);
                addTable(tabHf,name,primaryKey);
                System.out.println("Added table : " + name + " with schema " + t);
            }
        } catch (IOException e) {
            e.printStackTrace();
            System.exit(0);
        } catch (IndexOutOfBoundsException e) {
            System.out.println ("Invalid catalog entry : " + line);
            System.exit(0);
        }
    }
}


测试:
CatalogTest

Exercise 3

BufferPoll 缓冲池

为什么要有缓冲池??
	上述我们可以通过catelog管理所有的表
	现在我们想要某个表的所有元组,我们可以根据其表id获取其存储文件
	然后通过DbFile的readPage方法一次读取一页,然后在页面上进行查找
	
	此处我们可以建立一个pageStore用来缓存从磁盘文件读取的一定数量的页面
	这样的话在某些情况下就可以直接返回页  而不做磁盘io
	非常类似操纵系统的分页机制

所以也就是讲,缓存区需要维护一个存储特定数量页的集合
用来减少磁盘io,减少去读取文件

BufferPool的属性如下:
    /** Bytes per page, including header. */
    /**
     * 每页大小 4086byte
     */
    private static final int DEFAULT_PAGE_SIZE = 4096;
    private static int pageSize = DEFAULT_PAGE_SIZE;
    
    /** Default number of pages passed to the constructor. This is used by
    other classes. BufferPool should use the numPages argument to the
    constructor instead. */
    /**
     * 存储页的最大数量
     */
    public static final int DEFAULT_PAGES = 50;

    /**
     * final在字段上面,代表着这个字段不能被重新赋值,但请注意
     * 如果声明时没有赋值,在构造函数里则可以被首次赋值,其它该方法里绝对不行
     */
    /**
     * 当前缓存的页的最大数量
     */
    private final int pageNums;

    /**
     * 页pid的hashcode与缓冲区所存在的页的映射表
     * 存储的页面
     */
    //pageStore(即方法)都是允许调用的。只是不能再将这个pageStore变量指向其他的实例化对象了,即不能再出现pageStore= new ConcurrentHashMap<PageId,Page>(); 的代码。
    private final ConcurrentHashMap<Integer, Page> pageStore;
BufflePool应该可以做些什么?
	根据页id得到页
	根据页id丢弃页
	在页上插入删除元组,然后将其标记为脏页
	当缓冲满后根据某种策略丢弃一个页,注意丢弃前写入磁盘
	将某个脏页写入磁盘
	将所有的页刷新进磁盘

全代码:

package simpledb.storage;

import simpledb.common.Database;
import simpledb.common.Permissions;
import simpledb.common.DbException;
import simpledb.transaction.TransactionAbortedException;
import simpledb.transaction.TransactionId;

import java.io.*;

import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;

/**
 * BufferPool manages the reading and writing of pages into memory from
 * disk. Access methods call into it to retrieve pages, and it fetches
 * pages from the appropriate location.
 * <p>
 * The BufferPool is also responsible for locking;  when a transaction fetches
 * a page, BufferPool checks that the transaction has the appropriate
 * locks to read/write the page.
 * 
 * @Threadsafe, all fields are final
 *
 * 每次都需要将持久化的文件读到内存当中,作为快速读取的缓冲区
 */
public class BufferPool {
    /** Bytes per page, including header. */
    /**
     * 每页大小 4086byte
     */
    private static final int DEFAULT_PAGE_SIZE = 4096;
    private static int pageSize = DEFAULT_PAGE_SIZE;
    
    /** Default number of pages passed to the constructor. This is used by
    other classes. BufferPool should use the numPages argument to the
    constructor instead. */
    /**
     * 存储页的最大数量
     */
    public static final int DEFAULT_PAGES = 50;

    /**
     * final在字段上面,代表着这个字段不能被重新赋值,但请注意
     * 如果声明时没有赋值,在构造函数里则可以被首次赋值,其它该方法里绝对不行
     */
    /**
     * 当前缓存的页的最大数量
     */
    private final int pageNums;

    /**
     * 页pid的hashcode与缓冲区所存在的页的映射表
     * 存储的页面
     */
    //pageStore(即方法)都是允许调用的。只是不能再将这个pageStore变量指向其他的实例化对象了,即不能再出现pageStore= new ConcurrentHashMap<PageId,Page>(); 的代码。
    //private final ConcurrentHashMap<Integer, Page> pageStore;
    //重构了
    private final  LRUCache<Integer,Page> pageStore;
    /**
     * Creates a BufferPool that caches up to numPages pages.
     *
     * @param numPages maximum number of pages in this buffer pool.
     */
    public BufferPool(int numPages) {
        // some code goes here
        this.pageNums=numPages;
        this.pageStore=new LRUCache<>(numPages);
    }
    
    public static int getPageSize() {
      return pageSize;
    }
    
    // THIS FUNCTION SHOULD ONLY BE USED FOR TESTING!!
    public static void setPageSize(int pageSize) {
    	BufferPool.pageSize = pageSize;
    }
    
    // THIS FUNCTION SHOULD ONLY BE USED FOR TESTING!!
    public static void resetPageSize() {
    	BufferPool.pageSize = DEFAULT_PAGE_SIZE;
    }

    /**
     * Retrieve the specified page with the associated permissions.
     * Will acquire a lock and may block if that lock is held by another
     * transaction.
     * <p>
     * The retrieved page should be looked up in the buffer pool.  If it
     * is present, it should be returned.  If it is not present, it should
     * be added to the buffer pool and returned.  If there is insufficient
     * space in the buffer pool, a page should be evicted and the new page
     * should be added in its place.
     *
     * @param tid the ID of the transaction requesting the page
     * @param pid the ID of the requested page
     * @param perm the requested permissions on the page
     */
    public  Page getPage(TransactionId tid, PageId pid, Permissions perm)
        throws TransactionAbortedException, DbException {
        // some code goes here
//        //查看缓冲池中是否有
//        Page page = pageStore.getOrDefault(pid.hashCode(), null);
//
//        if(page==null){//如果不在,则读取
//            DbFile dbFile = Database.getCatalog().getDatabaseFile(pid.getTableId());
//            page = dbFile.readPage(pid);
//            // 是否超过大小
//            if(pageStore.size() >= pageNums){
//                // 淘汰 (后面的 Exercise 书写)
//                throw new DbException("页面已满");
//            }
//            // 放入缓存
//            pageStore.put(pid.hashCode(), page);
//        }
//        return page;
        //1.从LRU缓存中得到
        Page page = pageStore.get(pid.hashCode());
        if(page==null){
            //如果不存在 从磁盘中取
            page = Database.getCatalog().getDatabaseFile(pid.getTableId()).readPage(pid);
            //取完放到LRU缓存中  LRU缓存自动维护大小 淘汰页面等
            pageStore.put(pid.hashCode(),page);
        }
        return page;
    }

    /**
     * Releases the lock on a page.
     * Calling this is very risky, and may result in wrong behavior. Think hard
     * about who needs to call this and why, and why they can run the risk of
     * calling it.
     *
     * @param tid the ID of the transaction requesting the unlock
     * @param pid the ID of the page to unlock
     */
    public  void unsafeReleasePage(TransactionId tid, PageId pid) {
        // some code goes here
        // not necessary for lab1|lab2
    }

    /**
     * Release all locks associated with a given transaction.
     *
     * @param tid the ID of the transaction requesting the unlock
     */
    public void transactionComplete(TransactionId tid) {
        // some code goes here
        // not necessary for lab1|lab2
    }

    /** Return true if the specified transaction has a lock on the specified page */
    public boolean holdsLock(TransactionId tid, PageId p) {
        // some code goes here
        // not necessary for lab1|lab2
        return false;
    }

    /**
     * Commit or abort a given transaction; release all locks associated to
     * the transaction.
     *
     * @param tid the ID of the transaction requesting the unlock
     * @param commit a flag indicating whether we should commit or abort
     */
    public void transactionComplete(TransactionId tid, boolean commit) {
        // some code goes here
        // not necessary for lab1|lab2
    }

    /**
     * Add a tuple to the specified table on behalf of transaction tid.  Will
     * acquire a write lock on the page the tuple is added to and any other 
     * pages that are updated (Lock acquisition is not needed for lab2). 
     * May block if the lock(s) cannot be acquired.
     * 
     * Marks any pages that were dirtied by the operation as dirty by calling
     * their markDirty bit, and adds versions of any pages that have 
     * been dirtied to the cache (replacing any existing versions of those pages) so 
     * that future requests see up-to-date pages. 
     *
     * @param tid the transaction adding the tuple
     * @param tableId the table to add the tuple to
     * @param t the tuple to add
     */
    public void insertTuple(TransactionId tid, int tableId, Tuple t)
        throws DbException, IOException, TransactionAbortedException {
        // some code goes here
        // not necessary for lab1
//        // 注意在heapfile中仍是通过 bufferpool得到对应的页 然后去修改的
//        //也就是说未更新到磁盘
//        DbFile dbFile = Database.getCatalog().getDatabaseFile(tableId);
//        List<Page> pages = dbFile.insertTuple(tid, t);
//        //然后用脏页替换pageStore中现有页
//        //PageId pageId = t.getRecordId().getPageId();
//        for(Page page: pages){
//            page.markDirty(true,tid);
//            //如果缓冲区已满  则执行丢弃策略
//            if(pageNums >= BufferPool.DEFAULT_PAGES){
//                evictPage();
//            }
//            pageStore.put( page.getId().hashCode() ,page);//因为pagestrore 是 页id的hashcode 与 page的映射
//        }
        //1.调用DBFile DBFile如果发现LRU缓存中有   则调用Page 在Page上插入,并返回插入的脏页
        //                   否则其直接写,  那么需要将新页添加到缓存
        DbFile dbFile = Database.getCatalog().getDatabaseFile(tableId);
        List<Page> pages = dbFile.insertTuple(tid, t);
        //2.无论是返回的脏页还是新页  都需put到LRU缓存  LRU缓存会自动提到队列头
        for(Page page:pages){
            //新页需要加入缓存 那么是否需要标脏呢?
            //尽管由于我们创建新页时 先插入元组 再写入磁盘 不属于脏页
            //但新页 也属于时间前后比照下的  脏页
            page.markDirty(true,tid);  //二次标脏无所谓吧
            pageStore.put(page.getId().hashCode(),page);//脏页还需要加入缓存码?
        }

    }

    /**
     * Remove the specified tuple from the buffer pool.
     * Will acquire a write lock on the page the tuple is removed from and any
     * other pages that are updated. May block if the lock(s) cannot be acquired.
     *
     * Marks any pages that were dirtied by the operation as dirty by calling
     * their markDirty bit, and adds versions of any pages that have 
     * been dirtied to the cache (replacing any existing versions of those pages) so 
     * that future requests see up-to-date pages. 
     *
     * @param tid the transaction deleting the tuple.
     * @param t the tuple to delete
     */
    public  void deleteTuple(TransactionId tid, Tuple t)
        throws DbException, IOException, TransactionAbortedException {
        // some code goes here
        // not necessary for lab1

//        DbFile dbFile = Database.getCatalog().getDatabaseFile(t.getRecordId().getPageId().getTableId());
//        List<Page> pages = dbFile.deleteTuple(tid, t);
//        for (int i = 0; i < pages.size(); i++) {
//            pages.get(i).markDirty(true, tid);
//        }
        //1.还是先调用DBFile  DBFile从缓存中拿页Page 如果有则删除元组并返回已经标记了的脏页
        //                                       否则LRU缓存 pageStore会调用DBFile从磁盘读页 然后加入LRU缓存   然后再删除元组并返回脏页
        DbFile dbFile = Database.getCatalog().getDatabaseFile(t.getRecordId().getPageId().getTableId());
        List<Page> pages = dbFile.deleteTuple(tid, t);
        //??? 由于对象都是引用  那么pageStore已经全是脏页了  被修改过了
        //那也 二次标脏  二次put?
        for(Page page:pages){
            page.markDirty(true,tid);
            pageStore.put(page.getId().hashCode(),page);
        }


    }

    /**
     * Flush all dirty pages to disk.
     * NB: Be careful using this routine -- it writes dirty data to disk so will
     *     break simpledb if running in NO STEAL mode.
     */
    public synchronized void flushAllPages() throws IOException {
        // some code goes here
        // not necessary for lab1
        for(Page page:pageStore.getAllV()){
            PageId id = page.getId();
            flushPage(id);//flushPage会自动判断脏页
        }
    }

    /** Remove the specific page id from the buffer pool.
        Needed by the recovery manager to ensure that the
        buffer pool doesn't keep a rolled back page in its
        cache.
        
        Also used by B+ tree files to ensure that deleted pages
        are removed from the cache so they can be reused safely
    */
    public synchronized void discardPage(PageId pid) {
        // some code goes here
        // not necessary for lab1
       pageStore.removeK(pid.hashCode());
    }

    /**
     * Flushes a certain page to disk
     * @param pid an ID indicating the page to flush
     */
    private synchronized  void flushPage(PageId pid) throws IOException {
        // some code goes here
        // not necessary for lab1
        Page page = pageStore.get(pid.hashCode());
        //磁盘不存在 此页或则 此页不为脏页无需写入
        if(page==null||page.isDirty()==null){
            return;
        }
        //否则写进磁盘文件
        Database.getCatalog().getDatabaseFile(pid.getTableId()).writePage(page);
        //移除脏页和事务标签
        page.markDirty(false,null);
    }

    /** Write all pages of the specified transaction to disk.
     */
    public synchronized  void flushPages(TransactionId tid) throws IOException {
        // some code goes here
        // not necessary for lab1|lab2

    }

    /**
     * Discards a page from the buffer pool.
     * Flushes the page to disk to ensure dirty pages are updated on disk.
     */
    private synchronized  void evictPage() throws DbException {
        // some code goes here
        // not necessary for lab1
//        // 就简单挑选第一张被修改的脏页删除吧
//        Set<Map.Entry<Integer, Page>> entries = pageStore.entrySet();
//        for(Map.Entry<Integer, Page> entry:entries){
//            TransactionId dirty = entry.getValue().isDirty();
//            if(dirty!=null){
//                //先刷新脏页到磁盘
//                try {
//                    flushPage(entry.getValue().getId());
//                } catch (IOException e) {
//                    e.printStackTrace();
//                }
//                //再除去
//                pageStore.remove(entry.getKey(),entry.getValue());
//            }
//        }

        //1.简单来讲  我们从LRU缓存中获取 队尾V
        //然后调用discardPage即可
        Page tailV = pageStore.getTailV();
        discardPage(tailV.getId());

    }

}

package simpledb.storage;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Objects;
import java.util.concurrent.ConcurrentHashMap;

/**
 * 最近最久未使用
 *
 * 采用 hash表 + 双链表 实现
 */
public class LRUCache<K,V> {

    private class DLinkNode{
        private K key;
        private V value;
        private DLinkNode pre;
        private DLinkNode next;

        public DLinkNode() {
        }

        public DLinkNode(K key, V value) {
            this.key = key;
            this.value = value;
        }

        @Override
        public boolean equals(Object o) {
            if (this == o) return true;
            if (o == null || getClass() != o.getClass()) return false;
            DLinkNode dLinkNode = (DLinkNode) o;
            return Objects.equals(key, dLinkNode.key) && Objects.equals(value, dLinkNode.value);
        }

        @Override
        public int hashCode() {
            return Objects.hash(key, value);
        }
    }

    //hashMap k为key 值为linkNode
    private Map<K,DLinkNode> cache=new ConcurrentHashMap<K, DLinkNode>();
    private int size;
    private int capacity;
    private DLinkNode head;
    private DLinkNode tail;


    public LRUCache(int capacity) {
        this.capacity = capacity;
        size=0;
        head=new DLinkNode(null,null);
        tail=new DLinkNode(null,null);
        head.next=tail;
        tail.pre=head;
        tail.next=head;
        head.pre=tail;
    }

    public int getSize() {
        return size;
    }

    public DLinkNode getHead() {
        return head;
    }

    public DLinkNode getTail() {
        return tail;
    }

    public Map<K, DLinkNode> getCache() {
        return cache;
    }



    //get方法用于获取key对应的value
    必须要加锁,不然多线程链表指针会成环无法结束循环  synchronized是Java中的关键字,是一种同步锁
    public synchronized V get(K key){
        DLinkNode dLinkNode = cache.get(key);
        if(dLinkNode==null){
            return null;
        }
        //如果找到  先移动到队列头
        moveToHead(dLinkNode);
        return dLinkNode.value;
    }
    //removeTo方法用于将刚刚访问的节点移到队头
    private void moveToHead(DLinkNode node) {
        //先提出来
        node.next.pre = node.pre;
        node.pre.next = node.next;


        //再插入
        node.next = head.next;
        head.next.pre = node;
        node.pre = head;
        head.next = node;
    }
    //删除一个节点  内部使用
    private void remove(K key,DLinkNode node){
        node.next.pre = node.pre;
        node.pre.next = node.next;

        cache.remove(key);
    }
    //提供一个删除K 的公有方法
    public void removeK(K key){
        DLinkNode dLinkNode = cache.get(key);
        remove(key,dLinkNode);//并未更改要删除的页的顺序!!!
    }

    public synchronized void put(K key,V value){
        //get命中 直接返回
        DLinkNode node = cache.get(key);
        if(node!=null){
            //修改值
            //移动到队列头
            node.value = value;
            moveToHead(node);
            return;
        }
        //否则添加
        DLinkNode dLinkNode = new DLinkNode(key, value);
        //每次入队都是队头
        addToHead(dLinkNode);
        size++;
        //注意添加到 cache
        cache.put(key,dLinkNode);
        //需要查看是否超出容量
        if(size>=capacity){
            //移除队尾  即最近最久未使用
            removeTail();
        }
    }
    private void addToHead(DLinkNode node) {
        node.pre = head;
        node.next = head.next;
        head.next.pre = node;
        head.next = node;
    }
    //移除队尾节点  注意不是tail 不是tail!!!!
    private void removeTail() {
        DLinkNode newTail = tail.pre;

        remove(newTail.key,newTail);

        size--;

    }

    //提供获取所有v的方法
    public List<V> getAllV(){
        List<V> vList=new ArrayList<>();
        //遍历双链表加入即可
        DLinkNode p=head;
        //vList.add(p.value);
        while (p.next!=tail){
            p=p.next;
            vList.add(p.value);
        }
        return vList;
    }
    //提供获取队尾的PageId的方法
    public V getTailV(){
        return tail.value;//并未调整顺序
    }



}

请注意,这里我实现了一个LRU缓存类,然后BufferPool维护的这个LRU缓存,已经属于lab2内容
本节暂时不用测试,在下一节 HeapPage中 readPage会有测试

Exercise 4

HeapPage 页面 RecordId 路径 HeapPageId 页面id

前面已经说过了
	在Tuple里我们还记录了每个元组即每行数据,在哪一个页的第几行
	此处封装了一个RecordId,内部有两个属性
								页id PageId pid;
								和行号 int tupleno;
	 而当我们去实现PageId接口时,例如HeapPageId,我们也有两个属性
	 							表id  int tableId;
	 							此页在表中的页码号  int pgNo;
	 							
     以此来记录此行数据在  哪一个表的第几页第几行

RecordId

package simpledb.storage;

import java.io.Serializable;

/**
 * A RecordId is a reference to a specific tuple on a specific page of a
 * specific table.
 * 路径结构,表面该行属于PageId某个页面的tupleno行,这PageId所对应的DbPage属于DbFile某个表
 */
public class RecordId implements Serializable {

    private static final long serialVersionUID = 1L;

    /**
     *页面id
     */
    PageId pid;

    /**
     * 行序号
     */
    int tupleno;

    /**
     * Creates a new RecordId referring to the specified PageId and tuple
     * number.
     * 
     * @param pid
     *            the pageid of the page on which the tuple resides
     * @param tupleno
     *            the tuple number within the page.
     */
    public RecordId(PageId pid, int tupleno) {
        // some code goes here
        this.pid=pid;
        this.tupleno=tupleno;
    }

    /**
     * @return the tuple number this RecordId references.
     */
    public int getTupleNumber() {
        // some code goes here
        return tupleno;
    }

    /**
     * @return the page id this RecordId references.
     */
    public PageId getPageId() {
        // some code goes here
        return pid;
    }

    /**
     * Two RecordId objects are considered equal if they represent the same
     * tuple.
     * 
     * @return True if this and o represent the same tuple
     */
    @Override
    public boolean equals(Object o) {
        // some code goes here
        //throw new UnsupportedOperationException("implement this");
        if(!(o instanceof RecordId)){
            return false;
        }
        RecordId other = (RecordId) o;
        if(other.pid.getTableId()!=this.pid.getTableId()||other.pid.getPageNumber()!=pid.getPageNumber()||other.tupleno!=this.tupleno){
            return false;
        }
        return true;
    }

    /**
     * You should implement the hashCode() so that two equal RecordId instances
     * (with respect to equals()) have the same hashCode().
     * 
     * @return An int that is the same for equal RecordId objects.
     */
    @Override
    public int hashCode() {
        // some code goes here
        //throw new UnsupportedOperationException("implement this");
        String hash = ""+pid.getTableId()+pid.getPageNumber()+tupleno;
        return hash.hashCode();
    }

}

HeapPageId

package simpledb.storage;

/** Unique identifier for HeapPage objects. */
public class HeapPageId implements PageId {

    /**
     * 对应的表的id
     */
    private int tableId;
    /**
     * 此页在表中的页码
     * 位置
     */
    private int pgNo;
    /**
     * Constructor. Create a page id structure for a specific page of a
     * specific table.
     *
     * @param tableId The table that is being referenced
     * @param pgNo The page number in that table.
     */
    public HeapPageId(int tableId, int pgNo) {
        // some code goes here
        this.tableId=tableId;
        this.pgNo=pgNo;
    }

    /** @return the table associated with this PageId */
    public int getTableId() {
        // some code goes here
        return tableId;
    }

    /**
     * @return the page number in the table getTableId() associated with
     *   this PageId
     */
    public int getPageNumber() {
        // some code goes here
        return pgNo;
    }

    /**
     * @return a hash code for this page, represented by a combination of
     *   the table number and the page number (needed if a PageId is used as a
     *   key in a hash table in the BufferPool, for example.)
     * @see BufferPool
     */
    public int hashCode() {
        // some code goes here
        //throw new UnsupportedOperationException("implement this");
        String hash = ""+tableId+pgNo;
        return hash.hashCode();
    }

    /**
     * Compares one PageId to another.
     *
     * @param o The object to compare against (must be a PageId)
     * @return true if the objects are equal (e.g., page numbers and table
     *   ids are the same)
     */
    public boolean equals(Object o) {
        // some code goes here
        if(o instanceof HeapPageId){
            HeapPageId other=(HeapPageId) o;
            if(other.pgNo==pgNo && other.tableId==tableId){
                return true;
            }
        }
        return false;
    }

    /**
     *  Return a representation of this object as an array of
     *  integers, for writing to disk.  Size of returned array must contain
     *  number of integers that corresponds to number of args to one of the
     *  constructors.
     */
    public int[] serialize() {
        int[] data = new int[2];

        data[0] = getTableId();
        data[1] = getPageNumber();

        return data;
    }

}

接着我们详细说说HeapPage

 显然HeapPage应该是存储了一定数量元组的  是每次从磁盘文件读取写入的最小单位
 所以说HeapPage应该包括属性:
 			页id(从而得到其所在表id,行号)
 			tuple的集合
 			TupleDesc   表头
 			是否为脏页
 			事务id
当然还有:
			在这里我们用位图bitmap来表示页的每行是否已被占有
			 final byte[] header;//头部数据 bitmap
			final int numSlots;//槽数,也就是行的数量

		在这种情况下每页的行数也就 等于 页的总bit / (每行的总bit+1)
package simpledb.storage;

import simpledb.common.Database;
import simpledb.common.DbException;
import simpledb.common.Debug;
import simpledb.common.Catalog;
import simpledb.transaction.TransactionId;

import java.util.*;
import java.io.*;

/**
* Each instance of HeapPage stores data for one page of HeapFiles and 
* implements the Page interface that is used by BufferPool.
*
* @see HeapFile
* @see BufferPool
*
*/
public class HeapPage implements Page {


  final HeapPageId pid;//页id

  final TupleDesc td;//表头部

  final byte[] header;//头部数据 bitmap
  final int numSlots;//槽数,也就是行的数量

  final Tuple[] tuples;//元组数据

  byte[] oldData;
  private final Byte oldDataLock= (byte) 0;

  private TransactionId tid;//事务id       记录最后一次脏页的tid 当冲突时 先将脏页写走 再进行修改
  private boolean dirty;//判断是否为脏页

  /**
   * Create a HeapPage from a set of bytes of data read from disk.
   * The format of a HeapPage is a set of header bytes indicating
   * the slots of the page that are in use, some number of tuple slots.
   *  Specifically, the number of tuples is equal to: <p>
   *          floor((BufferPool.getPageSize()*8) / (tuple size * 8 + 1))
   * <p> where tuple size is the size of tuples in this
   * database table, which can be determined via {@link Catalog#getTupleDesc}.
   * The number of 8-bit header words is equal to:
   * <p>
   *      ceiling(no. tuple slots / 8)
   * <p>
   * @see Database#getCatalog
   * @see Catalog#getTupleDesc
   * @see BufferPool#getPageSize()
   */
  public HeapPage(HeapPageId id, byte[] data) throws IOException {
      this.pid = id;
      this.td = Database.getCatalog().getTupleDesc(id.getTableId());
      this.numSlots = getNumTuples();
      DataInputStream dis = new DataInputStream(new ByteArrayInputStream(data));

      // allocate and read the header slots of this page
      header = new byte[getHeaderSize()];
      for (int i=0; i<header.length; i++)
          header[i] = dis.readByte();
      
      tuples = new Tuple[numSlots];
      try{
          // allocate and read the actual records of this page
          for (int i=0; i<tuples.length; i++)
              tuples[i] = readNextTuple(dis,i);
      }catch(NoSuchElementException e){
          e.printStackTrace();
      }
      dis.close();

      setBeforeImage();
  }

  //返回一个页面有多少个元组
  /** Retrieve the number of tuples on this page.
      @return the number of tuples on this page
  */
  private int getNumTuples() {        
      // some code goes here
      //总字节数*8/(每条元组所占字节数*8+1)
      return (int) Math.floor( BufferPool.getPageSize()*8* 1.0 / ((td.getSize()*8)+1));

  }

  /**
   * Computes the number of bytes in the header of a page in a HeapFile with each tuple occupying tupleSize bytes
   * @return the number of bytes in the header of a page in a HeapFile with each tuple occupying tupleSize bytes
   */
  //获取头部长度
  private int getHeaderSize() {
      
      // some code goes here
      return (int) Math.ceil(getNumTuples()*1.0/8);
               
  }
  
  /** Return a view of this page before it was modified
      -- used by recovery */
  public HeapPage getBeforeImage(){
      try {
          byte[] oldDataRef = null;
          synchronized(oldDataLock)
          {
              oldDataRef = oldData;
          }
          return new HeapPage(pid,oldDataRef);
      } catch (IOException e) {
          e.printStackTrace();
          //should never happen -- we parsed it OK before!
          System.exit(1);
      }
      return null;
  }
  
  public void setBeforeImage() {
      synchronized(oldDataLock)
      {
      oldData = getPageData().clone();
      }
  }

  /**
   * @return the PageId associated with this page.
   */
  public HeapPageId getId() {
  // some code goes here
  //throw new UnsupportedOperationException("implement this");
      return pid;
  }

  /**
   * Suck up tuples from the source file.
   */
  private Tuple readNextTuple(DataInputStream dis, int slotId) throws NoSuchElementException {
      // if associated bit is not set, read forward to the next tuple, and
      // return null.
      if (!isSlotUsed(slotId)) {
          for (int i=0; i<td.getSize(); i++) {
              try {
                  dis.readByte();
              } catch (IOException e) {
                  throw new NoSuchElementException("error reading empty tuple");
              }
          }
          return null;
      }

      // read fields in the tuple
      Tuple t = new Tuple(td);
      RecordId rid = new RecordId(pid, slotId);
      t.setRecordId(rid);
      try {
          for (int j=0; j<td.numFields(); j++) {
              Field f = td.getFieldType(j).parse(dis);
              t.setField(j, f);
          }
      } catch (java.text.ParseException e) {
          e.printStackTrace();
          throw new NoSuchElementException("parsing error!");
      }

      return t;
  }

  /**
   * Generates a byte array representing the contents of this page.
   * Used to serialize this page to disk.
   * <p>
   * The invariant here is that it should be possible to pass the byte
   * array generated by getPageData to the HeapPage constructor and
   * have it produce an identical HeapPage object.
   *
   * @see #HeapPage
   * @return A byte array correspond to the bytes of this page.
   */
  public byte[] getPageData() {
      int len = BufferPool.getPageSize();
      ByteArrayOutputStream baos = new ByteArrayOutputStream(len);
      DataOutputStream dos = new DataOutputStream(baos);

      // create the header of the page
      for (byte b : header) {
          try {
              dos.writeByte(b);
          } catch (IOException e) {
              // this really shouldn't happen
              e.printStackTrace();
          }
      }

      // create the tuples
      for (int i=0; i<tuples.length; i++) {

          // empty slot
          if (!isSlotUsed(i)) {
              for (int j=0; j<td.getSize(); j++) {
                  try {
                      dos.writeByte(0);
                  } catch (IOException e) {
                      e.printStackTrace();
                  }

              }
              continue;
          }

          // non-empty slot
          for (int j=0; j<td.numFields(); j++) {
              Field f = tuples[i].getField(j);
              try {
                  f.serialize(dos);
              
              } catch (IOException e) {
                  e.printStackTrace();
              }
          }
      }

      // padding
      int zerolen = BufferPool.getPageSize() - (header.length + td.getSize() * tuples.length); //- numSlots * td.getSize();
      byte[] zeroes = new byte[zerolen];
      try {
          dos.write(zeroes, 0, zerolen);
      } catch (IOException e) {
          e.printStackTrace();
      }

      try {
          dos.flush();
      } catch (IOException e) {
          e.printStackTrace();
      }

      return baos.toByteArray();
  }

  /**
   * Static method to generate a byte array corresponding to an empty
   * HeapPage.
   * Used to add new, empty pages to the file. Passing the results of
   * this method to the HeapPage constructor will create a HeapPage with
   * no valid tuples in it.
   *
   * @return The returned ByteArray.
   */
  public static byte[] createEmptyPageData() {
      int len = BufferPool.getPageSize();
      return new byte[len]; //all 0
  }

  /**
   * Delete the specified tuple from the page; the corresponding header bit should be updated to reflect
   *   that it is no longer stored on any page.
   * @throws DbException if this tuple is not on this page, or tuple slot is
   *         already empty.
   * @param t The tuple to delete
   */
  public void deleteTuple(Tuple t) throws DbException {
      // some code goes here
      // not necessary for lab1
      RecordId recordId = t.getRecordId();
      if(recordId!=null){
          PageId pageId = recordId.getPageId();
          int tupleno = recordId.getTupleNumber();
          if(pageId.equals(pid)&& tupleno<numSlots && isSlotUsed(tupleno)){
              //便利所有的bitmap  找到t 然后置为null并更改slot
              for (int i = 0; i < numSlots; i++) {
                  if(isSlotUsed(i) && t.equals(tuples[i])){
                      markSlotUsed(i,false);
                      tuples[i]=null;
                      return;
                  }
              }
          }
          throw new DbException("can't find tuple in the page");
      }
      throw new DbException("can't find tuple in the page");
  }

  /**
   * Adds the specified tuple to the page;  the tuple should be updated to reflect
   *  that it is now stored on this page.
   * @throws DbException if the page is full (no empty slots) or tupledesc
   *         is mismatch.
   * @param t The tuple to add.
   */
  public void insertTuple(Tuple t) throws DbException {
      // some code goes here
      // not necessary for lab1
      //1.通过bitmap判断空间是否足够
      if(getNumEmptySlots()==0)throw new DbException("no empty slots");
      //2.判断插入的元组TupleDesc是否正确
      if(!t.getTupleDesc().equals(this.td))throw new DbException("no match tupleDesc");
      //3.搜索第一个未被使用的slot然后插入进去
      for(int i=0;i<numSlots;i++){
          if(!isSlotUsed(i)){
              markSlotUsed(i,true);
              tuples[i]=t;
              //显然当插入元组后  应该设置元组的RecordId即位置 已知
              tuples[i].setRecordId(new RecordId(pid,i));
              break;
          }
      }
  }

  /**
   * Marks this page as dirty/not dirty and record that transaction
   * that did the dirtying
   */
  public void markDirty(boolean dirty, TransactionId tid) {
      // some code goes here
  // not necessary for lab1
      this.dirty=dirty;
      this.tid=tid;
  }

  /**
   * Returns the tid of the transaction that last dirtied this page, or null if the page is not dirty
   */
  public TransactionId isDirty() {
      // some code goes here
  // Not necessary for lab1
      if(dirty){
          return tid;
      }
      return null;
  }

  /**
   * Returns the number of empty slots on this page.
   */
  public int getNumEmptySlots() {
      // some code goes here
      int i=0,sum=0;
      for(;i<numSlots;i++){
          int slot=i/8;
          int move=i%8;
          sum+=((header[slot]>>move) & 1)==0? 1 : 0;//对于每个slot我们从右到左使用!!!
      }
      return sum;
  }

  /**
   * Returns true if associated slot on this page is filled.
   */
  //该槽位是否被使用
  public boolean isSlotUsed(int i) {
      // some code goes here
      //槽位
      int slot=i/8;
      //偏移
      int move=i%8;

      return ((header[slot]>>move) & 1) ==1;//对于每个slot我们从右到左使用!!!
  }

  /**
   * Abstraction to fill or clear a slot on this page.
   */
  private void markSlotUsed(int i, boolean value) {
      // some code goes here
      // not necessary for lab1
      // 找到槽位
      int slot = i / 8;
      // 偏移
      int move = i % 8;
      // 掩码  //对于每个slot我们从右到左使用!!!
     // byte mask = (byte) (1 << (8-move));
      byte mask = (byte) (1 << move);
      // 更新槽位
      if(value){
          // 标记已被使用,更新 0 为 1
          header[slot] |= mask;
      }else{
          // 标记为未被使用,更新 1 为 0
          // 除了该位其他位都是 1 的掩码,也就是该位会与 0 运算, 从而置零
          header[slot] &= ~mask;
      }
  }

  /**
   * @return an iterator over all tuples on this page (calling remove on this iterator throws an UnsupportedOperationException)
   * (note that this iterator shouldn't return tuples in empty slots!)
   */
  public Iterator<Tuple> iterator() {
      // some code goes here
      // 获取已使用的槽对应的数
      ArrayList<Tuple> res = new ArrayList<>();
      for (int i = 0; i < numSlots; i++) {
          if(isSlotUsed(i)){
              res.add(tuples[i]);
          }
      }
      return res.iterator();
  }

}

测试:
HeapPageIdTest 、HeapPageReadTest、RecordIdTest

Exercise 5

HeapFile 存储文件

HeapFile有两个属性:
		File用于存储表数据
		TupleDesc 描述表头
通过numPages方法得到有多少页    !!!
然后还可以通过readPage  writePage  读写页
			

需注意:通过insertTuple    deleteTuple插入删除元组
这里的插入删除元组还应该是调用BufferPool来,

还有 这里写了一个HeapFileIterator
这个迭代器用于从文件中的某一页开始提取出一条条元组

    /**
     * 实现一个HeapFileIterator
     * 用于迭代该文件 中的一行行元组值
     */
    private static class HeapFileIterator implements DbFileIterator{

        //堆文件
        private final HeapFile heapFile;
        //事务id
        private final TransactionId tid;
        // 元组迭代器
        private Iterator<Tuple> iterator;
        //元组所在页码
        private int whichPage;

        public HeapFileIterator(HeapFile heapFile,TransactionId tid){
            this.heapFile=heapFile;
            this.tid=tid;
        }

        // 获取 当前文件当前页码的页  的迭代器
        private Iterator<Tuple> getPageTuple(int pageNumber) throws TransactionAbortedException, DbException {
            //首先判断页码是否超出文件范围
            if(pageNumber>=0 && pageNumber<heapFile.numPages()){
                HeapPageId heapPageId = new HeapPageId(heapFile.getId(), pageNumber);
                // 从缓存池中查询相应的页面 读权限
                HeapPage page = (HeapPage)Database.getBufferPool().getPage(tid, heapPageId, Permissions.READ_ONLY);
                return page.iterator();
            }

            throw new DbException(String.format("heapFile %d not contain page %d",  heapFile.getId(),pageNumber));
        }

        @Override
        public void open() throws DbException, TransactionAbortedException {
            //使页码为0
            this.whichPage=0;
            //设置迭代器指向当前文件当前0页码  的页的第一行
            iterator = getPageTuple(whichPage);
        }

        @Override
        public boolean hasNext() throws DbException, TransactionAbortedException {
            //如果迭代器为null
            if(iterator==null){
                return false;
            }
            //如果当前页码没有元素了  查看下一个页
            if(!iterator.hasNext()){
                //某些页可能没有存储的  所以需要while
                while(whichPage< (heapFile.numPages()-1)){
                    whichPage++;
                    iterator=getPageTuple(whichPage);
                    if(iterator.hasNext()){
                        return true;
                    }
                }
                return false;
            }

            return true;

        }

        @Override
        public Tuple next() throws DbException, TransactionAbortedException, NoSuchElementException {
            //如果迭代器为空或者没有下一个元素了  抛出异常
            if(iterator == null || !iterator.hasNext()){
                throw new NoSuchElementException();
            }
            // 返回下一个元组
            return iterator.next();
        }

        //从新开始
        @Override
        public void rewind() throws DbException, TransactionAbortedException {
            // 清除上一个迭代器
            close();
            // 重新开始
            open();
        }

        @Override
        public void close() {
            iterator=null;
        }
    }

全代码:

package simpledb.storage;

import simpledb.common.Database;
import simpledb.common.DbException;
import simpledb.common.Debug;
import simpledb.common.Permissions;
import simpledb.transaction.TransactionAbortedException;
import simpledb.transaction.TransactionId;

import java.io.*;
import java.util.*;

/**
 * HeapFile is an implementation of a DbFile that stores a collection of tuples
 * in no particular order. Tuples are stored on pages, each of which is a fixed
 * size, and the file is simply a collection of those pages. HeapFile works
 * closely with HeapPage. The format of HeapPages is described in the HeapPage
 * constructor.
 * 
 * @see HeapPage#HeapPage
 * @author Sam Madden
 */
public class HeapFile implements DbFile {


    /**
     * 文件
     */
    private final File file;

    /**
     * 表头
     */
    private final TupleDesc td;


    /**
     * 实现一个HeapFileIterator
     * 用于迭代该文件 中的一行行元组值
     */
    private static class HeapFileIterator implements DbFileIterator{

        //堆文件
        private final HeapFile heapFile;
        //事务id
        private final TransactionId tid;
        // 元组迭代器
        private Iterator<Tuple> iterator;
        //元组所在页码
        private int whichPage;

        public HeapFileIterator(HeapFile heapFile,TransactionId tid){
            this.heapFile=heapFile;
            this.tid=tid;
        }

        // 获取 当前文件当前页码的页  的迭代器
        private Iterator<Tuple> getPageTuple(int pageNumber) throws TransactionAbortedException, DbException {
            //首先判断页码是否超出文件范围
            if(pageNumber>=0 && pageNumber<heapFile.numPages()){
                HeapPageId heapPageId = new HeapPageId(heapFile.getId(), pageNumber);
                // 从缓存池中查询相应的页面 读权限
                HeapPage page = (HeapPage)Database.getBufferPool().getPage(tid, heapPageId, Permissions.READ_ONLY);
                return page.iterator();
            }

            throw new DbException(String.format("heapFile %d not contain page %d",  heapFile.getId(),pageNumber));
        }

        @Override
        public void open() throws DbException, TransactionAbortedException {
            //使页码为0
            this.whichPage=0;
            //设置迭代器指向当前文件当前0页码  的页的第一行
            iterator = getPageTuple(whichPage);
        }

        @Override
        public boolean hasNext() throws DbException, TransactionAbortedException {
            //如果迭代器为null
            if(iterator==null){
                return false;
            }
            //如果当前页码没有元素了  查看下一个页
            if(!iterator.hasNext()){
                //某些页可能没有存储的  所以需要while
                while(whichPage< (heapFile.numPages()-1)){
                    whichPage++;
                    iterator=getPageTuple(whichPage);
                    if(iterator.hasNext()){
                        return true;
                    }
                }
                return false;
            }

            return true;

        }

        @Override
        public Tuple next() throws DbException, TransactionAbortedException, NoSuchElementException {
            //如果迭代器为空或者没有下一个元素了  抛出异常
            if(iterator == null || !iterator.hasNext()){
                throw new NoSuchElementException();
            }
            // 返回下一个元组
            return iterator.next();
        }

        //从新开始
        @Override
        public void rewind() throws DbException, TransactionAbortedException {
            // 清除上一个迭代器
            close();
            // 重新开始
            open();
        }

        @Override
        public void close() {
            iterator=null;
        }
    }

    /**
     * Constructs a heap file backed by the specified file.
     * 
     * @param f
     *            the file that stores the on-disk backing store for this heap
     *            file.
     */
    public HeapFile(File f, TupleDesc td) {
        // some code goes here
        this.file=f;
        this.td=td;
    }

    /**
     * Returns the File backing this HeapFile on disk.
     * 
     * @return the File backing this HeapFile on disk.
     */
    public File getFile() {
        // some code goes here
        return file;
    }

    /**
     * Returns an ID uniquely identifying this HeapFile. Implementation note:
     * you will need to generate this tableid somewhere to ensure that each
     * HeapFile has a "unique id," and that you always return the same value for
     * a particular HeapFile. We suggest hashing the absolute file name of the
     * file underlying the heapfile, i.e. f.getAbsoluteFile().hashCode().
     *
     * 由绝对路径生成文件id   将文件id作为表id    所以讲一个文件对应一张表
     *
     * @return an ID uniquely identifying this HeapFile.
     */
    public int getId() {
        // some code goes here
        //throw new UnsupportedOperationException("implement this");
        return file.getAbsoluteFile().hashCode();
    }

    /**
     * Returns the TupleDesc of the table stored in this DbFile.
     * 
     * @return TupleDesc of this DbFile.
     */
    public TupleDesc getTupleDesc() {
        // some code goes here
        //throw new UnsupportedOperationException("implement this");
        return td;
    }

    // see DbFile.java for javadocs
    public Page readPage(PageId pid) {
        // some code goes here
        //读取pid此页   此页 此表在file中
        int tableId = pid.getTableId();
        //此页位置  页码pid.pno
        int pageNumber = pid.getPageNumber();
        // 随机访问,指针偏移访问
        RandomAccessFile p=null;
        try{
            //读取当前文件
            //如果当前页码 超出了文件总长度 则抛出异常
            p= new RandomAccessFile(file, "r");
            if((pageNumber+1.0)*BufferPool.getPageSize()>file.length()){
                throw new IllegalArgumentException(String.format("表 %d 页%d 不存在",tableId,pageNumber));
            }

            //准备一个字节数组用于读取页
            //指针f偏移至页码位置  然后读取
            byte[] bytes = new byte[BufferPool.getPageSize()];
            p.seek((long) pageNumber *BufferPool.getPageSize());

            //读取  如果读取的数量少了  说明不存在
            int read = p.read(bytes, 0, BufferPool.getPageSize());//返回读取的数量
            if(read<BufferPool.getPageSize()){
                throw new IllegalArgumentException(String.format("表%d 页%d 不存在",tableId,pageNumber));
            }
            return new HeapPage(new HeapPageId(tableId,pageNumber),bytes);
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            //关闭流
            try {
                p.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        throw new IllegalArgumentException(String.format("表%d 页%d 不存在",tableId,pageNumber));
    }

    // see DbFile.java for javadocs
    public void writePage(Page page) throws IOException {
        // some code goes here
        // not necessary for lab1
        //获取该页的页码  查看是否超出文件范围
        int pageNo = page.getId().getPageNumber();
        if(pageNo>numPages()){
            throw new IllegalArgumentException("page is not in the heap file or page id is wrong");
        }
        //然后写
        RandomAccessFile randomAccessFile = new RandomAccessFile(this.file, "rw");
        randomAccessFile.seek(pageNo * BufferPool.getPageSize());
        byte[] pageData = page.getPageData();
        randomAccessFile.write(pageData);
        randomAccessFile.close();
    }

    /**
     * Returns the number of pages in this HeapFile.
     * 该文件可写的最大页数
     */
    public int numPages() {
        // some code goes here
        // 文件长度 / 每页的字节数
        return (int) Math.floor(file.length()*1.0/BufferPool.getPageSize());
    }

    // see DbFile.java for javadocs
    //返回脏页或者新写的页
    public List<Page> insertTuple(TransactionId tid, Tuple t)
            throws DbException, IOException, TransactionAbortedException {
        // some code goes here
        ArrayList<Page> list = new ArrayList<>();
        //所有的 crud操作都需要 经过缓冲区
        //缓冲区中存在 则缓冲区直接返回
        //否则缓冲区去进行读取 然后再返回
        BufferPool bufferPool = Database.getBufferPool();
        int tableid=getId();
        //遍历所有的页 查看是否可写
        for(int i=0;i<numPages();i++){
            HeapPage page = (HeapPage)bufferPool.getPage(tid, new HeapPageId(tableid, i), Permissions.READ_WRITE);
            if(page.getNumEmptySlots() > 0){
                page.insertTuple(t);
                page.markDirty(true,tid);
                list.add(page);
                return list;
            }
        }

        //如果 没有页有空槽 则创建新页
        HeapPage page = new HeapPage(new HeapPageId(tableid, numPages()), HeapPage.createEmptyPageData());
        page.insertTuple(t);
        writePage(page);
        list.add(page);
        return list;
        // not necessary for lab1
    }

    // see DbFile.java for javadocs
    //返回删除了元组的脏页
    public ArrayList<Page> deleteTuple(TransactionId tid, Tuple t) throws DbException,
            TransactionAbortedException {
        // some code goes here
        //在缓冲区取得 相应的所在页 然后标记为脏页并删除
        ArrayList<Page> list = new ArrayList<>();
        HeapPage page = (HeapPage) Database.getBufferPool().getPage(tid, t.getRecordId().getPageId(), Permissions.READ_WRITE);
        page.deleteTuple(t);

        page.markDirty(true,tid);

        list.add(page);
        return list;
        // not necessary for lab1
    }

    // see DbFile.java for javadocs
    public DbFileIterator iterator(TransactionId tid) {
        // some code goes here
        return new HeapFileIterator(this,tid);
    }

}


测试:
HeapFileReadTest

Exercise 6

SeqScan 查询

SeqScan即顺序查询表中所有元组  
		   实现了OpIterator接口
           属性:
                 事务id  private TransactionId tid;
                 要查询的表id   private int tableid;
                  表的别名 private String tableAlias;
                  用于返回结果的   DbFileIterator dbFileIterator;

由于实现了Opiterator接口,要重写next等方法
而DbFileIterator是heapFile的迭代器,其next方法就是顺序返回一条条元组
所有SeqScan的next方法返回DbFileIterator的next对象即可

DbFileIterator是文件的迭代器,返回文件中的元组,
HeapFile存储数据不经组织,HeapFileIterator顺序返回文件中的元组
OpIterator是操作的迭代器,next返回操作后的每一条元组
之后的join,insert,delete,都会实现OpIterator

package simpledb.execution;

import simpledb.common.Database;
import simpledb.transaction.TransactionAbortedException;
import simpledb.transaction.TransactionId;
import simpledb.common.Type;
import simpledb.common.DbException;
import simpledb.storage.DbFileIterator;
import simpledb.storage.Tuple;
import simpledb.storage.TupleDesc;

import java.util.*;

/**
 * SeqScan is an implementation of a sequential scan access method that reads
 * each tuple of a table in no particular order (e.g., as they are laid out on
 * disk).
 *
 * 顺序扫描   一张表
 */
public class SeqScan implements OpIterator {

    private static final long serialVersionUID = 1L;

    /**
     * 此次查询的事务id
     */
    private TransactionId tid;
    /**
     * 要查询的表id  及其 别名
     */
    private int tableid;
    private String tableAlias;
    /**
     * 顺序扫描的指针
     */
    DbFileIterator dbFileIterator;

    /**
     * Creates a sequential scan over the specified table as a part of the
     * specified transaction.
     *
     * @param tid
     *            The transaction this scan is running as a part of.
     * @param tableid
     *            the table to scan.
     * @param tableAlias
     *            the alias of this table (needed by the parser); the returned
     *            tupleDesc should have fields with name tableAlias.fieldName
     *            (note: this class is not responsible for handling a case where
     *            tableAlias or fieldName are null. It shouldn't crash if they
     *            are, but the resulting name can be null.fieldName,
     *            tableAlias.null, or null.null).
     */
    public SeqScan(TransactionId tid, int tableid, String tableAlias) {
        // some code goes here
        this.tid=tid;
        this.tableid = tableid;
        this.tableAlias=tableAlias;
    }

    /**
     * @return
     *       return the table name of the table the operator scans. This should
     *       be the actual name of the table in the catalog of the database
     * */
    public String getTableName() {
        return Database.getCatalog().getTableName(tableid);
    }

    /**
     * @return Return the alias of the table this operator scans.
     * */
    public String getAlias()
    {
        // some code goes here
        return tableAlias;
    }

    /**
     * Reset the tableid, and tableAlias of this operator.
     * @param tableid
     *            the table to scan.
     * @param tableAlias
     *            the alias of this table (needed by the parser); the returned
     *            tupleDesc should have fields with name tableAlias.fieldName
     *            (note: this class is not responsible for handling a case where
     *            tableAlias or fieldName are null. It shouldn't crash if they
     *            are, but the resulting name can be null.fieldName,
     *            tableAlias.null, or null.null).
     */
    public void reset(int tableid, String tableAlias) {
        // some code goes here
        this.tableid=tableid;
        this.tableAlias=tableAlias;
    }

    public SeqScan(TransactionId tid, int tableId) {
        this(tid, tableId, Database.getCatalog().getTableName(tableId));
    }

    public void open() throws DbException, TransactionAbortedException {
        // some code goes here
        dbFileIterator=Database.getCatalog().getDatabaseFile(tableid).iterator(tid);
        dbFileIterator.open();
    }

    /**
     * Returns the TupleDesc with field names from the underlying HeapFile,
     * prefixed with the tableAlias string from the constructor. This prefix
     * becomes useful when joining tables containing a field(s) with the same
     * name.  The alias and name should be separated with a "." character
     * (e.g., "alias.fieldName").
     *
     * @return the TupleDesc with field names from the underlying HeapFile,
     *         prefixed with the tableAlias string from the constructor.
     */
    public TupleDesc getTupleDesc() {
        // some code goes here
        String prefix = tableAlias != null? tableAlias : "null";
        TupleDesc tupleDesc = Database.getCatalog().getTupleDesc(tableid);
        int len=tupleDesc.numFields();
        Type[] types = new Type[len];
        String[] names = new String[len];
        for(int i=0;i<len;i++){
            types[i]=tupleDesc.getFieldType(i);
            names[i]=prefix+"."+tupleDesc.getFieldName(i);
        }
        return new TupleDesc(types,names);
    }

    public boolean hasNext() throws TransactionAbortedException, DbException {
        // some code goes
        //如果没有open
        if(dbFileIterator==null){
            return false;
        }
        return dbFileIterator.hasNext();
    }

    public Tuple next() throws NoSuchElementException,
            TransactionAbortedException, DbException {
        // some code goes here
        //如果没有open
        if(dbFileIterator==null){
            throw new NoSuchElementException("no next Tuple");
        }
        Tuple next = dbFileIterator.next();
        if(next==null){
            throw new NoSuchElementException("no next Tuple");
        }
        return next;
    }

    public void close() {
        // some code goes here
        dbFileIterator.close();
        //dbFileIterator=null;
    }

    public void rewind() throws DbException, NoSuchElementException,
            TransactionAbortedException {
        // some code goes here
        dbFileIterator.rewind();
    }
}

总结

接着我们来模拟
1.创建表
2.添加一行数据
3.获取表中所有数据
4.删除一行数据

1.创建表
显然,我们需要指定  tupleDesc(表头)  表名  主键名
同时还有指定的物理文件HeapFile   
然后调用Catalog的addTable方法即可
    /**
     * 	1.创建表
     * 	显然,我们需要指定  tupleDesc(表头)  表名  主键名
     * 	同时还有指定的物理文件HeapFile
     * 	然后调用Catalog的addTable方法即可
     */
    @Test
     public void testCreate() throws IOException {
        Catalog catalog = Database.getCatalog();
        //1.指定表头  表schema
        String[] names = {"id", "age", "name"};
        Type[] types = {Type.INT_TYPE, Type.INT_TYPE, Type.STRING_TYPE};
        TupleDesc tupleDesc = new TupleDesc(types,names);

        //创建表的物理文件  注意  表文件或者说数据库文件  需要.dat 二进制文件
        //将.txt转换为.dat
        File infile = new File("student.txt");
        File outfile = new File("student.dat");
        if(!infile.exists()){
            boolean infileNewFile = infile.createNewFile();
            System.out.println(infileNewFile);
        }
        if(!outfile.exists()){
            boolean outfileNewFile = outfile.createNewFile();
            System.out.println(outfileNewFile);
        }
        HeapFileEncoder.convert(infile,outfile, BufferPool.getPageSize(),3);

        //2.创建表存储的物理文件
        HeapFile heapFile = new HeapFile(outfile, tupleDesc);

        //3.将表文件添加到数据库  并且指定表名称  和 主键名称
        catalog.addTable(heapFile,"student_table","id");

    }
2.添加一行数据
	需指定: 
				表名 
				Tuple
	我们通过catelog来获取表明对应的表id
	然后根据表id调用缓冲区的insertTuple
	缓冲区又会调用DbFile的插入
				DbFile获取缓存区中页,没有缓存区会自动读磁盘
				然后写到缓冲页上,返回脏页给调用者
	插入完成后,接着我们显示调用flush刷新到磁盘吧
  /**
     * 添加一行数据
     *  		需指定:
     *  					表名
     *  					Tuple
     *  		我们通过catelog来获取表明对应的表id
     *  		然后根据表id调用缓冲区的insertTuple
     *  		缓冲区又会调用DbFile的插入
     *  					DbFile获取缓存区中页,没有缓存区会自动读磁盘
     *  					然后写到缓冲页上,返回脏页给调用者
     *  		插入完成后,接着我们显示调用flush刷新到磁盘吧
     */

    @Test
    public void insert() throws IOException, TransactionAbortedException, DbException {
        //调用更改过的create方法  将创建的test表给数据库的catalog管理
        testCreate();
        //1.指定表明  tuple
        String name="student_table";
        String[] names = {"id", "age", "name"};
        Type[] types = {Type.INT_TYPE, Type.INT_TYPE, Type.STRING_TYPE};
        TupleDesc tupleDesc = new TupleDesc(types,names);
        Tuple tuple = new Tuple(tupleDesc);
        tuple.setField(0,new IntField(1));
        tuple.setField(1,new IntField(18));
        tuple.setField(2,new StringField("xiaoming",10));

        //2.通过catelog获取表id
        Catalog catalog = Database.getCatalog();
        int tableId = catalog.getTableId(name);
        //3.根据表id调用缓冲区来插入
        //                  缓冲区又会调用DbFile的插入
        //     *  					DbFile获取缓存区中页,没有缓存区会自动读磁盘
        //     *  					然后写到缓冲页上,返回脏页给调用者
        BufferPool bufferPool = Database.getBufferPool();
        bufferPool.insertTuple(new TransactionId(),tableId,tuple);
        //4.LRU缓存已经被标记为脏页   但还未刷新到磁盘
        bufferPool.flushAllPages();
    }
3.获取表中所有数据
catelog通过表名获取表id,
catelog根据表id获取对应的表文件
dbfile调用numpages获取一共多少页

bufferpool调用getPage获取所有的页
					如果LRU缓存不存在,则从磁盘去读
					HeapFile的readpage会读取一页数据,存放在byte数组
					然后根据读取byte数组返回一个heapPage
对于每个页获取其被使用的元组即可(这里发现有问题不知道咋读,就自己写了个get方法)
    /**
     * 	3.获取表中所有数据
     * 	catelog通过表名获取表id,
     * 	catelog根据表id获取对应的表文件
     * 	heapFile调用numpages获取一共多少页
     *
     * 	bufferpool调用getPage获取所有的页
     * 						如果LRU缓存不存在,则从磁盘去读
     * 						HeapFile的readpage会读取一页数据,存放在byte数组
     * 						然后根据读取byte数组返回一个heapPage
     * 	对于每个页获取其被使用的元组即可
     */
    @Test
    public void selectAll() throws IOException, TransactionAbortedException, DbException {
        insert();
        String name="student_table";

        Catalog catalog = Database.getCatalog();
        BufferPool bufferPool = Database.getBufferPool();
        int tableId = catalog.getTableId(name);
        HeapFile heapFile = (HeapFile)catalog.getDatabaseFile(tableId);
        int numPages = heapFile.numPages();
        for (int i = 0; i < numPages; i++) {
            HeapPage page = (HeapPage)bufferPool.getPage(new TransactionId(), new HeapPageId(tableId, i), Permissions.READ_ONLY);//插入时  页id是从0增长的
            Tuple[] tuples = page.getTuples();
            Arrays.stream(tuples)
                    .filter(n -> n!=null)
                    .forEach(System.out::println);
        }
    }

在这里插入图片描述

4.删除一行数据
此时的插入很简单,是随意的插入
但删除很困难!!!!
因为没有组织数据在磁盘中是如何存储的
每次删除都要遍历一遍,也就是获取所有的元组然后才能得到元组的recordId即位置
然后才能删除!!!!
所以,后续的b+树很需要!!!!
    /**
     * 	4.删除一行数据
     * 	此时的插入很简单,是随意的插入
     * 	但删除很困难!!!!
     * 	因为没有组织数据在磁盘中是如何存储的
     * 	每次删除都要遍历一遍,也就是获取所有的元组然后才能得到元组的recordId即位置
     * 	然后才能删除!!!!
     * 	所以,后续的b+树很需要!!!!
     */
        @Test
    public void delete() throws IOException, TransactionAbortedException, DbException {
            //插入小明
            insert();
            System.out.println("删除前--------------");
            //查询一次
            selectAll();

            Catalog catalog = Database.getCatalog();
            BufferPool bufferPool = Database.getBufferPool();
            //准备要删除的元组
            String name="student_table";
            String[] names = {"id", "age", "name"};
            Type[] types = {Type.INT_TYPE, Type.INT_TYPE, Type.STRING_TYPE};
            TupleDesc tupleDesc = new TupleDesc(types,names);
            Tuple tuple = new Tuple(tupleDesc);
            tuple.setField(0,new IntField(1));
            tuple.setField(1,new IntField(18));
            tuple.setField(2,new StringField("xiaoming",10));
            //由于不知道位置  需要遍历一遍

            int tableId = catalog.getTableId(name);
            HeapFile heapFile = (HeapFile)catalog.getDatabaseFile(tableId);
            int numPages = heapFile.numPages();
            List<Tuple> collectAll=new ArrayList<>();
            for (int i = 0; i < numPages; i++) {
                HeapPage page = (HeapPage)bufferPool.getPage(new TransactionId(), new HeapPageId(tableId, i), Permissions.READ_ONLY);//插入时  页id是从0增长的
                Tuple[] tuples = page.getTuples();
                List<Tuple> collect = Arrays.stream(tuples)
                        .filter(tupleTemp -> {
                            if(tupleTemp==null){
                                return false;
                            }
                            boolean tupleDesc_equals = tupleTemp.getTupleDesc().equals(tuple.getTupleDesc());
                            Iterator<Field> fields = tupleTemp.fields();
                            int counter=0;
                            while (fields.hasNext()){
                                Field next = fields.next();
                                tupleDesc_equals=next.equals(tuple.getField(counter));
                                counter++;
                            }
                            return tupleDesc_equals;
                        } )
                        .collect(Collectors.toList());

                if (!collect.isEmpty()) collectAll.addAll(collect);
            }

            //找到该需删除的元组
            Tuple tuple1 = collectAll.get(0);
            bufferPool.deleteTuple(new TransactionId(),tuple1);

            System.out.println("删除后------------");
            //删除后在查询一遍
            selectAll();
    }

在这里插入图片描述

总而言之
catalog记录了数据库的所有表及其文件记录
我们创建表及其文件需要Catalog的addTable

而对应crud操作
一切从BufferPool开始,
BufferPool会调用DbFile的写页,读页,插入元组,删除元组操作等,都是以页为单位的,而页中存储了元组的集合及TupleDesc,
但是DbFile的操作都是会从BufferPool的缓存中获取页,如果没有BufferPool再调用DbFile的读取页,同时加入缓存

对于脏页,应该先存储在缓存中,缓存满时flush,数据库关闭时flushAll。

大佬图及博客地址

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值