SimpleDB lab1 Ex5~Ex7

SimpleDB lab1 EX5~EX7

1 EX5 HeapFile

2020-10-25
根据提示,这里需要实现从磁盘中,对文件的读取操作。因此,我们首先要计算正确的偏移量(offset)。另外,为了实现随机读取文件,采用RandomAccess来操作文件。
从DbFile.java中,我们可以找到偏移量的计算方式,即 p a g e N o ∗ p a g e S i z e pageNo*pageSize pageNopageSize. 其中,pageNo代表file中的第n页,pageSize是页大小。
通过RandomAccessFile的seek()函数,可以快速定位到对应页的位置,并读取一个页的内容。同时,通过HeapPage的构造函数,将byte类型的数据,实例化为HeapPage。

另外,就是构造迭代器,这个相对比较简单。迭代器的作用为遍历整个HeapFile上的Tuple。前面在HeapPage上,已经实现了页上的迭代器,因此,可以通过读取HeapPage, 用HeapPage的迭代器返回Tuple。此处文档给了提示,使用BufferPool的readPage()函数,进行读取页面的操作。另外就是,不要在open的时候,把所有的页都载入内存,这样性能会更好一些。我们可以在每个页读完之后,如果文件还存在页没有读取,再将其载入内存。同时,释放上一个读取的页面。

其余函数比较常规,不再赘述。

综上,本部分的代码如下所示:

package simpledb;

import javax.xml.crypto.Data;
import java.io.*;
import java.util.*;

/**
 * HeapFile is an implementation of a DbFile that stores a collection of tuples
 * in no particular order. Tuples are stored on pages, each of which is a fixed
 * size, and the file is simply a collection of those pages. HeapFile works
 * closely with HeapPage. The format of HeapPages is described in the HeapPage
 * constructor.
 *
 * @see simpledb.HeapPage#HeapPage
 * @author Sam Madden
 */
public class HeapFile implements DbFile {


    private File file;

    private TupleDesc td;
    /**
     * Constructs a heap file backed by the specified file.
     *
     * @param f
     *            the file that stores the on-disk backing store for this heap
     *            file.
     */
    public HeapFile(File f, TupleDesc td) {
        // some code goes here
        this.file = f;
        this.td = td;
    }

    /**
     * Returns the File backing this HeapFile on disk.
     *
     * @return the File backing this HeapFile on disk.
     */
    public File getFile() {
        // some code goes here
        return this.file;
    }

    /**
     * Returns an ID uniquely identifying this HeapFile. Implementation note:
     * you will need to generate this tableid somewhere to ensure that each
     * HeapFile has a "unique id," and that you always return the same value for
     * a particular HeapFile. We suggest hashing the absolute file name of the
     * file underlying the heapfile, i.e. f.getAbsoluteFile().hashCode().
     *
     * @return an ID uniquely identifying this HeapFile.
     */
    public int getId() {
        // some code goes here
        return file.getAbsoluteFile().hashCode();
    }

    /**
     * Returns the TupleDesc of the table stored in this DbFile.
     *
     * @return TupleDesc of this DbFile.
     */
    public TupleDesc getTupleDesc() {
        // some code goes here
        return this.td;
    }

    // see DbFile.java for javadocs
    public Page readPage(PageId pid) {
        // some code goes here
        // 通过pid计算偏移量,然后读取一个页
        try {
            RandomAccessFile randomAccessFile = new RandomAccessFile(file, "r");

            // 计算偏移量
            int pgno = pid.getPageNumber();
            int pageSize = Database.getBufferPool().getPageSize();

            int offset = pgno * pageSize;
            // 读取一个pagesize的内容
            byte[] buffer = new byte[pageSize];

            randomAccessFile.seek(offset);
            randomAccessFile.read(buffer);
            HeapPage heapPage = new HeapPage(new HeapPageId(pid.getTableId(), pgno), buffer);
            return heapPage;
        } catch (IOException e) {
            e.printStackTrace();
        }
        return null;
    }

    // see DbFile.java for javadocs
    public void writePage(Page page) throws IOException {
        // some code goes here
        // not necessary for lab1
    }

    /**
     * Returns the number of pages in this HeapFile.
     */
    public int numPages() {
        // some code goes here
        long fileLen = file.length();
        return (int) Math.floor((double)fileLen/Database.getBufferPool().getPageSize());
    }

    // see DbFile.java for javadocs
    public ArrayList<Page> insertTuple(TransactionId tid, Tuple t)
            throws DbException, IOException, TransactionAbortedException {
        // some code goes here
        return null;
        // not necessary for lab1
    }

    // see DbFile.java for javadocs
    public ArrayList<Page> deleteTuple(TransactionId tid, Tuple t) throws DbException,
            TransactionAbortedException {
        // some code goes here
        return null;
        // not necessary for lab1
    }

    // see DbFile.java for javadocs
    public DbFileIterator iterator(TransactionId tid) {
        // some code goes here
        return new DbFileIterator() {
            int pages = numPages();
            int readingPage;
            PageId readingPid;
            Page page;
            Iterator<Tuple> it;
            @Override
            public void open() throws DbException, TransactionAbortedException {
                readingPage=0;
                readingPid = new HeapPageId(getId(), readingPage);
                page = Database.getBufferPool().getPage(tid, readingPid, Permissions.READ_ONLY);
                it = ((HeapPage)page).iterator();
            }

            @Override
            public boolean hasNext() throws DbException, TransactionAbortedException {
               if(it == null)
                    return false;
                // lab2 DeleteTest中,发现页面可能为空,所以不应该根据页数来判断是否当前页有tuple,
                // 而是根据当前页是否有tuple决定是否读取下一个页 故作出修改。
                boolean hasNextTupleInPage = it.hasNext();
                while(!hasNextTupleInPage){
                    if(readingPage < pages) {
                        Database.getBufferPool().releasePage(tid, readingPid);
                        readingPage++;
                        readingPid = new HeapPageId(getId(), readingPage);
                        page = Database.getBufferPool().getPage(tid, readingPid, Permissions.READ_ONLY);
                        it = ((HeapPage) page).iterator();
                        hasNextTupleInPage = it.hasNext();
                    }
                    else
                        return false;
                }
                return true;
            }

            @Override
            public Tuple next() throws DbException, TransactionAbortedException, NoSuchElementException {
                if(it == null)
                    throw new NoSuchElementException("iterator is not open");
                return it.next();
            }

            @Override
            public void rewind() throws DbException, TransactionAbortedException {
                readingPage=0;
                readingPid = new HeapPageId(getId(), readingPage);
                page = Database.getBufferPool().getPage(tid, readingPid, Permissions.READ_ONLY);
                it = ((HeapPage)page).iterator();
            }

            @Override
            public void close() {
                readingPage = pages+1;
                it = null;
                Database.getBufferPool().releasePage(tid, readingPid);
            }
        };
    }

}
运行结果

在这里插入图片描述

Ex6 Operator

此处仅要求实现SeqScan, 很简单。
seqScan实现的是语句select * from table table_alias
我们可以使用EX5中实现的Iterator遍历表中的数据。
代码如下:

package simpledb;

import org.omg.IOP.TAG_ALTERNATE_IIOP_ADDRESS;

import java.awt.image.DataBuffer;
import java.util.*;

/**
 * SeqScan is an implementation of a sequential scan access method that reads
 * each tuple of a table in no particular order (e.g., as they are laid out on
 * disk).
 */
public class SeqScan implements OpIterator {

    private static final long serialVersionUID = 1L;

    private TransactionId tid;

    private int tableId;

    private String tableAlias;

    private DbFileIterator iterator;
    /**
     * Creates a sequential scan over the specified table as a part of the
     * specified transaction.
     *
     * @param tid
     *            The transaction this scan is running as a part of.
     * @param tableid
     *            the table to scan.
     * @param tableAlias
     *            the alias of this table (needed by the parser); the returned
     *            tupleDesc should have fields with name tableAlias.fieldName
     *            (note: this class is not responsible for handling a case where
     *            tableAlias or fieldName are null. It shouldn't crash if they
     *            are, but the resulting name can be null.fieldName,
     *            tableAlias.null, or null.null).
     */
    public SeqScan(TransactionId tid, int tableid, String tableAlias) {
        // some code goes here
        this.tid = tid;
        this.tableId = tableid;
        this.tableAlias = tableAlias;
        this.iterator = Database.getCatalog().getDatabaseFile(tableid).iterator(tid);
    }

    /**
     * @return
     *       return the table name of the table the operator scans. This should
     *       be the actual name of the table in the catalog of the database
     * */
    public String getTableName() {
        return Database.getCatalog().getTableName(tableId);
    }

    /**
     * @return Return the alias of the table this operator scans.
     * */
    public String getAlias()
    {
        // some code goes here
        return this.tableAlias;
    }

    /**
     * Reset the tableid, and tableAlias of this operator.
     * @param tableid
     *            the table to scan.
     * @param tableAlias
     *            the alias of this table (needed by the parser); the returned
     *            tupleDesc should have fields with name tableAlias.fieldName
     *            (note: this class is not responsible for handling a case where
     *            tableAlias or fieldName are null. It shouldn't crash if they
     *            are, but the resulting name can be null.fieldName,
     *            tableAlias.null, or null.null).
     */
    public void reset(int tableid, String tableAlias) {
        // some code goes here
        this.tableId = tableid;
        this.tableAlias = tableAlias;
    }

    public SeqScan(TransactionId tid, int tableId) {
        this(tid, tableId, Database.getCatalog().getTableName(tableId));
    }

    public void open() throws DbException, TransactionAbortedException {
        // some code goes here
        iterator.open();
    }

    /**
     * Returns the TupleDesc with field names from the underlying HeapFile,
     * prefixed with the tableAlias string from the constructor. This prefix
     * becomes useful when joining tables containing a field(s) with the same
     * name.  The alias and name should be separated with a "." character
     * (e.g., "alias.fieldName").
     *
     * @return the TupleDesc with field names from the underlying HeapFile,
     *         prefixed with the tableAlias string from the constructor.
     */
    public TupleDesc getTupleDesc() {
        // some code goes here
        return Database.getCatalog().getTupleDesc(tableId);
    }

    public boolean hasNext() throws TransactionAbortedException, DbException {
        // some code goes here
        return this.iterator.hasNext();
    }

    public Tuple next() throws NoSuchElementException,
            TransactionAbortedException, DbException {
        // some code goes here
        return this.iterator.next();
    }

    public void close() {
        // some code goes here
        this.iterator.close();
    }

    public void rewind() throws DbException, NoSuchElementException,
            TransactionAbortedException {
        // some code goes here
        this.iterator.rewind();
    }
}

测试

在这里插入图片描述

EX7 (sp)

至此,lab1的内容已经完全实现了。文档中指出,我们可以自己实现一下sql与函数的对应关系,后续更新吧~
按照提示,新建了一个测试类:

package simpledb.client;

import simpledb.*;

import java.io.File;

public class Test{

        public static void main(String[] argv) {
            // construct a 3-column table schema
            Type types[] = new Type[]{ Type.INT_TYPE, Type.INT_TYPE, Type.INT_TYPE };
            String names[] = new String[]{ "field0", "field1", "field2" };
            TupleDesc descriptor = new TupleDesc(types, names);
            // create the table, associate it with some_data_file.dat
            // and tell the catalog about the schema of this table.
            HeapFile table1 = new HeapFile(new File("datafile.dat"), descriptor);
            Database.getCatalog().addTable(table1, "test");
            // construct the query: we use a simple SeqScan, which spoonfeeds
            // tuples via its iterator.
            TransactionId tid = new TransactionId();
            SeqScan f = new SeqScan(tid, table1.getId());
            // 打印属性名
            System.out.printLn(names[0] + " " + names[1] + " " + names[2] );
            try {
                // and run it
                f.open();
                while (f.hasNext()) {
                    Tuple tup = f.next();
                    System.out.println(tup);
                }
                f.close();
                Database.getBufferPool().transactionComplete(tid);
            } catch (Exception e) {
                System.out.println ("Exception : " + e);
            }
        }

}

可以看到,该类调用了前面写完的SeqScan(),遍历了表中的所有数据,并将其打印了出来。
需要调用SimpleDB主函数,输入参数(args) convert filename.txt n, 即可生成filenam.dat文件,其中存储了filename.txt中的数据,即可供simpleDB读取的二进制数据库文件。此时,调用Test类,即可输出表中的内容。

运行结果:
在这里插入图片描述
至此,lab1完成。

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值