depth: int. Number of Transformer blocks. 而transformer block在文中具体是指: 也即是,如果depth = 8,就是设置了8层transformer encoder