[Pytorch] Tensor底层机制

最新推荐文章于 2023-03-04 17:55:47 发布

smartcat2010

最新推荐文章于 2023-03-04 17:55:47 发布

阅读量1.3k

点赞数 2

分类专栏： PyTorch源码学习

本文链接：https://blog.csdn.net/smartcat2010/article/details/118882020

版权

PyTorch源码学习专栏收录该内容

8 篇文章 1 订阅

订阅专栏

整体关系：（逻辑分层，每层负责一部分特色功能）

Python侧的Tensor -->继承自

torch._C._TensorBase --> 就是

THPVariableType --> 实际对应

THPVariable --> 包含成员

at::Tensor --> 包含成员

TensorImpl --> 包含成员

Storage --> 包含成员

StorageImpl --> 包含成员

DataPtr --> 包含成员

UniqueVoidPtr --> 包含成员

void* data_

Python的Tensor: torch\_tensor.py

class Tensor(torch._C._TensorBase):

C和Python的衔接：torch\csrc\autograd\python_variable.cpp

PyModule_AddObject(module, "_TensorBase",   (PyObject *)&THPVariableType);

THPVariableType的成员函数挂载：（文件同上）

  THPUtils_addPyMethodDefs(methods, torch::autograd::variable_methods);
  ...
  THPVariableType.tp_methods = methods.data();

其成员函数位置：tools\autograd\templates\python_variable_methods.cpp

// XXX: ops that are bound here are not exposed to the C++ api nor the JIT.
// Any new ops added here should be accompanied with a comment why they are not
// being registered through native_functions.yaml, and be tagged cpp / JIT
PyMethodDef variable_methods[] = {
  // These magic methods are all implemented on python object to wrap NotImplementedError
  {"__add__", castPyCFunctionWithKeywords(TypeError_to_NotImplemented_<THPVariable_add>), METH_VARARGS | METH_KEYWORDS, NULL},
  {"__radd__", castPyCFunctionWithKeywords(TypeError_to_NotImplemented_<THPVariable_add>), METH_VARARGS | METH_KEYWORDS, NULL},
  {"__iadd__", castPyCFunctionWithKeywords(TypeError_to_NotImplemented_<THPVariable_add_>), METH_VARARGS | METH_KEYWORDS, NULL},
  {"__rmul__", castPyCFunctionWithKeywords(TypeError_to_NotImplemented_<THPVariable_mul>), METH_VARARGS | METH_KEYWORDS, NULL},
  {"__mul__", castPyCFunctionWithKeywords(TypeError_to_NotImplemented_<THPVariable_mul>), METH_VARARGS | METH_KEYWORDS, NULL},
  {"__imul__", castPyCFunctionWithKeywords(TypeError_to_NotImplemented_<THPVariable_mul_>), METH_VARARGS | METH_KEYWORDS, NULL},
  {"__sub__", castPyCFunctionWithKeywords(TypeError_to_NotImplemented_<THPVariable_sub>), METH_VARARGS | METH_KEYWORDS, NULL},
  {"__isub__", castPyCFunctionWithKeywords(TypeError_to_NotImplemented_<THPVariable_sub_>), METH_VARARGS | METH_KEYWORDS, NULL},
  {"__div__", castPyCFunctionWithKeywords(TypeError_to_NotImplemented_<THPVariable_div>), METH_VARARGS | METH_KEYWORDS, NULL},
  ...

THPVariableType定义： torch\csrc\autograd\python_variable.cpp

PyTypeObject THPVariableType = {
    PyVarObject_HEAD_INIT(
        &THPVariableMetaType,
        0) "torch._C._TensorBase", /* tp_name */
    sizeof(THPVariable), /* tp_basicsize */
    ...
    THPVariable_properties, /* tp_getset */
    ...
    THPVariable_pynew, /* tp_new */
};

THPVariable_properties定义：torch\csrc\autograd\python_variable.cpp

static struct PyGetSetDef THPVariable_properties[] = {
  ...
  {"_cdata", (getter)THPVariable_get_cdata, nullptr, nullptr, nullptr},
  {"_version", (getter)THPVariable_get_version, nullptr, nullptr, nullptr},
  {"grad_fn", (getter)THPVariable_get_grad_fn, nullptr, nullptr, nullptr},
  {"_grad_fn", (getter)THPVariable_get_grad_fn, (setter)THPVariable_set_grad_fn, nullptr, nullptr},
  {"is_leaf", (getter)THPVariable_is_leaf, nullptr, nullptr, nullptr},
  {"retains_grad", (getter)THPVariable_retains_grad, nullptr, nullptr, nullptr},
  {"data", (getter)THPVariable_get_data, (setter)THPVariable_set_data, nullptr, nullptr},
  {"_grad", (getter)THPVariable_get_grad, (setter)THPVariable_set_grad, nullptr, nullptr}, // Allows the python class to override .grad
  {"grad", (getter)THPVariable_get_grad, (setter)THPVariable_set_grad, nullptr, nullptr},
  ...
  {nullptr}
}

以下2处，可以看出THPVariableType实际对应里面的THPVariable类型：

1. 从"data"对应的THPVariable_get_data来看，Python调用时实际传过来的是THPVariable类型的对象：
static PyObject * THPVariable_get_data(THPVariable *self, void *unused)
{...}
2. 从THPVariable_pynew里面调用的THPVariable_NewWithVar看，new的实际返回是THPVariable的指针：
static PyObject* THPVariable_NewWithVar(
    PyTypeObject* type,
    Variable _var,
    c10::impl::PyInterpreterStatus status) {
  PyObject* obj = type->tp_alloc(type, 0);
  if (obj) {
    auto v = (THPVariable*) obj;
    // TODO: named constructor to avoid default initialization
    new (&v->cdata) MaybeOwned<Variable>();
    v->cdata = MaybeOwned<Variable>::owned(std::move(_var));
    ...
  }
  return obj;
}

THPVariable的定义，at::Tensor类型的cdata是核心：torch\csrc\autograd\python_variable.h

// Python object that backs torch.autograd.Variable
// NOLINTNEXTLINE(cppcoreguidelines-pro-type-member-init)
struct THPVariable {
  PyObject_HEAD;
  // Payload
  c10::MaybeOwned<at::Tensor> cdata;
  // Hooks to be run on backwards pass (corresponds to Python attr
  // '_backwards_hooks', set by 'register_hook')
  PyObject* backward_hooks = nullptr;
};

at::Tensor的定义：aten\src\ATen\templates\TensorBody.h

namespace at {
...

// Tensor is a "generic" object holding a pointer to the underlying TensorImpl object, which
// has an embedded reference count. In this way, Tensor is similar to boost::intrusive_ptr.
// ...
class TORCH_API Tensor {
  ...
  TensorImpl * unsafeGetTensorImpl() const {
    return impl_.get();
  }
  TensorImpl * unsafeReleaseTensorImpl() {
    return impl_.release();
  }
  ...
  void* data_ptr() const {
    return this->unsafeGetTensorImpl()->data();
  }
  ...
  c10::intrusive_ptr<TensorImpl, UndefinedTensorImpl> impl_;
};

TensorImpl实现（省略了所有成员函数和部分成员变量）：c10\core\TensorImpl.h

struct C10_API TensorImpl : public c10::intrusive_ptr_target {\
  Storage storage_;

  std::unique_ptr<c10::AutogradMetaInterface> autograd_meta_ = nullptr;
  c10::VariableVersion version_counter_;
  c10::impl::SizesAndStrides sizes_and_strides_;
  int64_t storage_offset_ = 0;
  int64_t numel_ = 1;
  caffe2::TypeMeta data_type_;
  c10::optional<c10::Device> device_opt_;
  DispatchKeySet key_set_;
};

Storage的实现：c10\core\Storage.h

struct C10_API Storage {
  ...
  c10::intrusive_ptr<StorageImpl> storage_impl_;
};

StorageImpl的实现: c10\core\StorageImpl.h

struct C10_API StorageImpl final : public c10::intrusive_ptr_target {
  ...
private:
  DataPtr data_ptr_;
  size_t size_bytes_;
  bool resizable_;
  // Identifies that Storage was received from another process and doesn't have
  // local to process cuda memory allocation
  bool received_cuda_;
  Allocator* allocator_;
};

DataPtr的实现：c10\core\Allocator.h

class C10_API DataPtr {
 private:
  c10::detail::UniqueVoidPtr ptr_;
  Device device_;
  ...
};

UniqueVoidPtr的实现：c10\util\UniqueVoidPtr.h

class UniqueVoidPtr {
 private:
  // Lifetime tied to ctx_
  void* data_;
  std::unique_ptr<void, DeleterFnPtr> ctx_;
  ...
};

torch::autograd::Variable等价于at::Tensor : torch\csrc\autograd\variable.h

namespace torch { namespace autograd {

/// `Variable` is exactly the same as `Tensor` (i.e. we have `using Variable = at::Tensor`).
/// This means you can perform all the usual mathematical and other
/// operations you can perform on `Tensor`s also on `Variable`s.
///
/// The only reason we are keeping the `Variable` class is backward compatibility
/// with external user's legacy C++ frontend code. Our intention is to eliminate
/// the `Variable` class in the near future.
using Variable = at::Tensor;

Reference:

(34 封私信 / 83 条消息) byjang - 知乎 (zhihu.com)

通过跟踪torch.rand，看里面Tensor创建的过程；

smartcat2010

关注

2
点赞
踩
5

收藏

觉得还不错? 一键收藏
1
评论
[Pytorch] Tensor底层机制

Python的Tensor: torch\_tensor.pyclass Tensor(torch._C._TensorBase):C和Python的衔接：torch\csrc\autograd\python_variable.cppPyModule_AddObject(module, "_TensorBase", (PyObject *)&THPVariableType);THPVariableType的成员函数挂载：（文件同上） THPUtils_addPyMet
复制链接

扫一扫

专栏目录