【深度学习利器之NVIDIA Jetson TX2】Jetson-TX2入门——参数性能介绍

最新推荐文章于 2024-08-06 17:07:15 发布

Gerwels_JI

最新推荐文章于 2024-08-06 17:07:15 发布

阅读量1.1w

点赞数 2

分类专栏： Jetson TX2开发板 DeepLearning 文章标签： Jetson TX2开发板嵌入式深度学习

DeepLearning 同时被 2 个专栏收录

12 篇文章 1 订阅

订阅专栏

Jetson TX2开发板

3 篇文章 0 订阅

订阅专栏

0 参数一览

参考：https://elinux.org/Jetson

1 Jetson TX2 Overview

The Jetson TX2 is a new iteration of the Jetson Development Kit which doubles the computing power and power efficiency of the earlier Jetson TX1.

The Jetson TX1 Dev Kit introduced a new module format, where a standardized Tegra Module is plugged into a carrier board(载板). While the Jetson TX2 uses the same carrier board as the Jetson TX1, the actual Tegra TX2 Module itself is all new.

2 Hardware

包括1块GPU + 2块CPU：
The Jetson TX2 features a NVIDIA Pascal GPU with 256 CUDA capable cores. The CPU complex consists of two ARM v8 64-bit CPU clusters which are connected by a high-performance coherent interconnect fabric(通畅的互联结构). The Denver 2 (Dual-Core) CPU cluster is optimized for higher single-thread(单线程) performance; the second CPU cluster is an ARM Cortex-A57 QuadCore which is better suited for multi-threaded(多线程) applications.
拥有128-bit memory controller + 8 GB RAM and 32 GB ROM
The memory subsystem incorporates a 128-bit memory controller(128位内存控制器), which provides high bandwidth(32Gbps) LPDDR4** support. 8 GB LPDDR4 Main Memory and 32 GB eMMC Flash memory are integrated on the Module. Going to a 128-bit design from the TX1 64-bit is a major performance enhancement.
支持音视频硬件编解码
The Module also supports hardware video encoders and decoders which support 4K ultra-high-definition video at 60 fps in several different formats(格式). This is slightly different than the hybrid Jetson TX1 module, which used both dedicated hardware and software which was running on the Tegra SoC for those tasks. Also included is an Audio Processing Engine with full hardware support for multi-channel audio.
包括Wi-Fi and Bluetooth模块
The Jetson TX2 supports Wi-Fi and Bluetooth wireless connectivity. Wi-fi is much improved over the earlier Jetson TX1. Gigabit Ethernet BASE-T is included. Here’s a comparison between the TX1 and the TX2.

2.1 Jetson TX2 vs Jetson TX1

The carrier board, which is common between both the Jetson TX2 and the Jetson TX1, has the following I/O connectors:

USB 3.0 Type A
USB 2.0 Micro AB (supports recovery and host mode)
HDMI
M.2 Key E
PCI-E x4
Gigabit Ethernet
Full size SD card reader
SATA data+power
Display expansion header
Camera expansion header

There are two expansion headers(两个拓展头), a 40 pin, 2.54mm spaced header with signals laid out similarly to the Raspberry Pi, and a 30 pin, 2.54mm spaced header for extra GPIO.

The Jetson also includes a 5MP camera in the camera expansion header, and a display expansion header for adding extra display panels.

The Jetson TX2 has added a CAN bus controller to the module. CAN is a network format that is frequently used in automobiles and other vehicles. The CAN bus signals are available directly on the GPIO Expansion Header.
在这里插入图片描述

3 Sippy or Speedy

Jetson TX2 Dual Operating Modes: Max-Q & Max-P
This new generation brings a configurable(可配置的) amount of performance increase depending on power consumption requirements. NVIDIA has engineered two modes:

Max-Q is the name of the energy efficiency mode which clocks the Parker SoC for efficiency over performance and draws about 7.5W, right before the bend in the power/performance curve. The result of this mode is that the TX2 has similar performance to a TX1 in max performance mode, while drawing about half the power!
In Max-P mode, the TX2 just flat out goes for it in the power budget of 15W. This provides about twice the performance of the Jetson TX1 at its maximum clock rate.

4 Software

-预装ubuntu16.04，并提供JetPack 3.0
There are several changes to the Jetson TX2 software stack(软件栈). The Jetson TX2 runs a Developer Preview of an Ubuntu 16.04 variant named L4T 27.1. The Linux Kernel is 4.4, a newer version than the earlier Jetson TX1 version 3.10. There have been changes to the boot flow, with additional firmware managers added to the mix. The Jetson TX2 comes with a long list of software libraries, and a good selection of samples with source code.
The new JetPack 3.0 installer is available to flash and copy system software to the Jetson TX2.

5 Initial Impressions

NVIDIA claims that the Jetson TX2 is twice as fast as the Jetson TX1. After booting the machine, this surely seems the case. The entire experience feels very much like a desktop/laptop level machine. Doubling the memory (and the memory bus speed) surely helps with that feeling. Previous Jetsons experience quite a bit of memory pressure when running memory intensive, desktop applications like web browsers. The TX2 doesn’t even notice.

Running a handful of compiles and tests on applications like Caffe proved that the Jetson TX2 is indeed quite a bit faster than the earlier Jetson TX1 (see the video for one of the tests).

One of the fun samples that comes with the Jetson TX2 is an object recognition example which is demonstrated in the video. The deep learning sample uses Caffe along with ImageNet and uses the onboard camera to grab imagery.

Note that we haven’t performed any performance tuning for the demos, this is how it runs fresh out the box!

If you want some hardcore numbers, go over to Phoronix and check out NVIDIA Jetson TX2 Linux Benchmarks(推荐阅读6).

6 Conclusion

Stay tuned as we begin working with the TX2 to better understand how to take advantage of the extra performance. Find out more on the NVIDIA Developers site.#

7 Pictures, Natch!

整体图
TX2的心脏
摄像头模块

8 Appendix

8.1 Tegra（图睿）

Tegra是一种采用单片机系统设计片上系统（SoC, system-on-a-chip）芯片，它集成了ARM架构处理器和NVIDIA的Geforce GPU，并内置了其它功能，产品主要面向小型设备。和Intel以PC为起点的x86架构相比，ARM架构的Tegra更像是以手机处理器为起点做出的发展。
注意：它不能运行x86 PC上的Windows XP等操作系统，但在手机上应用多年的ARM架构轻量级操作系统更能适应它高速低功耗的需求。

8.2 NVIDIA Pascal GPU

8.3

Denver 2 (Dual-Core) CPU cluster
ARM Cortex-A57 QuadCore

8.4 LPDDR4

Low Power Double Data Rate 4

8.4.1 介绍

LPDDR可以说是全球范围内最广泛使用于移动设备的“工作记忆”内存。全新的20nm 8Gb LPDDR4内存，在性能和集成度上都比20纳米级4Gb LPDDR3内存提高一倍。 [1]
LPDDR4可提供32Gbps的带宽，为DDR3 RAM的2倍。当前，Galaxy S5、Note 4和Nexus6均采用DDR3标准。更快速的RAM意味着应用的启动速度更快，这对于在执行多任务时启动重量级应用至关重要。

8.4.2 性能

由于I/O接口数据传输速度最高可达3200Mbps，是通常使用的DDR3 DRAM的两倍，新推出的8Gb LPDDR4内存可以支持超高清影像的拍摄和播放,并能持续拍摄2000万像素的高清照片。
与LPDDR3内存芯片相比，LPDDR4的运行电压降为1.1伏，堪称适用于大屏幕智能手机和平板电脑、高性能网络系统的最低功耗存储解决方案。以2GB内存封装为例，比起基于4Gb LPDDR3芯片的2GB内存封装，基于8Gb LPDDR4芯片的2GB内存封装因运行电压的降低和处理速度的提升，最大可节省40%的耗电量。同时，新产品的输入/输出信号传输采用三星独有的低电压摆幅终端逻辑(LVSTL, Low Voltage Swing Terminated Logic)，不仅进一步降低了LPDDR4芯片的耗电量，并使芯片能在低电压下进行高频率运转，实现了电源使用效率的最优化。

8.4.3 DDR3和DDR4的区别

这两个玩穿了的内容，就不再阐述了，我找了一篇CSDN，在文末的推荐阅读3里面，不懂的建议看看，总体来说就是两者的频率和兼容主板都是不同的。所以总之，DDR4 和 DDR3 内存是无法通用的，两者不兼容。不过目前市场有 DDR4 和 DDR3 内存都支持的 100 系列主板，这种非主流主板需求量比较小，不太建议。既然配了100、200系列的主板，就直接使用DDR4就行了。

8.5 eMMC & SSD

eMMC和SSD主要是满足不同需求而发展出来的NAND应用
eMMC
平板和手机为了满足移动性的需求，所以需要做到轻,薄；尤其是功耗要很低，因此eMMC就诞生了；所以eMMC接口是用IO pin来定义的，这样接口简单，功耗低；另外eMMC对于苹果iPad、安卓平板电脑、手机的作用也是巨大的，平板和手机都比较小，所以eMMC是把控制器和NAND颗粒封装在一个package里面，这也造成eMMC不能放很多NAND颗粒，容量比较低。总结eMMC特点就是功耗低，容量小，随机读写性能差；
SSD
SSD主要是为了满足大容量存储尤其是数据中心等应用场合，SSD成PC电脑的性能催化剂，读写性能尤其是随机读写性能快。为可达到这样的性能，SSD控制器都是使用高速总线，刚开始是SATA，现在PCIE也越来越多，以后可能会用光纤；NAND颗粒都有多个通道用于提升容量和读写性能。所以SSD功耗也很大。总结SSD特点就是功耗大, 容量大，读写快。
推荐阅读4