Tensorflow (7) 图解 NumPy

最新推荐文章于 2024-04-15 22:41:59 发布

light169

最新推荐文章于 2024-04-15 22:41:59 发布

阅读量1.6k

点赞数

分类专栏：神经网络深度学习 Python 文章标签： NumPy

本文链接：https://blog.csdn.net/light169/article/details/124127567

版权

深度学习同时被 3 个专栏收录

39 篇文章 24 订阅

订阅专栏

神经网络

6 篇文章 0 订阅

订阅专栏

Python

5 篇文章 0 订阅

订阅专栏

1、Creating Arrays

2、Array Arithmetic

data * 1.6:

3、Indexing

4、Aggregation

min, max, and sum, plenty of others.

5、Creating Matrices

np.array([[1,2],[3,4]])

6、Matrix Arithmetic

7、Dot Product

matrix multiplication

8、Matrix Indexing

9、Matrix Aggregation

10、Transposing and Reshaping

11、Yet More Dimensions

np.ones((4,3,2))

array([[[1., 1.],
        [1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.],
        [1., 1.]]])

12、Formulas应用

NumPy表示:

13、Data Representation

13.1 Tables and Spreadsheets

A spreadsheet or a table of values is a two dimensional matrix. Each sheet in a spreadsheet can be its own variable. The most popular abstraction in python for those is the pandas dataframe, which actually uses NumPy and builds on top of it.

13.2 Audio and Timeseries

An audio file is a one-dimensional array of samples. Each sample is a number representing a tiny chunk of the audio signal. CD-quality audio may have 44,100 samples per second and each sample is an integer between -32767 and 32768. Meaning if you have a ten-seconds WAVE file of CD-quality, you can load it in a NumPy array with length 10 * 44,100 = 441,000 samples. Want to extract the first second of audio? simply load the file into a NumPy array that we’ll call audio, and get audio[:44100].

Here’s a look at a slice of an audio file:

The same goes for time-series data (for example, the price of a stock over time).

13.3 Images

An image is a matrix of pixels of size (height x width).
- If the image is black and white (a.k.a. grayscale), each pixel can be represented by a single number (commonly between 0 (black) and 255 (white)). Want to crop the top left 10 x 10 pixel part of the image? Just tell NumPy to get you image[:10,:10].

Here’s a look at a slice of an image file:

If the image is colored, then each pixel is represented by three numbers - a value for each of red, green, and blue. In that case we need a 3rd dimension (because each cell can only contain one number). So a colored image is represented by an ndarray of dimensions: (height x width x 3).

13.4 Language

If we’re dealing with text, the story is a little different. The numeric representation of text requires a step of building a vocabulary (an inventory of all the unique words the model knows) and an embedding step. Let us see the steps of numerically representing this (translated) quote by an ancient spirit:

“Have the bards who preceded me left any theme unsung?”

A model needs to look at a large amount of text before it can numerically represent the anxious words of this warrior poet. We can proceed to have it process a small dataset and use it to build a vocabulary (of 71,290 words):

The sentence can then be broken into an array of tokens (words or parts of words based on common rules):

We then replace each word by its id in the vocabulary table:

These ids still don’t provide much information value to a model. So before feeding a sequence of words to a model, the tokens/words need to be replaced with their embeddings (50 dimension word2vec embedding in this case):

You can see that this NumPy array has the dimensions [embedding_dimension x sequence_length]. In practice these would be the other way around, but I’m presenting it this way for visual consistency. For performance reasons, deep learning models tend to preserve the first dimension for batch size (because the model can be trained faster if multiple examples are trained in parallel). This is a clear case where reshape() becomes super useful. A model like BERT, for example, would expect its inputs in the shape: [batch_size, sequence_length, embedding_size].

light169

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Tensorflow (7) 图解 NumPy

1、Creating Arrays2、Array Arithmeticdata * 1.6:3、Indexing4、Aggregationmin, max, and sum, plenty of others.5、Creating Matricesnp.array([[1,2],[3,4]])6、Matrix Arithmetic7、Dot Productmatrix mult...
复制链接

扫一扫