Tensorflow (7) 图解 NumPy

1、Creating Arrays

2、Array Arithmetic

 

data * 1.6:

3、Indexing

 

4、Aggregation

 

min, max, and sum, plenty of others.

5、Creating Matrices

np.array([[1,2],[3,4]])

 

6、Matrix Arithmetic

7、Dot Product

matrix multiplication

 

 8、Matrix Indexing

9、Matrix Aggregation

 

10、Transposing and Reshaping

 

 

 

 

11、Yet More Dimensions

np.ones((4,3,2))

array([[[1., 1.],
        [1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.],
        [1., 1.]]])

 

 

12、Formulas应用

 

NumPy表示:

13、Data Representation 

13.1 Tables and Spreadsheets

  • A spreadsheet or a table of values is a two dimensional matrix. Each sheet in a spreadsheet can be its own variable. The most popular abstraction in python for those is the pandas dataframe, which actually uses NumPy and builds on top of it.

13.2 Audio and Timeseries

  • An audio file is a one-dimensional array of samples. Each sample is a number representing a tiny chunk of the audio signal. CD-quality audio may have 44,100 samples per second and each sample is an integer between -32767 and 32768. Meaning if you have a ten-seconds WAVE file of CD-quality, you can load it in a NumPy array with length 10 * 44,100 = 441,000 samples. Want to extract the first second of audio? simply load the file into a NumPy array that we’ll call audio, and get audio[:44100].

Here’s a look at a slice of an audio file:

 

The same goes for time-series data (for example, the price of a stock over time).

13.3 Images

  • An image is a matrix of pixels of size (height x width).

    • If the image is black and white (a.k.a. grayscale), each pixel can be represented by a single number (commonly between 0 (black) and 255 (white)). Want to crop the top left 10 x 10 pixel part of the image? Just tell NumPy to get you image[:10,:10].

Here’s a look at a slice of an image file:

  • If the image is colored, then each pixel is represented by three numbers - a value for each of red, green, and blue. In that case we need a 3rd dimension (because each cell can only contain one number). So a colored image is represented by an ndarray of dimensions: (height x width x 3).

 13.4 Language

  • If we’re dealing with text, the story is a little different. The numeric representation of text requires a step of building a vocabulary (an inventory of all the unique words the model knows) and an embedding step. Let us see the steps of numerically representing this (translated) quote by an ancient spirit:

    “Have the bards who preceded me left any theme unsung?”

    A model needs to look at a large amount of text before it can numerically represent the anxious words of this warrior poet. We can proceed to have it process a small dataset and use it to build a vocabulary (of 71,290 words):

The sentence can then be broken into an array of tokens (words or parts of words based on common rules):

We then replace each word by its id in the vocabulary table:

These ids still don’t provide much information value to a model. So before feeding a sequence of words to a model, the tokens/words need to be replaced with their embeddings (50 dimension word2vec embedding in this case):

You can see that this NumPy array has the dimensions [embedding_dimension x sequence_length]. In practice these would be the other way around, but I’m presenting it this way for visual consistency. For performance reasons, deep learning models tend to preserve the first dimension for batch size (because the model can be trained faster if multiple examples are trained in parallel). This is a clear case where reshape() becomes super useful. A model like BERT, for example, would expect its inputs in the shape: [batch_size, sequence_length, embedding_size].

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值