解决shapes (none,111) and (none,111) are incompatible

monroyaume

已于 2023-01-22 11:49:39 修改

阅读量2.3k

点赞数

分类专栏：机器学习入门文章标签：深度学习 cnn

于 2021-12-05 07:43:19 首次发布

本文链接：https://blog.csdn.net/king52113141314/article/details/121725076

版权

机器学习入门专栏收录该内容

45 篇文章 7 订阅

订阅专栏

1,sparse_categorical_crossentropy和categorical_crossentropy的区别

对应解决这类问题:“logits and labels must have the same first dimension, got logits shape [32,28] and labels shape [3360]".

出现这个错误的原因就是网络最终输出shape和label的shape不一致

My features and labels matrices have the same 2 dimensions (number of sentences, length of max sentence). 亦即都是(1257,111)

categorical_crossentropy和sparse_categorical_crossentropy - 熊猫blue - 博客园

tensorflow - ValueError: Shapes (None, 1) and (None, 2) are incompatible - Stack Overflow

都是计算多分类crossentropy的，只是对y(即label)的格式要求不同。

1）如果是categorical_crossentropy，那y必须是one-hot处理过的

2）如果是sparse_categorical_crossentropy，那y就是原始的整数形式，比如[1, 0, 2, 0, 2]这种

2,CNN is to look into using padding=“same” for your convolutional layer, which is sort of equivalent to using return_sequences=True in RNN.

3,apply `tensorflow.keras.utils.to_categorical to change y(label)` to be one-hot-encoded

python - ValueError: Shapes (None, 1) and (None, 3) are incompatible - Stack Overflow

4,有关shapes 不兼容主要考虑flatten或`GlobalMaxPooling1D或MaxPooling`层的增加或删除,CNN 是否 padding=“same”,以及每层RNN 都增加return_sequences=True或者最后一层RNN不增加return_sequences=True

例如最后一层RNN不增加return_sequences=True并包含flatten:

    model = tf.keras.models.Sequential([
        tf.keras.layers.Embedding(
            input_dim=len(nlp.vocab.vectors) + 1,
            input_length=max_tokens,
            output_dim=100),
        tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(units=64, activation="relu", return_sequences=True)),
        tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(units=64, activation="relu")),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(units=1024, activation='relu'),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(units=len(labels), activation='softmax'),
])

的model.summary()是:

例如每层RNN 都增加return_sequences=True并保留flatten:

    model2 = tf.keras.models.Sequential([
        tf.keras.layers.Embedding(
            input_dim=len(nlp.vocab.vectors) + 1,
            input_length=max_tokens,
            output_dim=100),
        tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(units=64, activation="relu", return_sequences=True)),
        tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(units=64, activation="relu", return_sequences=True)),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(units=1024, activation='relu'),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(units=len(labels), activation='softmax'),
    ])

的model.summary()是:

例如每层RNN 都增加return_sequences=True并去除flatten:

    model = tf.keras.models.Sequential([
        tf.keras.layers.Embedding(
            input_dim=len(nlp.vocab.vectors) + 1,
            input_length=max_tokens,
            output_dim=100),
        tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(units=64, activation="relu", return_sequences=True)),
        tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(units=64, activation="relu", return_sequences=True)),
        #tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(units=1024, activation='relu'),
        tf.keras.layers.Dropout(0.2),
        #tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(units=len(labels), activation='softmax'),
    ])

的model.summary()是:

总结:

return_sequences=True保持原维度不变,如果去除它,则中间维度也会去除;

flatten会把后两个维度相乘变成一个维度,也即减少一个维度

a Flatten or GlobalMaxPooling1D that would squash your 1-vector-per-input into a single output, rather than leaving 1-output-per-input

MaxPooling1D会改变中间一维的维度值

5,Classifying not just one token at a time, but a series of tokens需要改变embedding_layer的input_length=max_tokens.

对于Classifying just one token at a time: input_length=1;

对于Classifying just one token at a time: input_length=max_tokens;

        tf.keras.layers.Embedding(
            input_dim=len(nlp.vocab.vectors) + 1,
            input_length=max_tokens,
            output_dim=100),

Classifying not just one token at a time, but a series of tokens, using a convolutional, recurrent, or transformer network. If you pursue this direction, you will likely want to replace the code
for token in tokens:
    ....
    train_features.append([token_index])
with code that groups tokens by sentence, e.g.:
for sent in nlp(text).sents:
    token_indexes = []
    for token in sent:
        ...
        token_indexes.append(token_index)
    train_features.append(token_indexes)
The shape of your matrices should now be (number of sentences, maximum number of tokens in a sentence) and you will need to take care to pad your matrices accordingly.

参考:

The paper discusses how we can identify name, location, and time key words in a sentence.

Fixed Positional Encodings Positional encoding for Attention model

splitting array to certain size (helpful for creating training batches) python - How do you split a list into evenly sized chunks? - Stack Overflow

The annotation guidelines are here: Overleaf, Online LaTeX Editor

python - How to use F-score as error function to train neural networks? - Stack Overflow

monroyaume

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
解决shapes (none,111) and (none,111) are incompatible

1,sparse_categorical_crossentropy和categorical_crossentropy的区别对应解决这类问题:“logits and labels must have the same first dimension, got logits shape [32,28] and labels shape [3360]".出现这个错误的原因就是网络最终输出shape和label的shape不一致My features and labels matrices have
复制链接

扫一扫