python predict函数_Keras model.predict函数给出输入形状错误

I have implemented universal sentence encoder in Tensorflow and now I am trying to predict the class probabilities on a sentence. I am converting the string to an array as well.

Code:

if model.model_type == "universal_classifier_basic":

class_probs = model.predict(np.array(['this is a random sentence'], dtype=object)

Error Message:

InvalidArgumentError (see above for traceback): input must be a vector, got shape: []

[[Node: lambda_1/module_apply_default/tokenize/StringSplit = StringSplit[skip_empty=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](lambda_1/module_apply_default/RegexReplace_1, lambda_1/module_apply_default/tokenize/Const)]]

Any leads, suggestions or explanations are welcomed and highly appreciated.

Thank You :)

解决方案

it is not that easy as you would like. Usually a model expects a vector of integer as input. Each integer represent the index of the correspondent word in a vocabulary. For example

vocab = {"hello":0, "world":1}

and you want to give as input the sentence "hello world" to the network then you should build the vector as follow:

net_input = [vocab.get(word) for word in "hello world".split(" ")]

Note also that, if you trained the network with mini batch then you will also need to add an extra first dimension to the vector you want to feed to the network. You can easily do this with numpy:

import numpy as np

net_input = np.expand_dims(net_input, 0)

In this way your net_input have the shape [1, 2] and you can feed it into the network.

There is still a problem that could stop you to feed the network with such a vector. At training time you have probably defined a placeholder for the input that has a precise len (30, 40 tokens). At test time you need to match that size at cost of padding your sentence if it doesn't feel the whole length or to cut it if it is longer.

You can truncate or add padding as follow:

net_input = [old_in[:max_len] + [vocab.get("PAD")] * (max_len - len(old_in[:max_len])] for old_in in net_input]

This line of code truncate the input if necessary old_in[:max_len] to the maximum possible len (note that python won't do anything if the len was less than max_len) and fill the difference between max len and the real len ((max_len - len(old_in[:max_len])) slots with padding tokens (+ [vocab.get("PAD")] )

Hope this helps.

If this is not the case you are in, just write down a comment to the answer and I'll try to figure out other solutions.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值