TensorFlow2使用IMDB数据集电影评论完成文本分类任务

1. 导入需要的库

import numpy as np
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_datasets as tfds

for i in [np, tf, hub, tfds]:
    print(i.__name__,": ",i.__version__,sep="")

print("Eager mode: ", tf.executing_eagerly())

输出:

numpy: 1.17.4
tensorflow: 2.2.0
tensorflow_hub: 0.8.0
tensorflow_datasets: 3.1.0
Eager mode: True

2. 下载并导入数据

train_data, validation_data, test_data = tfds.load(name="imdb_reviews", 
                                                   split=('train[:60%]', 'train[60%:]', 'test'),
                                                   as_supervised=True)

3. 数据初探

train_examples_batch, train_labels_batch = next(iter(train_data.batch(10)))
train_examples_batch

输出:

<tf.Tensor: shape=(10,), dtype=string, numpy=
array([b'This is a big step down after the surprisingly enjoyable original. This sequel isn\'t nearly as fun as part one, and it instead spends too much time on plot development. Tim Thomerson is still the best thing about this series, but his wisecracking is toned down in this entry. The performances are all adequate, but this time the script lets us down. The action is merely routine and the plot is only mildly interesting, so I need lots of silly laughs in order to stay entertained during a "Trancers" movie. Unfortunately, the laughs are few and far between, and so, this film is watchable at best.',
       b"Perhaps because I was so young, innocent and BRAINWASHED when I saw it, this movie was the cause of many sleepless nights for me. I haven't seen it since I was in seventh grade at a Presbyterian school, so I am not sure what effect it would have on me now. However, I will say that it left an impression on me... and most of my friends. It did serve its purpose, at least until we were old enough and knowledgeable enough to analyze and create our own opinions. I was particularly terrified of what the newly-converted post-rapture Christians had to endure when not receiving the mark of the beast. I don't want to spoil the movie for those who haven't seen it so I will not mention details of the scenes, but I can still picture them in my head... and it's been 19 years.",
       b'Hood of the Living Dead had a lot to live up to even before the opening credits began. First, any play on "...of the living dead" invokes His Holiness Mr. Romero and instantly sets up a high standard to which many movies cannot afford to aspire. And second, my movie-watching companion professed doubt that any urban horror film would surpass the seminal Leprechaun In the Hood. Skeptical, we settled in to watch. <br /><br />We were rewarded with a surprisingly sincere and good-hearted zombie film. Oh, certainly the budget is low, and of course the directors\' amateurs friends populate the cast, but Hood of the Living Dead loves zombie cinema. Cheap? Yeah. But when it\'s this cheap, you can clearly see where LOVE holds it together. <br /><br />Ricky works in a lab during the day and as a surrogate parent to his younger brother at night. He dreams of moving out of Oakland. Before this planned escape, however, his brother is shot to death in a drive-by. Ricky\'s keen scientific mind presents an option superior to CPR or 911: injections of his lab\'s experimental regenerative formula. Sadly, little bro wakes up in an ambulance as a bloodthirsty Oakland zombie! Chaos and mayhem! I think it\'s more economical to eat your enemies than take vengeance in a drive-by, but then again, I\'m a poor judge of the complexities of urban life. (How poor a judge? In response to a gory scene involving four men, I opined "Ah-ha! White t-shirts on everyone so the blood shows up. Economical! I used the same technique in my own low-budget horror film." Jordan replied, "No, that\'s gang dress. White t-shirts were banned from New Orleans bars for a time as a result." Oh.)<br /><br />A lot of the movie is set in someone\'s living room, so there\'s a great deal of hanging out and waiting for the zombies. But the characters are sympathetic and the movie is sincere-- it surpasses its budget in spirit. <br /><br />Zombie explanation: When man plays God, zombies arise! Or, perhaps: Follow FDA-approved testing rules before human experimentation! <br /><br />Contribution to the zombie canon: This is the first zombie movie I\'ve seen with a drive-by shooting. As far as the actual zombies go, infection is spread with a bite as usual, but quite unusually head shots don\'t work-- it\'s heart shots that kill. Zombies have pulses, the absence of which proves true death. And these zombies make pretty cool jaguar-growl noises. <br /><br />Gratuitous zombie movie in-joke: A mercenary named Romero. Groan. <br /><br />Favorite zombie: Jaguar-noise little brother zombie, of course!',
       b"For me this is a story that starts with some funny jokes regarding Franks fanatasies when he is travelling with a staircase and when he is sitting in business meetings... The problem is that when you have been watching this movie for an hour you will see the same fantasies/funny situations again and again and again. It is to predictable. It is more done as a TV story where you can go away and come back without missing anything.<br /><br />I like Felix Herngren as Frank but that is not enough even when it is a comedy it has to have more variations and some kind of message to it's audience....<br /><br />",
       b'This is not a bad movie. It follows the new conventions of modern horror, that is the movie within a movie, the well known actress running for her life in the first scene. This movie takes the old convention of a psycho killer on he loose, and manage to do something new, and interesting with it. It is also always nice to see Molly Ringwald back for the attack.<br /><br />So this might be an example of what the genre has become. Cut hits all the marks, and is actually scary in some parts. I liked it I gave it an eight.',
       b"I just finished a marathon of this series, and it became agonising to watch as it progressed. From the fictionalising of the historical elements, to O'Herlihy's awful accent in later episodes, the show just slumps the further it goes. If you are looking for some low quality production generalised WW2 fluff, then I could recommend season 1, but avoid anything after that, it degenerates into being one step from a soap opera, with increasingly worse story lines and sensibility.<br /><br />The old B&W film is by far the best of any form of entertainment with the Colditz name attached to it, and even that is not what one could hope for.",
       b'I am very sorry that this charming and whimsical film (which I first saw soon after it was first released in the early fifties) has had such a poor reception more recently. In my opinion it has been greatly underrated - but perhaps it appeals more to the European sense of humour than to (for example) the American: maybe we in Europe can understand and appreciate its subtleties and situations more, since we are closer to some of them in real life! Particular mention should be made of the limited but good music - especially the catchy and memorable song "It\'s a fine, fine night", which was issued separately on an HMV 78rpm record (10 inch plum label, I think!) in the fifties. I would urge anyone interested to give it a try if you get the chance: you may have a pleasant surprise.',
       b"Well i am going to go against the grain on this film so it seems. Being a self confessed horror fan I sat down to this not quite knowing what to expect. After 2 or 3 mins i actually found myself scared (quite rare). The film obviously has a small budget and is set around charing cross station but the films lack of money does not distract from the story. Yes the story is a bit far fetched and doesn't explain itself very well but THE CREEP is a class act and proceeds to slash and dismember anything that comes its way. MESSAGE FOR LADIES !!! THERE ARE CERTAIN PARTS OF THE FILM YOU SHOULD CLOSE YOUR EYES AT OR AT LEAST CROSS YOUR LEGS !! you will understand when you see it.<br /><br />All in all a good film and it makes a change to see a good slasher movie that actually scares",
       b'Even 15 years after the end of the Vietnam war "Jacknife" came not too late or was even superfluous. It\'s one of the few that try to deal with the second sad side of the war: The time after. Different from movies like "Taxi driver" or "Rambo" which use to present their main characters as broken heroes in a bad after war environment this movie allows the audience to face a different view on the Vietnam vets. Their development is shown very precisely before and especially after the war. The problems are obvious but in all this tragic there is always the feeling of some hope on the basis of love and friendship. "Jacknife" might be the quietest Vietnam movie ever but after almost 15 years this is really plausible and therefor justified. Moreover, it can make us believe that the war has not finished, yet; at least for some of us.<br /><br />The three main characters are amazing. De Niro has done one of his best jobs but Ed Harris is the star of this movie. Possibly,this was his best performance ever.',
       b'Before I explain the "Alias" comment let me say that "The Desert Trail" is bad even by the standards of westerns staring The Three Stooges. In fact it features Carmen Laroux as semi- bad girl Juanita, when you hear her Mexican accent you will immediately recognize her as Senorita Rita from the classic Stooge short "Saved by the Belle". <br /><br />In "The Desert Trail" John Wayne gets to play the Moe Howard character and Eddy Chandler gets to play Curly Howard. Like their Stooge counterparts a running gag throughout the 53- minute movie is Moe hitting Curly. Wayne\'s character, a skirt chasing bully, is not very endearing, but is supposed to be the good guy. <br /><br />Playing a traveling rodeo cowboy Wayne holds up the rodeo box office at gunpoint and takes the prize money he would have won if the attendance proceeds had been good-the other riders have to settle for 25 cents on the dollar (actually even less after Wayne robs the box office). No explanation is given for Wayne\'s ripping off the riders and still being considered the hero who gets the girl. <br /><br />Things get complicated at this point because the villain (Al Ferguson) and his sidekick Larry Fine (played by Paul Fix-who would go on to play Sheriff Micah on television\'s "The Rifleman") see Wayne rob the box office and then steal the remainder of the money and kill the rodeo manager. Moe and Curly get blamed. <br /><br />So Moe and Curly move to another town to get away from the law and they change their names to Smith and Jones. Who do they meet first but their old friend Larry, whose sister becomes the 2nd half love interest (Senorita Rita is left behind it the old town and makes no further appearances in the movie). <br /><br />Larry\'s sister is nicely played by a radiantly beautiful Mary Kornman (now grown up but in her younger days she was one of the original cast members of Hal Roach\'s "Our Gang" shorts). Kornman is the main reason to watch the mega-lame western and her scenes with Moe and Curly are much better than any others in the production, as if they used an entirely different crew to film them. <br /><br />Even for 1935 the action sequences in this thing are extremely weak and the technical film- making is staggeringly bad. The two main chase scenes end with stock footage wide shots of a rider falling from a horse. Both times the editor cuts to a shot of one of the characters rolling on the ground, but there is no horse in the frame, the film stock is completely different, and the character has on different clothes than the stunt rider. There is liberal use of stock footage in other places, none of it even remotely convincing. <br /><br />One thing to watch for is a scene midway into the movie where Moe and Curly get on their horses and ride away (to screen right) from a cabin as the posse is galloping toward the cabin from the left. The cameraman follows the two stooges with a slow pan right and then does a whip pan to the left to reveal the approaching posse. Outside of home movies I have never seen anything like this, not because it is looks stupid (which it does) but because a competent director would never stage a scene in this manner. They would film the two riders leaving and then reposition the camera and film the posse approaching as a separate action. Or if they were feeling creative they would stage the sequence so the camera shows the riders in the foreground and the posse approaching in the background. <br /><br />Then again, what do I know? I\'m only a child.'],
      dtype=object)>
train_labels_batch

输出:

<tf.Tensor: shape=(10,), dtype=int64, numpy=array([0, 0, 1, 0, 1, 0, 1, 1, 1, 0], dtype=int64)>

4. 模型构建

4.1 数据预处理

embedding = "https://hub.tensorflow.google.cn/google/tf2-preview/gnews-swivel-20dim/1"
hub_layer = hub.KerasLayer(embedding, input_shape=[], 
                           dtype=tf.string, trainable=True)
hub_layer(train_examples_batch[:3])

输出:

<tf.Tensor: shape=(3, 20), dtype=float32, numpy=
array([[ 2.209591  , -2.7093675 ,  3.6802928 , -1.0291991 , -4.1671185 ,
        -2.4566064 , -2.2519937 , -0.36589956,  1.9485804 , -3.1104462 ,
        -2.4610963 ,  1.3139242 , -0.9161584 , -0.16625322, -3.723651  ,
         1.8498232 ,  3.499562  , -1.2373022 , -2.8403084 , -1.213074  ],
       [ 1.9055302 , -4.11395   ,  3.6038654 ,  0.28555924, -4.658998  ,
        -5.5433393 , -3.2735848 ,  1.9235417 ,  3.8461034 ,  1.5882455 ,
        -2.64167   ,  0.76057523, -0.14820506,  0.9115291 , -6.45758   ,
         2.3990374 ,  5.0985413 , -3.2776263 , -3.2652326 , -1.2345369 ],
       [ 3.6510668 , -4.7066135 ,  4.71003   , -1.7002777 , -3.7708545 ,
        -3.709126  , -4.222776  ,  1.946586  ,  6.1182513 , -2.7392752 ,
        -5.4384456 ,  2.7078724 , -2.1263676 , -0.7084146 , -5.893995  ,
         3.1602864 ,  3.8389287 , -3.318196  , -5.1542974 , -2.4051712 ]],
      dtype=float32)>

4.2 构建模型

model = tf.keras.Sequential()
model.add(hub_layer)
model.add(tf.keras.layers.Dense(16, activation='relu'))
model.add(tf.keras.layers.Dense(1))

model.summary()

输出:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
keras_layer (KerasLayer)     (None, 20)                400020    
_________________________________________________________________
dense (Dense)                (None, 16)                336       
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 17        
=================================================================
Total params: 400,373
Trainable params: 400,373
Non-trainable params: 0
_________________________________________________________________

4.3 编译模型

model.compile(optimizer='adam',
              loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              metrics=['accuracy'])

5. 模型训练

history = model.fit(train_data.shuffle(10000).batch(512),
                    epochs=20,
                    validation_data=validation_data.batch(512),
                    verbose=1)

输出:

Epoch 1/20
30/30 [==============================] - 2s 64ms/step - loss: 0.8365 - accuracy: 0.5043 - val_loss: 0.7301 - val_accuracy: 0.5593
Epoch 2/20
30/30 [==============================] - 2s 59ms/step - loss: 0.6765 - accuracy: 0.6031 - val_loss: 0.6485 - val_accuracy: 0.6110
Epoch 3/20
30/30 [==============================] - 2s 57ms/step - loss: 0.6198 - accuracy: 0.6465 - val_loss: 0.6081 - val_accuracy: 0.6460
Epoch 4/20
30/30 [==============================] - 2s 60ms/step - loss: 0.5822 - accuracy: 0.6726 - val_loss: 0.5741 - val_accuracy: 0.6794
Epoch 5/20
30/30 [==============================] - 2s 58ms/step - loss: 0.5459 - accuracy: 0.7017 - val_loss: 0.5430 - val_accuracy: 0.7165
Epoch 6/20
30/30 [==============================] - 2s 58ms/step - loss: 0.5128 - accuracy: 0.7315 - val_loss: 0.5112 - val_accuracy: 0.7408
Epoch 7/20
30/30 [==============================] - 2s 58ms/step - loss: 0.4777 - accuracy: 0.7586 - val_loss: 0.4804 - val_accuracy: 0.7572
Epoch 8/20
30/30 [==============================] - 2s 59ms/step - loss: 0.4422 - accuracy: 0.7845 - val_loss: 0.4526 - val_accuracy: 0.7810
Epoch 9/20
30/30 [==============================] - 2s 58ms/step - loss: 0.4093 - accuracy: 0.8073 - val_loss: 0.4258 - val_accuracy: 0.7890
Epoch 10/20
30/30 [==============================] - 2s 58ms/step - loss: 0.3781 - accuracy: 0.8274 - val_loss: 0.4016 - val_accuracy: 0.8097
Epoch 11/20
30/30 [==============================] - 2s 60ms/step - loss: 0.3496 - accuracy: 0.8452 - val_loss: 0.3811 - val_accuracy: 0.8191
Epoch 12/20
30/30 [==============================] - 2s 61ms/step - loss: 0.3235 - accuracy: 0.8588 - val_loss: 0.3636 - val_accuracy: 0.8353
Epoch 13/20
30/30 [==============================] - 2s 60ms/step - loss: 0.2997 - accuracy: 0.8722 - val_loss: 0.3507 - val_accuracy: 0.8456
Epoch 14/20
30/30 [==============================] - 2s 62ms/step - loss: 0.2790 - accuracy: 0.8849 - val_loss: 0.3366 - val_accuracy: 0.8502
Epoch 15/20
30/30 [==============================] - 2s 59ms/step - loss: 0.2596 - accuracy: 0.8943 - val_loss: 0.3283 - val_accuracy: 0.8498
Epoch 16/20
30/30 [==============================] - 2s 59ms/step - loss: 0.2420 - accuracy: 0.9023 - val_loss: 0.3199 - val_accuracy: 0.8557
Epoch 17/20
30/30 [==============================] - 2s 58ms/step - loss: 0.2261 - accuracy: 0.9091 - val_loss: 0.3126 - val_accuracy: 0.8611
Epoch 18/20
30/30 [==============================] - 2s 59ms/step - loss: 0.2116 - accuracy: 0.9171 - val_loss: 0.3077 - val_accuracy: 0.8656
Epoch 19/20
30/30 [==============================] - 2s 59ms/step - loss: 0.1979 - accuracy: 0.9233 - val_loss: 0.3043 - val_accuracy: 0.8660
Epoch 20/20
30/30 [==============================] - 2s 61ms/step - loss: 0.1859 - accuracy: 0.9292 - val_loss: 0.3022 - val_accuracy: 0.8703

6. 模型评估

results = model.evaluate(test_data.batch(512), verbose=2)

for name, value in zip(model.metrics_names, results):
    print("%s: %.3f" % (name, value))

输出:

49/49 - 1s - loss: 0.3218 - accuracy: 0.8552
loss: 0.322
accuracy: 0.855

 

 

 

 

  • 2
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
好的,以下是使用循环神经网络(RNN)对IMDB电影评论数据集进行文本情感分类的代码: ```python import tensorflow as tf from tensorflow.keras.datasets import imdb from tensorflow.keras.layers import Embedding, LSTM, Dense from tensorflow.keras.models import Sequential from tensorflow.keras.preprocessing import sequence # 设置超参 max_features = 20000 # 只考虑最常见的20,000个单词 max_len = 80 # 每个评论最多包含80个单词 embedding_size = 128 # 词向量的维度 # 加载IMDB电影评论数据集 (x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features) # 对据进行预处理 x_train = sequence.pad_sequences(x_train, maxlen=max_len) x_test = sequence.pad_sequences(x_test, maxlen=max_len) # 构建模型 model = Sequential() model.add(Embedding(max_features, embedding_size, input_length=max_len)) model.add(LSTM(units=128)) model.add(Dense(units=1, activation='sigmoid')) # 编译模型 model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) # 训练模型 model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test)) ``` 这个代码中,我们首先导入了所需的库和类。然后,我们设置了一些超参,包括每个评论最多包含的单词、词向量的维度和我们要考虑的最常见的单词。 接下来,我们使用 `imdb.load_data()` 函从Keras中加载IMDB电影评论数据集,并将其分成训练集和测试集。 然后,我们对据进行了预处理,使用 `sequence.pad_sequences()` 函将每个评论的长度统一设置为 `max_len`,并使用零值填充序列。 接着,我们构建了一个简单的RNN模型,包括一个嵌入层、一个LSTM层和一个全连接层。我们使用 `model.compile()` 函来编译模型,指定了损失函、优化器和评价指标。 最后,我们使用 `model.fit()` 函来训练模型,并在测试集上进行了验证。在这个例子中,我们将模型训练了5个epoch,使用批大小为32。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值