jeecg-boot自动生成代码_用深度学习自动生成HTML代码

最新推荐文章于 2024-06-05 10:14:45 发布

weixin_39720865

最新推荐文章于 2024-06-05 10:14:45 发布

阅读量270

点赞数

文章标签： jeecg-boot自动生成代码

用深度学习自动生成HTML代码

选自Floydhub

作者：Emil Wallner

机器之心编译

如何用前端页面原型生成对应的代码一直是我们关注的问题，本文作者根据 pix2code 等论文构建了一个强大的前端代码生成模型，并详细解释了如何利用 LSTM 与 CNN 将设计原型编写为 HTML 和 CSS 网站。

项目链接：github.com/emilwallner…

在未来三年内，深度学习将改变前端开发。它将会加快原型设计速度，拉低开发软件的门槛。

Tony Beltramelli 在去年发布了论文《pix2code: Generating Code from a Graphical User Interface Screenshot》，Airbnb 也发布Sketch2code(airbnb.design/sketching-i…)。

目前，自动化前端开发的最大阻碍是计算能力。但我们已经可以使用目前的深度学习算法，以及合成训练数据来探索人工智能自动构建前端的方法。在本文中，作者将教神经网络学习基于一张图片和一个设计模板来编写一个 HTML 和 CSS 网站。以下是该过程的简要概述：

1)向训练过的神经网络输入一个设计图

2)神经网络将图片转化为 HTML 标记语言

3)渲染输出

我们将分三步从易到难构建三个不同的模型，首先，我们构建最简单地版本来掌握移动部件。第二个版本 HTML 专注于自动化所有步骤，并简要解释神经网络层。在最后一个版本 Bootstrap 中，我们将创建一个模型来思考和探索 LSTM 层。

代码地址：

github.com/emilwallner…
www.floydhub.com/emilwallner…

所有 FloydHub notebook 都在 floydhub 目录中，本地 notebook 在 local 目录中。

本文中的模型构建基于 Beltramelli 的论文《pix2code: Generating Code from a Graphical User Interface Screenshot》和 Jason Brownlee 的图像描述生成教程，并使用 Python 和 Keras 完成。

核心逻辑

我们的目标是构建一个神经网络，能够生成与截图对应的 HTML/CSS 标记语言。

训练神经网络时，你先提供几个截图和对应的 HTML 代码。网络通过逐个预测所有匹配的 HTML 标记语言来学习。预测下一个标记语言的标签时，网络接收到截图和之前所有正确的标记。

这里是一个简单的训练数据示例：docs.google.com/spreadsheet…。

创建逐词预测的模型是现在最常用的方法，也是本教程使用的方法。

注意：每次预测时，神经网络接收的是同样的截图。也就是说如果网络需要预测 20 个单词，它就会得到 20 次同样的设计截图。现在，不用管神经网络的工作原理，只需要专注于神经网络的输入和输出。

我们先来看前面的标记(markup)。假如我们训练神经网络的目的是预测句子「I can code」。当网络接收「I」时，预测「can」。下一次时，网络接收「I can」，预测「code」。它接收所有之前单词，但只预测下一个单词。

神经网络根据数据创建特征。神经网络构建特征以连接输入数据和输出数据。它必须创建表征来理解每个截图的内容和它所需要预测的 HTML 语法，这些都是为预测下一个标记构建知识。把训练好的模型应用到真实世界中和模型训练过程差不多。

我们无需输入正确的 HTML 标记，网络会接收它目前生成的标记，然后预测下一个标记。预测从「起始标签」(start tag)开始，到「结束标签」(end tag)终止，或者达到最大限制时终止。

Hello World 版

现在让我们构建 Hello World 版实现。我们将馈送一张带有「Hello World！」字样的截屏到神经网络中，并训练它生成对应的标记语言。

首先，神经网络将原型设计转换为一组像素值。且每一个像素点有 RGB 三个通道，每个通道的值都在 0-255 之间。

为了以神经网络能理解的方式表征这些标记，我使用了 one-hot 编码。因此句子「I can code」可以映射为以下形式。

在上图中，我们的编码包含了开始和结束的标签。这些标签能为神经网络提供开始预测和结束预测的位置信息。以下是这些标签的各种组合以及对应 one-hot 编码的情况。

我们会使每个单词在每一轮训练中改变位置，因此这允许模型学习序列而不是记忆词的位置。在下图中有四个预测，每一行是一个预测。且左边代表 RGB 三色通道和之前的词，右边代表预测结果和红色的结束标签。

#Length of longest sentence max_caption_len = 3#Size of vocabulary  vocab_size = 3# Load one screenshot for each word and turn them into digits  images = []for i in range(2): images.append(img_to_array(load_img('screenshot.jpg', target_size=(224, 224)))) images = np.array(images, dtype=float)# Preprocess input for the VGG16 model images = preprocess_input(images)#Turn start tokens into one-hot encoding html_input = np.array( [[[0., 0., 0.], #start [0., 0., 0.], [1., 0., 0.]], [[0., 0., 0.], #start Hello World! [1., 0., 0.], [0., 1., 0.]]])#Turn next word into one-hot encoding next_words = np.array( [[0., 1., 0.], # Hello World! [0., 0., 1.]]) # end# Load the VGG16 model trained on imagenet and output the classification feature VGG = VGG16(weights='imagenet', include_top=True)# Extract the features from the image features = VGG.predict(images)#Load the feature to the network, apply a dense layer, and repeat the vector vgg_feature = Input(shape=(1000,)) vgg_feature_dense = Dense(5)(vgg_feature) vgg_feature_repeat = RepeatVector(max_caption_len)(vgg_feature_dense)# Extract information from the input seqence  language_input = Input(shape=(vocab_size, vocab_size)) language_model = LSTM(5, return_sequences=True)(language_input)# Concatenate the information from the image and the input decoder = concatenate([vgg_feature_repeat, language_model])# Extract information from the concatenated output decoder = LSTM(5, return_sequences=False)(decoder)# Predict which word comes next decoder_output = Dense(vocab_size, activation='softmax')(decoder)# Compile and run the neural network model = Model(inputs=[vgg_feature, language_input], outputs=decoder_output) model.compile(loss='categorical_crossentropy', optimizer='rmsprop')# Train the neural network model.fit([features, html_input], next_words, batch_size=2, shuffle=False, epochs=1000)复制代码

在 Hello World 版本中，我们使用三个符号「start」、「Hello World」和「end」。字符级的模型要求更小的词汇表和受限的神经网络，而单词级的符号在这里可能有更好的性能。

以下是执行预测的代码：

# Create an empty sentence and insert the start token sentence = np.zeros((1, 3, 3)) # [[0,0,0], [0,0,0], [0,0,0]] start_token = [1., 0., 0.] # start sentence[0][2] = start_token # place start in empty sentence# Making the first prediction with the start token second_word = model.predict([np.array([features[1]]), sentence])# Put the second word in the sentence and make the final prediction sentence[0][1] = start_token sentence[0][2] = np.round(second_word) third_word = model.predict([np.array([features[1]]), sentence])# Place the start token and our two predictions in the sentence  sentence[0][0] = start_token sentence[0][1] = np.round(second_word) sentence[0][2] = np.round(third_word)# Transform our one-hot predictions into the final tokens vocabulary = ["start

weixin_39720865

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
jeecg-boot自动生成代码_用深度学习自动生成HTML代码

用深度学习自动生成HTML代码选自Floydhub作者：Emil Wallner机器之心编译如何用前端页面原型生成对应的代码一直是我们关注的问题，本文作者根据 pix2code 等论文构建了一个强大的前端代码生成模型，并详细解释了如何利用 LSTM 与 CNN 将设计原型编写为 HTML 和 CSS 网站。项目链接：github.com/emilwallner…在未来三年内，深度学习将改变前端开发...
复制链接

扫一扫