neat神经网络算法的线性回归拟合（Python）

爱听许嵩歌

已于 2022-02-18 11:58:49 修改

阅读量2.3k

点赞数 2

分类专栏：数据分析（Python）机器学习文章标签：深度学习 neat-python 数据分析

于 2020-12-21 20:14:37 首次发布

本文链接：https://blog.csdn.net/weixin_45092662/article/details/106526199

版权

数据分析（Python）同时被 2 个专栏收录

28 篇文章 11 订阅

订阅专栏

机器学习

17 篇文章 0 订阅

订阅专栏

前期做了个neat（遗传拓扑神经网络）算法进化出异或实验（xor）的网络和权重，那个非线性的都能训练出来，这个线性的应该没问题吧，所以试试。

conda软件版本：

python = 3.7.7
pandas = 1.0.3
neat-python = 0.92
numpy = 1.17.0
matplotlib = 3.1.1
(conda install)graphviz = 2.38.0
(pip install)graphviz = 0.13.2

数据，保存为：data.csv

在这里插入图片描述

首先，因为是线性规划，所以我们改下激活函数，改为relu，配置文件保存为：config-feedforward

#--- parameters for the xianxing experiment ---#

[NEAT]
fitness_criterion     = max
fitness_threshold     = 99
pop_size              = 300
reset_on_extinction   = True

[DefaultGenome]
# node activation options
activation_default      = random
activation_mutate_rate  = 0.1
activation_options      = relu

# node aggregation options
aggregation_default     = sum
aggregation_mutate_rate = 0.1
aggregation_options     = sum

# node bias options
bias_init_mean          = 0.0
bias_init_stdev         = 1.0
bias_max_value          = 30.0
bias_min_value          = -30.0
bias_mutate_power       = 0.5
bias_mutate_rate        = 0.7
bias_replace_rate       = 0.1

# genome compatibility options
compatibility_disjoint_coefficient = 1.0
compatibility_weight_coefficient   = 0.5

# connection add/remove rates
conn_add_prob           = 0.5
conn_delete_prob        = 0.5

# connection enable options
enabled_default         = True
enabled_mutate_rate     = 0.1

feed_forward            = True
initial_connection      = full

# node add/remove rates
node_add_prob           = 0.2
node_delete_prob        = 0.2

# network parameters
num_hidden              = 0
num_inputs              = 1
num_outputs             = 1

# node response options
response_init_mean      = 1.0
response_init_stdev     = 0.0
response_max_value      = 30.0
response_min_value      = -30.0
response_mutate_power   = 0.0
response_mutate_rate    = 0.0
response_replace_rate   = 0.0

# connection weight options
weight_init_mean        = 0.0
weight_init_stdev       = 1.0
weight_max_value        = 30
weight_min_value        = -30
weight_mutate_power     = 0.5
weight_mutate_rate      = 0.8
weight_replace_rate     = 0.1

[DefaultSpeciesSet]
compatibility_threshold = 3.0

[DefaultStagnation]
species_fitness_func = max
max_stagnation       = 20
species_elitism      = 3

[DefaultReproduction]
elitism            = 3
survival_threshold = 0.3

其次是主函数文件了，保存为：neat线性回归.py

import os
import neat
import visualize
import pandas as pd

data = pd.read_csv('data.csv')
x_input = data.x
y_output = data.y


def inputshuju():
    x = []
    for i in range(len(x_input)):
        x1 = x_input[i]
        x.append([x1])

    return x


def outputshuju():
    y = []
    for i in range(len(y_output)):
        y1 = y_output[i]
        y.append([y1])

    return y


# 1个输入，1个输出
xor_inputs = inputshuju()
xor_outputs = outputshuju()


def eval_genomes(genomes, config):
    # 评估函数
    for genome_id, genome in genomes:  # 每一个个体
        genome.fitness = 100.0  # 适应度为100.0的评估
        net = neat.nn.FeedForwardNetwork.create(genome, config)  # 生成一个前向传导网络
        for xi, xo in zip(xor_inputs, xor_outputs):  # zip打包成元祖的列表,https://www.runoob.com/python3/python3-func-zip.html
            output = net.activate(xi)
            genome.fitness -= (output[0] - xo[0]) ** 2  # 训练完后得到一个fitness


def run(config_file):
    # 读取配置文件
    config = neat.Config(neat.DefaultGenome, neat.DefaultReproduction,
                         neat.DefaultSpeciesSet, neat.DefaultStagnation,
                         config_file)

    # 创建种群
    p = neat.Population(config)

    # 打印训练过程
    p.add_reporter(neat.StdOutReporter(True))
    stats = neat.StatisticsReporter()
    p.add_reporter(stats)
    p.add_reporter(neat.Checkpointer())

    # 迭代1000次
    winner = p.run(eval_genomes, 1000)

    # 显示最佳网络
    print('\nBest genome:\n{!s}'.format(winner))
    print('\nOutput:')
    winner_net = neat.nn.FeedForwardNetwork.create(winner, config)
    for xi, xo in zip(xor_inputs, xor_outputs):
        output = winner_net.activate(xi)
        print("input {!r}, expected output {!r}, got {!r}".format(xi, xo, output))

    # 打印网络结构
    node_names = {-1: 'x', 0: 'y'}
    visualize.draw_net(config, winner, True, node_names=node_names)
    visualize.plot_stats(stats, ylog=False, view=True)
    visualize.plot_species(stats, view=True)

    # p = neat.Checkpointer.restore_checkpoint('neat-checkpoint-49')
    # p.run(eval_genomes, 10)


if __name__ == '__main__':
    local_dir = os.path.dirname(__file__)
    config_path = os.path.join(local_dir, 'config-feedforward')
    run(config_path)

画图的文件，保存为：visualize.py

from __future__ import print_function

import copy
import warnings

import graphviz
import matplotlib.pyplot as plt
import numpy as np


def plot_stats(statistics, ylog=False, view=False, filename='avg_fitness.svg'):
    """ Plots the population's average and best fitness. """
    if plt is None:
        warnings.warn("This display is not available due to a missing optional dependency (matplotlib)")
        return

    generation = range(len(statistics.most_fit_genomes))
    best_fitness = [c.fitness for c in statistics.most_fit_genomes]
    avg_fitness = np.array(statistics.get_fitness_mean())
    stdev_fitness = np.array(statistics.get_fitness_stdev())

    plt.plot(generation, avg_fitness, 'b-', label="average")
    plt.plot(generation, avg_fitness - stdev_fitness, 'g-.', label="-1 sd")
    plt.plot(generation, avg_fitness + stdev_fitness, 'g-.', label="+1 sd")
    plt.plot(generation, best_fitness, 'r-', label="best")

    plt.title("Population's average and best fitness")
    plt.xlabel("Generations")
    plt.ylabel("Fitness")
    plt.grid()
    plt.legend(loc="best")
    if ylog:
        plt.gca().set_yscale('symlog')

    plt.savefig(filename)
    if view:
        plt.show()

    plt.close()


def plot_spikes(spikes, view=False, filename=None, title=None):
    """ Plots the trains for a single spiking neuron. """
    t_values = [t for t, I, v, u, f in spikes]
    v_values = [v for t, I, v, u, f in spikes]
    u_values = [u for t, I, v, u, f in spikes]
    I_values = [I for t, I, v, u, f in spikes]
    f_values = [f for t, I, v, u, f in spikes]

    fig = plt.figure()
    plt.subplot(4, 1, 1)
    plt.ylabel("Potential (mv)")
    plt.xlabel("Time (in ms)")
    plt.grid()
    plt.plot(t_values, v_values, "g-")

    if title is None:
        plt.title("Izhikevich's spiking neuron model")
    else:
        plt.title("Izhikevich's spiking neuron model ({0!s})".format(title))

    plt.subplot(4, 1, 2)
    plt.ylabel("Fired")
    plt.xlabel("Time (in ms)")
    plt.grid()
    plt.plot(t_values, f_values, "r-")

    plt.subplot(4, 1, 3)
    plt.ylabel("Recovery (u)")
    plt.xlabel("Time (in ms)")
    plt.grid()
    plt.plot(t_values, u_values, "r-")

    plt.subplot(4, 1, 4)
    plt.ylabel("Current (I)")
    plt.xlabel("Time (in ms)")
    plt.grid()
    plt.plot(t_values, I_values, "r-o")

    if filename is not None:
        plt.savefig(filename)

    if view:
        plt.show()
        plt.close()
        fig = None

    return fig


def plot_species(statistics, view=False, filename='speciation.svg'):
    """ Visualizes speciation throughout evolution. """
    if plt is None:
        warnings.warn("This display is not available due to a missing optional dependency (matplotlib)")
        return

    species_sizes = statistics.get_species_sizes()
    num_generations = len(species_sizes)
    curves = np.array(species_sizes).T

    fig, ax = plt.subplots()
    ax.stackplot(range(num_generations), *curves)

    plt.title("Speciation")
    plt.ylabel("Size per Species")
    plt.xlabel("Generations")

    plt.savefig(filename)

    if view:
        plt.show()

    plt.close()


def draw_net(config, genome, view=False, filename=None, node_names=None, show_disabled=True, prune_unused=False,
             node_colors=None, fmt='svg'):
    """ Receives a genome and draws a neural network with arbitrary topology. """
    # Attributes for network nodes.
    if graphviz is None:
        warnings.warn("This display is not available due to a missing optional dependency (graphviz)")
        return

    if node_names is None:
        node_names = {}

    assert type(node_names) is dict

    if node_colors is None:
        node_colors = {}

    assert type(node_colors) is dict

    node_attrs = {
        'shape': 'circle',
        'fontsize': '9',
        'height': '0.2',
        'width': '0.2'}

    dot = graphviz.Digraph(format=fmt, node_attr=node_attrs)

    inputs = set()
    for k in config.genome_config.input_keys:
        inputs.add(k)
        name = node_names.get(k, str(k))
        input_attrs = {'style': 'filled',
                       'shape': 'box'}
        input_attrs['fillcolor'] = node_colors.get(k, 'lightgray')
        dot.node(name, _attributes=input_attrs)

    outputs = set()
    for k in config.genome_config.output_keys:
        outputs.add(k)
        name = node_names.get(k, str(k))
        node_attrs = {'style': 'filled'}
        node_attrs['fillcolor'] = node_colors.get(k, 'lightblue')

        dot.node(name, _attributes=node_attrs)

    if prune_unused:
        connections = set()
        for cg in genome.connections.values():
            if cg.enabled or show_disabled:
                connections.add((cg.in_node_id, cg.out_node_id))

        used_nodes = copy.copy(outputs)
        pending = copy.copy(outputs)
        while pending:
            new_pending = set()
            for a, b in connections:
                if b in pending and a not in used_nodes:
                    new_pending.add(a)
                    used_nodes.add(a)
            pending = new_pending
    else:
        used_nodes = set(genome.nodes.keys())

    for n in used_nodes:
        if n in inputs or n in outputs:
            continue

        attrs = {'style': 'filled',
                 'fillcolor': node_colors.get(n, 'white')}
        dot.node(str(n), _attributes=attrs)

    for cg in genome.connections.values():
        if cg.enabled or show_disabled:
            #if cg.input not in used_nodes or cg.output not in used_nodes:
            #    continue
            input, output = cg.key
            a = node_names.get(input, str(input))
            b = node_names.get(output, str(output))
            style = 'solid' if cg.enabled else 'dotted'
            color = 'green' if cg.weight > 0 else 'red'
            width = str(0.1 + abs(cg.weight / 5.0))
            dot.edge(a, b, _attributes={'style': style, 'color': color, 'penwidth': width})

    dot.render(filename, view=view)

    return dot

原理：

原理很简单，先用panda读取数据，然后用neat算法（遗传拓扑神经网络）迭代进化出我们需要的拓扑和权重，初始设定的fitness_threshold = 99,迭代次数为1000次，当fitness > 99，或者迭代了1000次，代码运行结束，最后可以看下实验最后部分结果。

 ****** Running generation 1 ****** 

Population's average fitness: -44417.02749 stdev: 40257.92716
Best fitness: 89.04762 - size: (1, 1) - species 1 - id 544
Average adjusted fitness: 0.902
Mean genetic distance 1.631, standard deviation 0.836
Population of 300 members in 2 species:
   ID   age  size  fitness  adj fit  stag
  ====  ===  ====  =======  =======  ====
     1    1   281     89.0    0.902     0
     2    0    19       --       --     0
Total extinctions: 0
Generation time: 0.034 sec (0.034 average)

****** Running generation 999 ****** 

Population's average fitness: -66597.44638 stdev: 271056.35588
Best fitness: 92.77414 - size: (1, 1) - species 64 - id 275343
Average adjusted fitness: 0.980
Mean genetic distance 2.519, standard deviation 0.711
Population of 301 members in 5 species:
   ID   age  size  fitness  adj fit  stag
  ====  ===  ====  =======  =======  ====
    21  756    56     92.8    0.989   417
    62  182    51     92.2    0.955   148
    64  154   101     92.8    0.981   101
    72   80    52     91.7    0.990     4
    73   68    41     92.0    0.987     4
Total extinctions: 0
Generation time: 0.066 sec (0.069 average)
Saving checkpoint to neat-checkpoint-999

Best genome:
Key: 254624
Fitness: 92.77413717482956
Nodes:
	0 DefaultNodeGene(key=0, bias=-2.67537926563292, response=1.0, activation=relu, aggregation=sum)
	3113 DefaultNodeGene(key=3113, bias=-2.00329012513426, response=1.0, activation=relu, aggregation=sum)
Connections:
	DefaultConnectionGene(key=(-1, 0), weight=0.48258719499769087, enabled=True)
	DefaultConnectionGene(key=(-1, 3113), weight=-1.74060320260046, enabled=True)

Output:
input [100], expected output [45], got [45.583340234136166]
input [110], expected output [51], got [50.40921218411307]
input [120], expected output [54], got [55.23508413408999]
input [130], expected output [61], got [60.060956084066895]
input [140], expected output [66], got [64.8868280340438]
input [150], expected output [70], got [69.7126999840207]
input [160], expected output [74], got [74.53857193399762]
input [170], expected output [78], got [79.36444388397452]
input [180], expected output [85], got [84.19031583395143]
input [190], expected output [89], got [89.01618778392834]

看下连接图：

在这里插入图片描述

最后，代码迭代了1000次结束的不是fitness > 99结束的，可以看下最后的Best fitness = 92.77414，然后我们看下原来的输出和实际的输出相差多少。如下

Output:
input [100], expected output [45], got [45.583340234136166]
input [110], expected output [51], got [50.40921218411307]
input [120], expected output [54], got [55.23508413408999]
input [130], expected output [61], got [60.060956084066895]
input [140], expected output [66], got [64.8868280340438]
input [150], expected output [70], got [69.7126999840207]
input [160], expected output [74], got [74.53857193399762]
input [170], expected output [78], got [79.36444388397452]
input [180], expected output [85], got [84.19031583395143]
input [190], expected output [89], got [89.01618778392834]

可以看出，实际的输出和原来的输出相差不大，说明此时训练的网络可以很好的拟合此时的线性回归了。那怎么看得到的线性回归方程呢？

其实得出的结果已经给出了，如下：

Nodes:
	0 DefaultNodeGene(key=0, bias=-2.67537926563292, response=1.0, activation=relu, aggregation=sum)
	3113 DefaultNodeGene(key=3113, bias=-2.00329012513426, response=1.0, activation=relu, aggregation=sum)
Connections:
	DefaultConnectionGene(key=(-1, 0), weight=0.48258719499769087, enabled=True)
	DefaultConnectionGene(key=(-1, 3113), weight=-1.74060320260046, enabled=True)

-1代表x，0代表y，生成的3113只是进化的多余的节点，可以不用管，假设线性回归方程为y = kx + b

从下面的这行的输出可以得出k = weight = 0.48258719499769087

DefaultConnectionGene(key=(-1, 0), weight=0.48258719499769087, enabled=True)

同理，得出b = bias = -2.67537926563292

0 DefaultNodeGene(key=0, bias=-2.67537926563292, response=1.0, activation=relu, aggregation=sum)

所以线性回归方程为

$y = 0.48258719499769087 x - 2.67537926563292$

我用TensorFlow2.0训练得出的线性回归方程为（连接为：https://blog.csdn.net/weixin_45092662/article/details/101688614）

y = 0.4737 x - 1.33

相比较，还是蛮接近的，为什么两方程不能一样？因为涉及到你怎么选择它的精度还有训练的loss的值，只要能拟合出它的线性，而且误差在我们的接受范围内，此项拟合就是成功的。所以此neat算法（遗传拓扑神经网络）能进化出线性和非线性的网络，还是蛮有用的，此后会进一步研究。

有用请点个赞！！
本站所有文章均为原创，欢迎转载，请注明文章出处：https://blog.csdn.net/weixin_45092662。百度和各类采集站皆不可信，搜索请谨慎鉴别。技术类文章一般都有时效性，本人习惯不定期对自己的博文进行修正和更新，因此请访问出处以查看本文的最新版本。

爱听许嵩歌

关注

2
点赞
踩
14

收藏

觉得还不错? 一键收藏
2
评论
neat神经网络算法的线性回归拟合（Python）

前期做了个neat（遗传拓扑神经网络）算法进化出异或实验（xor）的网络和权重，那个非线性的都能训练出来，这个线性的应该没问题吧，所以试试。数据，保存为：data.csv首先，因为是线性规划，所以我们改下激活函数，改为relu，配置文件保存为：config-feedforward#--- parameters for the xianxing experiment ---#[NEAT]fitness_criterion = maxfitness_threshold = 99
复制链接

扫一扫