AutonoumousDrivingCookbook From Microsoft team

这篇博客介绍了自动驾驶的自助指南,重点关注端到端深度学习教程和分布式深度强化学习在自动驾驶中的应用。作者探讨了强化学习算法,特别是奖励函数和网络架构,以及如何进行转移学习。还详细说明了本地训练任务的启动和模型运行,强调了分布式训练的重要性。
摘要由CSDN通过智能技术生成

自动驾驶自助指南

好东西来记录一下,督促自己学下去

End-to-end deep learning tutorial

之后review

Distributed Deep Reinforcement Learning for Autonomous Driving

自动驾驶中的分布式训练和强化学习

ExploreAlgorithm 算法探究

Step 1 - Explore the Algorithm

In this notebook you will get an overview of the reinforcement learning algorithm being used for this experiment and the implementation of distributed learning.

The reward function

To compute our reward function, we begin by computing the distance to the center of the nearest road. We then pass that distance through an exponential weighting function to force this portion to the range [0, 1].

def compute_reward(car_state, collision_info, road_points):
    #Define some constant parameters for the reward function
    THRESH_DIST = 3.5                # The maximum distance from the center of the road to compute the reward function
    DISTANCE_DECAY_RATE = 1.2        # The rate at which the reward decays for the distance function
    CENTER_SPEED_MULTIPLIER = 2.0    # The ratio at which we prefer the distance reward to the speed reward
    
    # If the car is stopped, the reward is always zero
    speed = car_state.speed
    if (speed < 2):
        return 0
    
    #Get the car position
    position_key = bytes('position', encoding='utf8')
    x_val_key = bytes('x_val', encoding='utf8')
    y_val_key = bytes('y_val', encoding='utf8')

    car_point = np.array([car_state.kinematics_true[position_key][x_val_key], car_state.kinematics_true[position_key][y_val_key], 0])
    
    # Distance component is exponential distance to nearest line
    distance = 999
    
    #Compute the distance to the nearest center line
    for line in road_points:
        local_distance = 0
        length_squared = ((line[0][0]-line[1][0
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值