【Learning 2 - Python数据分析与应用】NumPy 数组 Basic

NumPy 数组对象 nd.array

1. 数组属性

nd.array数组是存储单一数据类型的多维数组

属性 描述
ndim 返回int。表示数组的维数
shape 返回元组 tuple。表示数组的尺寸,对于n行m列的矩阵,形状(n,m)
size 返回int。表示数组的元素总数,等于数组形状的乘积。三行四列=12
dtype 返回data-type。描述数组中元素的类型
itemsize 返回 int。表示数组的每个元素的大小(以字节为单位)
  • 创建数组并查看属性
    在这里插入图片描述

2. 数组创建

numpy.array(object, dtype=None, copy=True, order=‘K’,subok=False, ndmin=0)

参数 描述
object 接收array。表示想要创建的数组。必须给定,无默认。
dtype 接收data-type。表示数组所需的数据类型。如果未给定,则选择保存对象所需的最小类型。默认为None。设置了就优先以设置的为主。
ndmin 接收int。指定生成数组应该具有的最小维数。默认为None
  • 重新设置数组的shape,变为三行两列
    在这里插入图片描述

  • 使用 arange 函数创建数组

之前接触的range(10)是0-9的10个数字,arange也是类似
在这里插入图片描述

  • 使用 linspace 函数创建数组

np.linspace(start, stop, num=5

  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Q-learning is a popular reinforcement learning algorithm used to solve Markov Decision Processes (MDPs). In Python, you can implement Q-learning using various libraries such as NumPy and TensorFlow. Here's a basic implementation of Q-learning in Python: ```python import numpy as np # Define the Q-learning function def q_learning(env, num_episodes, learning_rate, discount_factor, epsilon): # Initialize the Q-table num_states = env.observation_space.n num_actions = env.action_space.n Q = np.zeros((num_states, num_actions)) # Q-learning loop for episode in range(num_episodes): state = env.reset() done = False while not done: # Choose an action using epsilon-greedy policy if np.random.uniform() < epsilon: action = env.action_space.sample() else: action = np.argmax(Q[state]) # Perform the action and observe the next state and reward next_state, reward, done, _ = env.step(action) # Update the Q-table Q[state, action] += learning_rate * (reward + discount_factor * np.max(Q[next_state]) - Q[state, action]) state = next_state return Q # Example usage env = gym.make('your_environment') # Replace 'your_environment' with the name of your environment num_episodes = 1000 learning_rate = 0.1 discount_factor = 0.9 epsilon = 0.1 Q_table = q_learning(env, num_episodes, learning_rate, discount_factor, epsilon) ``` In this example, `env` represents the environment you want to train your agent on (e.g., a grid world). `num_episodes` is the number of episodes the agent will play to learn the optimal policy. `learning_rate` controls the weight given to the new information compared to the old information, while `discount_factor` determines the importance of future rewards. `epsilon` is the exploration rate that balances exploration and exploitation. Note that you need to install the required libraries (e.g., NumPy and gym) before running the code.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值