<Python learning>Basic

Learn from Codecademy: https://www.codecademy.com/learn/python
& 廖雪峰 Python 教程

  1. Python 语法
  2. 字符串和控制台输出
  3. 条件和控制流
  4. 函数
  5. 列表和字典

List:

  1. 从列表当中移除某元素,3种方法:

    • n.pop(index) will remove the item at index from the list and return the removed element to you
    • n.remove(item) will remove the actual item(the element)
    • del(n[1]) is like .pop in that it will remove the item at the given index, but it won’t return the element
  2. Iterating over a list, 两种方法:

for item in list:
    print item
for i in range(len(list)):
    print list[i]
  1. 字符串’xxx’或Unicode字符串u’xxx’也可以看成是一种list,每个元素就是一个字符。因此,字符串也可以用切片操作,只是操作结果仍是字符串

Loop:

  1. While / else:
    while/elseif/else 很相似, 不同点在于:
    else 当中的内容,只有在循环条件(loop condition)为False 的时候才执行。也即是说,如果这个循环通过 break 退出了, 那么else 也不会执行

  2. For / else
    while/else一样,只有在 for 正常退出(循环完成)的时候才会执行,若是break 退出,则不执行

The , character after print statement means that next print statement keeps printing on the same line.

zip can help iterate over two lists at once. It will create pairs of elements when passed two or more lists, and will stop at the end of the shorter list.


Filter and Lambda Syntax

Lambda functions are defined using the following syntax:

my_list = range(16)
filter(lambda x: x % 3 == 0, my_list)

Lambdas are useful when you need a quick function to do some work for you.

If you plan on creating a function you’ll use over and over, you’re better off using def and giving that function a name.

In Python, you can write numbers in binary format by starting the number with 0b.


Class Syntax

  • By convention, user-defined Python class names start with a capital letter.
  • Define an initial function using: init() (ou can think of init() as the method that “boots up” a class’ instance object. 记住是两个下划线…不是一个,一般第一个参数是self,表明是instance本身)
  • We can access attributes of our objects using dot notation
  • When a class has its own functions, those functions are called methods.
  • self 在function中也要作为参数写上
  • 关于继承函数:
class Derived(Base):
   def m(self):
       return super(Derived, self).m()

File Input/Output

my_list = [i**2 for i in range(1,11)]

my_file = open("output.txt", "r+")

#write things into the file
for i in my_list:
    my_file.write(str(i)+"\n")

#Always remember to close the file
#If you write to a file without closing, the data won't make it to the target file.
my_file.close()
  • read from a file line by line:.readline()

  • when a file object’s __exit__() method is invoked, it automatically closes the file. Use with and as to invoke:

with open("text.txt", "w") as textfile:
    textfile.write("Success!")
  • Check if a file is closed or not: it will return False if not closed
f.closed
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Q-learning is a popular reinforcement learning algorithm used to solve Markov Decision Processes (MDPs). In Python, you can implement Q-learning using various libraries such as NumPy and TensorFlow. Here's a basic implementation of Q-learning in Python: ```python import numpy as np # Define the Q-learning function def q_learning(env, num_episodes, learning_rate, discount_factor, epsilon): # Initialize the Q-table num_states = env.observation_space.n num_actions = env.action_space.n Q = np.zeros((num_states, num_actions)) # Q-learning loop for episode in range(num_episodes): state = env.reset() done = False while not done: # Choose an action using epsilon-greedy policy if np.random.uniform() < epsilon: action = env.action_space.sample() else: action = np.argmax(Q[state]) # Perform the action and observe the next state and reward next_state, reward, done, _ = env.step(action) # Update the Q-table Q[state, action] += learning_rate * (reward + discount_factor * np.max(Q[next_state]) - Q[state, action]) state = next_state return Q # Example usage env = gym.make('your_environment') # Replace 'your_environment' with the name of your environment num_episodes = 1000 learning_rate = 0.1 discount_factor = 0.9 epsilon = 0.1 Q_table = q_learning(env, num_episodes, learning_rate, discount_factor, epsilon) ``` In this example, `env` represents the environment you want to train your agent on (e.g., a grid world). `num_episodes` is the number of episodes the agent will play to learn the optimal policy. `learning_rate` controls the weight given to the new information compared to the old information, while `discount_factor` determines the importance of future rewards. `epsilon` is the exploration rate that balances exploration and exploitation. Note that you need to install the required libraries (e.g., NumPy and gym) before running the code.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值