Gym 101149C Mathematical Field of Experiments 规律

C - Mathematical Field of Experiments
Time Limit:2000MS     Memory Limit:262144KB     64bit IO Format:%I64d & %I64u

Description

standard input/output
Statements

Mathematician Michael is dreaming how he becomes a Minister of Education in Russia and tries his innovative experimental educational program in Maths in an elementary school. The main feature of this program is learning arithmetical operations in fields of integers modulo prime numbers instead of fields of real numbers. Impressed by his idea, Michael has started to write a Maths textbook for the 1st grade kids and is already preparing exercises to find a square root in modular arithmetic.

In each such exercise an integer x is given, and it's needed to find its square root modulo prime number p, which is also given. The correct answer for such task is an integer s, such that s·s and x have the same remainder after division by p. In other words, the number  has to leave no remainder after division by p. It must be said that the square root s doesn't exist for some numbers x.

To speedup the process of preparing tasks in this topic, Michael decided to write a program that finds square roots modulo given prime number p for all numbers x from 0 to , or tells that the corresponding square root doesn't exist.

Input

The first line contains a prime number p (2 ≤ p ≤ 106). A prime number has exactly two different divisors.

Output

Output p space-separated integers, the i-th of which must be equal to the square root of  modulo p. All numbers must be between 0 and . If some square root doesn't exist, output  - 1 instead of it, and if there are multiple square roots for some i, output any of them.

Sample Input

Input
5
Output
0 4 -1 -1 3
Input
7
Output
0 1 3 -1 5 -1 -1

Hint

In the first sample: .


打表可知其由1到n+1则打表直接可得出结果


#include <iostream> 
#include <cstdio>
#include <cstdlib>
#include <cmath>
#include <algorithm>
#include <climits>
#include <cstring>
#include <string>
#include <set>
#include <map>
#include <queue>
#include <stack>
#include <vector>
#include <list>
#define rep(i,m,n) for(i=m;i<=n;i++)
#define rsp(it,s) for(set<int>::iterator it=s.begin();it!=s.end();it++)
const int inf_int = 2e9;
const long long inf_ll = 2e18;
#define inf_add 0x3f3f3f3f
#define mod 1000000007
#define vi vector<int>
#define pb push_back
#define mp make_pair
#define fi first
#define se second
#define pi acos(-1.0)
#define pii pair<int,int>
#define Lson L, mid, rt<<1
#define Rson mid+1, R, rt<<1|1
const int maxn=5e2+10;
using namespace std;
typedef  long long ll;
typedef  unsigned long long  ull; 
inline int read(){int ra,fh;char rx;rx=getchar(),ra=0,fh=1;
while((rx<'0'||rx>'9')&&rx!='-')rx=getchar();if(rx=='-')
fh=-1,rx=getchar();while(rx>='0'&&rx<='9')ra*=10,ra+=rx-48,
rx=getchar();return ra*fh;}
//#pragma comment(linker, "/STACK:102400000,102400000")
ll gcd(ll p,ll q){return q==0?p:gcd(q,p%q);}
ll qpow(ll p,ll q){ll f=1;while(q){if(q&1)f=f*p;p=p*p;q>>=1;}return f;}
int dir[4][2] = {{1,0},{-1,0},{0,1},{0,-1}};

ll a[2000005];
int main()
{
	ll n,i,j;
	cin >> n;
	printf("0\n");
	for(ll i=1;i<=n/2;i++)
	{
		a[i*i%n] = i;
	}
	for(ll i=1;i<n;i++)
	{
		if(a[i])
			printf("%lld\n",a[i]);
		else
			printf("-1\n");
	}
		
	return 0;
} 




  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
好的,以下是用TensorFlow搭建A2C算法并实现gym中的打砖块游戏的步骤: 1. 导入库 ```python import numpy as np import tensorflow as tf import gym ``` 2. 定义Actor-Critic网络模型 ```python class ActorCritic(tf.keras.Model): def __init__(self, num_actions): super(ActorCritic, self).__init__() self.common = tf.keras.layers.Dense(32, activation='relu') self.actor = tf.keras.layers.Dense(num_actions, activation='softmax') self.critic = tf.keras.layers.Dense(1) def call(self, inputs): x = self.common(inputs) return self.actor(x), self.critic(x) ``` 该网络模型包含一个共享层和两个分支层,分别用于输出动作概率和状态值。共享层接收环境状态作为输入,经过计算后输出一个向量,分别供两个分支层使用。动作概率分支层使用softmax激活函数输出一个概率分布,以决定在给定状态下采取哪个动作。状态值分支层使用线性激活函数输出一个标量,以估计在给定状态下采取动作的期望回报。 3. 定义A2C算法 ```python class A2C: def __init__(self, env, gamma=0.99, alpha=0.0001): self.env = env self.gamma = gamma self.alpha = alpha self.model = ActorCritic(env.action_space.n) self.optimizer = tf.keras.optimizers.Adam(learning_rate=alpha) def update(self, state, action, reward, next_state, done): state = np.reshape(state, [1, -1]) next_state = np.reshape(next_state, [1, -1]) with tf.GradientTape() as tape: # 计算当前状态的动作概率和状态值 actor_probs, critic_value = self.model(state) # 计算选择的动作的log概率 log_prob = tf.math.log(actor_probs[0, action]) # 计算TD误差 if done: td_error = reward - critic_value else: next_actor_probs, next_critic_value = self.model(next_state) td_error = reward + self.gamma * next_critic_value - critic_value # 计算Actor和Critic的损失函数 actor_loss = -log_prob * td_error critic_loss = tf.keras.losses.mean_squared_error(reward + self.gamma * next_critic_value, critic_value) loss = actor_loss + critic_loss # 计算梯度并更新网络参数 gradients = tape.gradient(loss, self.model.trainable_variables) self.optimizer.apply_gradients(zip(gradients, self.model.trainable_variables)) ``` 该A2C算法包含一个Actor-Critic网络模型和一个优化器。它的update方法接收当前状态、选择的动作、即时奖励、下一个状态和done标志作为输入,然后根据A2C算法计算Actor和Critic的损失函数,并使用梯度下降法更新网络参数。 4. 训练A2C算法 ```python env = gym.make('Breakout-v0') a2c = A2C(env) total_episodes = 1000 max_steps_per_episode = 10000 for episode in range(total_episodes): state = env.reset() episode_reward = 0 for step in range(max_steps_per_episode): # 选择动作 actor_probs, _ = a2c.model(np.reshape(state, [1, -1])) action = np.random.choice(env.action_space.n, p=actor_probs.numpy()[0]) # 执行动作并观察环境 next_state, reward, done, _ = env.step(action) episode_reward += reward # 更新A2C算法 a2c.update(state, action, reward, next_state, done) if done: break state = next_state print("Episode {}: Reward = {}".format(episode + 1, episode_reward)) ``` 在这个训练循环中,我们首先使用env.reset()初始化游戏状态,并在每个时间步中选择一个动作并执行它。然后,我们观察环境并计算即时奖励,更新A2C算法,直到游戏结束。在每个episode结束时,我们输出总奖励。 5. 运行游戏 ```python from gym.wrappers import Monitor env = gym.make('Breakout-v0') env = Monitor(env, './video', force=True) state = env.reset() done = False while not done: actor_probs, _ = a2c.model(np.reshape(state, [1, -1])) action = np.argmax(actor_probs.numpy()) next_state, _, done, _ = env.step(action) state = next_state env.close() ``` 最后,我们可以使用gym.wrappers.Monitor包装器来录制游戏视频,并在每个时间步中选择Actor-Critic网络模型输出的最大概率动作。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值