1267 - Network (贪心)

Consider a tree network with n nodes where the internal nodes correspond to servers and the terminal nodes correspond to clients. The nodes are numbered from 1 to n . Among the servers, there is an original server S which provides VOD (Video On Demand) service. To ensure the quality of service for the clients, the distance from each client to the VOD server S should not exceed a certain value k . The distance from a node u to a node v in the tree is defined to be the number of edges on the path from u to v . If there is a nonempty subset Cof clients such that the distance from each u in C to S is greater than k , then replicas of the VOD system have to be placed in some servers so that the distance from each client to the nearest VOD server (the original VOD system or its replica) is k or less.

Given a tree network, a server S which has VOD system, and a positive integer k , find the minimum number of replicas necessary so that each client is within distance k from the nearest server which has the original VOD system or its replica.

For example, consider the following tree network.

\epsfbox{p3902.eps}

In the above tree, the set of clients is {1, 6, 7, 8, 9, 10, 11, 13}, the set of servers is {2, 3, 4, 5, 12, 14}, and the original VOD server is located at node 12.

For k = 2 , the quality of service is not guaranteed with one VOD server at node 12 because the clients in {6, 7, 8, 9, 10} are away from VOD server at distance k . Therefore, we need one or more replicas. When one replica is placed at node 4, the distance from each client to the nearest server of {12, 4} is less than or equal to 2. The minimum number of the needed replicas is one for this example.

Input 

Your program is to read the input from standard input. The input consists of T test cases. The number of test cases (T ) is given in the first line of the input. The first line of each test case contains an integer n (3$ \le$n$ \le$1, 000) which is the number of nodes of the tree network. The next line contains two integers s (1$ \le$s$ \le$n) and k (k$ \ge$1) where s is the VOD server and k is the distance value for ensuring the quality of service. In the following n - 1 lines, each line contains a pair of nodes which represent an edge of the tree network.

Output 

Your program is to write to standard output. Print exactly one line for each test case. The line should contain an integer that is the minimum number of the needed replicas.

Sample Input 

2 14 
12 2 
1 2 
2 3 
3 4 
4 5 
5 6 
7 5 
8 5 
4 9 
10 3 
2 12 
12 14 
13 14 
14 11 
14 
3 4 
1 2 
2 3 
3 4 
4 5 
5 6 
7 5 
8 5 
4 9 
10 3 
2 12 
12 14 
13 14 
14 11

Sample Output 

1 
0

题意:给定一个图,求最少放置几个服务器使得全部覆盖。

思路:贪心,从叶子节点,找他的k级节点去放置是最佳情况。

代码:

#include <stdio.h>
#include <string.h>
#include <vector>
using namespace std;
const int N = 1005;

int T, n, s, k, vis[N], f[N], vi[N];
vector<int> g[N], node[N];

void dfs(int s, int d) {
    if (d <= k)
	vis[s] = 1;
    vi[s] = 1;
    if(g[s].size() == 1) node[d].push_back(s);
    for (int i = 0; i < g[s].size(); i ++) {
	if (vi[g[s][i]] == 0) {
	    f[g[s][i]] = s;
	    dfs(g[s][i], d + 1);
	}
    }
}

void dfs2(int s, int d) {
    if (d > k) return;
    vis[s] = 1;
    vi[s] = 1;
    for (int i = 0; i < g[s].size(); i ++) {
	if (vi[g[s][i]] == 0)
	    dfs2(g[s][i], d + 1);
    }
}

void init() {
    memset(f, 0, sizeof(f));
    memset(vi, 0, sizeof(vi));
    memset(vis, 0, sizeof(vis));
    memset(g, 0, sizeof(g));
    memset(node, 0, sizeof(node));
    scanf("%d%d%d", &n, &s, &k);
    int a, b;
    for (int i = 0; i < n - 1; i ++) {
	scanf("%d%d", &a, &b);
	g[a].push_back(b);
	g[b].push_back(a);
    }
    dfs(s, 0);
}

int solve() {
    int ans = 0;
    for (int d = n - 1; d > k; d --) {
	int u, v;
	for (int i = 0; i < node[d].size(); i ++) {
	    u = v = node[d][i];
	    if (vis[u]) continue;
	    for (int j = 0; j < k; j ++) v = f[v];
	    memset(vi, 0, sizeof(vi));
	    dfs2(v, 0);
	    ans ++;
	}
    }
    return ans;
}

int main() {
    scanf("%d", &T);
    while (T--) {
	init();
	printf("%d\n", solve());
    }
    return 0;
}


Deep Q-Network (DQN) 是一种基于深度学习的强化学习算法,旨在解决 Q-learning 算法在高维状态空间下的不稳定性和收敛速度较慢的问题。它在经典控制任务和 Atari 游戏中表现出色,并且已经成为深度强化学习领域中的经典算法之一。 以下是一个简单的 DQN 的 Python 实现代码: ``` python import numpy as np import random import tensorflow as tf class DQN: def __init__(self, state_dim, action_dim, hidden_dim=64, gamma=0.99, lr=1e-3, batch_size=64): self.state_dim = state_dim self.action_dim = action_dim self.hidden_dim = hidden_dim self.gamma = gamma self.lr = lr self.batch_size = batch_size self.replay_buffer = [] self.training_step = 0 self.model = self._build_model() self.target_model = self._build_model() self.update_target() def _build_model(self): model = tf.keras.Sequential([ tf.keras.layers.Dense(self.hidden_dim, activation='relu', input_dim=self.state_dim), tf.keras.layers.Dense(self.hidden_dim, activation='relu'), tf.keras.layers.Dense(self.action_dim) ]) model.compile(loss='mse', optimizer=tf.keras.optimizers.Adam(lr=self.lr)) return model def update_target(self): self.target_model.set_weights(self.model.get_weights()) def act(self, state, epsilon=0.1): if np.random.uniform() < epsilon: return np.random.randint(0, self.action_dim) else: q_values = self.model.predict(state[np.newaxis])[0] return np.argmax(q_values) def remember(self, state, action, reward, next_state, done): self.replay_buffer.append((state, action, reward, next_state, done)) def replay(self): if len(self.replay_buffer) < self.batch_size: return samples = random.sample(self.replay_buffer, self.batch_size) states, actions, rewards, next_states, dones = map(np.array, zip(*samples)) next_q_values = self.target_model.predict(next_states) max_next_q_values = np.max(next_q_values, axis=1) targets = rewards + (1 - dones) * self.gamma * max_next_q_values q_values = self.model.predict(states) q_values[np.arange(len(q_values)), actions] = targets self.model.fit(states, q_values, verbose=0) self.training_step += 1 if self.training_step % 100 == 0: self.update_target() ``` 在这个简单的实现中,我们使用了一个全连接神经网络来表示 Q 函数,神经网络的输入为状态,输出为每个动作的 Q 值。我们使用经验回放(Experience Replay)和目标网络(Target Network)来稳定和加速训练。具体来说,我们将每个经验元组 $(s_t, a_t, r_t, s_{t+1}, \text{done})$ 存储在一个经验回放缓冲区中,并且每个时间步从缓冲区中随机采样一批经验来训练网络。此外,我们使用目标网络来计算下一个状态的最大 Q 值,这样可以减少 Q 值的震荡和不稳定性。具体来说,我们将目标网络的权重 $\theta^-$ 固定一段时间,然后将 $\theta^-$ 作为计算下一个状态的最大 Q 值的参数。 在训练过程中,我们使用均方误差(Mean Squared Error)作为损失函数,将 $\text{Q}(s_t, a_t)$ 与目标值 $r_t + \gamma \max_{a'} \text{Q}(s_{t+1}, a'; \theta^-)$ 的差异最小化。我们使用 Adam 优化器来优化网络参数。在每次训练结束后,我们将训练步数加一,并且如果训练步数是 $100$ 的倍数,就更新目标网络的权重。最后,我们使用 $\epsilon$-贪心策略来选择动作,其中 $\epsilon$ 是一个探索率,控制了随机选择动作的频率。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值