Data Mining UVA1591原题翻译+题解

最新推荐文章于 2021-12-19 14:30:59 发布

BEconfidence

最新推荐文章于 2021-12-19 14:30:59 发布

阅读量2k

点赞数

分类专栏： UVA

本文链接：https://blog.csdn.net/a197p/article/details/42322543

版权

UVA 专栏收录该内容

232 篇文章 2 订阅

订阅专栏

#include <iostream>
using namespace std;
int main()
{
    long long n,x,y,N,A,B,ansN,ansA,ansB;
    while(cin >> n >> x >> y){
        ansN = n*y <<10;
        for(A=0; A<32;A++){
            for(B=0;B<32;B++){
                N=(((n-1)*x +((n-1)*x<<A))>>B)+y;
                if(N>=n*y && N<ansN){
                    ansA = A;
                    ansB = B;
                    ansN = N;
                }
            }
        }
        cout << ansN << " " << ansA<<" "<<ansB<<"\n";
    }
    return 0;
}

Dr. Tuple is working on the new data-mining application for Advanced Commercial Merchandise Inc. Tuple 先生工作在新的数据挖掘应用对先进的商业货物。（明显句子不通。。。）

One of the subroutines for this application works with two arrays P and Q containing N records of data each (records are numbered from 0 to N - 1).应用的子程序之一借助两个每个包含N条记录的数组P和Q(记录从0到n-1被标记)

Array P contains hash-like structure with keys.数组P包含一种结构借助钥匙。

Array P is used to locate record for processing and the data for the corresponding record is later retrieved from the array Q.数组P被用来定位需要数据处理的记录和资料对应符合的记录会从何数组Q内被检索。

All records in array P have a size of SP bytes and records in array Q have size of SQ bytes. 所有的记录在数组P数组P的记录有一系列SP字节，Q内的记录有一系列SQ字节。

Dr. Tuple needs to implement this subroutine with the highest possible performance because it is a hot-spot of the whole data-mining application. Tuple先生需要实最高性能的施这个子程序，因为他是整个数据提炼的过热点。

However, SP and SQ are only known at run-time of application which complicates or makes impossible to make certain well-known compile-time optimizations.然而只在程序运行时知道SP和SQ，这使复杂或不可能去确定必然知道的最优化编译时期。

The straightforward way to find byte-offset of i-th record in array P is to use the following formula:找到字节偏移量在数组P的每个记录直接的方法如下：

Pofs(i) = SP . i, （1）

数组Q如下：

Qofs(i) = SQ . i. （2）

However, multiplication computes much slower than addition or subtraction in modern processors. 然而，乘法运算在处理中比加减慢的多。

s. Dr. Tuple avoids usage of multiplication while scanning array P by keeping computed byte-offset Pofs(i) of i-th record instead of its index i in all other data-structures of data-mining application. Tuple先生在扫面数组P时避免使用乘法使保持字节偏移量Pof（i）和i-th记录代替他的指数i在所有其他data-mining应用的数据结构。

（我彻底迷乱的了，坚持翻译完吧）。

He uses the following simple formulae when he needs to compute byte-offset of the record that precedes or follows i-th record in array P:当他需要计算记录的字节偏移量时他用简单的公式，领先或跟着数组P的i-th记录。

Pofs(i + 1) = Pofs(i) + Sp Pofs(i - 1) = Pofs(i) - Sp

Whenever a record from array P is located by either scanning of the array or by taking Pofs(i) from other data structures, Dr. Tuple needs to retrieve information from the corresponding record in array Q. 当数组P的一条记录被定位任意扫面来自其他结构的数组或通过把Pofs（i），Tuple先生需要去取回对应数组Q内的记录信息。

To access record in array Q its byte-offset Qofs(i) needs to be computed. One can immediately derive formula to compute Qofs(i) with known Pofs(i) from formulae (1) and (2):为了接近记录在数组Q他的字节偏移量Qofs（i）需要被计算，一个可以立即导出公式计算Qofs（i）和知道的Pofs（i）来自公式1和2：

Qofs(i) = Pofs(i)/SP . SQ （3）

Unfortunately, this formula not only contains multiplication, but also contains division.

不幸的是，这个公式不仅包含乘法，而且包含除法。

Even though only integer division is required here, it is still an order of magnitude slower than multiplication on modern processors.即使这样除法也是需要的，它比乘法慢的多。

If coded this way, its computation is going to consume the most of CPU time in data-mining application for ACM Inc.如果这样编码，对于ACM公司他的计算将消耗大部分的CPU耗时在data-ming程序内。

After some research Dr. Tuple has discovered that he can replace formula (3) with the following fast formula:在一系列的研究后，Tuple先生发现他可以用下面更快的公式替换公式（3）。

Qofs'(i) = (Pofs(i) + Pofs(i) < < A) > > B （4）

where A and B are non-negative integer numbers, ``< < A" is left shift by A bits (equivalent to integer
multiplication by 2A), `` > > B" is right shift by B bits (equivalent to integer division by 2B).

A和B是非负数，"<<A"是左移A（相当于乘2的A次方），">>B"右移B（相当于除2的B次方）。

This formula is an order of magnitude faster than (3) to compute, but it generally cannot always produce the
same result as (3) regardless of the choice for values of A and B. It still can be used if one is willing to sacrifice some extra memory.公式是比3快一个数量级，但是通常总是不能和（3）产生一样的结果不管对A还是对B，如果牺牲一些内存仍可以用他。

Conventional layout of array Q in memory (using formula (2)) requires N . SQ bytes to store the entire array.

传统的数组Q内存的安排（用公式（2））需要N。Sq字节储存全部的数组。

Dr. Tuple has found that one can always choose such K that if he allocates K bytes of memory for the array Q(where K N . SQ) and carefully selects values for A and B, the fast formula (4) will give non-overlapping storage locations for each of the N records of array Q.Tuple先生发现，可以选择一个K那样如果他给数组Q（K<=N乘Sq）分配K字节而且挑选A和B的值，最快的公式（4）可以无重叠储存数组Q的每个记录。

Your task is to write a program that finds minimal possible amount of memory K that needs to be allocated for array Q when formula (4) is used. 你的工作是写一个程序，找出最小的用公式（4）需要分配给数组Q的内存K。

Corresponding values for A and B are also to be found. If multiple pairs of values for A and B give the same minimal amount of memory K, then the pair where A is minimal have to be found, and if there is still several possibilities, the one where B is minimal.相等的A和B的值也可以被找到，如果许多A和B被指定很小的内存K，然后最小的内存A不得不被找到，而且如果有可能，B也是最小的。

You shall assume that integer registers that will be used to compute formula (4) are wide enough so that overflow will never occur.你应该确定计数器，用来计算公式（4），要足够大才不会溢出。

Input

Input consists of several datasets. Each dataset consists of three integer numbers N, SP , and SQ
separated by spaces (1 N 2 20, 1 SP 2 10, 1 SQ 2 10)输入包含数据集，每个数据集友3个整数组成N， Sp和Sq分别3个范围。

Output

For each dataset, write to the output a single line with three integer numbers K, A, and B separated by spaces.对于每个数据集，对应一个输出单行3个整数K，A，B用空个人分开。

翻译完了，原文还是没看懂，搜到一个博客，是这么写的：

以下为转载：

首先，要耐心读懂题意，题目并不是特别复杂，不过是有一些专业背景而已。

然后想到，从实际角度考虑，A和B的值都应该不会太大才对。再看看数据规模，Pofs的值能达到2^20*2^10=2^30。故用64位整型存储，我们最多也只能左移30多位。不过从这里就感觉题目不太对劲，出题人应该说明A、B的范围才对，难道是故意挖坑，让ACMer自己猜？这种坑也不是没见过，今年上海邀请赛，输出方程表达式的那题，题目就没有说系数为1的要忽略这一点（如1x+3y-1z-4应该输出x+3y-z-4），而是让选手自己从常识角度理解，把我们队可坑惨了。

与其乱猜，不如把测试数据翻出来：NEERC2003数据，找到测试数据的输入输出，发现果然所有的A、B值都在0~31以内。

求解
然后这题就逗比了，只要暴力所有A、B的组合，看解是不是可行的并找出最优解。

(设有n个元素，每个P占x个字节，每个Q占y个字节。)

怎么判断一个解是否可行呢？想想如果Q连续存储，至少也得消耗n*y个字节，一个AB方案，如果算出的字节小于该值就是不可行的。否则这个内存越小，解就越优，依据这个来更新最优解ansA,ansB,ansN。

BEconfidence

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Data Mining UVA1591原题翻译+题解

Dr. Tuple is working on the new data-mining application for Advanced Commercial Merchandise Inc. Tuple先生工作在新的数据挖掘应用对先进的商业货物。（明显句子不通。。。）One of the subroutines for this application works with two arra
复制链接

扫一扫

专栏目录