1093. Statistics from a Large Sample
Medium
5763Add to ListShare
You are given a large sample of integers in the range [0, 255]
. Since the sample is so large, it is represented by an array count
where count[k]
is the number of times that k
appears in the sample.
Calculate the following statistics:
minimum
: The minimum element in the sample.maximum
: The maximum element in the sample.mean
: The average of the sample, calculated as the total sum of all elements divided by the total number of elements.median
:- If the sample has an odd number of elements, then the
median
is the middle element once the sample is sorted. - If the sample has an even number of elements, then the
median
is the average of the two middle elements once the sample is sorted.
- If the sample has an odd number of elements, then the
mode
: The number that appears the most in the sample. It is guaranteed to be unique.
Return the statistics of the sample as an array of floating-point numbers [minimum, maximum, mean, median, mode]
. Answers within 10-5
of the actual answer will be accepted.
Example 1:
Input: count = [0,1,3,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] Output: [1.00000,3.00000,2.37500,2.50000,3.00000] Explanation: The sample represented by count is [1,2,2,2,3,3,3,3]. The minimum and maximum are 1 and 3 respectively. The mean is (1+2+2+2+3+3+3+3) / 8 = 19 / 8 = 2.375. Since the size of the sample is even, the median is the average of the two middle elements 2 and 3, which is 2.5. The mode is 3 as it appears the most in the sample.
Example 2:
Input: count = [0,4,3,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] Output: [1.00000,4.00000,2.18182,2.00000,1.00000] Explanation: The sample represented by count is [1,1,1,1,2,2,2,3,3,4,4]. The minimum and maximum are 1 and 4 respectively. The mean is (1+1+1+1+2+2+2+3+3+4+4) / 11 = 24 / 11 = 2.18181818... (for display purposes, the output shows the rounded number 2.18182). Since the size of the sample is odd, the median is the middle element 2. The mode is 1 as it appears the most in the sample.
Constraints:
count.length == 256
0 <= count[i] <= 109
1 <= sum(count) <= 109
- The mode of the sample that
count
represents is unique.
题目:给定非常大的数组,数值范围在[0, 255]之间,由于采样点太多,用count[i]来记录i出现的次数。返回统计信息。
思路:开始想用左右指针的方法找中位数。但由于1<=sum(count) <= 10^9.因此先计算了所有采样点数,然后从左到右依次遍历更新数值。注意点是:在计算平均值时double和int类型会溢出,需要转换成double long。代码如下:
class Solution {
public:
vector<double> sampleStats(vector<int>& count) {
int total = accumulate(count.begin(), count.end(), 0);
vector<double> res = {-1.0, 0.0, 0.0, -1.0, 0.0};
int sum = 0;
for(int i = 0; i < count.size(); i++){
if(count[i] == 0) continue;
if(res[3] < 0) {
if(sum == total - sum) res[3] = (res[1] + (double)i) / 2;
else {
sum += count[i];
if(sum > total - sum) res[3] = (double) i;
}
}
if(res[0] < 0) res[0] = (double) i;
res[1] = (double) i;
res[2] += ((long double) i * (long double) count[i]) / ((long double) total);
if(count[i] > count[int(res[4])]) res[4] = (double) i;
}
return res;
}
};
time: O(1), space: O(1);
由于只遍历了count, 255个值。空间也只是常量空间。