排序题目：H 指数

伟大的车尔尼

于 2024-09-16 18:00:00 发布

阅读量591

点赞数 5

分类专栏：数据结构和算法 # 排序文章标签：排序

本文链接：https://blog.csdn.net/stormsunshine/article/details/125573171

版权

数据结构和算法同时被 2 个专栏收录

343 篇文章 8 订阅

订阅专栏

排序

38 篇文章 0 订阅

订阅专栏

文章目录

题目
解法一
解法二

题目

标题和出处

标题：H 指数

出处：274. H 指数

难度

5 级

题目描述

要求

给定一个整数数组 $\texttt{citations}$ ，其中 $\texttt{citations[i]}$ 表示研究者的第 $\texttt{i}$ 篇论文被引用的次数，计算并返回该研究者的 h 指数。

根据 h 指数的定义，一名科研人员的 h 指数是指其发表的 $\texttt{n}$ 篇论文中总共有 $\texttt{h}$ 篇论文分别被引用了至少 $\texttt{h}$ 次，且其余的 $\texttt{n} - \texttt{h}$ 篇论文每篇被引用次数不超过 $\texttt{h}$ 次。

如果 $\texttt{h}$ 有多种可能的值，其中最大的值作为 h 指数。

示例

示例 1：

输入： $\texttt{citations = [3,0,6,1,5]}$
输出： $\texttt{3}$
解释： $\texttt{[3,0,6,1,5]}$ 表示研究者总共有 $\texttt{5}$ 篇论文，每篇论文相应的被引用了 $\texttt{3, 0, 6, 1, 5}$ 次。
由于研究者有 $\texttt{3}$ 篇论文每篇至少被引用了 $\texttt{3}$ 次，其余 $\texttt{2}$ 篇论文每篇被引用不多于 $\texttt{3}$ 次，所以 h 指数是 $\texttt{3}$ 。

示例 2：

输入： $\texttt{citations = [1,3,1]}$
输出： $\texttt{1}$

数据范围

$\texttt{n} = \texttt{citations.length}$
$\texttt{1} \le \texttt{n} \le \texttt{5000}$
$\texttt{0} \le \texttt{citations[i]} \le \texttt{1000}$

解法一

思路和算法

首先将数组 $\textit{citations}$ 排序，然后按照从大到小的顺序遍历数组 $\textit{citations}$ ，计算 h 指数。

根据 h 指数的定义，如果已经遍历的 $j$ 个元素都大于等于 $j$ ，则 h 指数至少为 $j$ 。其中最大的 $j$ 的值即为 h 指数。

代码

class Solution {
    public int hIndex(int[] citations) {
        Arrays.sort(citations);
        int h = 0;
        for (int i = citations.length - 1, j = 1; i >= 0 && citations[i] >= j; i--, j++) {
            h = j;
        }
        return h;
    }
}

复杂度分析

时间复杂度： $\log n)$ ，其中 $n$ 是数组 $\textit{citations}$ 的长度。排序需要 $\log n)$ 的时间，从大到小遍历数组需要 $O (n)$ 的时间，时间复杂度是 $\log n)$ 。
空间复杂度： $O(\log n)$ ，其中 $n$ 是数组 $\textit{citations}$ 的长度。排序需要 $O(\log n)$ 的递归调用栈空间。

解法二

思路和算法

解法一需要 $\log n)$ 的时间复杂度。如果使用计数代替排序，则可以降低时间复杂度。

根据 h 指数的定义，h 指数不可能超过论文总数，即如果将数组 $\textit{citations}$ 中的大于数组长度 $n$ 的元素都改成 $n$ ，不会改变 h 指数的结果。为了方便计算，计数时将每篇论文被引用的次数限制在范围 $[0, n]$ 内，大于 $n$ 的引用次数都按照 $n$ 次引用计算。

得到计数之后，反向遍历计数数组，同时维护已经遍历的论文总数 $\textit{sum}$ 。当遍历到计数数组的下标 $i$ 时， $\textit{sum}$ 表示被引用至少 $i$ 次的论文总数，如果 $\ge \textit{sum}$ ，则 $i$ 就是符合要求的 h 指数值。由于 h 指数值应该取可能的最大值，因此第一个使 $\ge \textit{sum}$ 的 $i$ 即为 h 指数值。

代码

class Solution {
    public int hIndex(int[] citations) {
        int n = citations.length;
        int[] counts = new int[n + 1];
        for (int citation : citations) {
            counts[Math.min(citation, n)]++;
        }
        int h = 0, sum = 0;
        for (int i = n; i >= 0; i--) {
            sum += counts[i];
            if (sum >= i) {
                h = i;
                break;
            }
        }
        return h;
    }
}

复杂度分析

时间复杂度： $O (n)$ ，其中 $n$ 是数组 $\textit{citations}$ 的长度。遍历数组 $\textit{citations}$ 计数和遍历计数数组都需要 $O (n)$ 的时间。
空间复杂度： $O (n)$ ，其中 $n$ 是数组 $\textit{citations}$ 的长度。计数需要 $O (n)$ 的空间。