由于现在都是是分布式系统，现有k个文件，每个文件个数为n，每个文件都是按照时间戳排序，需要把k个文件合并成1个按照时间戳排好序的文件；按照题目写出实现的代码，思考时间复杂度

bingbingYang_88

已于 2024-02-01 17:56:35 修改

阅读量191

点赞数 2

分类专栏：算法文章标签：算法数据结构

于 2024-02-01 17:55:04 首次发布

本文链接：https://blog.csdn.net/weixin_30409927/article/details/135979637

版权

算法专栏收录该内容

5 篇文章 0 订阅

订阅专栏

题目：由于现在都是是分布式系统，现有k个文件，每个文件个数为n，
每个文件都是按照时间戳排序，
需要把k个文件合并成1个按照时间戳排好序的文件；按照题目写出实现的代码，思考时间复杂度

备注：忽略文件读取过程，使用自己熟练的数据结构，编写代码

package com.example.springbootdemo2;

import com.fasterxml.jackson.databind.KeyDeserializer;
import com.fasterxml.jackson.databind.util.JSONPObject;

import java.util.*;

/**
 * @author evanYang
 * @since 2024/2/1 11:35 AM
 */
public class FIleDemo {
    public static void main(String[] args) {
        List<List<Integer>> files = new ArrayList<>();
        
        List<Integer> integers = Arrays.asList(2, 3, 5, 7, 9);
        files.add(integers);
        List<Integer> integers1 = Arrays.asList(1, 1, 1, 2, 2);
        files.add(integers1);
        List<Integer> integers2 = Arrays.asList(2, 4, 6, 8, 10);
        files.add(integers2);
        List<Integer> integers3 = fileMerge(files);
        List<Integer> integers4 = mergeFileDouble(files);
        System.out.println(integers4);
        System.out.println(String.valueOf(integers3));
//        System.out.println(integers3);
//        System.out.println(integers3.toString());
//        System.out.println(Arrays.toString(integers3.toArray()));
    
    }
    
    /*题目：由于现在都是是分布式系统，现有k个文件，每个文件个数为n，
    每个文件都是按照时间戳排序，
    需要把k个文件合并成1个按照时间戳排好序的文件；按照题目写出实现的代码，思考时间复杂度
    
    备注：忽略文件读取过程，使用自己熟练的数据结构，编写代码
    */
    public static List<Integer> fileMerge(List<List<Integer>> files) {
        List<Integer> mergeFile = new ArrayList<>();
        PriorityQueue<Node> minHeap = new PriorityQueue<>(Comparator.comparingInt(Node::getValue));
        //每个文件的 第一元素加入
        for (int i = 0; i < files.size(); i++) {
            List<Integer> file = files.get(i);
            if (!file.isEmpty()) {
                Integer value = file.get(0);
                minHeap.offer(new Node(i, 0, value));
            }
        }
        // 逐个弹出堆顶元素，将最小元素加入合并后的列表，并将来自同一文件的下一个元素加入堆
        while (!minHeap.isEmpty()) {
            Node node = minHeap.poll();
            mergeFile.add(node.getValue());
            int nextIndex = node.getIndex() + 1;
            List<Integer> fileIndex = files.get(node.getFileIndex());
            int size = fileIndex.size();
            if (nextIndex < size) {
                int nextValue = fileIndex.get(nextIndex);
                minHeap.offer(new Node(node.getFileIndex(), nextIndex, nextValue));
            }
        }
        
        return mergeFile;
    }
    
    static class Node {
        private int fileIndex;
        private int index;
        private int value;
        
        public Node(int fileIndex, int index, int value) {
            this.fileIndex = fileIndex;
            this.index = index;
            this.value = value;
        }
        
        public int getFileIndex() {
            return fileIndex;
        }
        
        public int getIndex() {
            return index;
        }
        
        public int getValue() {
            return value;
        }
    }
    
    public static List<Integer> mergeFileDouble(List<List<Integer>> files) {
        List<Integer> mergeResult = new ArrayList<>();
        int[] points = new int[files.size()];
        
        while (true) {
            int min = Integer.MAX_VALUE;
            int minFileIndex = -1;
            for (int i = 0; i < files.size(); i++) {
                List<Integer> file = files.get(i);
                int point = points[i];
                if (point < file.size() && file.get(point) < min) {
                    min = file.get(point);
                    minFileIndex = i;
                }
            }
            
            if (minFileIndex == -1) {
                break;
            }
            mergeResult.add(min);
            points[minFileIndex]++;
        }
        return mergeResult;
    }
    
}

这种方法使用了一个指针数组来记录每个文件当前元素的位置。在每一轮循环中，我们找到当前指针指向的最小元素，并将其添加到合并后的列表中。然后，我们更新该文件的指针，指向下一个元素。当所有文件的指针都超出了文件的长度时，表示合并完成。

这种实现的时间复杂度为O(kn)，其中k是文件的个数，n是每个文件的平均元素个数。不使用优先队列的话，在每一轮循环中都需要遍历k个文件来找到最小元素，因此整体复杂度为O(kn)。

初始化堆的过程需要将每个文件的第一个元素加入堆，时间复杂度为O(k)。
在归并的过程中，每个文件的元素最多会被加入和弹出堆一次，所以归并的过程时间复杂度为O(knlogk)，其中n是每个文件的平均元素个数。
最后返回合并后的文件列表，时间复杂度为O(kn)。
综上所述，总的时间复杂度为O(knlogk)。

bingbingYang_88

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
由于现在都是是分布式系统，现有k个文件，每个文件个数为n，每个文件都是按照时间戳排序，需要把k个文件合并成1个按照时间戳排好序的文件；按照题目写出实现的代码，思考时间复杂度

当所有文件的指针都超出了文件的长度时，表示合并完成。这种实现的时间复杂度为O(kn)，其中k是文件的个数，n是每个文件的平均元素个数。不使用优先队列的话，在每一轮循环中都需要遍历k个文件来找到最小元素，因此整体复杂度为O(kn)。在归并的过程中，每个文件的元素最多会被加入和弹出堆一次，所以归并的过程时间复杂度为O(knlogk)，其中n是每个文件的平均元素个数。初始化堆的过程需要将每个文件的第一个元素加入堆，时间复杂度为O(k)。题目：由于现在都是是分布式系统，现有k个文件，每个文件个数为n，
复制链接

扫一扫