使用哈希函数:balanceLoad = lambda x: bisect.bisect_left(boundary_array, -keyfunc(x))
其中boundary_数组为[-64,-10,35]
下面的内容告诉我将每个元素分配到哪个分区
^{pr2}$
但是,有没有一种方法可以确定/控制它们在每个分区中的分配/放置位置?{1,2,3}对{3,2,1}。在
例如,当我这样做时:rdd = CleanRDD(sc.parallelize(range(100), 4).map(lambda x: (x *((-1) ** x) , x)))
sortByKey(rdd, keyfunc=lambda key: key, ascending=False).collect()
每个分区中的元素顺序相反:
[(64,64),
(66,66),
(68,68),
(70,70),
(72,72),
(74,74),
(76,76),
(78,78),
(80,80),
(82,82),
(84,84),
(86,86),
(88,88),
(90,90),
(92,92),
(94,94),
(96,96),
(98,98),
(10,10),
(12,12),
(14,14),
(16,16),
(18,18),
(20,20),
(22,22),
(24,24),
(26,26),
(28,28),
(30,30),
(32,32),
(34,34),
(36,36),
(38,38),
(40,40),
(42,42),
(44,44),
(46,46),
(48,48),
(50,50),
(52,52),
(54,54),
(56,56),
(58,58),
(60,60),
(62,62),
(-35,35),
(-33,33),
(-31,31),
(-29,29),
(-27,27),
(-25,25),
(-23,23),
(-21,21),
(-19,19),
(-17,17),
(-15,15),
(-13,13),
(-11,11),
(-9,9),
(-7,7),
(-5,5),
(-3,3),
(-1,1),
(0,0),
(2,2),
(4,4),
(6,6),
(8,8),
(-99,99),
(-97,97),
(-95,95),
(-93,93),
(-91,91),
(-89,89),
(-87,87),
(-85,85),
(-83,83),
(-81,81),
(-79,79),
(-77,77),
(-75,75),
(-73,73),
(-71,71),
(-69,69),
(-67,67),
(-65,65),
(-63,63),
(-61,61),
(-59,59),
(-57,57),
(-55,55),
(-53,53),
(-51,51),
(-49,49),
(-47,47),
(-45,45),
(-43,43),
(-41,41),
(-39,39),
(-37,37)]
请注意,三个组中每个组中的元素顺序相反。
我该怎么更正?在