I have a list with millions of numbers which are always increasing to the end, I need to find and return numbers within a specified range e.g. numbers greater than X but less than Y, the numbers in the list can change and the values I'm searching for change as well
I have been using this method, please note this is a basic example the numbers are not uniform or the same as shown below in my program
l = [i for i in range(2000000)]
nums = []
for element in l:
if element > 950004:
break
if element > 950000:
nums.append(element)
#[950001, 950002, 950003, 950004]
Although fast, I kind of need it to be a bit faster for what my program is doing, the numbers change a lot so I'm wondering if there's a better way to do this with a pandas series or a numpy array? but so far all I've done is make an example in numpy:
a = numpy.array(l,dtype=numpy.int64)
Would a pandas series be more functional? Making use of query()? what would be the best way to approach this with an array as opposed to a python list of python objects
解决方案
Here is a solution using binary search. You are speaking of millions of numbers. Technically binary search will make the algorithm faster by decreasing the runtime complexity to O(log n) neglecting the final slicing step.
import bisect
l = [i for i in range(2000000)]
lower_bound = 950000
upper_bound = 950004
lower_bound_i = bisect.bisect_left(l, lower_bound)
upper_bound_i = bisect.bisect_right(l, upper_bound, lo=lower_bound_i)
nums = l[lower_bound_i:upper_bound_i]