我有一个包含两个类的脚本。(很明显,我删除了很多我认为与我正在处理的错误无关的内容。)最终的任务是创建一个决策树,正如我在this问题中提到的那样。在
不幸的是,我得到了一个无限循环,我很难确定原因。我已经识别出了出错的代码行,但我认为迭代器和我要添加到的列表将是不同的对象。list.append功能是否有一些我不知道的副作用?还是我犯了其他明显的错误?在class dataset:
individuals = [] #Becomes a list of dictionaries, in which each dictionary is a row from the CSV with the headers as keys
def field_set(self): #Returns a list of the fields in individuals[] that can be used to split the data (i.e. have more than one value amongst the individuals
def classified(self, predicted_value): #Returns True if all the individuals have the same value for predicted_value
def fields_exhausted(self, predicted_value): #Returns True if all the individuals are identical except for predicted_value
def lowest_entropy_value(self, predicted_value): #Returns the field that will reduce entropy the most
def __init__(self, individuals=[]):
以及
^{pr2}$
我的初始化代码目前是:if __name__ == '__main__':
filename = (sys.argv[1]) #Takes in a CSV file
predicted_value = "# class" #Identifies the field from the CSV file that should be predicted
base_dataset = parse_csv(filename) #Turns the CSV file into a list of lists
parsed_dataset = individual_list(base_dataset) #Turns the list of lists into a list of dictionaries
root = Node(0, Dataset(parsed_dataset)) #Creates a root node, passing it the full dataset
root.split_dataset(root.ds.lowest_entropy_value(predicted_value)) #Performs the first split, creating multiple subnodes
n = root.links[0]
n.split_dataset(n.ds.lowest_entropy_value(predicted_value)) #Attempts to split the first subnode.