2022李宏毅机器学习hw1

梦想的小鱼

已于 2022-10-29 21:07:59 修改

阅读量985

点赞数 2

分类专栏：机器学习代码练习文章标签：机器学习人工智能深度学习

于 2022-10-27 12:21:57 首次发布

本文链接：https://blog.csdn.net/snajdansa/article/details/127549669

版权

这篇博客详细记录了完成2022年李宏毅机器学习课程作业1的过程，包括数据处理、基础代码实现、模型改进以及可能的优化方向。在基础代码部分，涉及了随机数种子设置、数据集切割、特征选取和模型训练等。在改进方法中，提到了特征选择、神经网络模型优化（如添加BN层）和超参数调整。尽管达到了较强的基线表现，但作者也提出了针对测试集准确率过高、优化方法和训练时间的问题进行进一步研究。

摘要由CSDN通过智能技术生成

3.1怎么样知道这个特征是好的，排除掉无用特征

3.2神经网络模型设计（加BN层，增加隐藏层）

3.3超参数设计（各层神经元的数量、batch_size的取值、参数更新时的学习率、权值衰减系数或学习的epoch）

3.4优化器改变

四、还可以改进的方向

Machine Learning HW1 COVID-19 Cases Prediction

一、任务

随机数种子定义，放入gpu

Given survey results in the past 5 days in a specifific state in U.S., then predict the percentage of new tested positive cases in the 5th day.

数据

结果

全过strong baselin，public score与bossline差0.01

二、基础代码

随机数种子定义，放入gpu

def same_seed(seed):
    '''Fixes random number generator seeds for reproducibility.'''
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    np.random.seed(seed)
    torch.manual_seed(seed)
    if torch.cuda.is_available():
        torch.cuda.manual_seed_all(seed)

数据集切割

def train_valid_split(data_set, valid_ratio, seed):
    '''Split provided training data into training set and validation set'''
    valid_set_size = int(valid_ratio * len(data_set))
    train_set_size = len(data_set) - valid_set_size
    train_set, valid_set = random_split(data_set, [train_set_size, valid_set_size],
                                        generator=torch.Generator().manual_seed(seed))
    return np.array(train_set), np.array(valid_set)

预测函数

def predict(test_loader, model, device):
    model.eval()  # Set your model to evaluation mode.
    preds = []
    for x in tqdm(test_loader):
        x = x.to(device)
        with torch.no_grad():
            pred = model(x)
            preds.append(pred.detach().cpu())
    preds = torch.cat(preds, dim=0).numpy()
    return preds

数据格式定义

class COVID19Dataset(Dataset):
    '''
    x: Features.
    y: Targets, if none, do prediction.
    '''

    def __init__(self, x, y=None):
        if y is None:
            self.y = y
        else:
            self.y = torch.FloatTensor(y)
        self.x = torch.FloatTensor(x)

    def __getitem__(self, idx):
        if self.y is None:
            return self.x[idx]
        else:
            return self.x[idx], self.y[idx]

    def __len__(self):
        return len(self.x)

模型定义

class My_Model(nn.Module):
    def __init__(self, input_dim):
        super(My_Model, self).__init__()
        # TODO: modify model's structure, be aware of dimensions.
        self.layers = nn.Sequential(
            nn.Linear(input_dim, 16),
            nn.ReLU(),
            nn.Linear(16, 8),
            nn.ReLU(),
            nn.Linear(8, 1)
        )
    def forward(self, x):
        x = self.layers(x)
        x = x.squee

最低0.47元/天解锁文章