机器学习编程练习1——处理缺失值

Link-ll

于 2024-07-19 17:41:10 发布

阅读量155

点赞数 3

文章标签：机器学习人工智能

本文链接：https://blog.csdn.net/2301_80203671/article/details/140553815

版权

导入模块

import pandas as pd
import numpy as np
from sklearn.impute import SimpleImputer

读取数据

data = {
      'size': ['XL', 'L', 'M', np.nan, 'M', 'M'],
      'color': ['red', 'green', 'blue', 'green', 'red', 'green'],
      'gender': ['female', 'male', np.nan, 'female', 'female', 'male'],
      'price': [199.0, 89.0, np.nan, 129.0, 79.0, 89.0],
      'weight': [500, 450, 300, np.nan, 410, np.nan],
      'bought': ['yes', 'no', 'yes', 'no', 'yes', 'no']
}

构建pandas.Dataframe，并打印出来

df = pd.DataFrame(data)
df

查看每一列空值的个数

df.isnull().sum()

输出data的行数

len(df)

每一列空值的比

df.isnull().sum() / len(df)

使用均值填充缺失值

创建一个策略对象传两个对象第一个空值第二个填充的数据（用均值来填充）

imputer = SimpleImputer(missing_values = np.nan,strategy = 'mean')
df[["weight"]] = imputer.fit_transform(df[["weight"]])
df

使用常量填充缺失值

imputer = SimpleImputer(missing_values = np.nan,strategy = 'constant'，fill_value = 99.0)
df[["price"]] = imputer.fit_transform(df[["price"]])
df

使用最频繁的值填充缺失值（一般是非数值列）

imputer = SimpleImputer(missing_values = np.nan,strategy = 'most_frequent')
df[["size"]] = imputer.fit_transform(df[["size"]])
df

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

Link-ll

关注关注

3
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

机器学习基本模型与算法在线实验闯关

北天的博客

04-16

2057

本博客内容为头歌平台——机器学习基本模型与算法在线实验闯关的题目练习

吴恩达机器学习-编程练习-ex2.2

onesmile5137的博客

06-27

723

1.热身训练，实现sigmoid function 在开始练习实际的代价函数之前，我们先回顾一下基本的S函数： h θ (x) = g(θ^T*x) 函数G就是一个S函数，这个S函数的形式是： g(z) =1/(1 + e^−z) 然后我们在python中实现一下这个算法，令这个算法对单值和矩阵都生效 python中实现的sigmoid函数： import numpy as np def sig...

参与评论您还未登录，请先登录后发表或查看评论

机器/深度学习笔记：吴恩达机器学习课程 | 编程练习

giser_rs

09-24

510

做代码练习，用pycharm这种可交互性比较差的IDE（当然这类注重的是整体运行），尤其是在做数据可视化与理解每句代码的时候，真不如jupyter Notebook好用，做起来真麻烦！(1)黄海广博士笔记与练习数据及代码。

吴恩达机器学习-编程练习-ex2.3

onesmile5137的博客

07-01

328

1.简单预测与评估模型 #模型评估 #预测一个45、85分的学生被录取的概率 prob = sigmoid.sigmoid(np.matmul([1,45,85],theta)) print('成绩为45、85的学生被录取的概率是：',prob) #评估模型准确率 m2 = np.size(x,0) p = np.zeros((m2, 1)) matmul = np.matmul(x, theta...

吴恩达机器学习---编程练习7

qq_37210730的博客

08-17

504

博主只是初学机器学习的新人一枚，这篇博客旨在分享一下吴恩达机器学习课程编程练习7的答案，同时也是相当于自己对这一章的内容做一个回顾，让自己理解的更加的透彻，理性讨论，不喜勿喷本练习的主题是K-means Clustering and Principal Component Analysis，即K均值聚类算法和主要成分分析。因此这篇文章也分两部分来讨论，根据作业文件的步骤来分别对K均值聚类算法和P...

Ng 机器学习week8编程练习

a1310368974的博客

11-29

877

主要是简单的聚类和pca降维 function idx = findClosestCentroids(X, centroids) K = size(centroids, 1); m=size(X,1); % You need to return the following variables correctly. idx = zeros(size(X,1), 1); A=zeros

吴恩达机器学习逻辑回归练习题及答案

01-04

1. **数据预处理**：首先，我们需要对数据进行清洗和格式化，这可能涉及特征缩放、缺失值处理和编码分类变量。在这个练习中，我们可能会用到Octave来读取和处理数据。 2. **定义成本函数（Cost Function）**：逻辑...

Python3数据分析与机器学习实战——课后习题答案

11-12

在本资源中，“Python3数据分析与机器学习实战——课后习题答案”涵盖了多个章节的练习题解答，包括了从基础到进阶的Python数据分析和机器学习应用。这些章节包括了从基础知识到复杂算法的广泛内容，对于学习者来说...

Excel数据处理（缺失值/重复值/异常值/拆分）

data_cola的博客

07-07

6241

6月12日给大家讲解了一下数据获取的东东（时隔略久，忘记的请点击数据获取回顾），时隔一个月，接着我们的用Excel进行数据分析系列的第二篇：数据处理。文末有获取本篇实例数据的方法。目录...

coursera斯坦福机器学习编程题答案

11-06

coursera上的吴恩达教授斯坦福大学机器学习课程编程题全部参考答案，已经过验证完全可以提交，供机器学习初学者参考或取得学分，重点在于理解课程内容。

吴恩达机器学习编程题即部分答案

10-19

Programming Exercise 1: Linear Regression Machine Learning Introduction In this exercise, you will implement linear regression and get to see it work on data. Before starting on this programming exercise, we strongly recom- mend watching the video lectures and completing the review questions for the associated topics. To get started with the exercise, you will need to download the starter code and unzip its contents to the directory where you wish to complete the exercise. If needed, use the cd command in Octave/MATLAB to change to this directory before starting this exercise. You can also find instructions for installing Octave/MATLAB in the “En- vironment Setup Instructions” of the course website. Files included in this exercise ex1.m - Octave/MATLAB script that steps you through the exercise ex1 multi.m - Octave/MATLAB script for the later parts of the exercise ex1data1.txt - Dataset for linear regression with one variable ex1data2.txt - Dataset for linear regression with multiple variables submit.m - Submission script that sends your solutions to our servers [?] warmUpExercise.m - Simple example function in Octave/MATLAB [?] plotData.m - Function to display the dataset [?] computeCost.m - Function to compute the cost of linear regression [?] gradientDescent.m - Function to run gradient descent [†] computeCostMulti.m - Cost function for multiple variables [†] gradientDescentMulti.m - Gradient descent for multiple variables [†] featureNormalize.m - Function to normalize features [†] normalEqn.m - Function to compute the normal equations ? indicates files you will need to complete † indicates optional exercises

吴恩达老师《机器学习》课程编程练习1——线性回归

谢谢你们的关注

06-18

1476

吴恩达老师《机器学习》课第一次编程作业

吴恩达Coursera深度学习（1-4）编程练习

malele4th

02-26

4527

Class 1：神经网络和深度学习 Week 4：深层神经网络——编程练习目录 Class 1神经网络和深度学习 Week 4深层神经网络编程练习目录 1深层网络用到的函数 2初始化模型参数及反向传播 3两层L层神经网络模型 1深层网络用到的函数 import numpy as np import matplotlib.pyplot as plt im...

Coursera-吴恩达-机器学习-（编程练习1）Linear Regression（对应第1-2周课程）

malele4th

01-12

5910

此系列为 Coursera 网站Andrew Ng机器学习课程个人学习笔记（仅供参考）课程网址：https://www.coursera.org/learn/machine-learning exercise 1 —— Linear Regression 目录 exercise 1 Linear Regression 目录 1-1 作业介绍 1-2作业分析 part1 p

机器学习面试编程题汇总

余音丶未散的博客

03-01

3191

阿里2017年3月在线编程题 package yuyin.chuli;import java.math.BigDecimal; import java.util.Scanner;public class Main { /** 请完成下面这个函数，实现题目要求的功能 **/ /** 当然，你也可以不按照这个模板来作答，完全按照自己的想法来 ^-^ **/ static doub

机器学习 - 编程练习（一）：线性回归

deliaodun9945的博客

09-06

567

编程练习（一）：线性回归文件清单 ex1.m ex1_multi.m ex1data1.txt - ex1.m 用到的数据组 ex1data2.txt - ex1_multi.m 用到的数据组 submit.m - 提交代码 [*] warmUpExercise.m [*] plotData.m [*] computeCost.m [*] gradientDescent.m ...

机器学习习题(13)