Julia : DataFrame与replace、delete

https://stackoverflow.com/questions/34611109/julia-dataframe-replacing-missing-values

一、replace

Probably the easiest approach is to use replace! or replace from base Julia. Here is an example with replace!:

julia> using DataFrames

julia> df = DataFrame(x = [1, missing, 3])
3×1 DataFrame
│ Row │ x       │
│     │ Int64⍰  │
├─────┼─────────┤
│ 1   │ 1       │
│ 2   │ missing │
│ 3   │ 3       │

julia> replace!(df.x, missing => 0);

julia> df
3×1 DataFrame
│ Row │ x      │
│     │ Int64⍰ │
├─────┼────────┤
│ 1   │ 1      │
│ 2   │ 0      │
│ 3   │ 3      │

However, note that at this point the type of column x still allows missing values:

julia> typeof(df.x)
Array{Union{Missing, Int64},1}

This is also indicated by the question mark following Int64 in column x when the data frame is printed out. You can change this by using disallowmissing! (from the DataFrames.jl package):

julia> disallowmissing!(df, :x)
3×1 DataFrame
│ Row │ x     │
│     │ Int64 │
├─────┼───────┤
│ 1   │ 1     │
│ 2   │ 0     │
│ 3   │ 3     │

Alternatively, if you use replace (without the exclamation mark) as follows, then the output will already disallow missing values:

julia> df = DataFrame(x = [1, missing, 3]);

julia> df.x = replace(df.x, missing => 0);

julia> df
3×1 DataFrame
│ Row │ x     │
│     │ Int64 │
├─────┼───────┤
│ 1   │ 1     │
│ 2   │ 0     │
│ 3   │ 3     │

Base.ismissing with logical indexing

You can use ismissing with logical indexing to assign a new value to all missing entries of an array:

julia> df = DataFrame(x = [1, missing, 3]);

julia> df.x[ismissing.(df.x)] .= 0;

julia> df
3×1 DataFrame
│ Row │ x      │
│     │ Int64⍰ │
├─────┼────────┤
│ 1   │ 1      │
│ 2   │ 0      │
│ 3   │ 3      │

Base.coalesce

Another approach is to use coalesce:

julia> df = DataFrame(x = [1, missing, 3]);

julia> df.x = coalesce.(df.x, 0);

julia> df
3×1 DataFrame
│ Row │ x     │
│     │ Int64 │
├─────┼───────┤
│ 1   │ 1     │
│ 2   │ 0     │
│ 3   │ 3     │

DataFramesMeta

Both replace and coalesce can be used with the @transform macro from the DataFramesMeta.jl package:

julia> using DataFramesMeta

julia> df = DataFrame(x = [1, missing, 3]);

julia> @transform(df, x = replace(:x, missing => 0))
3×1 DataFrame
│ Row │ x     │
│     │ Int64 │
├─────┼───────┤
│ 1   │ 1     │
│ 2   │ 0     │
│ 3   │ 3     │

julia> df = DataFrame(x = [1, missing, 3]);

julia> @transform(df, x = coalesce.(:x, 0))
3×1 DataFrame
│ Row │ x     │
│     │ Int64 │
├─────┼───────┤
│ 1   │ 1     │
│ 2   │ 0     │
│ 3   │ 3     │

二、delete

删除某列:

julia> df = DataFrame(A = 1:4, B = ["M", "F", "F", "M"], C = 2:5)
4x3 DataFrame
|-------|---|-----|---|
| Row # | A | B   | C |
| 1     | 1 | "M" | 2 |
| 2     | 2 | "F" | 3 |
| 3     | 3 | "F" | 4 |
| 4     | 4 | "M" | 5 |

julia> delete!(df, :B)
4x2 DataFrame
|-------|---|---|
| Row # | A | C |
| 1     | 1 | 2 |
| 2     | 2 | 3 |
| 3     | 3 | 4 |
| 4     | 4 | 5 |


如果不知道列名的话:

df = df[:,[1:2,4:end]] # remove column 3
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值