Is there any method or function to fill missing multiple columns and rows for pivot table in python?
import pandas as pd
import numpy as np
from io import StringIO
csvfile = StringIO("""Date;Cat;Type;Value
01-Jan;AA;S;1
02-Jan;AA;F;2
02-Jan;BB;T;3
04-Jan;BB;T;3
05-Jan;CC;T;2
05-Jan;DD;T;1
05-Jan;BB;S;4
05-Jan;AA;S;2
05-Jan;DD;S;4""")
df = pd.read_csv(csvfile, sep = ';')
pt = pd.pivot_table(df, values = 'Value', index=['Cat', 'Type'], columns= ['Date'], aggfunc = np.sum, fill_value = 0)
pt
The above code result shows as below, the Type columns are missing values (F,S,T) for some Cat, 03-Jan column is missing:
Cat|Type|01-Jan|02-Jan|04-Jan|05-Jan|
---+----+------+------+------+------+
AA |F | | 2| | |
|S | 1| | | 2|
BB |S | | | | 4|
|T | | 3| 3| |
CC |T | | | | 2|
DD |S | | | | 4|
|T | | | | 1|
But the expected result want as:
Cat|Type|01-Jan|02-Jan|03-Jan|04-Jan|05-Jan|
---+----+------+------+------+------+------+
AA |F | | 2| | | |
|S | 1| | | | 2|
|T | | | | | |
BB |F | | | | | |
|S | | | | | 4|
|T | | 3| | 3| |
CC |F | | | | | |
|S | | | | | |
|T | | | | | 2|
DD |F | | | | | |
|S | | | | | 4|
|T | | | | | 1|
解决方案
Just convert df['Type'] to Categorical first:
df['Type'] = df['Type'].astype('category')
This forces Pandas to show every value with pivot_table. It's also good practice to trust Pandas conversion of strings such as 'sum' to optimised functions. Here's a demo:
df['Type'] = df['Type'].astype('category')
pt = pd.pivot_table(df, values='Value', index=['Cat', 'Type'],
columns='Date', aggfunc='sum', fill_value=0)
print(pt)
Date 01-Jan 02-Jan 04-Jan 05-Jan
Cat Type
AA F 0 2 0 0
S 1 0 0 2
T 0 0 0 0
BB F 0 0 0 0
S 0 0 0 4
T 0 3 3 0
CC F 0 0 0 0
S 0 0 0 0
T 0 0 0 2
DD F 0 0 0 0
S 0 0 0 4
T 0 0 0 1