shekhar pandey
Nov 21, 2021

--

import numpy as np
import pandas as pd

crosstab, pivot_table

titanic = pd.read_csv('https://raw.githubusercontent.com/shekhar270779/Learn_ML/main/datasets/Titanic.csv')titanic.head()
png

Cross Tabulation / Contingency table

pd.crosstab(titanic.Survived, titanic.Pclass)
png
rows = titanic.Survived
cols = titanic.Pclass
pd.crosstab(rows, cols, margins=['rows','columns'])
png
pd.crosstab(titanic.Survived, titanic.Pclass, normalize=True, margins=True)
png

Pivoting

pd.pivot_table(index='Survived', columns='Pclass', values=['Fare'], aggfunc=lambda x : np.mean(x), data=titanic)
png
pd.pivot_table(index='Survived', columns='Pclass', values=['Age'], aggfunc=lambda x : np.mean(x), data=titanic)
png
titanic.pivot_table(index='Pclass', columns='Sex', values='Age', aggfunc=lambda x : np.mean(x)).unstack()Sex     Pclass
female 1 34.611765
2 28.722973
3 21.750000
male 1 41.281386
2 30.740707
3 26.507589
dtype: float64

Challenge on pivot_table, compare avg. age based on

  • Survived vs Pclass
  • Survived vs Sex
  • Class vs Sex
titanic.pivot_table(index='Survived', columns='Pclass', values=['Age'], aggfunc=lambda x : np.mean(x))
png
titanic.groupby(['Survived', 'Pclass']).agg({'Age': np.mean})
png

--

--

No responses yet