Seaborn¶
Seaborn provides advanced graphical capabilities for creating sophisticated statistical visualizations with ease. It simplifies the process of generating complex plots from pandas DataFrames using simple commands. Let consider the CHD_test.csv,
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
CHD=pd.read_csv('./data/CHD_test.csv',index_col=False)
CHD.head()

Histogram¶
Standardize the 'median_income' and 'median_house_value' and plot the
import seaborn as sns
sns.set(color_codes=True)
CHD['median_income'] = (CHD['median_income'] -CHD['median_income'].mean()) / CHD['median_income'].std()
CHD['median_house_value'] = (CHD['median_house_value'] -CHD['median_house_value'].mean()) / CHD['median_house_value'].std()
for col in ['median_income','median_house_value']:
plt.hist(CHD[col], density=True)
plt.show(block=False)

We can get a smooth estimate of the distribution using a kernel density estimation (KDE):
import warnings
warnings.filterwarnings("ignore")
sns.kdeplot(data=CHD, x='median_income', y='median_house_value')
plt.show(block=False)

You can create a hexagonally-based histogram using jointplot:


The following illustrates how to draw a box plot for different family levels.
g=sns.catplot(data=CHD, x='median_income', y='famlev', kind="box")
g.set_axis_labels("Income", "Family level");

Pairplots¶
We can generalize joint plots for multidimensional data, which is very useful for exploring correlations between multiple dimensions of data.

Joyplot¶
Joyplot is a useful plot to compare distributions, the following show how to plot
Create the data
rs = np.random.RandomState(1979)
x = rs.randn(500)
g = np.tile(list("ABCDEFGHIJ"), 50)
df = pd.DataFrame(dict(x=x, g=g))
m = df.g.map(ord)
df["x"] += m
Initialize the FacetGrid object
pal = sns.cubehelix_palette(10, rot=-.25, light=.7)
g = sns.FacetGrid(df, row="g", hue="g", aspect=15, height=.5, palette=pal)
Draw the densities in a few steps
g.map(sns.kdeplot, "x",bw_adjust=.5, clip_on=False,fill=True, alpha=1, linewidth=1.5)
g.map(sns.kdeplot, "x", clip_on=False, color="w", lw=2, bw_adjust=.5)
Define and use a simple function to label the plot in axes coordinates
def label(x, color, label):
ax = plt.gca()
ax.text(0, .2, label, fontweight="bold", color=color,
ha="left", va="center", transform=ax.transAxes)
g.map(label, "x")
Set the subplots to overlap
Remove axes details that don't play well with overlap
g.set_titles("")
g.set(yticks=[], ylabel="")
g.despine(bottom=True, left=True)
plt.show(block=False)

Displot¶
This function can be used for visualizing the univariate or bivariate distribution of data,
import seaborn as sns
import matplotlib.pyplot as plt
data1 = np.random.normal(size = 100)
data2 = np.random.normal(size = 100)
var={"A": data1, "B": data2}
df= pd.DataFrame(data=var,index=(range(100)))
sns.displot(data=df,kde=True)
plt.show(block=False)
