Skip to content

plotnine -nneed to work

ggplot2 is a very useful package in R for creating advanced plots. In Python, the plotnine library is used to create ggplot2-like plots. You can import the module using import plotnine as p9. Generating plots in ggplot2 (plotnine) follows a structured series of steps, which can be accomplished via:

  • initialize it
import plotnine as p9
CHD_plot=p9.ggplot(data=CHD)

  • Define aesthetics using aes and specify your arguments. The most important aesthetics include: x, y, alpha, color, colour, fill, linetype, shape, size, and stroke. To create variations of the plot with different parameters, you can assign it to a variable.
CHD_plot=CHD_plot + p9.aes(x='median_income', y='median_house_value')
# or CHD_plot=p9.ggplot(data=CHD,mapping=p9.aes(x='median_income', y='median_house_value'))
CHD_plot.show()

  • Specify what you want to display and use the + operator to add layers and customize your plot.
CHD_plot=CHD_plot+p9.geom_point()
CHD_plot.show()

You can easily add scale and define label:

CHD_plot=CHD_plot+ p9.geom_point(alpha=0.15)+ p9.xlab("median_income") +
 p9.ylab("median_house_value") + p9.scale_x_log10() + 
 p9.theme_bw()+ p9.theme(text=p9.element_text(size=10))

  • After creating your plot, you can save it to a file in your favourite format
CHD_plot = CHD_plot + p9.geom_point()
CHD_plot.save("CHD_plot.png", dpi=300)

bar chart

To generate a bar chart, you can use geom_bar()

CHD_bar=(p9.ggplot(data=CHD,mapping=p9.aes(x='famlev'))+ p9.geom_bar())

Plotting distributions

  • A boxplot can be created using geom_boxplot():
CHD_dist=(p9.ggplot(data=CHD,
           mapping=p9.aes(x='famlev',
                          y='median_income'))
    + p9.geom_boxplot()
    + p9.scale_y_log10()
 )

  • To add points behind the boxplot, you can use geom_jitter() to plot the points with some random noise to avoid overlapping points. This will create a visual representation of the data points behind the boxplot. Here's an example:
CHD_dist=(p9.ggplot(data=CHD,
           mapping=p9.aes(x='famlev',
                          y='median_income'))
    + p9.geom_boxplot()
    + p9.geom_jitter(alpha=0.1, color="green")
    + p9.scale_y_log10()
 )