Skip to main content

Data Analysis and Visualization with Matplotlib and Seaborn | TOP 10 code snippets for practice

Data visualization is an essential aspect of data analysis. It enables us to better understand the underlying patterns, trends, and insights within a dataset. Two of the most popular Python libraries for data visualization are Matplotlib and Seaborn. Both libraries are highly powerful, and they can be used to create a wide variety of plots to help researchers, analysts, and data scientists present data visually.

In this article, we will discuss the basics of both libraries, followed by the top 10 most used code snippets for visualization. We'll also provide links to free resources and documentation to help you dive deeper into these libraries.

Matplotlib and Seaborn: A Quick Overview

Matplotlib

Matplotlib is a low-level plotting library in Python. It allows you to create static, animated, and interactive plots. It provides a lot of flexibility but may require more code to create complex plots compared to Seaborn.

Matplotlib is especially useful when you need full control over the visual elements of a plot, like adjusting the axis, colors, legends, titles, and more.

Official Documentation: Matplotlib Documentation

Seaborn

Seaborn is built on top of Matplotlib and is designed to make it easier to create visually appealing and informative statistical graphics. It comes with a variety of high-level functions to create complex plots with fewer lines of code.

Seaborn is particularly helpful for statistical visualizations, such as correlation plots, box plots, and heatmaps.

Official Documentation: Seaborn Documentation

Top 10 Most Used Code Snippets for Data Visualization

1. Basic Line Plot with Matplotlib

A line plot is one of the most common plots for showing trends over time.

import matplotlib.pyplot as plt

# Example data
x = [1, 2, 3, 4, 5]
y = [1, 4, 9, 16, 25]

plt.plot(x, y)
plt.title('Basic Line Plot')
plt.xlabel('X Axis')
plt.ylabel('Y Axis')
plt.show()

2. Basic Scatter Plot with Matplotlib

A scatter plot is useful to visualize the relationship between two continuous variables.

import matplotlib.pyplot as plt

# Example data
x = [1, 2, 3, 4, 5]
y = [5, 4, 3, 2, 1]

plt.scatter(x, y)
plt.title('Basic Scatter Plot')
plt.xlabel('X Axis')
plt.ylabel('Y Axis')
plt.show()

3. Histogram with Matplotlib

Histograms are used to display the distribution of a dataset.

import matplotlib.pyplot as plt
import numpy as np

data = np.random.randn(1000)

plt.hist(data, bins=30, edgecolor='black')
plt.title('Histogram')
plt.xlabel('Values')
plt.ylabel('Frequency')
plt.show()

4. Box Plot with Seaborn

A box plot visualizes the distribution of numerical data and highlights outliers.

import seaborn as sns
import matplotlib.pyplot as plt

# Example data
tips = sns.load_dataset("tips")

sns.boxplot(x='day', y='total_bill', data=tips)
plt.title('Box Plot')
plt.show()

5. Heatmap with Seaborn

A heatmap is a great way to visualize a correlation matrix or any other grid of data.

import seaborn as sns
import matplotlib.pyplot as plt

# Example data
flights = sns.load_dataset("flights")

pivot_flights = flights.pivot_table(index='month', columns='year', values='passengers')
sns.heatmap(pivot_flights, cmap="YlGnBu", annot=True, fmt="d")
plt.title('Heatmap')
plt.show()

6. Pair Plot with Seaborn

A pair plot displays pairwise relationships between several variables in a dataset.

import seaborn as sns

# Example data
iris = sns.load_dataset("iris")

sns.pairplot(iris, hue='species')
plt.title('Pair Plot')
plt.show()

7. Bar Plot with Seaborn

Bar plots are commonly used to visualize categorical data.

import seaborn as sns
import matplotlib.pyplot as plt

# Example data
tips = sns.load_dataset("tips")

sns.barplot(x="day", y="total_bill", data=tips)
plt.title('Bar Plot')
plt.show()

8. Violin Plot with Seaborn

A violin plot is a combination of a box plot and a kernel density plot, useful for comparing distributions.

import seaborn as sns
import matplotlib.pyplot as plt

# Example data
tips = sns.load_dataset("tips")

sns.violinplot(x="day", y="total_bill", data=tips)
plt.title('Violin Plot')
plt.show()

9. Pie Chart with Matplotlib

A pie chart is used to show proportions or percentages of a whole.

import matplotlib.pyplot as plt

# Example data
labels = ['A', 'B', 'C', 'D']
sizes = [15, 30, 45, 10]

plt.pie(sizes, labels=labels, autopct='%1.1f%%', startangle=90)
plt.title('Pie Chart')
plt.show()

10. Regplot (Regression Plot) with Seaborn

A regression plot shows the relationship between two variables and fits a regression line to the data.

import seaborn as sns
import matplotlib.pyplot as plt

# Example data
tips = sns.load_dataset("tips")

sns.regplot(x="total_bill", y="tip", data=tips)
plt.title('Regression Plot')
plt.show()

Additional Resources

For anyone looking to learn more about data visualization with Matplotlib and Seaborn, here are some great free resources and documentation:

1. Matplotlib Documentation

  • Link: Matplotlib Documentation
  • This resource provides a comprehensive guide to all the functionalities of Matplotlib, including detailed tutorials, examples, and an API reference.

2. Seaborn Documentation

  • Link: Seaborn Documentation
  • Seaborn's official website includes an easy-to-follow user guide, gallery, and API reference for creating high-level statistical visualizations.

3. Kaggle: Python Data Visualization

  • Link: Kaggle Data Visualization
  • Kaggle offers free courses that teach the fundamentals of data visualization using Python libraries like Matplotlib and Seaborn.

4. Matplotlib Tutorial on W3Schools

  • Link: Matplotlib Tutorial
  • This tutorial on W3Schools is beginner-friendly, covering the basics of Matplotlib and providing easy-to-follow examples.

5. Seaborn Tutorial on TutorialsPoint

  • Link: Seaborn Tutorial
  • A free, beginner-friendly guide to Seaborn, explaining its features and how to use it for creating beautiful statistical plots.

6. Python Data Science Handbook (by Jake VanderPlas)

  • Link: Python Data Science Handbook
  • This book (available online for free) includes excellent sections on data visualization using Matplotlib and Seaborn.

Conclusion

Matplotlib and Seaborn are two of the most powerful libraries for data visualization in Python. While Matplotlib provides flexibility for custom visualizations, Seaborn simplifies the creation of complex statistical plots. The code snippets provided in this article should give you a solid foundation to start exploring these libraries. Don't forget to check out the recommended resources for further learning and practice. Happy visualizing!

Comments

Popular posts from this blog

Raghvendra Singh Portfolio

  I’m Raghvendra Singh Business Analytics & Data Science Professional I help businesses make data-driven decisions using analytics, dashboards and data science techniques across Ecommerce, Retail, Finance and Marketing . I specialize in converting raw data into clear insights, measurable impact and actionable recommendations for business leaders and teams. Profile Links Github LinkedIn Portfolio  Below are selected projects showcasing my work in analytics, data science and business problem-solving . 1. Digital Marketing Ads Clustering for Ads24x7 2. Inferential statistics: Probability to ANOVA 3. Power BI Sales & Invetory forecasting using SARIMA, SQL, Python 4. Power BI/ Looker/ Tableu- Neerus Dashboards - Myntra payments dashboard 5. Text Analytics using NLP on political speeches analysis 6.  Election Data Classification: End to end analysis 7.  📬 Let’s Connect 📧 Email: raghavsingh0027 @gmail.com 🔗 LinkedIn: https://www.linkedin.com/in/raghvendra0...

25 Game-Changing Use Cases of Data Science in Marketing

In today’s competitive and fast-paced marketing landscape, businesses are constantly seeking ways to optimize their strategies, engage with customers more effectively, and increase ROI. Enter data science , which has proven to be a powerful tool in transforming marketing practices. By leveraging data, machine learning, and artificial intelligence (AI), marketers can extract valuable insights, predict trends, and enhance decision-making. This article will explore 25 use cases of data science in marketing and illustrate how it can help companies unlock new opportunities and drive better outcomes. 1. Customer Segmentation What it is: Data science enables marketers to categorize customers based on shared traits, behaviors, or preferences, which allows for more targeted and personalized campaigns. Example: By analyzing purchasing history and browsing patterns, data science tools can create customer segments, enabling businesses to deliver tailored marketing messages for each group....

AI/ML Projects by AI Councel Lab

As part of our mission to create impactful AI and ML solutions, we have worked on several projects that showcase the power of data and machine learning in solving real-world problems. These projects are designed to address a variety of use cases across different industries and to demonstrate the practical applications of AI and ML algorithms. Below is a list of the key projects I’ve worked on, highlighting the scope, objectives, and technologies involved. 1. Customer Churn Prediction Model Objective: Predict customer churn for a subscription-based service using machine learning. Tech Stack: Python, Pandas, Scikit-learn, Logistic Regression, Random Forest. Overview: This project focused on using historical customer data to predict which customers were likely to cancel their subscription. By identifying these customers early, businesses can take proactive measures to improve retention. Key Insights: The model demonstrated the effectiveness of classification algorithms in customer re...