Skip to main content

Digital Marketing Ads Clustering Using Machine Learning


The ads24x7 is a Digital Marketing company which has now got seed funding of $10 Million. They are expanding their wings in Marketing Analytics. They collected data from their Marketing Intelligence team and now wants you (their newly appointed data analyst) to segment type of ads based on the features provided. Use Clustering procedure to segment ads into homogeneous groups.


🔍 Project Objective

This project focuses on applying unsupervised machine learning and dimensionality reduction techniques to solve two real-world analytical problems:

  1. Segment digital advertisements based on performance metrics to optimize marketing strategy.

  2. Reduce high-dimensional census data using PCA to extract meaningful population insights efficiently.

The project demonstrates strong skills in EDA, clustering, PCA, business interpretation, and actionable recommendations.


🧠 Part 1: Digital Marketing Ads Clustering (Business Analytics + ML)

📌 Problem Statement

A digital marketing company wanted to segment advertisements into homogeneous groups based on performance indicators such as CTR, CPM, CPC, revenue, spend, device type, and platform.

⚙️ Approach

  • Performed detailed EDA (univariate & bivariate analysis)

  • Treated missing values using domain-specific formulas for CTR, CPM, and CPC

  • Detected and treated outliers using the IQR method

  • Applied z-score scaling to improve clustering performance

  • Used:

    • Hierarchical Clustering (Ward + Euclidean)

    • K-Means Clustering

  • Identified optimal clusters using:

    • Elbow method

    • Silhouette score

📊 Key Results

  • Optimal number of clusters identified as 5

  • Each cluster represented a distinct ad performance pattern

  • Certain clusters delivered high revenue with low CPC

  • Large ad sizes did not necessarily translate to better performance

💡 Business Insights

  • Video ads generated the highest average revenue

  • Mobile ads had lower CPM and higher reach

  • Poster-sized vertical ads showed best CTR and efficiency

  • A significant portion of ads consumed budget with poor ROI

📈 Recommendations

  • Increase investment in mobile video ads

  • Prioritize poster-sized creatives for higher conversions

  • Reduce spend on clusters with high CPC & low revenue

  • Use clustering as a recurring optimization strategy


📉 Part 2: Census Data Analysis Using PCA (Data Science)

📌 Problem Statement

The Indian Census dataset contained 57+ highly correlated variables, making analysis complex and inefficient.
Objective was to reduce dimensionality while retaining maximum variance.

⚙️ Approach

  • Conducted EDA on selected demographic and workforce variables

  • Treated outliers and scaled data using z-score normalization

  • Verified suitability using:

    • Bartlett’s Test of Sphericity

    • KMO Test (0.93 – excellent adequacy)

  • Applied Principal Component Analysis (PCA) using Scikit-learn

  • Used Scree plot and cumulative explained variance for PC selection

📊 Key Results

  • Reduced 57 variables → 5 principal components

  • These 5 PCs explained 90.6% of total variance

  • Principal components captured:

    • Population size

    • Workforce composition

    • Agricultural labor patterns

    • Gender-based employment distribution

💡 Insights

  • Strong correlation between male & female population metrics

  • Workforce participation patterns varied significantly across states

  • PCA successfully eliminated multicollinearity while preserving structure


🛠 Skills Demonstrated

Technical Skills

  • Python, Pandas, NumPy

  • Scikit-learn

  • Clustering (K-Means, Hierarchical)

  • PCA & linear algebra concepts

  • Data preprocessing & scaling

Analytics & Business Skills

  • Exploratory Data Analysis

  • Marketing analytics

  • KPI interpretation

  • Insight generation

  • Recommendation framing


🚀 Business Impact

  • Enables data-driven ad optimization

  • Reduces marketing spend inefficiency

  • Improves campaign ROI and targeting

  • Simplifies complex demographic datasets for faster decision-making


🏁 Conclusion

This project showcases my ability to:

  • Translate business problems into data science solutions

  • Apply machine learning practically, not theoretically

  • Convert complex analysis into clear business recommendations



🔗 Project Resources

  • 📁 Dataset & Code

  • 📊 Report


















Comments

Popular posts from this blog

Text Analytics on U.S. Presidential Inaugural Speeches

Project Overview In this project, I performed text analytics and natural language processing (NLP) on three historic U.S. Presidential inaugural speeches to understand their linguistic structure, vocabulary usage, and dominant themes . Speeches Analyzed Franklin D. Roosevelt – 1941 John F. Kennedy – 1961 Richard Nixon – 1973 The goal was not political analysis, but language analysis using Python and NLP libraries. Git Link Problem Definition The objectives of this analysis were: Compute text statistics for each speech: Number of characters Number of words Number of sentences Average word length Perform text preprocessing : Lowercasing Removing punctuation, numbers, and special characters Stopword removal Stemming Identify the most frequently used words across all three speeches Visualize dominant themes using a Word Cloud Data Source The speeches were sourced from the NLTK Inaugural Corpus , which contains official U.S. presidential inaugural addresses dating back to 1789. from nlt...

Raghvendra Singh Portfolio

  I’m Raghvendra Singh Business Analytics & Data Science Professional I help businesses make data-driven decisions using analytics, dashboards and data science techniques across Ecommerce, Retail, Finance and Marketing . I specialize in converting raw data into clear insights, measurable impact and actionable recommendations for business leaders and teams. Profile Links Github LinkedIn Portfolio  Below are selected projects showcasing my work in analytics, data science and business problem-solving . 1. Digital Marketing Ads Clustering for Ads24x7 2. Inferential statistics: Probability to ANOVA 3. Power BI Sales & Invetory forecasting using SARIMA, SQL, Python 4. Power BI/ Looker/ Tableu- Neerus Dashboards - Myntra payments dashboard 5. Text Analytics using NLP on political speeches analysis 6.  Election Data Classification: End to end analysis 7.  📬 Let’s Connect 📧 Email: raghavsingh0027 @gmail.com 🔗 LinkedIn: https://www.linkedin.com/in/raghvendra0...

Introducing The Cat Poet: Your Personal AI Cat Wordsmith by AI Councel Lab

Poetry is the rhythmical creation of beauty in words.     – Edgar Allan Poe Now, imagine that beauty, powered by AI. Welcome to AI Councel Lab , your go-to space for cutting-edge AI tools that blend creativity and intelligence. Today, we're thrilled to introduce a truly unique creation: The  Cat Poet — a next-generation poetic companion that turns your ideas into art. ✨ What Is The AI   Cat Poet ? Try Cat Poet App Now → The Cat Poet is an AI-powered poetry generator designed to take a keyword or phrase of your choice and craft beautiful poems in a wide range of poetic styles — from minimalist Haikus to heartfelt Elegies , powerful Odes , and over 30 diverse poetic forms . Whether you're a writer, student, creative thinker, or someone just looking for a moment of lyrical joy, The Cat Poet is here to inspire you. 🧠 How It Works Simply enter a word, feeling, or concept — and let the AI weave its magic. Behind the scenes, a fine-tuned language model selects from a c...