The ads24x7 is a Digital Marketing company which has now got seed funding of $10 Million. They are expanding their wings in Marketing Analytics. They collected data from their Marketing Intelligence team and now wants you (their newly appointed data analyst) to segment type of ads based on the features provided. Use Clustering procedure to segment ads into homogeneous groups.
🔍 Project Objective
This project focuses on applying unsupervised machine learning and dimensionality reduction techniques to solve two real-world analytical problems:
Segment digital advertisements based on performance metrics to optimize marketing strategy.
Reduce high-dimensional census data using PCA to extract meaningful population insights efficiently.
The project demonstrates strong skills in EDA, clustering, PCA, business interpretation, and actionable recommendations.
🧠 Part 1: Digital Marketing Ads Clustering (Business Analytics + ML)
📌 Problem Statement
A digital marketing company wanted to segment advertisements into homogeneous groups based on performance indicators such as CTR, CPM, CPC, revenue, spend, device type, and platform.
⚙️ Approach
Performed detailed EDA (univariate & bivariate analysis)
Treated missing values using domain-specific formulas for CTR, CPM, and CPC
Detected and treated outliers using the IQR method
Applied z-score scaling to improve clustering performance
Used:
Hierarchical Clustering (Ward + Euclidean)
K-Means Clustering
Identified optimal clusters using:
Elbow method
Silhouette score
📊 Key Results
Optimal number of clusters identified as 5
Each cluster represented a distinct ad performance pattern
Certain clusters delivered high revenue with low CPC
Large ad sizes did not necessarily translate to better performance
💡 Business Insights
Video ads generated the highest average revenue
Mobile ads had lower CPM and higher reach
Poster-sized vertical ads showed best CTR and efficiency
A significant portion of ads consumed budget with poor ROI
📈 Recommendations
Increase investment in mobile video ads
Prioritize poster-sized creatives for higher conversions
Reduce spend on clusters with high CPC & low revenue
Use clustering as a recurring optimization strategy
📉 Part 2: Census Data Analysis Using PCA (Data Science)
📌 Problem Statement
The Indian Census dataset contained 57+ highly correlated variables, making analysis complex and inefficient.
Objective was to reduce dimensionality while retaining maximum variance.
⚙️ Approach
Conducted EDA on selected demographic and workforce variables
Treated outliers and scaled data using z-score normalization
Verified suitability using:
Bartlett’s Test of Sphericity
KMO Test (0.93 – excellent adequacy)
Applied Principal Component Analysis (PCA) using Scikit-learn
Used Scree plot and cumulative explained variance for PC selection
📊 Key Results
Reduced 57 variables → 5 principal components
These 5 PCs explained 90.6% of total variance
Principal components captured:
Population size
Workforce composition
Agricultural labor patterns
Gender-based employment distribution
💡 Insights
Strong correlation between male & female population metrics
Workforce participation patterns varied significantly across states
PCA successfully eliminated multicollinearity while preserving structure
🛠 Skills Demonstrated
Technical Skills
Python, Pandas, NumPy
Scikit-learn
Clustering (K-Means, Hierarchical)
PCA & linear algebra concepts
Data preprocessing & scaling
Analytics & Business Skills
Exploratory Data Analysis
Marketing analytics
KPI interpretation
Insight generation
Recommendation framing
🚀 Business Impact
Enables data-driven ad optimization
Reduces marketing spend inefficiency
Improves campaign ROI and targeting
Simplifies complex demographic datasets for faster decision-making
🏁 Conclusion
This project showcases my ability to:
Translate business problems into data science solutions
Apply machine learning practically, not theoretically
Convert complex analysis into clear business recommendations
Comments
Post a Comment