Vibe Coding Part 2: I Built an SEO Content Audit Tool using ChatGPT
I built a tool that clusters similar content for faster SEO audits, content pruning, and strategy planning.
Farlyn Lucas
4/19/20253 min read


A while back, I challenged myself to build a simple SEO tool even though I wasn’t a technical coder.
That’s how I fell into the rabbit hole of vibe coding and ChatGPT-assisted projects.
Now, here’s another one.
Summary
I audited a chaotic site with over 1500+ blog posts, most of them underperforming or thin.
Manually checking for duplicate or overlapping topics was too slow.
I built a content audit assistant tool that groups similar pages based on semantic meaning.
It helps prioritize content pruning, consolidation, and repurposing opportunities.
Built with help from ChatGPT and common sense.
The Reality of Messy Content
I recently audited a website with over 1500+ blog posts, no clear content structure, and no editorial system. It was just years of random publishing.
As you can imagine, duplicate and near-duplicate content was everywhere.
The client had been publishing for years without:
A content calendar
A topic hierarchy
A keyword targeting system
For instance, they had articles like:
Fentanyl is the Deadliest Drug in the Country
The Deadly Effects of Fentanyl
The Dangers of Fentanyl Use: What You Need to Know
Clearly covering the same ground but scattered across different URLs.
None of these pages were building real topical authority on their own.
I needed a way to:
Detect near-duplicate articles
Group them into clusters
Prioritize consolidation and pruning work
The Solution
Instead of manually opening thousands of pages, I decided to build a tool to automate the most painful part:
Identifying similar or duplicate posts to speed up content pruning, consolidation, and repurposing decisions.
After a lot of back-and-forths with ChatGPT and research, I found the perfect approach:
Use a semantic embedding model (MiniLM-L6-v2 from Hugging Face)
Process each page's Title, Meta Description, H1, and URL path
Compare all pages using cosine similarity based on meaning, not just keywords
What This Tool Does Well
Quickly finds near-duplicate and overlapping pages.
Groups them into clear clusters for review.
Helps prioritize merge, consolidate, or prune decisions.
What This Tool Doesn't Do
It doesn't scan live web pages.
It doesn't magically decide for you what action to take.
It doesn't fully replace human review.
You still need to audit the clusters and decide whether to merge, redirect, rewrite, or delete.
Again, this tool accelerates the audit but doesn’t replace thinking.
How to Use the Tool
Export your data from ScreamingFrog
Place the CSV in the the same folder of ClusterAudit.exe
Run ClusterAudit.exe and follow the prompts


Review your final grouped results in the new CSV






The Output
After running ClusterAudit, the tool automatically generates a new CSV showing all clustered pages.
Each grouped result includes:
Group ID
Core Topic
Address (URL)
Title 1
Meta Description 1
H1-1
H2-1
Status
Important:
Some automatically generated Core Topic labels might not perfectly reflect all articles within a group.
While ClusterAudit does a strong job semantically grouping related content, final Core Topic naming still benefits from human SEO review.


Final Thoughts
THIS TOOL IS NOT PERFECT.
It still needs a human to review the clusters, make judgment calls, and plan actual content actions.
But it turned a chaotic, 1500-page manual audit into a 2-hour strategic review.
Instead of wasting days opening URLs, cross-checking titles, and second-guessing duplicate topics, I could now focus immediately on strategy, consolidation plans, and content pruning decisions.
Sometimes that’s all you need: not a perfect system, but a tool that gets you unstuck and pushes you forward faster.
If you want access to the tool, it’s about 171MB - a bit big to host directly. Just leave your email address and I’ll send it over!
Quick Disclaimer
This is a personal project. It's not a full enterprise tool.
It's just a small, scrappy solution that helped me work smarter and faster and maybe it can help you too.
© 2025 Farlyn Lucas. All Rights Reserved.