Vibe Coding Part 2: I Built an SEO Content Audit Tool using ChatGPT

I built a tool that clusters similar content for faster SEO audits, content pruning, and strategy planning.

Farlyn Lucas

4/19/20253 min read

A while back, I challenged myself to build a simple SEO tool even though I wasn’t a technical coder.

That’s how I fell into the rabbit hole of vibe coding and ChatGPT-assisted projects.

Now, here’s another one.

Summary

  • I audited a chaotic site with over 1500+ blog posts, most of them underperforming or thin.

  • Manually checking for duplicate or overlapping topics was too slow.

  • I built a content audit assistant tool that groups similar pages based on semantic meaning.

  • It helps prioritize content pruning, consolidation, and repurposing opportunities.

  • Built with help from ChatGPT and common sense.

The Reality of Messy Content

I recently audited a website with over 1500+ blog posts, no clear content structure, and no editorial system. It was just years of random publishing.

As you can imagine, duplicate and near-duplicate content was everywhere.

The client had been publishing for years without:

  • A content calendar

  • A topic hierarchy

  • A keyword targeting system

For instance, they had articles like:

  • Fentanyl is the Deadliest Drug in the Country

  • The Deadly Effects of Fentanyl

  • The Dangers of Fentanyl Use: What You Need to Know

Clearly covering the same ground but scattered across different URLs.

None of these pages were building real topical authority on their own.

I needed a way to:

  • Detect near-duplicate articles

  • Group them into clusters

  • Prioritize consolidation and pruning work

The Solution

Instead of manually opening thousands of pages, I decided to build a tool to automate the most painful part:

Identifying similar or duplicate posts to speed up content pruning, consolidation, and repurposing decisions.

After a lot of back-and-forths with ChatGPT and research, I found the perfect approach:

  • Use a semantic embedding model (MiniLM-L6-v2 from Hugging Face)

  • Process each page's Title, Meta Description, H1, and URL path

  • Compare all pages using cosine similarity based on meaning, not just keywords

What This Tool Does Well

  • Quickly finds near-duplicate and overlapping pages.

  • Groups them into clear clusters for review.

  • Helps prioritize merge, consolidate, or prune decisions.

What This Tool Doesn't Do

  • It doesn't scan live web pages.

  • It doesn't magically decide for you what action to take.

  • It doesn't fully replace human review.

You still need to audit the clusters and decide whether to merge, redirect, rewrite, or delete.

Again, this tool accelerates the audit but doesn’t replace thinking.

How to Use the Tool

  1. Export your data from ScreamingFrog

  2. Place the CSV in the the same folder of ClusterAudit.exe

  3. Run ClusterAudit.exe and follow the prompts

  1. Review your final grouped results in the new CSV

The Output

After running ClusterAudit, the tool automatically generates a new CSV showing all clustered pages.

Each grouped result includes:

  • Group ID

  • Core Topic

  • Address (URL)

  • Title 1

  • Meta Description 1

  • H1-1

  • H2-1

  • Status

Important:

Some automatically generated Core Topic labels might not perfectly reflect all articles within a group.

While ClusterAudit does a strong job semantically grouping related content, final Core Topic naming still benefits from human SEO review.

Final Thoughts

THIS TOOL IS NOT PERFECT.

It still needs a human to review the clusters, make judgment calls, and plan actual content actions.

But it turned a chaotic, 1500-page manual audit into a 2-hour strategic review.

Instead of wasting days opening URLs, cross-checking titles, and second-guessing duplicate topics, I could now focus immediately on strategy, consolidation plans, and content pruning decisions.

Sometimes that’s all you need: not a perfect system, but a tool that gets you unstuck and pushes you forward faster.

If you want access to the tool, it’s about 171MB - a bit big to host directly. Just leave your email address and I’ll send it over!

Quick Disclaimer

This is a personal project. It's not a full enterprise tool.

It's just a small, scrappy solution that helped me work smarter and faster and maybe it can help you too.