One Year of Hex Setups: What Survived, What Changed, What We Learned

Over the past year, we published eleven case studies of people and teams who downloaded the Hex Constraint PDF and actually used it. A solo publisher, a freelancer, a five-person agency, a solo developer, a YouTuber, a consultant, an educator, a podcaster, an e-commerce operator, a nonprofit, and a remote team. Different roles, different industries, different starting stacks. The same constraint applied to all of them. This is the retrospective — the patterns that emerged across every case, the places where the hex held, and the places where it bent.

The Survival Pattern

The first thing that jumps out when you look across all eleven case studies is what survived the hex in nearly every one: general-purpose LLMs won. Specialized AI tools mostly lost. Nine out of eleven cases kept a general-purpose LLM — Claude or ChatGPT — as a core tool. Only two kept a specialized AI tool that didn't have a general-purpose alternative. The pattern was consistent enough to be a finding, not a coincidence.

The reason is straightforward. A general-purpose LLM with good prompts and saved templates does 80-90% of what five specialized tools do. The specialized tool might be marginally better at its one thing — a dedicated grant writing assistant produces slightly more structured first drafts than Claude with a grant template, a dedicated ad copy tool produces slightly more formatted variants than Claude with an advertising prompt. But the marginal improvement doesn't justify the cost of an additional subscription, an additional interface, an additional login, an additional set of failure modes, and an additional drain on attention. When the hex forced people to choose, the general-purpose tool won because it was good enough at everything while being great at consolidation.

The tools that survived alongside the general-purpose LLM were almost always domain-specific production tools — not AI tools per se, but tools required by the medium. Shopify for the e-commerce operator. Ghost for the publisher. An editing suite for the YouTuber. A DAW or editing tool for the podcaster. These tools earned their slots not by being AI-powered but by being essential to the output format. You can't publish a blog without a CMS. You can't sell products without a storefront. The hex doesn't argue against necessary tools — it argues against tools that duplicate what a simpler tool already does.

The tool categories that consistently failed to survive the hex: dedicated AI writing assistants (replaced by the general-purpose LLM), AI-powered analytics and forecasting tools (replaced by simpler reporting or manual analysis), AI chatbots for customer-facing use (removed without replacement in most cases), and multi-tool automation stacks (simplified or eliminated). Each of these categories shares a characteristic — they promise specialized intelligence but deliver general-purpose text generation wrapped in a specialized interface. Once people realized that the underlying capability was the same LLM they already had, the specialized wrapper stopped justifying its cost.

The Cheat Pattern

Nobody maintained perfect constraint discipline for a full year. Every single case study had at least one moment where the person added a tool back or added something new. The question is whether the addition was justified or whether it was a relapse into tool-collecting.

The pattern split roughly 60/40. About 60% of tool additions were what we'd call justified swaps — removing one tool and adding a different one that served the same function better. The publisher who swapped image generation tools after the first month. The freelancer who switched from one LLM to another when pricing changed. The agency that replaced their project management tool after the team outgrew the original. These swaps respected the constraint — the total tool count stayed the same or decreased. The hex doesn't demand loyalty to specific tools. It demands discipline about the total count.

The other 40% were relapses — adding a tool without removing one, justified by some version of "but this one is different." The most common trigger was a product launch or feature announcement. A new AI tool would ship with a feature that seemed purpose-built for the person's use case, and the constraint would feel unreasonable. "The hex says five tools, but this sixth one is clearly essential." In most cases, the person recognized the pattern within 2-4 weeks, audited the addition, and either removed it or removed something else to make room.

The honest finding is that the hex works better as a speed bump than a wall. It doesn't prevent every impulsive tool addition. It makes you notice that you're making one. The person who adds a sixth tool while running the hex is aware they're breaking the constraint, which means they're more likely to evaluate the addition critically than someone who never had the constraint at all. The tool might stay, but it stays because it earned its slot, not because it accumulated by default.

The Constraint Fatigue Pattern

Nearly every case study reported the same timeline: the first two weeks of the hex felt liberating, weeks three through six felt productive, and weeks six through eight were the danger zone. That 6-8 week mark is where constraint fatigue sets in — the initial energy of "I'm simplifying" fades, the novelty of the constrained stack wears off, and the "just one more tool" impulse returns in force.

The trigger at the fatigue point was usually external. A colleague mentions a tool. A newsletter features a new release. A YouTube video demonstrates something that looks relevant. The internal dialogue shifts from "I don't need that" to "maybe I should test it" to "I'll just try it for a week." The people who survived the fatigue point without relapsing had one thing in common: they had a written hex — an actual document listing their tools, the justification for each one, and the rule for adding or swapping. The constraint was externalized, not just felt.

The people who didn't write it down were more likely to drift. The constraint lived in their head, which meant it was subject to the same rationalization that got them to eleven tools in the first place. "I said five tools, but what I really meant was five core tools, and this new one is supplementary, so it doesn't count." Written constraints are harder to negotiate with than mental ones.

After the fatigue point, the constraint typically stabilized. People who made it past eight weeks without relapsing reported that the impulse to add tools diminished significantly — not because the impulse disappeared, but because the habit of asking "does this earn a slot" became automatic. The constraint became a reflex rather than a discipline. That's when it sticks.

The Unexpected Benefit

We expected the biggest reported benefit of the hex to be productivity — more output, less time lost to tool-switching. That's what the constraint promises, and it's what most people said they wanted when they downloaded the PDF.

It wasn't what they reported as the biggest benefit.

Across nine of eleven case studies, the most frequently cited benefit was reduced decision fatigue. Not "I got more done" but "I stopped spending mental energy deciding which tool to use for each task." The freelancer who cut from eleven tools to five didn't report writing more copy. They reported that the copy came easier because they weren't choosing between three writing tools before starting. The educator who cut to one LLM didn't report better course prep. They reported that course prep felt less draining because the tool choice was already made.

This makes sense when you think about what tool sprawl actually costs. The direct cost is subscription dollars and context-switching time. The indirect cost — the one that's harder to measure but might be larger — is the ongoing decision overhead of maintaining a complex stack. Which tool for this task? Am I using the right one? Should I switch? Is the new one better? These questions don't show up in time tracking because they happen in the background, as a persistent low-level cognitive tax. Removing the questions freed up bandwidth that people redirected — often unconsciously — toward the actual work.

The developer case study put it most clearly: "I didn't ship more code per hour. I shipped more hours of code per day, because I wasn't losing an hour every morning deciding which AI tool configuration to use." The hex didn't make the tools better. It made the person more available to use them.

The Tools That Never Earned Their Slot Back

Some tool categories were cut in the initial hex audit and never came back — not in any of the eleven case studies, not at the three-month check-in, not at six months, not at a year. These are the categories that the hex suggests are genuinely unnecessary for most individual practitioners and small teams.

AI-powered meeting note tools (Otter, Fireflies, and similar). Every case study that used these tools cut them, and none added them back. The replacement was consistent: one person takes notes, and if the notes need cleanup, the general-purpose LLM handles it. The dedicated meeting AI was producing outputs that nobody read consistently — a problem that better transcription couldn't fix because the problem was consumption, not production.

AI chatbots for customer-facing use. The e-commerce operator's experience was representative — the chatbot was actively losing sales. The consultant dropped their client-facing AI assistant. The nonprofit never had one. No case study in the series reported a customer-facing AI chatbot that clearly earned its slot. The common finding was that customers preferred static resources (FAQ pages, documentation) over AI interactions that might hallucinate.

Demand forecasting and AI analytics tools. The e-commerce operator, the agency, and the consultant all used some form of AI-powered analytics or forecasting. All cut them. The replacement was usually a spreadsheet or the platform's native analytics. The finding across cases was that the AI forecasting added a layer of false precision — the tools produced confident-looking projections that weren't meaningfully better than the human's informed estimates.

These three categories share a pattern: they produce impressive-looking outputs that don't survive contact with actual use. The meeting notes look thorough but don't get read. The chatbot sounds helpful but frustrates customers. The forecast looks data-driven but isn't more accurate than intuition. The hex reveals these tools because the constraint forces you to evaluate outputs, not capabilities.

The Hex as Practice vs. One-Time Audit

The most important finding from the retrospective is about how the hex works over time. The initial hex — the moment you audit your stack, cut the extras, and commit to a constraint — is valuable but temporary. It's a one-time purge. The real value of the hex is what comes after: the ongoing practice of evaluating every tool against the constraint before it earns a slot.

The case studies that reported the best long-term outcomes treated the hex as a quarterly practice, not a one-time event. Every three months, review the stack. For each tool, ask: did I use this meaningfully in the last 90 days? Did it produce output that I shipped? Would I re-subscribe if the subscription lapsed? The quarterly review prevents drift — the slow re-accumulation of tools that happens when the initial constraint energy fades and the normal tool-discovery impulse resumes.

The case studies that treated the hex as a one-time event — cut the stack, feel good, move on — showed tool creep returning within 4-6 months. Not back to the original sprawl, but trending in that direction. The constraint without maintenance is a diet without follow-up. It works for a while, and then the old patterns return.

What We'd Change

After twelve case studies and a year of follow-up, here's what we'd modify about the hex framework.

First, the initial tool count recommendation should be softer. The hex suggests a hard cap, but several case studies found that the right number depends on the role. A solo publisher needs fewer tools than an agency team. An e-commerce operator needs a different configuration than a developer. The constraint should be "the minimum that produces your output" rather than a specific number — though having a specific number as a starting point is useful for forcing the initial audit.

Second, the framework should more explicitly separate "production tools" from "AI tools." The publisher's CMS isn't an AI tool. The developer's IDE isn't an AI tool. The e-commerce operator's Shopify account isn't an AI tool. These are production infrastructure. The hex constraint applies to the AI layer on top of that infrastructure — the tools you added because AI. Conflating the two leads to unnecessary stress about whether your CMS "counts" toward the hex.

Third, the framework needs a stronger emphasis on the quarterly review. Too many people treated the hex as a one-time event. The value is in the practice, not the purge. We'd build the quarterly review into the PDF itself — a specific date, a specific set of questions, a specific process for evaluating whether a tool still earns its slot.

The hex isn't a rule. It's a discipline. And like any discipline, it works when you practice it, not when you announce it. The eleven case studies in this series — the publisher, the freelancer, the agency, the developer, the YouTuber, the consultant, the educator, the podcaster, the e-commerce operator, the nonprofit, the remote team — all started with the same PDF and the same constraint. What separated the ones who got lasting value from the ones who drifted back wasn't the initial cut. It was whether they kept asking the question.


This is part of CustomClanker's Hex in the Wild series — real setups from real people.