Cleaning Validation Acceptance Criteria: The Worst-Case Approach FDA Expects — and Where Manufacturers Keep Stumbling

Beyond the 10 ppm default: how HBEL-derived limits, MACO calculations, and recovery studies hold up under FDA's 21 CFR 211.67 cleaning validation scrutiny.

Cleaning validation is one of those areas where a facility can be technically compliant — written procedures, completed batch records, signed protocols — and still collect a Form 483 observation. Equipment cleaning and maintenance citations under 21 CFR 211.67 have appeared among FDA’s top 10 most-cited cGMP violations for nine consecutive years. The gap between having a cleaning validation program and having one that survives modern FDA scrutiny is wider than most quality teams realize.

The worst-case approach is where that gap shows up most often.

Why the Old 10 ppm Limit Is No Longer Your Safe Harbor

For decades, the 10 ppm carryover limit was the industry default — a rule-of-thumb that first emerged from early PDA technical reports in the early 1990s and found its way into internal SOPs across thousands of facilities worldwide. Pair it with visual cleanliness and 0.001 of the minimum therapeutic dose, and you had what most firms considered a defensible acceptance criterion.

FDA and EMA have moved on from that framework. The EMA’s 2014 Guideline on Setting Health-Based Exposure Limits (HBELs) for cleaning validation — formally EMA/CHMP/CVMP/SWP/169430/2012 — established that acceptance criteria should derive from toxicological data specific to each active pharmaceutical ingredient, not from a blanket concentration limit. The core concept is the Permitted Daily Exposure (PDE): a substance-specific dose below which no adverse effects are expected, even with lifetime exposure.

FDA hasn’t issued equivalent standalone guidance on HBELs for shared facilities, but investigators have been asking about HBEL-derived limits since at least 2018. We’ve seen 483 observations cite “inadequate scientific justification” for acceptance criteria where facilities relied solely on the 10 ppm default without a toxicological risk assessment. That observation used to be rare. It isn’t anymore.

If your facility manufactures highly potent APIs — oncology compounds, hormones, or beta-lactam antibiotics — the HBEL approach isn’t optional. The math simply doesn’t support a blanket 10 ppm limit for a compound with a PDE of 0.01 mg/day. Conversely, for some lower-potency molecules, 10 ppm is unnecessarily conservative. The HBEL approach cuts both ways, and that nuance is actually useful when you’re defending limits to an investigator.

In practice, this means every product manufactured in a shared facility needs a documented PDE value, derived from either an internal toxicology review or a qualified external source. ICH Q3C provides the methodology for residual solvents; ISPE’s Risk-Based Manufacture of Pharmaceutical Products (Risk-MaPP) guide remains a solid reference framework for APIs. Both are routinely cited in FDA audit responses that hold up.

Building Your MACO: Where the Calculation Actually Gets Audited

The Maximum Allowable Carryover (MACO) is the quantitative backbone of a cleaning validation acceptance criterion. The standard calculation is straightforward in concept:

MACO (mg) = PDE (mg/day) × Minimum Batch Size of Next Product (mg) ÷ Maximum Daily Dose of Next Product (mg/day)

In practice, the inputs are where firms get into trouble — and where regulatory compliance consulting reviews consistently find the most significant gaps.

The most common error is using an “average” or “representative” batch size rather than the minimum. Using the average inflates the MACO and makes your acceptance criterion look more achievable than it is. But if FDA asks why you didn’t use the minimum batch size, there’s no defensible answer. The minimum batch size is what protects the most vulnerable patient — the one receiving the highest theoretical dose of contaminated product from the smallest possible batch.

The second frequent misstep is calculating MACO without correcting for analytical recovery. If your swab recovery study shows a mean recovery of 73%, the residue actually present on the surface is 1/0.73 times what your analytical result indicates. Failing to apply that correction means your surface limit is systematically understated. This comes up in audits more often than it should. It’s documented in ISPE guidance, it’s logically necessary, and yet protocols that calculate a surface area limit based on MACO alone — without the recovery factor — are still common.

Even if your primary acceptance criterion is HBEL-derived, cross-check your MACO against all three traditional criteria as a safety net:

Is the carryover below 10 ppm in the next product’s batch?
Is it below 0.1% of the minimum therapeutic dose?
Is the equipment visually clean?

All three should be satisfied. If your HBEL-derived limit is more stringent — which it often is for potent compounds — use that as your criterion. If the MACO passes the HBEL check but fails the 0.1% dose cross-check, that’s a signal to take a second look at your toxicological inputs.

Worst-Case Selection: More Rigorous Than “Hardest to Clean”

The worst-case approach requires identifying the combination of factors presenting the greatest contamination risk. Most cleaning SOPs describe this in broad terms: worst-case equipment, worst-case product, worst-case location. The concepts are correct. The documentation is usually what falls short.

Worst-case equipment is typically selected via a matrix or bracketing strategy. For a facility running the same active across five different granulators, testing a single representative unit may be acceptable — but only with documented rationale explaining exactly why that unit represents the worst case. Surface material, geometry, number of contact points, internal weld quality, and the presence of dead legs all factor in. FDA investigators routinely ask to see the bracketing rationale, not just the conclusion.

Worst-case product involves two dimensions that can point in different directions. First, the product with the lowest PDE — hardest to eliminate because the acceptable residue limit is smallest. Second, the product with the worst cleanability — highest solubility issues, stickiest excipient matrix, most heat-sensitive API. The former is a hazard-based criterion; the latter is process-based. Both need to be addressed. For facilities using matrix validation, the worst-case product selection methodology deserves its own controlled document rather than being buried in a validation protocol narrative.

Worst-case location is where visual inspection creates false confidence. Dead legs in piping, internal seams on blenders, the undersides of agitator shafts, and gasket interfaces are exactly where residue concentrates — and precisely where routine swabbing is most difficult. If your sampling plan doesn’t specifically address these critical zones, with documented rationale for how they’re sampled or why a rinse sample is representative, an investigator has a clear basis for a 483.

The practical test: could a quality auditor who has never seen your facility pick up your cleaning validation risk assessment and identify exactly which equipment location, which product combination, and which acceptance limit represents the worst case? If the answer is no, the documentation isn’t there yet.

Swab vs. Rinse Sampling: Choosing the Method That Will Hold Up

Both swab and rinse sampling are scientifically acceptable. The choice depends on the surface, the residue, and what you’re measuring — and the protocol must explicitly justify the choice, not just state it.

Swab sampling is preferred for surfaces with poor drainage, irregular geometry, or insoluble residues. A swab physically removes material from a defined area, which makes it the most direct measurement of surface contamination. The limitation is operator variability: swabbing technique directly affects recovery, which is exactly why recovery studies are non-negotiable — and why SOPs must specify pressure, stroke pattern, and swab saturation.

Rinse sampling is appropriate for large, smooth, well-drained surfaces and residues that are readily soluble. The risk is treating rinse samples as a universal surrogate for direct surface contact when they aren’t. A rinse sample from a 500 L tank tells you what’s coming off in solution; it doesn’t capture what remains in corners, at the liquid-air interface, or behind fittings.

The analytical method matters as well. Total Organic Carbon (TOC) per USP <643> has become the default non-specific method for detecting API residues and cleaning agent carryover — it’s sensitive, broadly applicable, and faster than HPLC for routine post-cleaning testing. But TOC cannot differentiate between API residue and residual cleaning agent. Your cleaning agent must itself be validated as TOC-negligible at the concentrations expected in rinse water, or you need an analytical method that separates the signals.

HPLC with UV detection remains appropriate for specific detection, particularly when you need to distinguish between multiple potential residues. If your cleaning agent is also carbon-based and you’re using TOC as your only analytical tool, the method is not scientifically justified for that product combination.

Recovery Studies: The Data Your Protocol Must Generate Internally

Recovery studies cannot be sourced from literature or borrowed from peer facilities. They must be generated using your swabs, your solvents, your surfaces, your analysts, and your analytical instruments. This is the most consistently underbuilt section of cleaning validation packages we review — and the one most likely to draw a 483 observation when an investigator asks for the underlying data.

The widely accepted criterion for swab recovery is a mean of ≥70% with a relative standard deviation ≤20%. Those thresholds aren’t codified in a CFR section, but they’re consistent with PIC/S PI 006 guidance and represent what investigators use as a reference point when evaluating study design. If your mean recovery lands at 62%, you have two options: optimize your swabbing methodology until you reach the threshold, or apply the actual recovery factor in your limit calculation and document why the lower recovery represents your validated condition.

What FDA is really looking for is the traceable link between recovery data and the final acceptance limit. The formula should be explicit in the protocol:

Surface Limit (μg/cm²) = MACO (μg) ÷ Surface Area (cm²) ÷ Recovery Factor

If that chain of custody — from PDE to MACO to surface limit to analytical result — is documented, clean, and traceable, an investigator can follow it. If any link is missing, the entire cleaning validation program is exposed.

One more detail that comes up frequently in regulatory compliance consulting engagements: the recovery study must cover the concentration range of your acceptance limit. If you’re validating that residues below 2 μg/cm² won’t carry over, you need to spike coupons at or near that level and demonstrate quantitative recovery at that concentration. Spiking at five times your limit and extrapolating downward isn’t scientifically defensible. Some FDA investigators will let it pass; others won’t. It’s not worth the exposure.

What to Review Before Your Next Inspection

Pull your current cleaning validation protocols and look for four specific gaps. A documented HBEL-derived acceptance criterion for each API in shared equipment. A MACO calculation that uses the minimum batch size and applies the analytical recovery correction. A worst-case selection rationale that stands as a standalone, controlled document. And internal coupon recovery data generated at or near your acceptance limit.

If any of those are missing or thin, address them before an investigator asks. Cleaning validation 483 observations are almost never about surprise findings — they’re about documentation gaps that were visible internally and left unresolved. Getting it right isn’t about cleaning better. It’s about documenting what you already know how to do.

Written by Sam Sammane, Founder & CEO, Aurora TIC | Founder, Qalitex Group. Learn more about our team

Talk to our compliance consultants. Contact us

GMP-compliant analytical and cleaning validation support testing — Qalitex Laboratories provides ISO 17025-accredited testing services that support cleaning validation studies, method development, and swab recovery verification.
Health Canada GMP compliance and pharmaceutical testing services — Androxa supports Canadian drug manufacturers with Health Canada–aligned GMP consulting and analytical testing for shared-facility cleaning programs.