30-second answer: Account fit scoring rates how closely a company matches an ideal customer profile, expressed as a number or letter grade so revenue teams can rank-order accounts. The vocabulary covers model classes (rule-based, weighted, machine-learned), input classes (firmographic, technographic, behavioural, intent), output classes (fit score, engagement score, propensity, MQA), and operating terms (decay, calibration, threshold, retraining, lookalike). This glossary defines 22 fit-scoring terms.
See account fit scoring driving routing inside Abmatic AI, book a demo.
Rule-based scoring uses explicit if-then rules: industry equals SaaS, headcount between 200 and 5000, revenue between 50M and 500M. The scoring is transparent and easy to debug but brittle as ICP definitions evolve.
Weighted scoring assigns numeric weights to each fit attribute (industry weight, size weight, stack weight) and sums them into a composite score. It is more flexible than rule-based and remains transparent.
Machine-learned scoring trains a model (logistic regression, gradient-boosted trees) on historical closed-won versus closed-lost data, learning the attribute weights automatically. It captures non-linear interactions but requires training volume and ongoing recalibration.
Lookalike modelling builds a fit score by training on existing high-value customers as positive examples and finding accounts with similar attributes. It is the standard cold-start approach when explicit fit definitions are weak.
A composite score combines fit and engagement (or fit and intent) into a single rank-orderable number. The composite drives MQA thresholds and routing.
Firmographic inputs include industry, employee count, revenue, geography, and ownership type. They are the foundation of fit scoring because product-market fit usually maps to firmographic attributes.
Technographic inputs describe the company's technology stack and feed fit scoring when the product depends on stack compatibility (a Salesforce-native product weights Salesforce-on-stack high).
Behavioural inputs include site visits, content consumption, demo requests, and ad engagement. Behavioural inputs often feed engagement scoring rather than fit scoring; the distinction is that fit is structural and engagement is observed.
Intent inputs include first-party, third-party, and predictive intent signals. They time the outreach but rarely belong inside the fit score itself; intent answers when, fit answers whether.
CRM disposition inputs include past opportunity outcomes (won, lost, no-decision), churn history, and prior account-team designation. They are powerful predictors but introduce path dependence in models that retrain on them.
The fit score is the structural-match number, typically 0 to 100 or letter grade A through F. It answers should we sell to this account. See account fit score.
The engagement score is the activity-based number, summing weighted contact actions across an account. It answers is this account paying attention right now.
A propensity score is a forward-looking probability that an account will convert in a defined window. It is usually machine-learned. Propensity scores are the most actionable single number for prioritization.
The MQA threshold is the composite-score cutoff above which an account is marked qualified for sales engagement. The right threshold balances precision and recall for the sales capacity available. See marketing qualified account.
Tier thresholds split scored accounts into 1:1, 1:few, and 1:many cohorts. Tier 1 is reserved for the highest-fit, highest-propensity accounts.
Calibration tunes the scoring function so that probability predictions match observed conversion rates. A score of 80 should imply roughly an 80 percent conversion rate; if it implies 30 percent, the model is miscalibrated.
Score decay reduces an account's engagement or intent score as time passes since the most recent activity. Without decay, dormant accounts retain inflated scores.
Retraining cadence is how often the model is refit on fresh data. Most production fit-scoring models retrain monthly; some retrain weekly when signal volume permits.
Drift is the gradual divergence between the data the model was trained on and the data it now sees in production. Drift detection triggers retraining.
Holdout validation reserves a sample of accounts for testing the model out-of-sample. Without holdout, scoring metrics are over-optimistic.
The confusion matrix tabulates true-positives (high score, won), false-positives (high score, lost), false-negatives (low score, won), and true-negatives (low score, lost). It is the basic diagnostic for fit-score quality.
Score-driven routing assigns accounts and leads to sales reps based on the composite score, sending high-fit, high-engagement accounts to the highest-tier sellers. See how to route leads from intent signals.
Score-based sales plays trigger pre-defined outreach sequences when an account crosses a threshold. The play is the action, the threshold is the trigger. See how to set up account scoring and how to score account fit without a data team.
Ready to see fit scoring driving real routing rules? Book a demo of Abmatic AI.
Lead scoring is contact-level and traditionally combines fit and engagement into one number. Fit scoring is account-level and usually separates fit from engagement so the two can be tuned and weighted independently. See lead scoring.
The right answer depends on training data and category maturity. Programs with under 200 closed-won deals usually start with weighted rules because there is not enough data to fit a model reliably. Above 200 closed-won deals, machine-learned scoring usually outperforms rules.
Quarterly is the standard cadence. The threshold drifts as ICP, sales capacity, and category dynamics change. A threshold set in Q1 may saturate sales by Q3 if the program scales.
Reporting them separately and combining them at routing time is the cleaner design. It lets revenue teams see why an account is or is not qualified (high fit but low engagement is different from low fit but high engagement) and allows independent threshold tuning.
Training only on closed-won data without including closed-lost or no-decision data. Without negatives, the model cannot learn what differentiates wins from losses; it just learns to recognize past winners. Always include the loss class.
Two patterns produce the most reliable fit scoring in production. The first is the explicit-then-learned pattern: launch with a documented weighted rule set, run it for two quarters to build labelled outcomes, then introduce a machine-learned overlay that retains the explicit rules as transparency anchors. The second is the per-segment pattern: train separate fit models for distinct go-to-market motions (new business versus expansion, tier 1 versus tier 3) rather than forcing a single model across all motions. Both patterns trade simplicity for accuracy and tend to outperform single-model production deployments.
The most common anti-patterns are training on a too-narrow time window (only last quarter's wins, ignoring seasonality), conflating fit and engagement into a single weight (which makes both untunable), and refreshing the model in place without maintaining a champion-challenger comparison (which obscures regression). Avoiding these three patterns alone tends to materially improve fit-scoring program reliability.
Fit scoring is a foundational construct for B2B revenue programs, and one of the most error-prone. The cleanest stacks separate fit from engagement, retrain on fresh data, validate against a holdout, and revisit thresholds quarterly. Use this glossary as a reference when reading fit-scoring vendor documentation and designing routing rules.
See fit scoring built into orchestration inside Abmatic AI. Book a demo.