
Picture by Editor
# Introduction
For a lot of organizations, notably these in regulated industries or with restricted technical infrastructure, Excel and its XLMiner add-in function the first platform for predictive modeling and machine studying workflows.
But Excel’s accessibility masks a important hole: the distinction between working fashions and constructing reliable analytics programs. Engaged on a mortgage approval prediction mission, I found that Excel-based machine studying fails not attributable to algorithmic limitations, however attributable to some practices which are steadily ignored.
This text transforms the unwelcome experiences into 5 complete frameworks that may elevate your Excel-based machine studying work.
# Lesson 1: A number of Strategies for Outlier Detection
Outlier dealing with is extra artwork than science, and untimely removing can get rid of reputable excessive values that carry necessary info. In a single case, all residential asset values above the ninety fifth percentile have been eliminated utilizing a easy IQR calculation, below the idea they have been errors. Later evaluation revealed the removing of reputable ultra-high-value properties, a related section for big mortgage approvals.
The lesson: Use a number of detection strategies and guide evaluate earlier than removing. Create a complete outlier detection framework.
In a brand new sheet adjoining to the principle information, create detection columns:
- Column A: Unique Worth (residential_assets_value)
- Column B: IQR Technique
=IF(A2 > QUARTILE.INC($A$2:$A$4270,3) + 1.5*(QUARTILE.INC($A$2:$A$4270,3)-QUARTILE.INC($A$2:$A$4270,1)), "Outlier_IQR", "Regular") - Column C: 3-Sigma Technique
=IF(ABS(A2-AVERAGE($A$2:$A$4270)) > 3*STDEV($A$2:$A$4270), "Outlier_3SD", "Regular") - Column D: Percentile Technique
=IF(A2 > PERCENTILE.INC($A$2:$A$4270,0.99), "Outlier_P99", "Regular") - Column E: Mixed Flag
=IF(COUNTIF(B2:D2,"Outlier*")>=2, "INVESTIGATE", "OK") - Column F: Guide Overview [Notes after investigating]
- Column G: Remaining Determination [Keep/Remove/Transform]
This multi-method method revealed patterns in my mortgage information:
- Values flagged by all three strategies (IQR, 3-sigma, and percentile): Doubtless errors
- Values flagged by IQR however not 3-sigma: Professional excessive values in skewed distributions
- Values flagged solely by percentile: The intense however legitimate circumstances I virtually misplaced
The “Guide Overview” column is essential. For every flagged statement, doc findings resembling: “Professional luxurious property, verified in opposition to public data” or “Doubtless information entry error, worth exceeds market most by 10x.”
# Lesson 2: At all times Set Random Seeds
Few experiences are extra irritating than presenting glorious mannequin outcomes, then being unable to breed these precise numbers when making ready the ultimate report. This situation occurred with a classification tree mannequin: At some point’s validation accuracy was 97.3%, however subsequent day’s was 96.8%. The distinction appears small, however it undermines credibility. Makes the viewers marvel which quantity is actual and the way a lot can this evaluation be trusted.
The lesson: The wrongdoer is random partitioning with no fastened seed. Most machine studying algorithms contain randomness at some stage.
- Information partitioning: Which observations go into coaching vs. validation vs. take a look at units
- Neural networks: Preliminary weight randomization
- Some ensemble strategies: Random function choice
XLMiner makes use of random processes for partitioning the info. Operating the identical mannequin twice with an identical parameters yields barely totally different outcomes as a result of the coaching/validation break up differs every time.
The answer is straightforward however non-obvious. When utilizing XLMiner’s partition performance (present in most mannequin dialogs):
- Examine the field labeled “Set seed” (it is unchecked by default)
- Enter a particular integer: 12345, 42, 2024, or any memorable quantity
- Doc this seed worth within the Mannequin Log
Now, each time the mannequin is run with this seed:
- Similar coaching/validation/take a look at splits
- Similar mannequin efficiency metrics
- Similar predictions for a similar observations
- Good reproducibility
Right here is an instance from the mortgage approval dataset with out seed (three runs of an identical logistic regression):
- Run 1: Validation Accuracy = 92.4%, F1 = 0.917
- Run 2: Validation Accuracy = 91.8%, F1 = 0.923
- Run 3: Validation Accuracy = 92.1%, F1 = 0.919
And with with seed=12345 (three runs of an identical logistic regression):
- Run 1: Validation Accuracy = 92.1%, F1 = 0.928
- Run 2: Validation Accuracy = 92.1%, F1 = 0.928
- Run 3: Validation Accuracy = 92.1%, F1 = 0.928
The distinction issues enormously for credibility. When tasked with recreating the evaluation, it may be accomplished with confidence, figuring out the numbers will match.
Vital caveat: The seed controls randomness in partitioning and initialization, however it does not make the evaluation resistant to different modifications. If information is modified (including observations, altering transformations) or mannequin parameters are adjusted, outcomes will nonetheless differ, as they need to.
# Lesson 3: Correct Information Partitioning: The Three-Means Break up
Associated to reproducibility is partition technique. XLMiner’s default settings create a 60/40 coaching/validation break up. This appears affordable till the query arises: the place’s the take a look at set?
A typical mistake entails constructing a neural community, tuning it based mostly on validation efficiency, then reporting these validation metrics as last outcomes.
The lesson: With no separate take a look at set, the optimization happens straight on the info being reported, inflating efficiency estimates. The right partitioning technique makes use of three units.
1. Coaching Set (50% of Information)
- The place the mannequin learns patterns
- Used to suit parameters, coefficients, or weights
- For the mortgage dataset: ~2,135 observations
2. Validation Set (30% of Information)
- For mannequin choice and hyperparameter tuning
- Used to check totally different fashions or configurations
- Helps choose the very best pruned tree, optimum cutoff worth, or preferrred neural community structure
- For the mortgage dataset: ~1,280 observations
3. Check Set (20% of Information)
- “Remaining examination” – solely rating as soon as
- Used ONLY in spite of everything modeling choices are full
- Gives unbiased estimate of real-world efficiency
- For the mortgage dataset: ~854 observations
Essential rule: By no means iterate on take a look at set efficiency. The second a mannequin is chosen as a result of “it performs higher on the take a look at set,” that take a look at set turns into a second validation set, and efficiency estimates change into biased.
That is my workflow now:
- Set seed to 12345
- Partition 50/30/20 (coaching/validation/take a look at)
- Construct a number of mannequin variants, evaluating every on validation set solely
- Choose the very best mannequin based mostly on validation efficiency and enterprise necessities
- Rating the take a look at set precisely as soon as with the chosen mannequin
- Report take a look at set efficiency because the anticipated real-world final result
Right here is an instance from the mortgage approval mission:
| Mannequin Model | Coaching Acc | Validation Acc | Check Acc | Chosen? |
|---|---|---|---|---|
| Logistic Regression (all vars) | 90.6% | 89.2% | Not scored | No |
| Logistic Regression (stepwise) | 91.2% | 92.1% | Not scored | No |
| Classification Tree (depth=7) | 98.5% | 97.3% | Not scored | Sure |
| Classification Tree (depth=5) | 96.8% | 96.9% | Not scored | No |
| Neural Community (7 nodes) | 92.3% | 90.1% | Not scored | No |
After choosing the Classification Tree (depth=7) based mostly on validation efficiency, the take a look at set was scored precisely as soon as: 97.4% accuracy. This take a look at accuracy represents the anticipated manufacturing efficiency.
# Lesson 4: The Coaching/Validation Hole: Recognizing Overfitting Earlier than It Hurts
The issue: Preliminary have a look at the classification tree outcomes from the mission report appear promising.
Coaching Information Efficiency:
- Accuracy: 98.45%
- Precision: 99%
- Recall: 96%
- F1 Rating: 98.7%
The mannequin appeared profitable up to now till the main focus bought shifted to the validation outcomes.
Validation Information Efficiency:
- Accuracy: 97.27%
- Precision: 98%
- Recall: 94%
- F1 Rating: 97.3%
The distinction appeared small, simply 1.18% accuracy distinction. However figuring out whether or not the hole constituted an issue required a scientific framework.
The lesson: It’s vital to grasp when the mannequin memorizes as a substitute of learns.
The sensible resolution: Create an overfitting monitor. Construct a easy however systematic comparability sheet that makes overfitting apparent.
Step 1: Create the Comparability Framework
Here’s a mannequin efficiency comparability within the “Overfitting_Monitor” sheet:
| Metric | Coaching | Validation | Hole | Hole % | Standing |
|---|---|---|---|---|---|
| Accuracy | 98.45% | 97.27% | 1.18% | 1.20% | ✓ Good |
| Precision | 99.00% | 98.00% | 1.00% | 1.01% | ✓ Good |
| Recall | 96.27% | 94.40% | 1.87% | 1.94% | ✓ Good |
| F1 Rating | 98.76% | 97.27% | 1.49% | 1.51% | ✓ Good |
| Specificity | 96.56% | 92.74% | 3.82% | 4.06% | ? Watch |
And listed here are the interpretation guidelines:
- Hole < 3%: ✅ Good – Mannequin generalizing properly
- Hole 3-5%: ❓ Watch – Acceptable however monitor carefully
- Hole 5-10%: ⚠️ Regarding – Attainable overfit, take into account simplification
- Hole > 10%: ❌ Drawback – Particular overfit, should handle
And that is the detailed evaluation:
- Total Evaluation: GOOD
- Reasoning: All main metrics inside 2% hole. Specificity hole barely larger however nonetheless acceptable. Mannequin seems to generalize properly.
- Suggestion: Proceed with take a look at set analysis.
Step 2: Add the calculation formulation
Cell: Hole (for Accuracy)=[@Training] - [@Validation]
Cell: Hole % (for Accuracy)=([@Training] - [@Validation]) / [@Training]
Cell: Standing (for Accuracy)
=IF([@[Gap %]]<0.03, "✓ Good",
IF([@[Gap %]]<0.05, "? Watch",
IF([@[Gap %]]<0.10, "⚠ Regarding", "✗ Drawback")))
Step 3: Create a Visible Overfitting Chart
Construct a side-by-side bar chart evaluating coaching vs. validation for every metric. This makes patterns immediately seen:

When bars are shut, the mannequin generalizes properly. When coaching bars are for much longer than validation bars, there may be overfitting.
Evaluating Throughout Completely different Fashions
The true worth comes from evaluating overfitting patterns throughout the mannequin choices. Right here is the comparability for the “Model_Overfitting_Comparison” sheet:
| Mannequin | Prepare Acc | Val Acc | Hole | Overfitting Danger |
|---|---|---|---|---|
| Logistic Regression | 91.2% | 92.1% | -0.9% | Low (damaging hole) |
| Classification Tree | 98.5% | 97.3% | 1.2% | Low |
| Neural Community (5 nodes) | 90.7% | 89.8% | 0.9% | Low |
| Neural Community (10 nodes) | 95.1% | 88.2% | 6.9% | Excessive – Reject this |
| Neural Community (14 nodes) | 99.3% | 85.4% | 13.9% | Very Excessive – Reject this |
Interpretation: The neural community with 10+ nodes is clearly overfitting. Regardless of excessive coaching accuracy (99.3%), validation accuracy drops to 85.4%. The mannequin memorized coaching information patterns that do not generalize.
Most suitable option: Classification tree
- Excessive efficiency (97.3% validation)
- Minimal overfitting (1.2% hole)
- Good steadiness of complexity and generalization
Listed below are some easy methods to scale back overfitting when noticed:
- For Classification Bushes: Cut back most depth or enhance minimal samples per node
- For Neural Networks: Cut back variety of nodes or layers
- For Logistic Regression: Take away variables or use stepwise choice
- For All Fashions: Add extra coaching information if doable
# Lesson 5: Implement Information Validation for Categorical Variables
Information entry errors are silent killers of machine studying initiatives. A single typo, resembling “gradute” as a substitute of “graduate”, creates a 3rd class in what ought to be a binary variable. The mannequin now has an surprising function worth it is by no means seen throughout coaching, probably inflicting errors throughout deployment or, worse, silently producing incorrect predictions.
The prevention: Excel’s information validation function. Here is the implementation protocol for categorical variables:
In a hidden sheet (title it “Validation_Lists”), create lists of legitimate values:
- For schooling: Create a column with entries “Graduate” and “Not Graduate”
- For self-employment: Create a column with entries “Sure” and “No”
- For mortgage standing: Create a column with entries “Authorised” and “Rejected”
Within the information entry sheet:
- Choose your complete column for a categorical variable (e.g., column containing schooling information)
- Information → Information Validation → Settings tab
- Permit: Listing
- Supply: Navigate to the hidden validation sheet and choose the suitable record
- Error Alert tab: Fashion = Cease, with a transparent message: “Solely ‘Graduate’ or ‘Not Graduate’ are legitimate entries”
Now it is inconceivable to enter invalid values. Customers see a dropdown with legitimate selections, eliminating typographical errors completely.
For numeric variables with identified ranges, apply comparable validation to stop inconceivable values:
- Credit score scores: Should be between 300 and 900
- Mortgage time period: Should be between 1 and 30 years
- Annual earnings: Should be larger than 0
Choose the column, apply information validation, set:
- Permit: Entire Quantity (or Decimal)
- Information: between
- Minimal: 300 (for credit score scores)
- Most: 900
# Remaining Ideas
Here’s a abstract of the teachings outlined within the article.

5 Excel AI Classes I Discovered the Arduous Means (click on to enlarge)
The 5 practices on this article — multi-method outlier detection, setting random seeds, three-way information partitioning, monitoring the training-validation hole, and implementing information validation — share a typical thread: they’re all easy to implement however devastating to omit.
None of those practices require superior statistical information or advanced programming. They do not demand extra software program or costly instruments. Excel XLMiner is a robust device for accessible machine studying.
Rachel Kuznetsov has a Grasp’s in Enterprise Analytics and thrives on tackling advanced information puzzles and trying to find contemporary challenges to tackle. She’s dedicated to creating intricate information science ideas simpler to grasp and is exploring the varied methods AI makes an impression on our lives. On her steady quest to study and develop, she paperwork her journey so others can study alongside her. You’ll find her on LinkedIn.
