Khorikov Unit Testing Guide
A practical guide to Vladimir Khorikov’s unit testing principles, extracted from Unit Testing: Principles, Practices, and Patterns (2020). Go-focused examples throughout.
I recently read Khorikov’s book and found myself returning to the same concepts repeatedly. This is my distillation—the ideas I actually use when writing and reviewing tests.
1. The Goal: Sustainable Growth
Unit testing exists to enable sustainable growth of the software project. Not coverage. Not checking boxes. Not proving you did work.
The challenge: as projects grow, development slows. New features take longer. Bugs multiply. Code becomes fragile. Tests should prevent this decay by:
- Catching regressions before they reach production
- Enabling confident refactoring
- Serving as living documentation
The trap: Teams chase coverage metrics, accumulating tests that don’t actually help. A project with 90% coverage can still have:
- Tests that break on every refactor (false positives)
- Tests that miss real bugs (false negatives)
- Tests that take too long to run
- Tests no one can understand
The goal is sustainable growth, not test quantity.
2. Four Pillars of a Good Test
Every test can be evaluated on four dimensions:
| Pillar | Description |
|---|---|
| Protection against regressions | Catches bugs when code changes |
| Resistance to refactoring | No false positives when internals change |
| Fast feedback | Quick execution |
| Maintainability | Easy to read and change |
You cannot maximize all four—there are inherent trade-offs. But resistance to refactoring is non-negotiable.
Why Resistance to Refactoring Matters Most
False positives devastate teams. Khorikov tells the “wolf crying” story:
A test suite with many false positives conditions developers to ignore test failures. “Oh, that test always fails after changes—just re-run it.” Soon, real bugs slip through because failures are dismissed. The test suite becomes useless, then abandoned.
Root cause of false positives: Tests coupled to implementation details. When tests verify how code works rather than what it accomplishes, any internal change triggers failures—even when behavior is correct.
The solution: Test observable behavior only.
3. Observable Behavior vs Implementation Details
This is the core distinction. Get this right and your tests will be valuable. Get it wrong and they’ll be a burden.
Observable behavior = operations OR state that help clients achieve their goals.
Everything else = implementation details.
Unit of Behavior, Not Unit of Code
A common mistake: treating “unit test” as “test one class.” This couples tests to class structure, which is an implementation detail.
“The number of classes it takes to implement such a unit of behavior is irrelevant.”
Khorikov uses the dog analogy:
Correct: “When I call my dog, he comes to me.”
Wrong: “When I call my dog, his front left leg moves first, then his right leg, then…”
The dog’s leg movement is an implementation detail. The observable behavior is: dog comes when called.
Well-Designed API Automatically Improves Tests
The connection between good API design and good tests is direct:
- Hide implementation details → Tests can only verify observable behavior (because that’s all they can see)
- Expose minimum public API → Fewer things to test, less coupling
- Encapsulate invariants → Tests verify correct behavior, not internal state
“Making the API well-designed automatically improves unit tests.”
If your tests need to access private fields or mock internal collaborators, that’s a sign your API is leaking implementation details.
Encapsulation Connection
Encapsulation isn’t just about hiding data—it’s about protecting against invariant violations.
When you leak implementation details:
- Clients can violate invariants → bugs
- Tests couple to internals → false positives
Tell-don’t-ask: Bundle data with the functions that operate on it. Don’t ask an object for its state, do calculations, then tell it what its new state should be. Tell the object what you want and let it manage its own state.
Go Example: Testing Result vs Algorithm
// BAD: Testing the algorithm (implementation detail)
func TestCalculateDiscount_UsesCorrectFormula(t *testing.T) {
// This test verifies HOW the discount is calculated
customer := Customer{TotalPurchases: 1000}
// Checking intermediate steps or formula application
rate := customer.discountRate() // Exposes internal method
assert.Equal(t, 0.1, rate) // Testing implementation
expected := 100.0 * rate // Replicating the algorithm
actual := customer.CalculateDiscount(100.0)
assert.Equal(t, expected, actual)
}
// GOOD: Testing the result (observable behavior)
func TestCalculateDiscount_GoldCustomer(t *testing.T) {
// This test verifies WHAT the discount is
customer := Customer{TotalPurchases: 1000} // Gold tier
discount := customer.CalculateDiscount(100.0)
// We don't care HOW it's calculated, just that it's correct
assert.Equal(t, 10.0, discount) // Gold customers get 10% off
}
The bad test will break if you change the formula implementation, even if the result stays correct. The good test only breaks if actual behavior changes.
4. Three Styles of Unit Testing
| Style | Description | Example |
|---|---|---|
| Output-based | Feed input, check return value | result := Calculate(input) |
| State-based | Check state after operation | obj.DoSomething(); assert(obj.Field) |
| Communication-based | Verify interactions with collaborators | mock.AssertCalled("Method") |
Output-based produces the highest quality tests because:
- No coupling to internal state (state-based)
- No coupling to collaborator interactions (communication-based)
- Pure function in, result out
Go Example: Prefer Output-Based
// OUTPUT-BASED (best): Pure function, check return value
func TestCalculateTotal(t *testing.T) {
order := Order{
Items: []Item{
{Product: "Book", Price: 10.00, Quantity: 2},
{Product: "Pen", Price: 1.50, Quantity: 5},
},
}
total := CalculateTotal(order)
assert.Equal(t, 27.50, total)
}
// STATE-BASED (acceptable): Check object state after operation
func TestCart_AddItem(t *testing.T) {
cart := Cart{}
item := Item{Product: "Book", Price: 10.00}
cart = cart.Add(item) // Value semantics: returns new cart
assert.Equal(t, 1, cart.ItemCount())
}
// COMMUNICATION-BASED (avoid): Verify mock interactions
func TestOrderService_CallsRepository(t *testing.T) {
mockRepo := &MockOrderRepository{}
service := NewOrderService(mockRepo)
service.PlaceOrder(order)
mockRepo.AssertCalled(t, "Save", order) // Testing interaction
}
Design your code to enable output-based testing. Push side effects to the boundaries.
5. The Four Quadrants (What to Test)
Few Collaborators Many Collaborators
───────────────── ──────────────────
High Complexity Domain/Algorithms Overcomplicated
or Domain Sig → UNIT TEST → REFACTOR
Low Complexity Trivial Controllers
→ DON'T TEST → INTEGRATION TEST
Key insight: “Complexity OR domain significance”—only one is required to warrant unit testing.
Go Examples for Each Quadrant
// DOMAIN/ALGO (unit test): High domain significance, few collaborators
// Business logic that decides pricing rules
func (u User) CalculateDiscount(purchaseAmount float64) float64 {
if u.Type == CustomerTypeGold {
return purchaseAmount * 0.10
}
if u.Type == CustomerTypeSilver {
return purchaseAmount * 0.05
}
return 0
}
// Test it thoroughly:
func TestUser_CalculateDiscount(t *testing.T) {
tests := []struct {
name string
userType CustomerType
amount float64
want float64
}{
{"gold customer", CustomerTypeGold, 100.0, 10.0},
{"silver customer", CustomerTypeSilver, 100.0, 5.0},
{"regular customer", CustomerTypeRegular, 100.0, 0.0},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
u := User{Type: tt.userType}
assert.Equal(t, tt.want, u.CalculateDiscount(tt.amount))
})
}
}
// TRIVIAL (don't test): Low complexity, low significance
// Simple getter—no logic, no business rules
func (u User) Email() string { return u.email }
// Don't waste time testing this
// CONTROLLER (integration test): Low complexity, many collaborators
// Orchestrates other components—no complex logic itself
func (c UserController) ChangeEmail(userID int, newEmail string) error {
user, err := c.userRepo.GetByID(userID) // collaborator
if err != nil {
return err
}
company, err := c.companyRepo.Get() // collaborator
if err != nil {
return err
}
user = user.ChangeEmail(newEmail, company) // domain logic (value semantics)
return c.userRepo.Save(user) // collaborator
}
// Integration test this—verify it correctly wires components together
// OVERCOMPLICATED (refactor!): High complexity + many collaborators
// This function is trying to do too much
func ProcessOrder(orderID string, db Database, paymentAPI PaymentGateway,
emailService EmailGateway, inventory InventoryService) error {
// 50 lines of validation logic
// payment processing rules
// inventory checks
// email notifications
// database updates
}
// Split into: domain logic (deep) + controller (wide)
6. Mocks: When OK vs Design Smell
Classical vs London Schools
| School | Approach | Unit Definition |
|---|---|---|
| London (mockist) | Mock all collaborators | Single class |
| Classical | Use real collaborators | Unit of behavior |
Khorikov sides with classical. The London school couples tests to class structure, which is an implementation detail.
Two Types of Communication
| Type | Definition | Mock? |
|---|---|---|
| Intra-system | Between classes inside your application | No—implementation detail |
| Inter-system | Between your application and external systems | Yes—observable behavior |
When Mocks Are OK: External Boundaries
Mock only inter-system communications—interactions with the external world:
- Email gateway
- Payment processor
- External APIs
- Message queues
These are observable behavior of your system as a whole. The outside world can see that you sent an email.
When Mocks Are a Design Smell: Internal Collaborators
If you need mocks for intra-system communications, that’s a code smell:
“Instead of finding ways to test a large, complicated graph of interconnected classes, you should focus on not having such a graph in the first place.”
“The use of mocks only hides this problem; it doesn’t tackle the root cause.”
The fix is to redesign, not to reach for mocks.
Go Example
// OK: Mocking external email service
// This is inter-system communication—observable to outside world
func TestOrderProcessor_SendsConfirmationEmail(t *testing.T) {
mockGateway := &SpyEmailGateway{}
processor := OrderProcessor{EmailGateway: mockGateway}
processor.CompleteOrder(order)
assert.True(t, mockGateway.SendCalled)
assert.Equal(t, order.CustomerEmail, mockGateway.LastRecipient)
}
// SMELL: Mocking internal repository
// If you need this, your domain logic is too coupled to infrastructure
func TestOrderService_MocksRepository(t *testing.T) {
mockRepo := &MockOrderRepository{} // ← Design smell
mockRepo.On("FindByID", "123").Return(order)
service := OrderService{Repo: mockRepo}
result := service.GetOrderTotal("123")
mockRepo.AssertCalled(t, "FindByID", "123")
}
// BETTER: Refactor so domain logic doesn't need repository
func TestCalculateOrderTotal(t *testing.T) {
order := Order{Items: []Item{{Price: 10}, {Price: 20}}}
total := order.CalculateTotal() // Pure function, value receiver
assert.Equal(t, 30.0, total)
}
7. Architecture for Testability
Hexagonal Architecture
┌─────────────────────────────────────────┐
│ Application Service │ ← Integration test
│ (Controller/Orchestrator) │
├─────────────────────────────────────────┤
│ Domain Layer │ ← Unit test
│ (Business logic, algorithms) │
├─────────────────────────────────────────┤
│ Infrastructure │
│ (DB, APIs, file system, etc.) │
└─────────────────────────────────────────┘
- Domain layer: Pure business logic, no external dependencies → unit test
- Application services: Orchestration, talks to external world → integration test
- Only mock at the boundary between your system and external systems
Humble Object Pattern
When code has both high complexity AND many collaborators (the “overcomplicated” quadrant), split it:
- Deep code: Complex logic, few/no collaborators → unit test
- Wide code: Simple orchestration, many collaborators → integration test
“Code depth vs code width—never both.”
Go Example
// DEEP: User.ChangeEmail - domain logic, few collaborators
// Complex business rules, but no external dependencies
// Uses value semantics: returns modified copy
func (u User) ChangeEmail(newEmail string, company Company) User {
if company.IsEmailCorporate(newEmail) {
u.Type = CustomerTypeEmployee
} else if u.Type == CustomerTypeEmployee {
u.Type = CustomerTypeRegular
}
u.Email = newEmail
return u
}
// Unit test this exhaustively—it's your domain logic
// WIDE: UserController.ChangeEmail - orchestration, many collaborators
// Simple coordination, talks to database and message bus
func (c UserController) ChangeEmail(userID int, newEmail string) error {
user, _ := c.userRepo.GetByID(userID) // collaborator: database
company, _ := c.companyRepo.Get() // collaborator: database
user = user.ChangeEmail(newEmail, company) // domain logic (tested separately)
c.userRepo.Save(user) // collaborator: database
c.messageBus.Send("email_changed", user.ID) // collaborator: message bus
return nil
}
// Integration test this—one happy path + critical error cases
8. Fractal Nature of Tests
Tests at different layers verify the same behavior at different granularities.
- Domain class test: verifies a subgoal
- Application service test: verifies the full use case
- Both trace back to business requirements
Key insight: If you can’t trace a test to a business requirement, it’s probably testing implementation details.
Go Example
// Domain layer test (subgoal): "corporate emails are identified"
func TestCompany_IsEmailCorporate(t *testing.T) {
company := Company{Domain: "mycorp.com"}
assert.True(t, company.IsEmailCorporate("user@mycorp.com"))
assert.False(t, company.IsEmailCorporate("user@gmail.com"))
}
// Domain layer test (subgoal): "user type changes based on email"
func TestUser_ChangeEmail_UpdatesType(t *testing.T) {
user := User{Type: CustomerTypeRegular}
company := Company{Domain: "mycorp.com"}
user = user.ChangeEmail("new@mycorp.com", company) // Value semantics
assert.Equal(t, CustomerTypeEmployee, user.Type)
}
// Application layer test (full use case): "user email is changed"
func TestUserController_ChangeEmail(t *testing.T) {
// Setup: real database with test data
db := createTestDatabase(t)
db.InsertUser(User{ID: 1, Email: "old@gmail.com", Type: CustomerTypeRegular})
db.InsertCompany(Company{Domain: "mycorp.com"})
controller := UserController{UserRepo: db, CompanyRepo: db}
// Act
err := controller.ChangeEmail(1, "new@mycorp.com")
// Assert: user updated correctly
require.NoError(t, err)
user := db.GetUserByID(1)
assert.Equal(t, "new@mycorp.com", user.Email)
assert.Equal(t, CustomerTypeEmployee, user.Type)
}
All three tests verify aspects of the same business requirement: “change user email.” They test it at different granularities.
9. The Two-Phase Workflow
Development (TDD)
During development, tests guide design:
- Red: Write a failing test
- Green: Make it pass (minimal code)
- Refactor: Clean up
At this stage, write tests for everything. Tests help you think through the design. Some of these tests will be scaffolding.
Pre-PR (Khorikov Rebalancing)
Before merging, reassess your test portfolio with Khorikov’s lens:
| Quadrant | TDD Produced | Khorikov Action |
|---|---|---|
| Trivial | Unit tests | Delete |
| Domain/Algo | Unit tests | Keep |
| Controllers | Unit tests | Replace with integration tests |
This is not a subset—it’s a rebalancing.
Khorikov may also reveal tests to add that TDD didn’t produce:
- Integration tests for controller happy paths
- Edge cases only visible at system boundaries
Why Two Phases?
- TDD targets local development: Fast feedback, design exploration
- Khorikov targets CI: Tests that provide value over time, don’t break on refactors
A test valuable during development (helped you think) may be harmful in CI (breaks on every refactor, tests implementation details).
10. Coverage Metrics
Coverage is a negative indicator, not a positive one.
“Low coverage numbers—say, below 60%—are a certain sign of trouble. They mean there’s a lot of untested code in your code base. But high numbers don’t mean anything.”
Never make coverage a target. When coverage is a goal:
- Teams write tests for trivial code
- Tests verify implementation details (easy coverage)
- Valuable edge cases are ignored (hard to cover)
High coverage with low-quality tests is worse than moderate coverage with good tests.
11. Integration Tests for Controllers
Controllers (application services) should be tested with integration tests:
- One happy path: Verify components wire together correctly
- Edge cases unit tests can’t cover: Boundary conditions, error paths
Don’t over-test controllers. They should be “humble”—minimal logic, just coordination. If a controller needs extensive testing, it probably has too much logic (move it to domain layer).
Go Example
func TestUserController_ChangeEmail_Integration(t *testing.T) {
// Happy path: email changes correctly
t.Run("changes email from non-corporate to corporate", func(t *testing.T) {
db := createTestDatabase(t)
messageBus := &SpyMessageBus{}
db.InsertUser(User{ID: 1, Email: "user@gmail.com", Type: CustomerTypeRegular})
db.InsertCompany(Company{Domain: "mycorp.com"})
controller := UserController{UserRepo: db, CompanyRepo: db, MessageBus: messageBus}
err := controller.ChangeEmail(1, "user@mycorp.com")
require.NoError(t, err)
user := db.GetUserByID(1)
assert.Equal(t, "user@mycorp.com", user.Email)
assert.Equal(t, CustomerTypeEmployee, user.Type)
assert.True(t, messageBus.SentEmailChanged)
})
// Edge case: user doesn't exist
t.Run("returns error for unknown user", func(t *testing.T) {
db := createTestDatabase(t)
controller := UserController{UserRepo: db, CompanyRepo: db}
err := controller.ChangeEmail(999, "new@example.com")
assert.Error(t, err)
})
}
12. Quick Reference
A condensed version for team wikis, PR templates, or coding guidelines.
Testing: Two-Phase Workflow
Development (TDD): Write tests first—red/green/refactor. Tests guide design.
Pre-PR (Khorikov): Rebalance test portfolio. Not a subset—a reassessment.
| Quadrant | TDD Produced | Khorikov Action |
|---|---|---|
| Trivial | Unit tests | Delete |
| Domain/Algo | Unit tests | Keep |
| Controllers | Unit tests | Replace with integration |
Four Pillars of a Good Test
| Pillar | Priority |
|---|---|
| Protection against regressions | Required |
| Resistance to refactoring | MOST IMPORTANT |
| Fast feedback | Required |
| Maintainability | Required |
What to Test (Four Quadrants)
Few Collaborators Many Collaborators
───────────────── ──────────────────
High Complexity Domain/Algorithms Overcomplicated
or Domain Sig → UNIT TEST → REFACTOR
Low Complexity Trivial Controllers
→ DON'T TEST → INTEGRATION TEST
Mocks
| Communication | Mock? |
|---|---|
| Inter-system (external APIs, email, payment) | Yes—observable behavior |
| Intra-system (between your classes) | No—fix design instead |
Needing internal mocks = code architecture problem
Observable Behavior Only
Test what code does, not how it does it:
- Observable = operations OR state that help clients achieve goals
- Test the result, not the algorithm
- If you can’t trace a test to a business requirement, delete it
Coverage
- Below 60%: Certain sign of trouble
- High numbers: Mean nothing about quality
- Never make coverage a target
Based on Khorikov, Vladimir. “Unit Testing: Principles, Practices, and Patterns.” Manning, 2020. ISBN 978-1617296277.