Updates & Patching

Purpose

Updates and patches fix vulnerabilities and bugs. Not patching exposes you to attacks. Patching carelessly breaks systems. This guide explains how to balance the two.

Context & Assumptions

Who this is for:

IT administrators managing multiple systems
Business owners evaluating patch urgency
Operations managers scheduling downtime

Key assumptions:

You have a technology stack requiring updates (operating systems, applications, libraries)
You understand that all patches carry some risk of breaking things
You have a change management process in place

The Three Types of Updates

1. Security Patches (Critical)

Fix known security vulnerabilities
Examples: zero-day exploits, privilege escalation flaws, data leaks
Urgency: Deploy within days, not weeks

Decision rule: If a vendor labels an update "critical" or "security," treat it as urgent unless you have specific reason not to (e.g., known compatibility issue with critical system).

2. Stability Patches (High)

Fix bugs that impact availability or performance
Examples: memory leaks, database corruption, service crashes
Urgency: Deploy within weeks

Decision rule: Monitor vendor release notes. If a patch fixes something that affects your business (e.g., your e-commerce platform crashes), prioritize it.

3. Feature Updates (Medium)

Add new capabilities or cosmetic improvements
Examples: UI redesigns, new reporting options, performance improvements
Urgency: Deploy on a regular schedule, not in emergency mode

Decision rule: These can wait for a planned maintenance window. Test thoroughly before deploying to production.

Safe Patching: The Three-Environment Model

Do not patch production directly. Use this model:

Development → Staging → Production

Environment 1: Development

Your testing environment
Low cost, high risk acceptable
Deploy patches here first
Run your standard tests

Environment 2: Staging

Mirror of production (or close to it)
Medium cost, low risk acceptable
Deploy patches after development passes
Run full production-like tests with real-world data volume
This is where you catch problems before production sees them

Environment 3: Production

Your live business systems
High cost, zero risk acceptable
Deploy patches only after staging validation
Monitor closely for issues

For very small businesses with limited infrastructure: At minimum, have a recent backup before patching production. This allows you to roll back if something breaks.

Patch Deployment Strategy

Security Patches

Step	Timeframe	Action
1. Release announced	Day 1	Subscribe to vendor security bulletins. Assess impact to your systems.
2. Deploy to development	Day 1–2	Test on non-critical systems. Look for compatibility issues.
3. Deploy to staging	Day 2–3	Run full test suite. Verify your critical workflows.
4. Deploy to production	Day 3–7	Plan maintenance window if needed. Deploy and monitor closely.

For zero-day exploits (no known workaround): Faster timeline. Consider 24-hour deployment if possible.

Stability & Feature Patches

Step	Timeframe	Action
1. Release announced	Day 1	Review release notes. Is this relevant to your business?
2. Deploy to development	Week 1	Test. Look for regressions.
3. Deploy to staging	Week 2	Run production-like tests.
4. Deploy to production	Week 2–4	Schedule maintenance window. Deploy during low-traffic hours if possible.

Automation: When and How

Automating patches reduces workload but increases risk if something breaks unexpectedly.

Safe Automation Rules

Scenario	Automate?	Rationale
Security patches on non-critical systems	Yes	Reduces delay. Impact is low if something breaks.
Security patches on critical systems	No	Too risky. Manual control necessary.
Feature updates	No	Test first. These often introduce unexpected changes.
OS patches (servers)	No	Manual control. Often requires restart. Schedule downtime.
Application patches (SaaS)	Maybe	If vendor tests it for you and rollback is automatic, consider it. Otherwise, manual.

Recommendation: Automate security patches for non-critical systems (e.g., development boxes, non-revenue services). Control critical systems manually.

Rollback & Recovery

Before patching, know how to undo it:

Scenario	Rollback Method
Database update breaks queries	Restore from backup taken before patch
Application update crashes	Revert to previous version (usually one command if using package manager)
OS patch causes boot failure	Use recovery mode or restore from snapshot
SaaS platform has issue	Contact vendor; usually they maintain previous version for 24–48 hours

Action: For any critical system, test your rollback procedure before you need it.

Change Management & Communication

Document every patch:

What: Specific patch name and version
When: Deployment date/time
Why: Business justification
Tested: Which environments? What tests ran?
Approved by: Who authorized this?
Result: Any issues? Performance changes?

Communicate with stakeholders before patching critical systems:

"We will patch [system] on [date] at [time]. Expected downtime: [X] minutes."
Send reminder 24 hours before
Send summary after, including any issues encountered

Patch Schedule & Calendar

Create a regular maintenance window:

Example: Monthly Patch Tuesday

When: Second Tuesday of each month, 2 AM–4 AM (off-hours)
What: Deploy pending patches to non-critical systems
Who: Operations team
Process: 1 hour development, 1 hour staging, 0.5 hour production deployment, 0.5 hour monitoring

For emergencies: Maintain an out-of-cycle process. Example:

Critical security patch discovered → Assessment same day → Deployment within 24–48 hours

Monitoring During & After Patching

During Deployment

Monitor system dashboards (CPU, memory, disk, network)
Watch application logs for errors
Have rollback ready if needed
Communicate status to team

After Deployment (24–72 hours)

Check functionality: Do your core business processes still work?
Monitor performance: Any slowdowns or crashes?
Review errors: Any new error messages in logs?
Validate backups: Confirm backups completed successfully
Measure impact: If applicable, any changes in user experience or performance metrics?

If something is wrong, investigate. Don't assume it's unrelated.

Common Pitfalls

Patching without testing — Skipping staging/development "to save time" causes outages. Not worth it.
No rollback plan — Patching without knowing how to undo it is reckless.
Patching all systems at once — If something breaks, you can't isolate it. Stagger patches.
Ignoring release notes — Vendors often warn about known issues with specific versions. Read them.
No communication — Users find out about downtime when their system stops working, not from you.
Forgetting to update documentation — After patching, update runbooks, playbooks, and technical documentation.

Practical Example: 50-Person Business

Current systems:

Microsoft 365 (email, teams)
Shopify e-commerce
Custom database on AWS
Various SaaS tools

Patch strategy:

System	Type	Frequency	Window	Owner
Microsoft 365	Cloud (auto-patched)	Continuous	N/A	Microsoft
Shopify	Cloud (auto-patched)	Continuous	N/A	Shopify
AWS Database	On-premises (managed)	Monthly	2 AM Sunday	Operations Manager
SaaS tools	Cloud (auto-patched)	Continuous	N/A	Vendor

Process:

Subscribe to vendor security bulletins
Monthly patch window: Sunday 2–3 AM, 30 minutes to deploy, 30 minutes to verify
Test critical queries in staging before deployment
Roll back via AWS snapshot if needed (tested quarterly)

Result: Systematic approach, minimal downtime, clear responsibility