The procedures that separate Shopify Plus brands that scale cleanly from those that drown in operational chaos.
Below $5M ARR, you can run a Shopify brand mostly through intuition and Slack. Above $5M, intuition stops scaling. Process becomes the multiplier, or the bottleneck.
I've worked with dozens of Shopify Plus brands between $5M and $50M. The ones that scale cleanly all have a similar core set of SOPs. The ones that scale chaotically tend to be missing the same handful. This post is that core set.
Fourteen SOPs across four functions. Each one specific enough to be useful, not so granular that you'll never write it. If you only have time to build five, the prioritization section at the end tells you which five.
At $1M ARR, you have 3-5 people. Everyone knows everything. Process lives in your head. SOPs feel premature.
At $5M, you have 8-15 people. Some of them are contractors. Some joined last month. The CEO can't be in every customer interaction. The marketing manager can't review every email. Suddenly the questions are 'how do we usually do this?' and the answer is 'depends who you ask.'
At $10M, you have 25+ people and the chaos has compounded. New hires take 3 months to ramp instead of 3 weeks. Customers notice inconsistency. Investors ask hard questions about ops maturity. The brand stalls.
$5M is the last point where you can install SOPs cleanly. After that you're retrofitting against habit.
Defines who refunds, for what, up to what amount, with what documentation. Without this, every agent improvises. CSAT becomes inconsistent and customers compare notes on Reddit.
What good looks like: one-touch refund limit ($200 standard), required documentation per refund reason, escalation path for edge cases, monthly audit cadence.
Approval tiers, documentation requirements, edge case handling.
Defines what triggers escalation from tier 1 to tier 2 to manager. Without this, complaints sit in the wrong queue and amplify.
What good looks like: keyword triggers (lawyer, attorney, BBB, social media), VIP customer flag routing, response SLA per tier (tier 1: 4 hours, tier 3: 30 minutes).
Tier definitions, response SLAs, social media protocol.
Defines who creates, approves, edits, and retires macros. Without this, your Gorgias account becomes a graveyard of stale macros that agents stop using.
What good looks like: macro approval workflow (manager signs off on new macros), quarterly macro audit (delete unused), naming convention (intent-action-variant), variable usage discipline.
Macro setup, intent training, escalation rules, chatbot configuration.
Defines the playbook for Q4 peak. Without this, BFCM is improvised and CSAT tanks.
What good looks like: ticket forecast (3-8x normal volume), surge hiring by October 15, macro refresh by November 1, escalation team on standby, daily monitoring dashboard.
Pre-launch prep, surge staffing, macros, escalation, post-event review.
Defines how physical inventory at 3PL is reconciled against Shopify display. Without this, oversells happen and customer trust erodes.
What good looks like: monthly full reconciliation, weekly cycle counts on top SKUs, variance threshold (under 1 percent OK, over 3 percent escalate), variance categorization (damage vs theft vs sync error).
Cadence, variance handling, audit trail.
Defines what signals trigger fraud review and how chargebacks are disputed. Without this, you lose either margin (loose rules) or real customers (tight rules).
What good looks like: medium-risk review queue (CVV mismatch, address inconsistency), velocity rules (max orders per email per 24 hours), Shopify 3DS for high-risk countries, chargeback dispute within 7-day window.
Defines how orders flow from Shopify to 3PL, how exceptions are handled, how returns are received. Without this, problems compound across departments.
What good looks like: order routing rules (zone-based), inventory sync (real-time or 15-min max), exception alerts via Slack, monthly performance review with 3PL.
Order routing, inventory sync, returns receiving.
Defines how DOA claims are verified, refunded, and tracked. Without this, agents either fall for fraud or stiff real customers.
What good looks like: photo evidence required (product + packaging), default to replacement (not refund), carrier dispute within 7 days for shipping damage, fraud flag for customers with 3+ DOA claims in 6 months.
Photo verification, replacement vs refund, carrier disputes.
Defines how often each flow is reviewed and what audit checks happen. Without this, flows go stale and revenue per recipient drops invisibly.
What good looks like: quarterly audit of every active flow, trigger config check, exclusion segment verification, attribution check against GA4 or Triple Whale, deliverability monitoring.
Abandoned cart, welcome series, segmentation, audit cadence.
Defines pre-launch verification for any marketing campaign. Without this, broken UTMs and untested links leak into production.
What good looks like: copy and creative approved by stakeholder, UTMs configured on every link, audience segment verified, send-time confirmed, post-launch monitoring scheduled for first 4 hours.
Defines how creators are paid, on what cadence, with what verification. Without this, late payments damage relationships and creators blacklist your brand.
What good looks like: net 15 or net 30 cadence, post completion verification before payment, tax forms collected at onboarding, payment method documented per creator, batch processing weekly or biweekly.
Verification, tracking, payment, tax compliance.
Defines how the books are closed each month. Without this, financial reporting drifts and you can't make data-driven decisions.
What good looks like: bank reconciliations by business day 3, accruals posted by day 4, statements drafted by day 5, CFO review by day 7. Material adjustments require CFO sign-off.
Defines the first 30 days for every new hire. Without this, every onboarding is improvised and ramp time is unpredictable.
What good looks like: pre-day-1 welcome packet, day 1 orientation, week 1 product immersion, week 2-3 shadowed work, week 4 independence, day 30 ramp evaluation.
Defines how the leadership team reviews the business each quarter. Without this, strategy reviews are ad-hoc and tactical decisions dominate strategic ones.
What good looks like: pre-read distributed 1 week ahead, fixed format (revenue, ops, marketing, retention, headcount), action items captured with owners, follow-up cadence between quarters.
Realistically you won't build 14 SOPs in a quarter. Here are the 5 highest-leverage to start with:
Build these five over the next quarter, one per two weeks. By end of quarter you'll have working documentation for the highest-leverage operations in your brand.
The other 9 can come over the following 2 quarters as the team grows and you start hitting the limits of the first five.
If you want to build these SOPs faster, ReccordSOP lets you record any workflow and AI generates the structured SOP. Drift detection flags when reality diverges from documentation. Free tier covers 3 SOPs per month.
DTC SOP templates organized by tool and procedure.
No. Start with 5 (refund policy, inventory reconciliation, Klaviyo audit, onboarding, BFCM prep). Add the rest as the team grows and you find specific gaps.
2-4 hours for the first draft. The bigger investment is keeping them current with quarterly review. Plan an hour per SOP per quarter for maintenance.
Below 20 SOPs, Notion works. Above 20, dedicated tooling pays for itself in maintenance time and drift detection. ReccordSOP is built for this case.
Runbook is for incident response (something is broken, what do we do). SOP is for routine operations (we do this every week, how exactly). Both matter, separate documents.
I built ReccordSOP after watching too many DTC ops teams lose months to undocumented workflows. These SOPs are battle-tested with Shopify operators running $1M to $50M brands.
Last reviewed June 2, 2026