The Risks of Replatforming

  • While we intend to move most of the development team to NextGen, we rediscover that 100% of our current revenue comes from customers on our current platform. There’s a huge list of commitments, urgently needed improvements, compliance issues and current-quarter revenue drivers that can’t be dropped. Engineering time-slices everyone (inefficient and demotivating). Or we assign a much smaller-than-planned team to NextGen…
    Or we assign the “best” developers to NextGen, massively demotivating the developers “left behind” on revenue-producing LegacyTech… Or we hire an entirely new team to work on NextGen which not only demotivates everyone already on board, but requires them to teach the fresh hires everything we already know about our platform/processes/tools/ company/strategy/culture/market.
  • We don’t have a complete spec for the current system. It’s 10 years old, the original architects are long gone, we’ve grown and grafted lots of obscure capabilities, and portions have not been touched for years. No one person actually knows how the entire system works. Some modules are too scary or delicate to touch.
    But we nonetheless tell ourselves that reverse engineering requirements from the existing product should be easy. (“How hard could it be to just make the new product work just like the old one?”) So our project estimates are wildly optimistic. As we work, we keep unearthing surprising functionality and technical ugliness that pushes out a final delivery date.
  • We should eliminate assorted features that aren’t widely used, don’t work well, address outdated needs, or otherwise weigh down the current version. This is a strategic product decision, since it will inevitably p*ss off a handful of long-time customers even though it’s the right thing to do. But we don’t think through these hard trade-offs in advance, since NextGen is imagined as a perfect substitute.
Tempus fugit
  • Major systems projects never deliver everything all at once, magically, on a single day. In reality, we’ll get a v1.0 that covers 60% of use cases and existing features in February; then v1.1 that covers 70% in April; then 78% coverage in v1.35 in July; then 82% in November… eventually we creep up on 90%. (We never reach 100% compatibility.) And over-optimism means that February doesn’t happen until August — or possibly the following January — as schedules slip and we add more features to both versions.
  • Current customers — the people paying our salaries while we replatform — hear about NextGen and immediately want all of the details. “I need a document right now that walks me through every step of the migration and every change in feature/function so that I can do detailed planning now for Q4 2022.” “We just licensed LegacyTech and expect 4+ years of support on that architecture. Our auditors take a full year to review major upgrades, so quick replacements are not an option.” “What about the third-party integration we put on top of your data extraction API? Can you promise 100% parameter-level compatibility and zero recoding?” “I have 8 urgent P1 tickets waiting to be fixed. You can’t reduce engineering support for production systems.” Executives panic about these entirely expected responses.
  • Sales and Support take us at our word, and tell customers that NextGen will be 100% plug-compatible with LegacyTech: no feature changes, no data migrations, nothing new to learn, no incremental work for them, and includes unlimited free assistance. Promised for the original February delivery date. After all, that was our stated goal for the project. They eventually look foolish in front of customers, and product/engineering are the scapegoats.
  • As replatforming takes much longer than planned (as it always does), there’s increasing pressure to add new features to the current product. Customers demand continuous improvement, and our competitors are not asleep. So we back-fit just one new feature into LegacyTech, then a second, then a third. Each of these has to be implemented in NextGen to keep 100% compatibility, further delaying its arrival.
  • We discover that our six-month project has taken two years and isn’t yet complete. We’re always one last feature away from perfection, and the shiny tech choices we made two years ago are outdated. Internal pressure builds to cancel it (again) and start fresh (again). We fire some architects and product managers, hoping that it goes better next time. Replatforming becomes our codeword for catastrophe.
  • Migrating our entire customer base of 1000 companies will take 2–3 years starting once we deliver UpgradedPlatform v1.0. We’ll be running two development efforts in parallel for at least 18 months for our six-month project (or 3 years for our 1-year project). Sorry, them’s the facts.
  • We’ll migrate customers in well-defined cohorts. First, we set a date for all new customers to start on UpgradedPlatform v1; then a segment of current customers using only core/vanilla features; then those who need some handholding and prodding; then those with complex migrations, etc.
  • Customer Success will create a dedicated migration assistance squad. We will also recruit and train some external development partners to take on bespoke integrations.
  • We will undoubtedly have 40 or 50 long-standing customers who can’t migrate or refuse to move forward. Development’s savings from completely retiring CurrentPlatform may be $10M+/year, but only when we terminate support for the very last customer. So let’s agree now (instead of 3 years from now) that we’re willing to lose a few big customers and their ARR. (If not, let’s cancel this project before we start.)
upgrades/migration over time
  • The KPI/success metric is a stepwise migration of customers. We hope to have 100% of new customers starting on UpgradedPlatform by September, and the easiest 30% of our installed base by December. Each new version will unlock another segment, so we’ll move another 10% per quarter as each improved release arrives.
  • We’ll push back hard on any enhancements to CurrentPlatform but know that a few will be required. (And customers who need those will be unwilling to migrate until UpgradedPlatform has those exact improvements.) So we’ll escalate requests to the Exec Team and implement those later into UpgradedPlatform. These customers go to the end of the migration queue even though it costs us current-quarter revenue.
  • Initially, we don’t make a big public deal about UpgradedPlatform — since it’s a subset of CurrentPlatform. Once we hit 35%, we’ll make a formal End-of-Support announcement for CurrentPlatform with 2 years’ notice. Hard stop on that date with no exceptions, no extensions, no excuses even when HugeCorp calls our CEO and complains. (Otherwise, we’ll be maintaining it for another decade and lose all of the intended efficiencies.)
  • CurrentPlatform goes into maintenance-only mode when UpgradedPlatform hits 75% adoption: that’s when we stop all work except truly critical P1/system down issues. We redeploy the whole CurrentPlatform team to UpgradedPlatform at 95% and have a ceremonial burial for that codeline.

An Example?

  • 40% of customers are single-location American ranchers raising only cattle (cows), submitting paperwork via email/PDF, and using the basic version of our RFID animal tracking system. They use roughly one third of Hoofbeats’ total capabilities. If we ruthlessly prioritize, this could be v1.0 of HoofbeatsTNG. Early migrations will include some manual data transfers, so upgrading 2000 smaller customers will take Q3 and Q4.
  • Another 15% are similar but raise cows and horses and bison. That requires report summaries by species, additional regulatory forms, and breed-specific medication recommendations. Let’s add those to v1.1 and migrate those ranchers next.
  • At this point, we stop selling Hoofbeats-Current to new customers. (“As of November 1, all fresh proposals will be for HoofbeatsTNG. As of January 1, no new signed deals for Hoofbeats-Current. Sales teams get zero commission on any exceptions.”) And we formally announce our end-of-support plan for Hoofbeats-Current. (“Last day of product support for Current will be 30 October 2023. No extensions, no special side maintenance contracts.”)
  • 20% of customers are outside the US, and each country has its own regulatory filings: 16% in Canada, Australia, UK, Brazil and Argentina — so we’ll limit v1.2 to those five countries. In addition to reporting, we must support local languages and currency, date format, and wider phone number/postal code fields.
  • 5% of our customers (but 25% of total revenue) are large agricultural corporations with many locations and management hierarchies. So v1.3 will need multi-site reporting, complex forecasting tools, online form submissions, delegated user permissions, and integration with the top two agricultural accounting systems. Etc.
  • The last 4% are customers we will actively drop or refuse to accommodate. A few raise llamas, some have integrations with obsolete financial software, and we’ll limit ourselves to 6 languages. We have clear executive agreement not to serve these groups going forward — no matter how much they complain. Our account teams will notify them at the appropriate time and recommend alternatives.

Sound Byte

Replatforming or re-architecting production software is a complex, expensive, risky proposition. Most of the challenges are around dealing with internal expectations and paying external customers, not the underlying technical details. So let’s not think of these as development-only exercises and assign sole responsibility to Engineering.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Rich Mironov

Tech start-up veteran, smokejumper CPO/product management VP, writer, coach for product leaders, analogy wrangler, product camp founder.