May 3, 2020
Estimated Post Reading Time ~

Migration of 6k AEM pages in batches with zero downtime



old to new simple structure from left to right

Changing designs of thousands of topics on telegraph.co.uk whilst maintaining 100% uptime
What is this blog about
At Telegraph Engineering we recently worked on creating a new topic page design and migrated 6k+ pages from its existing layout to a new fast, simple and engaging layout. This blog demonstrates what and how we did this from the perspective of the data content repository (JCR).

Note: This blog requires some basic understanding of Adobe Experience Manager architecture and data structure

Glossary
JCR — Content Repository for the Java Technology API.
AEM — Adobe Experience Manager; the underlying CMS for content management.

What are Topic Pages?
Topic pages are subject-specific landing pages that filter news articles based on their relevant topics. This screenshot shows the old and new topic page designs.

Before:

The old-look topic pages

After Migration:

After migration

Business Value — Why did we do it?
We decided to change the layout of some page types in AEM to help increase our readers’ engagement, drive registrations and support our vision.

How — Our Phased Approach
Phase 1: Component re-write:
We started by rewriting existing components. For example, the old list component was difficult to monitor and maintain, was impossible to call from services that were external to AEM, and the layout was very old. So we started by rewriting a list component that supported a new layout and monitoring, and had the facility to be used as a service (ie. external services can call the list service with a configurable parameter and maintenance is much easier).

This required us to modify the architecture, however, I am focusing only on the migration part in this blog. If you have any questions please add them into the comments section.

Phase 2: Create content packages in Pre-Prod
After the component rewrite, we removed the old page from the staging (pre-production) environment. We manually created the new pages using the newly-built layout. We then created and downloaded the content packages in AEM.

Phase 3: Content Migration in Live
Firstly we backed up existing pages (as packages). Then we installed the content package for the new topic pages — which we’d created on staging in phase 2 — on to production and activated these pages. This flushed the dispatcher cache and the page was ready with its new layout on production.

Easy, right? Not quite… The manual process worked for one topic page. We have more than 6,000! And they all require migration.

Phase 4: Automating the process
To migrate 6,000 pages we had to develop and adopt an automated solution. So we decided to migrate them in batches to reduce overhead and to make rollback easier if required.

Here are the steps that we followed:

  1. Write and then run a Groovy script to find and build a list of pages that require migration (based on business logic).
  2. Run a bash script to generate a content package of 500 pages (from different locations in AEM) on prod. The script also downloads the package to a local directory.
  3. Manually upload the package to the staging environment so that you are in line with production.
  4. Run a java program to migrate those pages from the old jcr structure to the new jcr structure.
  5. Run the bash script again on the staging environment to create the content package and download it to local directory. This time the package has got migrated pages.
  6. Install the content package on the prod author and publishers.
  7. To clear up the dispatcher cache, we have a python script that rebuilds those pages on the dispatcher. Reactivating 500 pages from different locations in AEM is not easy and we can’t clear the whole dispatcher cache, so the only option is using a script that can automate this process and rebuild pages on the dispatcher, even before the first request reaches the dispatcher.
  8. Follow step 1 to 7 until you have all 6k pages migrated.
  9. Finally, celebrate your success!!
The whole process worked with zero downtime and it was so seamlessly happened that it was unnoticed by editors and/or subscribers.


By aem4beginner

No comments:

Post a Comment

If you have any doubts or questions, please let us know.