You have just completed the deploy of latest version on your production environment, but something was wrong, and after a few minutes/hours the business guys ask you to revert to the previous version. You was smart, and did a backup of the latest version in production in a tar.gz. Next you manually copied the backup and deployd it to all your servers…. without making mistakes, because you had a perfect documentation for this procedure, and you remembered well all the steps to rollback… isn’t?
Antipattern: use a backup to rollback an application to a previous version.
The recipe: automate the rollback in the same way you automate the deploy
In the last post we see how to use business metrics to validate a deploy, using a mix of software instrumentation and PaaS services. But what to do when a deploy validation fails?
You have at least 2 different aspects to consider:
- Changes at “behaviour” level (your source code)
- Changes at “data” level (your database
Rollback of Source Code
At onebip.com we use a simple trick: every deploy is in different new folder, with a unique generated name (provided by our deployment pipeline tool). After a successful and validated deploy we update a symbolic link “onebip” to the working directory of the latest version of the application. To rollback the code, we just revert the symbolic link to the previous version, a matter of few seconds.
But how to know which is previous version? we create a previousVersion file with a very simple command
<echo msg="Preparing version info file in ${deploy.basedir}" /> <phingcall target="exec-pssh"> <property name="ssh.host.remote" value="${ws.hosts.argument}" /> <property name="ssh.command" value="echo `readlink ${deploy.basedir}` > ${deploy.basedir.number}/previousVersion" /> </phingcall> <echo msg="Version info OK!" />
Now the rollback becames easy:
<target name="rollback"> <echo msg="Rollback" /> <exec command="echo '-H ${ssh.username}@${ws.host}'" outputProperty="ws.hosts.argument" /> <phingcall target="exec-pssh"> <property name="ssh.host.remote" value="${ws.hosts.argument}" /> <property name="ssh.command" value=" cd ${deploy.basedir} && PREVIOUS_VERSION=`cat previousVersion` && echo PREVIOUS_VERSION is $PREVIOUS_VERSION && if [ ! -d $PREVIOUS_VERSION ]; then echo FAILURE PREVIOUS_VERSION $PREVIOUS_VERSION does not exist, we can not rollback; exit 255; fi && cd .. && rm -fR ${deploy.basedir} && ln -s $PREVIOUS_VERSION ${deploy.basedir} && ACTUAL_VERSION=`readlink ${deploy.basedir}` && if [ $ACTUAL_VERSION != $PREVIOUS_VERSION ]; then echo FAILURE ACTUAL_VERSION: $ACTUAL_VERSION, PREVIOUS_VERSION: $PREVIOUS_VERSION; else echo OK link ok actual: $ACTUAL_VERSION; fi " /> </phingcall> <echo msg="Rollback done" /> </target>
Rollback of Persisted Data
The management of data is a bit harder, because you have to manage the schema changes, in the persisted data and at application level.
A specific post on the subject will coming soon… for now let assume that you can simply run a rollback script and revert schema and data on you database.
Test the Rollback automation
An important point to remember is that the rollback must be tested, as you do with any other feature. The current process thaw we use is very simple: at every deploy in the “Integration Test” environment, we do the following steps:
- deploy new version
- rollback
- deploy new version again
Conclusions
This is a lesson learn from the first “not working” rollback we did in production…. now we want to be sure that rollback works every time, so we automated it and dry run the rollback procedure in deployment pipeline.
What are you thoughts on the subject? How do you manage rollbacks? Let me know 🙂