Category Archives: Software Engineering

Database Migrations For Zero Downtime Deployments

One of the biggest challenges with blue-green deployments is migrating the database. While it’s easy to run two different versions of your software at the same time, you can only run one database (i.e. schema) at a time. For example, if you change a table to use a different column name, and then try to run both versions of your software at the same time (as you’d want to with a blue-green deployment) then the old software will break because it doesn’t know about the new column name.

This situation presents a large but not insurmountable challenge. We can resolve this challenge by following some basic principles.

Basic Principles

  • The most basic principle is that each version N of the software MUST work with both version N and version N+1 of the database.
  • Version N and version N+1 of the software must both be able to safely run at the same time on version N+1 of the database.
  • Additionally, migrating the database and migrating the code are independent. The database usually migrates first because version N+1 of the software is not required to work with version N of the database.

This diagram illustrates the directions in which compatibility must be maintained between the code and the database.

screen-shot-2016-09-30-at-8-34-12-am

The consequence of these conditions is that we can do any database refactoring that we like with zero downtime, the refactorings just need to be broken down into smaller refactorings and spread across multiple deployments. Also the individual refactorings don’t need to be in consecutive deployments, they can be spread across more deployments as long as they are done in order.

One complication is that this approach becomes very difficult if your releases are very far apart. Imagine a release schedule of once every three months, and imagine a database refactoring spread over three releases… It becomes increasingly likely that things will get forgotten or de-prioritized, and you will end up with a lot of half-finished database refactorings in your system.

The closer together your refactorings are in time, they easier they are to manage. In fact, if you are practicing continuous deployment, the database cleanup steps in subsequent deployments can happen almost immediately after the release once you are confident with the release and the old code has been retired. Blue Green deployment operates in a feedback loop: it enables you to do smaller and more frequent deployments, and it works much better and is much easier if you do smaller and more frequent deployments.

In the table below we see some sample refactorings and how they can be reduced to a set of compatible changes. Note that each column represents changes you would make for an individual release N, and the refactorings (rows) are split by database change and software change. Imagine overlaying the diagram above onto the table below and seeing in which directions compatibility needs to be maintained. Finally, sets of database changes for a single release should be done in a single transaction of course and the changes should be atomic.

screen-shot-2016-09-30-at-8-32-16-am

Testing

This should always be tested against production-scale data and while the database is in use in a perf environment. You can see how long the schema migration takes, the performance impact of the migration to a system under load, and find solutions to these impacts if necessary.

Conclusion

We’ve outlined here just one technical component of zero downtime deployments. There are many other important non-technical components of zero downtime deployments, including considering the culture of your organization, the maturity of your devops practice, and the release cadence expected by customers or mandated by your industry. There are even other possible technical components such as how to migrate non-database data stores, collaborating service migration, and code that depends on data not on schema.

However, with the basic principles above, I am confident that one of the more difficult components of zero downtime deployment (database migrations) can be solved, and that the other components that apply to your own situation can similarly be solved if you are willing to do the work.

Happy deploying!

Advertisements

Leave a comment

Filed under Software Engineering

The Business Case For Zero Downtime Deployments

This post describes the business advantages of zero-downtime, (a.k.a. blue-green) deployments. If you are not familiar with blue-green deployments, you can read the excellent Martin Fowler article about blue-green deployments. But in a nutshell this amounts to the idea of releasing to production by deploying a release to a separate production environment and behind the scenes gradually redirecting user activity to the new environment from the old one.

Why Do I Want This?

  • Eliminate downtime: Done correctly, your users will literally see zero downtime. Many organizations do deployments at midnight local time so that they can takenthe entire service offline “when nobody’s using it.” But if your software is successful in a global environment (or if your target demographic’s usage time is split between midnight and mid-day), there is no time when “nobody’s using it” and the best time for one group of users may be the worst time for another.
  • Increase support: Production deployment is potentially the single most dangerous activity your team does. Instead of midnight deployments when everybody is exhausted or asleep, deploy during business hours when the entire team is available and alert.
  • Reduce risk: allows easy and safe rollback, if something unexpected happens with your release, you can immediately and safely roll back to the last version by simply directing user traffic back to the previous environment
  • Provide for staging: when the new environment is active, the previous environment becomes the staging environment for the next deployment. If you didn’t have a staging environment, you should probably have one anyway.
  • Hot backup for disaster recovery: after deploying and we’re satisfied that it is stable, we can deploy the new release to the previous environment too. This gives us a hot backup in case of disaster.

What’s The Catch?

Zero-downtime deployment isn’t free: it requires a certain diligence and maturity of process and devops, but not any more than you really need to play with the big kids in the world of software engineering anyway. Details at the engineering level to follow in a future post!

Conclusion

The benefits of this kind of deployment are actually huge. If you can deploy with zero downtime once per month, why not once per sprint? Why not every day? Why not as you finish features and fix bugs so that your customers see their bug fixes and feature requests in a fraction of the time?

If you can do deployments this way, your users with thank you. And your boss will thank you. 🙂

Leave a comment

Filed under Software Engineering

Joining the JCP and Working with JSR’s

Recently I was inspired to join the Java Community Process (JCP) and try to support a Java Specification Request (JSR). In some ways it was easy, but in other ways there were parts of the process that were confusing. This post is intended to help anybody who wants to join the JCP and adopt a JSR.

Prologue: Working With Your JUG

If all you want to do is help out a JSR that you’re excited about, and your home JUG has a JCP membership, then stop here. The fact is, you don’t personally need to join the JCP or sign any paperwork to participate with a JSR. If you work in the context of your home JUG, your efforts and those of your fellow JUG members can be submitted to a JSR working group as a contribution of the JUG, so it is covered under the JUG’s JCP membership.

However, if you wish to join the JCP program in your own right as an Associate Member and be listed individually as a Contributor on a JSR, you can join by following the instructions for individual membership on JCP.org

The above link has detailed steps for joining. It is a lot of text and it looks intimidating, but it’s really easier than it looks!

The rest of this blog post is about joining the JCP as an individual. Essentially you create an Oracle Account if you don’t already have one, fill out a form and submit it, they send you an agreement to sign, you sign it and you’re done!

Step 1: Your Oracle Account

Once you create an Oracle Account and verify your email address, you will need to create your JCP account. On your first login to jcp.org you will see

“Your SSO login is not yet associated with any JCP account. So click the “Create New JCP Account” button below to create your account. If you are trying to link your SSO login to an existing JCP account, please contact the PMO: admin@jcp.org. In the message, describe both the e-mail address for your SSO login and the JCP account. If you are trying to create a new jcp.org account, please click this “Create New JCP Account” button.”

Click the button to create your account and you’re done!

One caveat is that your SSO login can only be associated with a single JCP membership. This is described on the the membership description page but I didn’t see it at first because it was in small print at the bottom of the page. This tripped me up because I tried to use a single login both to create a JCP account for a JUG and then for myself as an individual. Each Membership requires the use of a distinct user account.

Step 2: Legalese

They will send you the Associate Membership Agreement (AMA) form to sign, which really is just two pages of legalese and a couple clicks. It takes just a few minutes, and when you submit it, it then goes back to them for approval.

Approval can take a week or more, this is not an automated process. So don’t worry if you don’t hear anything for a little while. But if multiple weeks go by, try pinging the admin@jcp.org.

Step 3: Adopt A JSR

Ok, here is where we need some more explanation. The JCP / JSR relationship is a little confusing at first.

Let’s say you sign the AMA and log in to jcp.org. If you go to the Java SE 9 JSR to join it, there is a “join” link in a tiny font halfway down the page which is easy to miss. And if you attempt to join and sign up as a Contributor, they will respond that you can’t be a Contributor because you haven’t contributed anything. Also if you go to adoptajsr.org, that’s a different login (for java.net) and it doesn’t mention the JCP at all. So how do you “join” or “adopt” a JSR and start contributing?

The JCP is an organization and joining it gives you  certain abilities like voting rights, and the legal agreement you sign when you join covers the legal ownership of any contributions to a JSR. But except for login and joining/voting, you don’t do much on jcp.org. The JCP doesn’t host any repositories or code, and it doesn’t host the actual JSR home pages. The JSR on the JCP site (Such as Java EE8, JSR 366) is not the JSR’s home page (such as the Java EE JSR Home). So to join a JSR, you will have to go to the individual JSR’s home page and get involved from there. Each JCP page has a link called “Public Project Page” to take you to that JSR’s actual home page, but if you didn’t know that that’s what you’re looking for, it might be hard to find.

Each JSR hosts code itself and has its own set of rules. Joining a JSR means different things depending on the JSR because the individual JSR’s are managed by different people and different organizations, so each one could be run differently. For instance, OpenJDK is on java.net, some independent JSR’s like JSR 107 (Java Caching) are on github, and Redhat and IBM each use their own repos. You’ll need to create separate logins to do anything if you want to work on multiple JSR’s.

In the case of OpenJDK (for the Java SE JSR) it operates more like an open source project. There is no “I hit the Join button and now I’m a member, or now I’ve adopted it.” It’s more like “I’ve been monitoring the email list, picked out a favorite bug, submitted patches, and gradually become part of the community of people working on this thing.” The JDK contribution page outlines how to get involved with OpenJDK. Getting started would involve as little as getting familiar with the conversations going on in the mailing lists, such as the Java 9 observer mailing list.

Conclusion

For more information, go to to the Adopt A JSR Home Page
Joining the JCP and working with a JSR is not difficult, but getting started can be unclear for the uninitiated. Hopefully this post clears the way for you yourself to become active in the Java Community!

Leave a comment

Filed under Software Engineering

Drawbacks of Sprint Zero, or, How To Do Just Enough Design Up Front

I read this article recently about drawbacks of sprint zero and thought it was highly relevant to us as Software Engineers. Please read this article then come back here and we can talk 🙂

There is a balance to be had between doing ALL design up front, NO design up front, and (as this article suggests) SOME design up front. I have seen all of these approaches, and it is admittedly tricky at times to pick the right balance. For new product development a design phase does seem to work well in my experience. And for existing product development I would say a healthy dose of “big picture thinking” is appropriate when designing the feature (perhaps at story grooming time) and an entire design phase might be a bit much at that point. Again, as with most things, I think the answer to the question of how much design to do up front is “It depends.”

How do you design your products? How much time do you spend designing your features and how do you know it’s the right amount of time? Let me know what you think!

Leave a comment

Filed under Software Engineering

Reactions: A First Look at Couchbase

The Philly JUG had a great meeting this week about Couchbase. I casually knew about Couchbase as a NoSQL solution, but there were a couple really cool things that surprised me.

First: There’s a Couchbase Lite that is suitable for running directly on mobile devices. So you could use this instead of SQLite on Android! Using NoSQL on your phone sounded pretty exciting!

Also: There is a Sync Gateway for client-server replication and peer-to-peer replication. From their site, Couchbase Lite “Supports replication with compatible database servers. This gives your app best-of-breed sync capabilities. Not only can the user’s data stay in sync across multiple devices, but multiple users’ data can be synced together. Supports peer-to-peer replication. By adding an extra HTTP listener component, your app can accept connections from other devices running Couchbase Lite and exchange data with them.”

Finally, it’s worth noting that Spring Data supports Couchbase directly. So if you are in the Spring world, it should make using Couchbase even easier.

When a Philly JUG meeting ends with everybody learning something new, I’ll call it a rousing success!

Leave a comment

Filed under Software Engineering

Using EclipseLink with Spring Boot

The Promise

Spring Boot advertises in its self-introduction that one of its goals is to “Be opinionated out of the box, but get out of the way quickly as requirements start to diverge from the defaults.”

One thing that we might want to change from the defaults is the choice of JPA implementation. Hibernate is Spring Boot’s default JPA implementation, but EclipseLink (which is in fact the JPA Reference Implementation) is a good choice as well, and many people will have compelling reasons to choose it over Hibernate.

Does Spring Boot keep its promise to get out of our way if we want to deviate from the defaults? Let’s find out!

Put It To The Test

At one point there was an issue filed and resolved in the spring data examples project: people wanted an example demonstrating how to use EclipseLink in Spring Boot. So the eclipselink sample that came out of that is probably the best place to start.

It is a good starting point, but if we use Gradle we still have some work to do because the example uses Maven. We can actually start with the JPA project from spring.io’s Accessing Data With JPA guide as a starting point, and incorporate the basic changes from the eclipselink sample.
The first thing we’ll need to do is  add some configuration to provide the EclipseLink vendor adaptor. Note that this configuration is for the JpaBaseConfiguration from Spring Boot 1.4 (the class for Spring Boot 1.3 is a little different). In this class, the “eclipselink.weaving” property is set to false to eliminate any weaving configuration. This is maybe not what we’d want to do in production, but it makes the demo a bit simpler.

@Configuration
public class EclipseLinkConfiguration extends JpaBaseConfiguration {

   protected EclipseLinkConfiguration(DataSource dataSource, JpaProperties properties,
                     ObjectProvider<JtaTransactionManager> jtaTransactionManagerProvider) {
      super(dataSource, properties, jtaTransactionManagerProvider);
   }

   @Override
   protected AbstractJpaVendorAdapter createJpaVendorAdapter() {
      return new EclipseLinkJpaVendorAdapter();
   }

   @Override
   protected Map<String, Object> getVendorProperties() {

      // Turn off dynamic weaving to disable LTW (Load Time Weaving) lookup in static weaving mode
      return Collections.singletonMap("eclipselink.weaving", "false");
   }
}

Changes to the build script are essentially excluding the Hibernate implementation of JPA and including the EclipseLink implementation.


dependencies {

   compile("org.springframework.boot:spring-boot-starter-data-jpa:1.4.0.M3") {
      exclude group: "org.hibernate", module: "hibernate-entitymanager"
   }
   compile("org.eclipse.persistence:org.eclipse.persistence.jpa:2.6.3")

   // .. other dependencies here
}

Finally, one issue with the original sample’s JPA Entity is that EclipseLink doesn’t like using a primitive long as a primary key, so we can change it to Long.

You can get all the source for this project the github.

Results

On running, we see initialization output we’d expect, starting with

Building JPA container EntityManagerFactory for persistence unit 'default'
Initialized JPA EntityManagerFactory for persistence unit 'default'

and ending with

Closing JPA EntityManagerFactory for persistence unit 'default'

Running “gradle dependencies” and searching for hibernate in the output shows zero results, while doing the same for eclipselink clearly shows eclipselink being used

gradle dependencies | grep eclipse.persistence 

+--- org.eclipse.persistence:org.eclipse.persistence.jpa:2.6.3
| +--- org.eclipse.persistence:javax.persistence:2.1.1
| +--- org.eclipse.persistence:org.eclipse.persistence.asm:2.6.3
| +--- org.eclipse.persistence:org.eclipse.persistence.antlr:2.6.3
| +--- org.eclipse.persistence:org.eclipse.persistence.jpa.jpql:2.6.3
| \--- org.eclipse.persistence:org.eclipse.persistence.core:2.6.3
| +--- org.eclipse.persistence:org.eclipse.persistence.asm:2.6.3

Conclusion

Spring Boot got out of our way pretty quickly and easily once we wanted to move away from the default technology. We can consider the promise kept!

This is not the end of course. There is more to the configuration if we want our application to be ready for production, but this is a first step to using EclipseLink in a Spring Boot application. We can build on what else you would want to do in a future post!

Leave a comment

Filed under Software Engineering

Smaller Stories, Agile Journeys

This is a story of one company’s continuing journey to discover agility.

Episode I: WATERFALL

Let’s pretend there’s a company that has no concept of a “Story”, small or otherwise. The description of work for the entire software product is “get it done.” Well, there are specs/etc, but you know what I mean. No part of the software is done until EVERYTHING is done. In this world, there are no stories, there’s just “The Thing You’re Working On” and the developers code like crazy until it’s done from their point of view. Then the developers “throw it over the wall” for QA to test it, and hope nothing comes back to bite them.

The result of this situation is that you can’t deliver anything until you can deliver everything. So if something changes or there’s a bug, or something is not done on time… The company is unable to deliver incremental value and they are basically sitting on lost revenue and lost feedback until they can deliver. Obviously this is not good from the company’s point of view. Can they do better?

Episode II: MINIFALLS

The company then finds out about “Agile” and “Scrum.” So they start using sprints, user stories, product owners, etc, and everything is great! They even discover that smaller stories are easier to describe and easier to finish, so there is a push to make smaller stories, which is good right? That way you can develop one thing, QA it, develop another thing, QA it, and so on until the sprint is complete.

This is an improvement over waterfall, but they’ve just gone from throwing it over the wall to throwing it over smaller walls. In the beginning of the sprint, QA does not have much to do because they are waiting on development of stories to finish, and at the end of the sprint, developers do not have much to do because they are waiting on any bugs to come back and can’t start work yet on the next sprint. Can they do better?

Episode III: AGILE

Finally, the company finds out how to get the members of its co-located cross-functional teams to actually work together. With smaller well-defined stories in hand, team members with development skills and team members with testing skills are working together on Day One to get each story over the finish line. From they moment the team breaks from sprint planning to begin the sprint, they are engaged together with clarifying edge cases, coding, designing tests, more coding, writing automated tests, and even more coding. They recognize that if they have nothing to do, it’s because they’re missing something in how they should be working and collaborating in their team.

Epilog

I really recommend reading “What Does QA Do on the First Day of a Sprint?” It paints what I think is a great picture with all the details about what this world can look like.

Which stage is your company in? How do developers and testers work together on your team?

Leave a comment

Filed under Software Engineering