Blog
Technology

Protecting a Core Data Migration from Corruption

Craig Marvelley

July 7, 2017

“If something can corrupt you, you’re corrupted already.”

Bob Marley

It’s fair to say that Bob was a pretty relaxed guy, at least where his music was concerned. I suspect though that if he’d used a note-taking app to jot down the lyrics to 1977’s “Exodus” only to later discover them lost after an app update, even his composure would’ve been tested.

It’s an unfortunate fact that when it comes to software, things often go wrong. The applications we use consist of millions of lines of code, which rely on further billions of lines in the operating system they live upon. That leaves plenty of opportunity for bugs, which can sometimes have undesirable concequences.

Take data storage as an example. We’ve previously discussed how in the Bipsync Notes iOS app we persist application data using Apple’s Core Data framework, which stores our content in an SQLite database on disk. From time to time we change the database schema as we introduce new features to the app. When this happens, we need to update all the databases within all the instances of the app installed across our userbase so their schema matches the latest version.

The Core Data library provides two methods with which we can migrate database schemas from one version to another. The original method, known as ‘heavyweight’ migration, would involve us writing numerous migration mapping classes which describe how data moves between the old and new versions of the entities within the Core Data entity model. This approach is onerous and adds a substantial amount of work to any version of the app that involves a schema change. Unsurprisingly, developers were less than enthused. So to improve matters Apple introduced what is now known as a ‘lightweight’ migration option. This is the method we prefer.

Lightweight Migrations

The beauty of lightweight migrations is that they largely take care of themselves. When setting up an NSPersistentStoreCoordinator to manage the database, we opt in to using lightweight migrations by setting the NSMigratePersistentStoresAutomaticallyOption option to true. If iOS determines any differences between the schema of the existing database and the incoming entity model, and those differences can be negotiated by a series of SQL statements, it will issue those statements against the database until the schema matches that of the entity model.

It’s not foolproof – some schema differences can’t be dealt with via SQL, in which case heavyweight migrations are the only way forward. However this lightweight approach is able to handle all our migrations. There are some tradeoffs to taking this lightweight path; we aren’t able to gauge its progress, which can be problematic, as we’ll later reveal. But overall, the amount of development time lightweight migrations save is compelling.

When Migrations Go Bad

Most of the time these migrations complete without a hitch, but there are instances of reports that a database may be left unusable afterward. Indeed we recently experienced this issue ourselves. In these situations there’s little one can do after the fact; the database is corrupted. Reading from or writing to it is unreliable at best and impossible at worst.

In our case, we believe the cause of the corruption was a premature quitting of the app by the user while a migration was in process. iOS should be able to cope with such an event without harming the database, but that does not appear to always be the case. Unfortunately there’s no way to prevent the user from force quitting the app, but we can take steps to minimise damage if they do.

Back up your database

This is essential. Having a safe copy of your database before attempting to migrate it means that if anything does go wrong, you have the option to revert back to a pristine version.

Take a backup before you attempt to load the store – otherwise Core Data will begin a lightweight migration automatically before you have chance. In our case, once we have a backup, we can load the store and deal with the outcome as appropriate. Our process is:

Determine if a migration is required
If it is, copy the database to a safe location
Load the core data store, which will automatically begin to migrate the database
If migration is successful, delete the copied database
If not, remove the semi-migrated database and replace it with the backup, so that the process is repeatable

Most of these steps are straightforward for a seasoned iOS developer, and are concerned with copying files around or initialising stores. When taking a backup of the store, be sure to also copy the journal files that go with the SQLite database, else data will be lost.

The first step, determining if you require a migration, is probably the least obvious task. It’s easy to do though, and can be accomplished in a few extra lines of code:

https://gist.github.com/craigmarvelley/bb4cc49561d48edf3a060e482fd51cd7#file-migration1-m

This code can also be reused by the app to work out if we need to show any additional UI as the migration takes place, especially since the copying of the database could add significant time to the migration task. Which brings us nicely on to the next consideration…

Keep the user informed

Depending on how large your database is, a migration can take anywhere from several seconds to several minutes to complete. We regularly see times of around 20 seconds to back up and migrate databases of several hundred megabytes in size. It’s crucial that during this time your app remains responsive, even if it is technically unusable until the migration completes. Should the user assume an app launch failure, they are likely to terminate it themselves, and the database is at risk.

If you’ve followed what is now best practice, you’ll be initialising your store on a background thread as described here. If you’re targetting iOS 10 and up, you could take advantage of NSPersistentContainer which does this for you. Either way, it’s critical that the main thread, which manages the app’s UI, is not blocked waiting for the migration to finish. User experience aside, should the app fail to launch in a timely manner, the watchdog process will kill the app anyway.

With the UI free to update during the migration, we can now display a loading UI to differentiate this particular startup routine from the average one. A progress bar would be ideal here, but lightweight migrations don’t expose an API to measure their progress (though it’s possible to do this if you are using heavyweight migrations). In the absence of a progress bar an indeterminite activity indicator, i.e. a spinner, will do. We also present a message to the user, as you can see below.

core data migration — This screen the user sees during migration.

It’s simple but effective, and makes it clear to the user that the wait is abnormal and requires some patience. If they nonetheless continue to terminate the app, at least we can say that they were warned.

Employ a background task

Even if the user doesn’t quit the app, they may get bored waiting for the migration to finish and decide to switch to another app in the meantime. When an app goes into the background, iOS usually allows it a few seconds to clean up before it is forcibly terminated. To an ongoing migration which can easily last more than a few seconds, this presents the same danger as if it were killed by the user.

To protect against this we can start a background task before the migraton. This ensures that if the app goes into the background as the user switches to another one, iOS will grant it up to ten minutes to finish what it’s doing – which is ample time for a migration to complete. When the user returns to the app they’ll either pick up where they left off on the loading screen or, if the initialisation process completed, they’ll find the app migrated and ready to use.

Be vigilant for corruption

Even with the above recommendations in place, one has to accept that sometimes events beyond your control can conspire to corrupt your database and prevent automatic restoration. In such a scenario it’s important that you’re immediately aware of the situation and prevent it from further damage to the user experience.

Say for example the database can be read from, but cannot be written to. Here, there is a danger that while the app looks fully functional to the user, none of their changes to their data are persisted to disk. Rather than the user waste time on the app in a broken state, we can detect related errors emerging from calls to the NSManagedObjectContext API and put the app into an error state until we can investigate in more detail.

This screen is presented if the database is unusable.

While the user may find a paused app an inconvenience, especially if it comes out of the blue, that feeling is nothing compared to how they would feel should data be lost.

In conclusion

Murphy’s Law dictates that things will always go wrong in software development, but when it comes to Core Data migrations we can take steps to keep our users’ data safe.

No corruption, no cry.