CloudPhotos – Apple CloudKit Sample

CloudPhotos : Using CloudKit with iOS is an interesting Apple sample project, last updated a year ago, I thought I would blog about some observations I had.

The table view controller uses an array of CKRecords for its data. I sort of remember Apple saying not to use CKRecord as your model class, but all I could find was this in the docs:

Customizing Records
The CKRecord class does not support any special customizations and should not be subclassed. Use this class as-is to manage data coming from or going to the server.

This doesn’t exactly say out-right don’t use as the model, but if they say don’t subclass it then to add your methods I suppose you might try to use a category. The thing is the CKRecord contains useful logic for change tracking, e.g. when modifying a property it stores the key in changedKeys. Then when you come to use a CKModifyRecordsOperation if you use the savePolicy save changed keys then it optimises by only sending those keys that were changed (I think). I suppose the issue here is this is a lot of useful behaviour you wouldn’t want to re-implement yourself in your own model class, so you do in fact want to be using CKRecord as your model class for things that will be sent back to the server. So why Apple say not to subclass it is a bit of a mystery. Anyway once you start to implement a local cache, or would like to make use of core data fetch controller and sorting, you likely will be using NSManagedObjects as your model classes. The CKRecord encodeSystemFields does not include the changedKeys array, I suppose what you can do is track the changed keys yourself on the managed object, and then when you come to transmit the record only set the keys that you want to be sent, and then CKRecord probably won’t mind it is missing the other information.

APLCloudManager is actually a controller, and is init from the app delegate. It sort of the MVC-N design (amazing video well worth the watch), where all network code is inside one controller and not in the view controllers. I say sort of because you are supposed to only manipulate the model inside here, and then model should notify the view controller new data is available, in Apples case their model is in their view controller and is updated via the completionHandler. It’s funny because in the video he says some people choose controller (who disagree controller should only be used in view controllers), and some use manager and right here is a manager example! So in this class we see all the methods for doing cloud network calls. At first I thought the saveRecord was strange because it looks like a direct wrapper around CKDatabase’s saveRecord. However it also has a dispatch back to main queue with in. This is cool because it means the UI view controller callbacks don’t need to worry at all about threading! Perhaps a good design tip to make use of. Maybe they should have at least named it savePhotoRecord though? That’s the only thing its used for. I suppose keeping it generic is ok, given the record itself is what contains the type. Actually one of the weirder things about CloudKit is in one batch you can save records of all different types in one batch, whereas CRUD API consumers would usually be saving one record type to each end point which would need multiple requests.

Again on this manager class, all the methods are void returns, meaning unfortunately there is no cancellation capability. I think Apple should have included cancellation on at least one of them to show people how it is done. Basically I think the method could return the NSOperation (wouldn’t work with convenience methods, would always need full operations) or a protocol (to hide that scary class from UI devs), then that has the cancel method. So for example if trying to refresh a table, then the user moves away, in the view will disappear the request could be cancelled.

The subscribe method in the manager is a particularly nasty one, no completion handler on it. It’s called from the main view controller’s viewDidLoad so if it fails then the user carries on none-the-wiser. Inside the method it does look for a not authenticated error, in which case it keeps retrying every 3 seconds. At this point I noticed the operation’s qualityOfService was not set to user interactive. This means it operates in the NSURLSession discretionary mode where it does fail with network errors and just keeps retrying on its own. I just wonder is this enough to mean that you don’t need any error handling at all? Hence why they didn’t even bother with a completion handler? Will need to look more into that. I think tho at least the user could have none not to leave the app until the subscription had been set up. I was thinking this kind of initialisation feature of checking for account/creating zones/creating subscriptions might be best done in an on-boarding UI screen. You know the kind that is like a welcome screen where there is some info, maybe a spinner while things are being set up then a button? The Apple News app has this on first launch.

In the APLMainTableViewController I’m dissapointed that upon a pull-to-refresh it calls loadPhotos which does a full download of all the photos again, and then it does a full table reload. Ideally we would want to download the changes and then insert/update records. Then we realise they are using the public database which doesn’t support the fetch changes feature, so now we know why they are doing a full download. They still however could have done the table delta updates, but maybe that was too much code for this intro to CloudKit.

Finally I’d like to make a general comment on this old project and why perhaps it is no longer a good beginner sample. CloudKit is all about privacy, so sharing photos to a public database does seem to conflict with that just a bit. In iOS 10 they added CKShare and secure sharing capabilities between friends, it would be great to see a photo sharing sample that uses that instead. I think most people that are struggling with CloudKit are trying to do caching or synchronisation, this sample doesn’t touch on that at all. Maybe there are other Apple CloudKit samples I haven’t discovered yet, but on my wish list would definitely be a proper sharing one, and a sync one (with silent notifications).

CloudKit Sync Nightmare

So after previous post on analysing the web requests, where we found out the classes involved and sort of when they should be used, we are still clueless as to how to structure our app to enable CloudKit syncing. We know we shouldn’t subclass fetch operation but then how should we gather our information to pass to the operation? Should we create our own operation that then start the fetch operation within? Or use queues, or dependencies? In general should we use our own queues or always use the container database’s built in queue? How should we structure our retry logic? Does NSURLSession’s discretionary mode handle CloudKit retry or just network failure retry? Should be be using longlife operations for sync? Should we be queuing or coalescing fetches between multiple pushes and between anything background and anything user like pull-to-refresh? So many unanswered questions. What a nightmare, Apple has been absolutely useless in their code samples at the WWDC talk (e.g. 16m 48s not useful example and didn’t even bother to demonstrate record sync). And what the hell do they mean by the requests sent last 24 hours / 7 days stuff? I think I need to class-dump the headers to figure out how Apple are doing this. So turns out the app uses an embedded framework for the CloudKit code, here is the dump of NewsCore on my Github, let’s take a look.

First of all we notice 336 files. 336 files just for the networking and caching, not the UI. Unbelievable, but lets carry on anyway.

A lot of the CloudKit related files are prefixed with FCCK, 27 of them! And there are many more like controllers etc. that use these objects. Maybe half the files have a reference to some aspect of CloudKit one way or another.

FCAppConfiguration looks like a possible entry point for this framework, it contains a FCCloudContext which looks like the heart of the CloudKit stuff, with many controllers, centres, managers and queues. Looks like it stores the controllers for syncing the reading list, FCReadingList which is a subclass of FCPersonalizationData : FCPrivateZoneController.

From FCModifyRecordsCommand we notice the coalesceWithCommand method so they are coalescing commands, this might eventually prevent duplicate network requests for the same thing, but doing it at the command level rather than the network level.

The operation subclass hierarchy is as follows FCCKFetchRecordChangesOperation : FCCKOperation : FCOperation : NSOperation

FCOperation isAsynchronous (no other reason for implementing that method) and childOperations suggests they are implementing operation grouping not with sub-queues but instead with dependencies between operations and tracking the group using an array, which would allow the group to be cancelled after being added to more global queue.

FCCKFetchRecordChangesOperation is like a wrapper around CKFetchRecordChangesOperation. Since that class isn’t an ivar anywhere it suggests its declared locally, which means they didn’t need to deal with self retain cycle block bugs. It has properties for everything needed to perform a fetch changes operation including: database, zone, desiredKeys, allChangedRecordsByID and allDeletedRecordIDs. It even has _continueFetchingRecords and resetForRetry methods which makes it look like the operation is being retried from within this operation. Is it added to the database queue and then addChildOperations from FCOperation though?

FCCancelHandler is an interesting class that likely is for wrapping the operation object into something that is safe for a UI developer to work with. It doesn’t appear to be used for CloudKit operations though.

FCAsyncSerialQueue shows they are using serial queues however it isn’t obvious if they are implemented via addDependency on last or with maxConcurrentOperations = 1. It is used in the FCCKRecordZoneManager.

FCPushNotificationCenter is the place where push notifications are managed.

They have added some convenience methods via FCCKDatabase+Additions.h which have completion handlers but no cancellation object returned.

Like we saw in the flat binary filenames used for caching modifications to be uploaded they are using the word command in the class names, e.g. FCModifyReadingListCommand and FCCommandQueue looks to be for managing the binary file with the loadFromDisk and serializeCommands methods.

FCCKTestDatabase makes me think they are taking advantage of dependency injection for testing purposes, where they can pass in a mocked cloud database for testing instead of doing real network requests.

To be honest I wasn’t expecting this complicated an app, this looks like at least a year of coding, maybe even more. Remember the app only syncs a few record types, reading list (bookmarks), history (read articles) and personalisation data like favourite feeds. Implementing syncing is going to be no easy task at all; no wonder we are all clueless after a few hours of WWDC talks! Hopefully some of my research has helped you though.

Sorry I didn’t link every class name, WordPress’s post editor stopped letting me set links for some reason.

Investigating CloudKit Sync in the Apple News App

Introduction

Syncing data across a user’s own devices in a privacy-sensitive way is one of the flagship features of CloudKit. Unfortunately many agree “it isn’t as documented as it might be”, e.g. there is no sync sample from Apple, the WWDC 2014 talk and WWDC 2015 talk didn’t quite get to the topic although did offer some vague hints, in WWDC 2016 talk the presenters gave lots of unfinished sentences possibly due to lack of time, e.g. <sic>you have to do it this way “for a variety of reasons.” Looking at the CloudKit framework headers none of the classes have sync in their name, or even have the word “sync” mentioned. Slightly better is that more recently the class doc for CKFetchRecordChangesOperation has been updated to mention this is the one to use for synchronising, however having information spread across multiple places is really hard work. Furthermore, fully understanding CloudKit behaviour requires knowledge of other things like NSOperation quality of service and NSURLSession discretionary. To add to the confusion, the open-source projects out there are all attempting to achieve sync in different ways, e.g. some are attempting to sync with the public database which is missing the required features, or they might use a periodic sync rather than realtime which CloudKit was designed for using push notifications, and zones are a really stumbling block. These common mistakes is clearly a result of lack of good samples and documentation, and in particular lack of some absolutely vital information, for example the CKFetchRecordZoneChangesOperation. I happened come across a Stack Overflow answer by a CloudKit engineer who shared that not all the changes are returned, they are coalesced to remove unnecessary ones, e.g. if a record was added and then deleted since the previous request (tracked by a token) it isn’t included. This was quite eye-opening because it shows the server is a lot smarter than expected, and knowing this now yes this could be used for an efficient sync. What also helped me is, as we see later, they do hit a HTTP endpoint that is named sync, which helped confirm this must be the right path. I think Apple could have put the words sync in the class name or at the very least put in the header, use this class for syncing!

In a situation like this where there is much ambiguity it is useful to look at how Apple do things, for some ground truth, and they are using CloudKit sync in the News App so lets take an in-depth look at that and using two devices for testing, an iPod Touch and iPhone 6s both on 10.1.1.

News Article Download

When the app starts up it shows news articles, since all users can see these articles we would expect them to be in the public database. We will use a web proxy to analyse the requests, this won’t give us exact detail of how the framework classes are being configured, but it will give us an idea of the general algorithm. We will be using the 6s to monitor the requests.

As we can see it performs a query to the container com.apple.news.public which is the public database. The record contains and articleID, title, thumbnail, contentURL etc. so the full article is downloaded in separate requests that go directly to the news organisations own servers. The thumbnail is a URL to an image on the icloud-content server you can see in the screenshot. The query used contains minOrder and maxOrder which are integers so would suggest they are doing a query to get new articles, this isn’t a sync its usually called a delta download, i.e. only downloads the new information. The articles have an order field like an auto-inc, or sequence number, newer articles have a higher number, which is better than using timestamps where 2 articles might have the same time. This is possible because only Apple is the one inserting records. This kind of delta download is a great bandwidth saver however it only allows only new articles to be downloaded which has a limitation you can see in the below screenshots.

 

As you can see the first article had its title changed between 4h ago and 13h ago. The limitation of their design is it doesn’t allow old articles to be updated, which doesn’t fit well with the news industry where headlines can be changed frequently as a result of errors or updated information. This would require existing downloaded articles to be updated, perhaps even deleted. This isn’t possible with the public database because it doesn’t have the required features to support a full sync. The alternative design would be to clear the cache and re-download all the articles every time, which would ensure the user is seeing the latest list, however that may have higher bandwidth requirements. Apple must have run the numbers for their number of records and data involved and decided an append only delta download was the way to go. There is another interesting usability feature here, as you can see we are on the history list, if a user is browsing the list to find a previously read article it certainly would make it harder to find if the title was changed. Now it becomes a very interesting problem, because you have a trade off between what is technically optimal with what is best for users.

Bookmark & Reading List Sync

Now that we have covered how articles are downloaded lets now look at the features we are really interested in, how it performs the sync between devices. The News app has two features that are synced, bookmarks and history. The history view has already been shown in the above screenshots. An item is added to the history after an article is viewed and the user has scrolled down a bit, or maybe spends some time in the article, or perhaps a combination of both. Articles can be bookmarked when viewing them, by tapping the bookmark icon on the bottom right. Now the really interesting part, if you have two devices side-by-side these two sets of data are updated almost instantly (~5 seconds) when changes happen. For example bookmark an article on one device and it appears in the bookmark list on the other device, un-bookmark the article and it disappears from the other devices list. So there we have the feature we are looking for, a full sync between devices so lets see how it is implemented. We will tackle the push part later, we’ll focus on the News App’s requests just now though.

So we have the 6s connected to the proxy and open at the saved articles page. On the iPod touch we open an article and bookmark it. 5 seconds later this happens:

As we can see in the first request it hits an end point called “zone/sync” to a container called com.apple.news.private. So now we know they are using both public and private databases for this app, which is interesting to me because obviously articles need to be referenced, and we were told many times CKReference doesn’t work across zones or databases, all they’ve done here is simply make the articleID a string field rather than a reference, I suppose they aren’t bothered about referential integrity. Next fortunately for us a familiar looking class name is included in the request, looks like a CKFetchDatabaseChangesOperation (the one in the log has prefix CKD which is because the actual request goes through the CloudKit daemon, so its like an RPC or remote class). On the response tab (not shown in screenshot) we see ReadingList and ReadingHistory which are the name of the two zones that have changed. Lets take a look at what happens next:

These next two requests are to the endpoint “record/sync” again to the private container and again we can see the class looks like the familiar CKFetchRecordZoneChangesOperation (Note. pre iOS 10 the class was CKFetchRecordChangesOperation). This first sync request contains the zone name ReadingList and as expected the second contains ReadingHistory. From more testing we see that the record/sync is only requested if the zone name is contained in the zone/sync.

Background Sync

Another feature is these lists are already up-to-date when the app is re-opened. If the app is killed and restarted then it already has the previous data which sparks my interest to see how they are achieving caching, but for now we will focus on what technique they are using for updating e.g. update on coming to foreground, background fetch or push notification. So to find out we can test this with the 6s connected to the proxy just on the homescreen, and then on the iPod Touch using news to read and bookmark an article. We’ll use the proxy and also the new Sierra Console so we can gain an insight into what the 6s is doing.

A push notification! And it has the content-available flag this shows they are using silent push, implemented using a CKNotificationInfo with shouldSendContentAvailable set. We also also see the private container and the zid which is the short version of Zone Name. Apple shorten the json key names in pushes because packet size is limited. Because the push contains the zoneID this would suggest they are using CKRecordZoneSubscription. Now lets see what the proxy logged:

Looks like the exact same requests as when the app is in the foreground. Finding out what zones changed and then what records within them changed. In fact the same push is used when the app is in the foreground, so now we know how the info is kept up-to-date. I don’t know about you but what crosses my mind is if multiple pushes are received do they coalesce the fetch changes requests? I might investigate that later on.

Pull-to-Refresh

Next, I noticed there is also a pull-to-refresh enabled on the bookmark and history table views. This was perhaps implemented as a fall-back in case for some reason the push notification doesn’t arrive so it allows the user to force a refresh of new data. Or maybe they are using the feature to clear the cache and re-download all the bookmarks to clear up any inconsistencies? Lets do a pull-to-refresh on the 6s and see what the proxy shows:

We see only a CKFetchRecordZoneChanges this time, no record download. Scrolling down the request shows it is for the ReadingList zone name. Similarly if we pull-to-refresh on the History tab we see the same request but for the ReadingHistory zone name. This proves they are only using this feature to do a sync, so as a replacement for a missing push notification, rather than a full clear-cache and re-download everything.

Caching

Now it wouldn’t be a proper sync without caching, this allows the app to be killed and restarted and still show the previous info. So lets see how they achieved that. To that we will use a jailbroken iPod Touch on iOS 9.3.3 so there is a chance that the caching has changed on iOS 10 but hopefully this is still interesting. We will connect over SSH and browse the file system to find where the News app stores its data.

The data is in /private/var/mobile/Container/Data/Application/ContainerID which was found by a process of elimination where first the mobile user’s Library folder was browsed and when nothing was found there I looked to the containers. I think this is a relatively new concept of storing platform (or built-in) app data in the container folder. Since container folders have a UUID it can be tricky to find the right one, how I achieve it is modify something in the app, like save an article to the reading list, and then sort all the containers by date. So in the screenshot above we have found the folder and we can see the private data in a folder, where i have highlighted the reading-list file, there is also a reading-list-commands file, then we also see a CloudKit folder with an sqlite database named Records.db. Lets look at these files in editors, beginning with “reading-list”. 

Looks like a binary file made with NSKeyedArchiver and contains the list of articleIDs and date added. The file named “reading-list-commands” just looks like an array of article IDs stored in binary, my guess would be user actions that need sent to the server are cached in here. At first glance flat files looks really, really bad, it means all the data is stored in memory and written out whenever it changes. Its possible they chose flat files rather than a database since fetch changes might error CKServerTokenExpired, which documentation says to toss the cache and start with a fresh download using a nil token, so in that case it does make sense to just delete a file rather than empty and rebuild a database, it would be good to know how common this scenario is tho, Apple definitely to need provide more information to what at moment is very black-box like. To aid my development, I have asked on Stack Overflow if there is a way to simulate CKServerTokenExpired. The best alternative to flat files for the model is CoreData, with automatic UI updates and table sorting, so they would need a very good reason for us to give all that up and use flat files, or maybe they just didn’t have time?

So now we know they are using flat files for all the private syncing which is very interesting given other developers have attempted to sync to Core Data. Lets see what secrets are hiding in the Records.db by opening it in a SQLite editor.

No Core Data here either! They way to tell is lots of capitalised “Z_” prefixed tables and fields, so here they are using sqlite directly. Also we notice the recordID is being encoded with a colon seperator, e.g. recordName:zone:owner, which is interesting because I’ve seen other developers attempt to encode all the properties of the recordID in different ways, some even storing different zones in different databases. The owner might be the creatorUserRecordID.recordName (or maybe modifier) because usually it is __defaultOwner__ when looking at that record name of records your own account creates, rather than being your own user UUID. This table even contains the containerIdentifier which you would think would be redundant information, since the app knows what containers it contacts so this reminded me the last time I looked at the CloudKit headers I did see Sqlite mentioned, lets open that now.

In the file list on the left we see a CKSQLite class which is an Obj-C wrapper around the sqlite library. Opening CKRecordID we see it has methods sqliteRepresentation and initWithSqliteRepresentation likely for the colon separator parsing. Finally, by searching for what class is using CKSQLite we find a large class CKPackage (pictured), which looks like it is responsible for the database we saw and caching all the records. How it is actually used isn’t clear, like if it is loosely-coupled to the operations, in that they decide which records get cached, or if it is tightly-coupled in that the operations automatically are caching records. That would require more investigation but it could suggest caching features are coming to CloudKit APIs of the future. But at least we know now they are using flat files for sync of uploads and downloads of private records, and sqlite for caching public records. This is sad in a way that we didn’t see any core data database perhaps with a sync status flag we could have looked to for inspiration, but also reminds us not to over think a problem and using tried-and-tested techniques of saving files is still fine.

Conclusion

There we have it, it’s been a long journey but we have learned a lot. Now we know the two classes we should be focussing on to peform a sync are the CKFetchDatabaseChangesOperation and CKFetchRecordZoneChangesOperation. Also we know for realtime sync we should be using silent push notifications via CKRecordZoneSubscription and CKNotificationInfo with shouldSendContentAvailable, and upon push received we should first sync database changes and then only the record zone changes for the zones included. We also know we should implement a manual refresh just in case push notifications don’t arrive. We learned about using flat files for caching private database sync, and how CKRecords could be encoded in sqlite database fields. Did we learn anything else? Probably! In the future, we might see if they coalesce sync requests originating from multiple pushes. Another question that came to mind is if if they re-download records to the same device that created it, which is a common dilemma in sync solutions. I hope this post has helped and now you are starting off on the right foot for building the perfect multi-device sync app!

Footnote

There is one final thing I’d like to mention concerning a new feature of the iOS 10 API. They added the ability for the sync classes to repeat themselves to get all data, via the fetchAllChanges properties. This is great news since the big developer complaints with the CloudKit API is it was very complicated to make the required repeat requests. The strange thing is they only added it to the classes involved in syncing, not to CKQueryOperation for example. On the one hand this shows Apple’s focus with CloudKit is towards improving syncing, which is great, but it also is bad in that they got the API wrong the first time, and have subsequently had to rename classes ( CKFetchRecordChangesOperation -> CKFetchRecordZoneChangesOperation) and add essential properties like fetchAllChanges. It also seems rushed compared to normal, like a block on CKFetchRecordZoneChangesOperation is named recordZoneChangeTokensUpdatedBlock that pluralisation just seems strange to me, i.e. a zone only has one token, and stands out as inconsistent with the other names used. This is the kind of thing that should get cleaned up as they iterate over the API design, maybe Scott Forstall’s perfectionist strategy of iterating API designs ten times before release is no longer being implemented.

YapDatabase Disappointing CloudKit Example

It’s not often you see code as disappointing as this, particularly because CloudKit is so well designed for the situation being described, but it’s actually very common. I think it happens when you have a developer with speciality in one area, trying to apply it to their understanding of how to implement something in a new area, and just getting the job done as quickly as possible.

I was searching Github for examples of how people are handling CKErrorChangeTokenExpired, and came across this CloudKitManager.m by YapDatabase. In scrolling through I noticed this function that looks to be chaining some CloudKit setup operations together:

- (void)continueCloudKitFlow
{
    DDLogInfo(@"%@ - %@", THIS_FILE, THIS_METHOD);
    
    if (self.needsCreateZone)
    {
        [self createZone];
    }
    else if (self.needsCreateZoneSubscription)
    {
        [self createZoneSubscription];
    }
    else if (self.needsFetchRecordChangesAfterAppLaunch)
    {
        [self fetchRecordChangesAfterAppLaunch];
    }

First thought is, since these methods are usually all async and dependent on one another how can they all be called in line like this? Taking a look at one of these methods gives an ugly surprise! (And I don’t mean them using tabs instead of spaces but yeh that’s ugly too! Had to even fix them just to paste into WordPress.)

- (void)createZone
{
    dispatch_async(setupQueue, ^{ @autoreleasepool {
        
        // Suspend the queue.
        // We will resume it upon completion of the operation.
        // This ensures that there is only one outstanding operation at a time.
        dispatch_suspend(setupQueue);
        
        [self _createZone];
    }});
}

So all of the methods are called one after another instantly, but to make the actual methods (prefixed with underscore) run serially, the developer runs them as blocks on a GCD dispatch queue, and when each block starts it pauses the  queue to prevent the other ones being called. Unbelievable!

The question you are all asking is why on earth didn’t the developer simply make use of CloudKit’s beautiful use of NSOperations and just set up dependencies via an NSOperationQueue to run them serially automatically? I would guess the answer is this particular developer is an expert in GCD (based on the information in the Readme of the project CocoaAsyncSockets), so put simply, they used what they knew to solve the problem.

It’s one of those situations where they got the job done, so that’s great, but I think if everyone takes a moment to learn the NSOperation API properly and see how amazingly well CloudKit builds upon it, it would bring much more enjoyment to developers.

In case you were wondering how it should be done, is the CloudKit operations don’t have to be added to a database or container via addOperation to be processed, a database property can be set on the operations and then it can be added to a custom operation queue, thus allows you to group your operations together, and with dependencies order them serially. It is easy to miss but in the CKDatabase header it says “or schedule operations on your own queue”, that’s the secret.

CKRecordZoneSubscription Notes

If you attempt to set desiredKeys on a notificationInfo for the new iOS 10 CKRecordZoneSubscription class, saving results in the runtime error:

<CKError 0x17424b610: "Invalid Arguments" (12/2006); server message = "cannot add additionalFields to this subscription type"; uuid = 4E08C616-97AA-4E9A-B584-B7972B1CD99B; container ID = "iCloud.com.x">

Which makes sense given the notification is about a zone changing and not a record, however one might assume it can send the record that was responsible for the change but looking to CKRecordZoneNotification shows no record properties are available. It’s always kind of disappointing to see subclassing not working out, i.e. where the subclass is denied features of the parent, it can sometimes point to bad class design however in this case Apple could have handled it better either by client side validation or at least a note in the header. It’s extra disappointing given in iOS 10 they refactored from flat CKSubscription inits for the different subscription types (which did have their own limitations, e.g. initWithZoneID only allowed zero for options) to subclasses. I hope they’ve thought this new design through because if it is refactored again it could be a real headache for framework backwards compatibility.

 

CloudKit Syncing References

Here are some references from around the web on how to implement CloudKit syncing. It will be added to…

9th August 2017

In the Build Better Apps with CloudKit Dashboard WWDC 2017 video at 4:30 Dave Browning presents a ToDo list app that syncs from Core Data with CloudKit 6:15. I wrote to Dave in June and he told me that they do plan to release the source for that app, they just need to clean it up a bit and review it internally first. He said he’d let me know once it has been published but it has been a few months now. I’ll update this post if it gets released.

5th January 2017

As I Learn CloudKit Syncing by Eric Allam

Probably my favourite tutorial so far, Part 1 follows the example given in the Advanced CloudKit talk from 2014, unfortunately the talk and this tutorial doesn’t get as far as using push notifications via a CKRecordZoneSubscription and resorts to fetch only. It correctly uses CKFetchRecordChangesOperation for this, however it contains a mistake in part 4. It says that only changed properties are included in the CKRecord given by the recordChangedBlock, in fact the full record is included by default, and it now also supports a desiredKeys in the newer version of this class named CKFetchRecordZoneChangesOperation. It correctly identifies using NSOperation dependencies as the right approach for queuing related operations, however in seek of queue error handling Eric looks to the Advanced NSOperations talk and corresponding open source project, which unfortunately is of a different design to CloudKit NSOperations and he hits some problems.

5th January 2017

Seam – Seamless CloudKit Sync with CoreData

This project provides tight coupling with CoreData that seems to go against Apple’s designs of keeping frameworks more loose. For example, it wraps conflict handling which might be something the developer requires full control over. It’s always the same with wrapping, you end up hiding away something needed and then having to write more code to expose it again. Opting certain entities out of syncing requires extra effort, and it appears to support only an all-or-nothing sync, rather than say just up or down. The project was coded in Swift so unfortunately fell victim to the constant syntax changes of the language, and has been left broken on Swift 3.0. Furthermore, NSOperation is designed around returning errors rather than exceptions that Swift uses, I’m not sure how well the developer handles this but it does seem a potential area of weakness.

5th January 2017

CloudKit + Core Data + NSOperations – Syncing by Nick Harris

This is the top tutorial on Google for CloudKit syncing, but sadly has mistakes and leaves many unanswered questions. Uniquely, Nick uses a record zone per record type which obviously isn’t the right approach for these kind of records, he realises this later, and the reason is that CKRecordZoneSubscription can be scoped to a recordType. Another mistake is by subclassing CKFetchRecordsOperation to perform processing, however he strangely does then use correctly use NSOperation dependencies for the sync method. One interesting approach is to use a separate entity to store deletes, so that the real entity can be deleted as normal rather than just soft deleted; this has the advantage that predicates don’t need the additional where status != deleted.  There doesn’t appear to be any error handling if any one of the operations fails, furthermore the synchronous processing NSOperation just fatal errors if anything goes wrong with the Core Data method calls. I believe if he had access to an asynchronous NSOperationQueue class that supports errors and cancellation it would have helped a lot. Written in Swift so again falls victim to syntax changes.

5th January 2017

Apple CloudKit Engineer on Stack Overflow

A developer named farktronix gives great insight into the inner workings of CKFetchRecordZoneChangesOperation on the server. From the name of the class you might think that this returns all changes or deletes, which at first glance would appear far too much data to be useful for an initial sync, so for example, you might think instead to query to get the first set. However the developer states that the server coalesces changes before sending them (which gives the bug being discussed), this means that if a record was created and then deleted, the server is smart enough not to send either of those events in the fetch changes operation. This is a really great piece of insider info, and hopefully this user shares more in the future.