How We Modularized Medium’s iOS codebase
Without interrupting workflow
After we launched the Medium iOS app, we wanted to make it easy for any engineer in the company to quickly experiment with, and contribute to, the codebase. Having a more modular codebase enables us to experiment more: for example, we could spin up a prototype that has the same core components for authentication and downloading posts, but explores different navigation or post displays.
We wanted a solution that would allow us to split up our codebase into modules without disrupting our workflow, which relies heavily on git feature branches and GitHub pull requests for code reviews. We looked at several different options and decided on using git subtrees. Here’s how we came to that decision.#### CocoaPods
CocoaPods is the most well-known of the options we considered. It’s very easy to use—both as library consumer and writer—and is similar to other dependency management systems like Rubygems.
Our first attempt to modularize Medium’s iOS app was to split the codebase into a library (or “pod”) and use CocoaPods to manage the dependencies. Briefly, here’s what we did:
- Split the main project into separate git repositories, one for client code and one for a library and client code.
- Integrate CocoaPods in the library, by restructuring the workspace and the project files and creating its podspec.
- Remove the “vendored” libraries in the library workspace to use their CocoaPods version
- Restructure the client app project workspace to use the pod version of the library
There was a lot of upfront work in splitting up the codebase and preparing the podspec. Setting up the compilation steps was also a bit complicated, especially when you are new (as I am) to the many knobs and buttons of Xcode’s build configurations. But after that everything seemed to work smoothly.
However, we ran into trouble because we were using CocoaPods on a rapidly changing codebase rather than a seldom-changing library, which is perhaps not its intended usage. Xcode seems to heavily cache the state of config and file indexes, which caused a lot of confusion around files not being found during compilation time, and autocomplete not working properly. Adding new files to the library required a lot of ‘pod update’ commands. We decided that this was too disruptive to our workflow, and looked at other options.
Using git submodules requires learning how to use a different set of commands and incorporate them into your workflow and scripts. For example, whenever you checkout the repository for the first time, you will have to run ‘git submodules update’ to fetch all the dependencies.
Also, the programmer’s workflow needs to change so that every time you make changes to submodules, separate commits get created for all of them. Then, you need to update the references in the parent repository to point to the new commits.
Here at Medium we rely on GitHub’s pull request feature to do code reviews—so we only merge the feature branch into master when someone else has reviewed the code. Git submodules makes this process much more complicated, and a developer would have to create separate pull requests for each submodule and link them together in a code review process.
Again, the disruption to our familiar workflow eliminated this option.
In the end it turned out that the right answer was a more barebones solution, the lesser known git subtrees. Git subtrees is, in essence, a script written on top of git (now a part of the git-contrib package). It clones the subproject’s repository and merges it into the parent project’s repository. This means that either all commits from the library directory get copied into the parent directory or all the commits get squashed into one big merge commit.
To add a subproject, you execute these arcane-looking commands:
git remote add my-lib-remote firstname.lastname@example.org:Medium/my-lib.git
git subtree add —-prefix=my-lib-folder/ my-lib-remote master
First, you add your library’s remote as if it was your own. Then, you use subtree add to add that repo’s code into a path in the parent’s project, specified by prefix. The last parameter, master, is the branch you are pulling code from.
These two commands will pull all the commits from the subproject repository and lay all the code into the prefix path. This has a significant advantage over git submodules—you can see the history of changes of all the files that matter to your project, which gives you much more visibility into the changes going into the project and overall code quality.
From there on out, you can resume your normal workflow disruption-free as if there were no subproject at all!
Synchronizing code between the client app codebase and the library is a tiny bit more complicated. For a more comprehensive walkthrough of the git subtree workflow, refer to git subtrees: a tutorial.As our codebase and team evolves, it’s possible that our needs and capacity for customizing our own development tools will change. But for now, git subtrees is serving us quite well.