Lessons learned about structuring iOS applications: Why we went from VIPER to MVVM 

Developing iOS applications is not easy. As one begins to think about the user interface, network connection, business logic and every little detail in between, it quickly becomes clear that an ordered and structured approach is necessary.

Hence, over the years, iOS developers have spent quite some time thinking about which software architecture is the best. Best here, of course, is to be understood as meaning that which produces results quickly and reliably without putting unnecessary burden on developers.

What follows are a few of the lessons we at Centroida learned during some of our latest projects. As always, all the standards caveats apply. Namely, your mileage might vary and none of this is the one final Truth (we are still learning with every single day after all!). Still, for all it’s worth, here are some of the pain-points we were consistently hitting in our development process.

#1 Complexity matters.

When we started , initially we chose to structure it according to the commonly known VIPER (View-Interactor-Presenter-Entity-Router) architecture. As its name suggests, it models the app as a bunch of interacting modules which are each composed of single responsibility objects.

For instance, with VIPER, a Register module could have a View which displays a form with fields for name and phone, a Presenter that validates the values entered, an Interactor that takes the form values and sends them to the backend via http, and a Router which lets us properly navigate to the next module, for example Login. Each of these layers talks to its neighboring layers (e.g. view with presenter) via a well-defined protocol that creates meaningful abstraction over the purely technical details of Apple’s APIs.

All in all, there is unquestionable beauty in this approach. The simplicity that such fine separation of concerns offers is certainly alluring. Consequently, the initial days were a breeze. We were adding modules to the app, each living in its own silo, each component doing its own job in the chain.

However, nothing is perfect and there certainly was a price to pay.

For one, adding modules took time. When you have many interacting objects each of which only knows about a specific interface (as defined by a Swift protocol) implemented by another, then that’s a lot of protocols, files and classes to create, manage, let alone keep track of.

And so, lesson number one is to simply be mindful of the fact that theoretical simplicity can easily come to clash with the practical realities of the human mind. As an app grows, so do the features it has to offer. Consequently, unless the number of developers grows in a similar manner, there will certainly come a point where keeping track of the state of the project becomes next to impossible. And when this happens, adding new features becomes more and more difficult as moving parts multiply and the codebase begins to exceed comprehension. (Especially after a few days off!)

In fact, it is worth mentioning here that complexity is harmful not only because it overwhelms the working memory. It is also harmful because of the vicious cycle of frustration that builds up over time as a result. Which brings us to lesson #2.

#2. The cost of architectural purity / consistency

It won’t be news to anyone well-versed in iOS development that UIKit and other of Apple’s APIs do not follow a VIPER-like architecture (the closest they do follow is MVC or Model View Controller). Nor do they have the same level of separation of concerns that VIPER tries to bring.

Here again, generally speaking, VIPER seems like a way to bring order to the madness. (No more UI event handling in the view controller, let alone making network requests) And in many cases VIPER brought order indeed. Nonetheless, over time friction increasingly grew between the Apple API and the VIPER architecture.

By way of illustration, for some of the app we required a way to present the user with a multi-step form which involves several view controllers that ultimately build up one complete form state to be sent off to the backend.

A situation like this presented us with architectural problems because it suddenly broke one of the assumptions of VIPER as we were practicing it in the beginning. Namely, that there would always be a one to one correspondence between views and presenters. Suddenly, we needed to have either:

  1. Several views (each step of the form) with one presenter (for the whole form) or
  2. Several view/presenter pairs that shared a common state.

And while each of those could work, there admittedly was something a bit dissatisfying about either of the approaches. With both there was a palpable feeling of loss of clear separation.

At this point, many philosophical questions began to arise. For instance, “are two presenters that share a common state really two different objects” and/or “does a single presenter that manages several views really have a single responsibility? (i.e. is the presenter’s single responsibility the management of the form as a whole or is it taking up too much work by managing multiple form screens?).

Relatedly, should the logic for going from one view into another be part of the single-presenter-to-rule-them-all or should it be encoded into the router associated to each view/presenter pair. Or should it live in the router even in the former case? And what if instead of separate view controllers (living in e.g. the navigation stack), we instead used a composite view controller such as UIPageViewController to change from screen to screen? Is it then still the router’s job to change pages or is it now part of the view’s concerns?

The upshot of all that, at least in the author’s experience, was that just as much time was spent thinking about our ideological VIPER purity as was spent actually getting our job done.

Of course, one might say that this is a feature and not a bug; that thinking carefully about structure is important; that VIPER is simply pointing out potential issues and that much like a Swift’s type checker, it made us write better code in the end.

And while it’s tempting to agree completely (after all, “this way is the true way and any issues are only there for your own good” carries an almost religious attraction to it), the reality simply felt different. We were doing a lot of architectural work for little discernible gain. In fact, the gain was often questionable given VIPER’s tendency to multiply files and entities. Not only that, but as the UIPageViewController dilemma showed, it often wasn’t as cut clear which entity was supposed to do what. Every abstraction leaks at a certain point and we were more and more coming to a point where the stakes were growing without us having found a go-to solution that just works.

Naturally and consequently, the cost of mistakes rose and stayed high. The fact is, in the context of the example above, whichever of the two approaches we had decided to take (many views/one presenter or many view/presenters + common state), we would have guaranteed ourselves that any potential refactoring into the other approach would have taken much mental effort and many frustrating business hours – not great when you have a product to deliver and developers to manage.

And unless one believes that the best and optimal solution is always obvious (and so refactoring is never needed), the upshot of all of the above was disincentivized experimentation / slower feature development. All of which ultimately fed into the vicious psychological cycle that got us started in this section in the first place.

#3. The ubiquitous callbacks and a few more practical examples

One of the marking characteristics of protocol-driven development with such long chain of entities as in VIPER (e.g. view->presenter->interactor->api and back) is the overwhelming amount of function calls happening for even the simplest of things.

For instance, for a view to display the current user’s name, the presenter needs to get it from somewhere and tell the view about it. But since the presenter does not deal with this part of the app’s state, it has to go to the interactor which in turn talks to some kind of identity manager (e.g. supported by UserDefaults) that has the current user’s data. And of course, if any of those steps needs to happen asynchronously, none of these function calls can simply return the name. Instead, they should all accept a callback or use some other equivalent (e.g. a fixed callback encoded in the interface of the caller, e.g. userNameReceived() or similar) to propagate the information back to the view.

At least in our case, the code that resulted from this was more or less:

view (somewhere, often viewDidLoad) presenter.requestCurrentUserName()
presenter (requestCurrentUserName): interactor.requestCurrentUserName()
interactor (requestCurrentUserName): identityManager.getCurrentUserName() and presenter.currentUserNameReceived(name)
presenter (currentUserNameReceived): view.displayCurrentUserName(name)
view (displayCurrentUserName): sets some label’s text and quits

Besides being incredibly boring to read and write, this code exemplifies precisely some of the frustration which #1 and #2 touched on already, When you have so many more or less equivalent functions in so many protocols for even the simplest of things, it quickly becomes hard to keep track and maiantin the codebase, especially when a refactoring needs to happen (what if we now need the user’s email or ID too? 8 more new functions or 8 changed function signatures (per module), the end result is decreased pace of development either way)

This hit us especially hard when we needed to support refresh tokens which occasionally could trigger an UI update. As a way of introducing the problem we were trying to solve, consider the following: our backend gives us a short-lived access token and a relatively long lived refresh token that could generate new access tokens if necessary. Whenever we attempt to do a network request, we sometimes discover that our access token has expired. Thus, we try to refresh it. But since we encrypt the refresh token using the iPhone’s Secure Enclave, accessing the refresh token triggers biometric authentication that has an associated UI prompt (Face ID, Touch ID, passcode prompts). And even if all goes to plan up to this point, we could still discover that the refresh token is itself expired. Thus, we need to propagate back to the view and display the login form.

Now, pretty much every architecture worth its salt will have some kind of separation between the view and network layers. However, practical issues like the one just described at times necessitate that this separation not be insurmountable or absolutely total (as we saw already, biometric authentication has an UI but is required in the network layer).

At least for us, VIPER often seemed to err a bit too much on the side of absolute separation. Which is all fine, until there’s no getting around Apple’s Secure Enclave’s API that can trigger its UI at any point and thus undermine, at least conceptually, the intended separation of concerns between view and interactor / api manager.

This was a con for VIPER in our book because it was this exact separation of concerns that necessitated the multiplication of architectural layers; it was also this separation that justified the endless boilerplate that we were writing. It seemed to us that without a strict separation, we were losing some of the benefits that VIPER promised us while still continuing to pay all of the associated costs.

#4. Testing

Which gets us to the topic of testing.

One of the selling points of VIPER is the ability to mock any of the architectural layers (by implementing its defining protocols) and thus be able to unit test that everything along each chain in each module is working properly.

Unfortunately, Swift is not the world’s greatest language when it comes to mocking capabilities and frameworks. At first, we experimented with Cuckoo, but we hit some build issues and ultimately had to give up. We reverted back to writing our mocks ourselves, but this quickly made any refactoring even more painful than before. We now had to update not only the protocols for each layer, but the mocks and the tests too. As a result, a simple experiment to see if some new idea might work could sometimes take a full day of fixing protocol conformance errors.

It was the combination of this and many other smaller problems that tempted us into looking for alternatives. Put simply, given the ever-changing project specification, it seemed that VIPER was simply too opinionated and too stubborn to deal with. We wanted something more practical. Something that could maintain the good sides of VIPER while minimizing the bad ones.

Enter MVVM

A few months into the project, we decided that we had learned enough lessons that a rewrite at that point would bring real business value. Marred by some of the negative sides of our previous architecture, we decided to take a new approach this time.

In particular, we decided that instead of trying to commit to countless protocols and multiple layers of separation right from the get go, we should focus our efforts instead on the core functionality of each view controller. Put differently, we decided to let the abstractions suggest themselves along the way instead of forcing them right from the start.

This bought us a few nice things. Firstly, it made progress fast and easy. There was much less time wasted thinking about where exactly a piece of code should go. Secondly, it broke the conceptual mound that VIPER had created and allowed us to be more practically creative.

For instance, to avoid recreating the callback hell, we researched and ended up adopting RxSwift. This helped us because it made callbacks unnecessary in many cases (replaced by subscriptions to observables). Moreover, it allowed us to use many powerful functional primitives like map, filter, reduce, etc. out of the box. This led to more succinct and yet readable code. (The succinctness was key for us here as VIPER+Rx is certainly a possibility too but it’d end up verbose and difficult to follow!)

As for the multiple views/presenters dilemma, we resolved it by creating a separate class (sometimes an UIPageViewController, sometimes a mere Swift class) that manages the whole multi-screen flow and keeps track of the state. The views underneath it were tasked with displaying relevant data and propagating appropriate UI events back to the parent.

Speaking generally, our new approach was a sign of our changing attitudes towards complexity. If before we wanted to be CLEAN at any costs, now we preferred to get there over time without taking on all of the costs upfront. Which is to say, if some view ever became so complicated, we’d not hesitate at all to introduce an associated presenter to lift any and all non-rendering tasks off the view’s shoulders. But we wouldn’t start there from day one. I guess one could say we adopted a pragmatic, complexity-on-demand approach.

This is probably why we ended up with MVVM in the end. In the beginning we had our views and our models and we needed a way to connect the two together without unnecessary complexity. View-Models seemed like an appropriate tool for the job given how reusable and simple they can be. Not only that, but they also seemed to play well with RxSwift and, at least in our case, resemble simple functions that could be easily tested.

That is ultimately why, all things considered, we consider the rewrite successful. It’s not because there aren’t any issues remaining. Rather, it is because few of those issues would have been made any easier with VIPER or any of the more complicated architectural patterns out there.

Conclusion

In conclusion, it’s important to remind ourselves that the function of software architecture is to help the developer write quality software faster. Our experiment with VIPER showed us that, at least for our team, it was over-engineered and brought along with it a significant overhead.

That said, the same experiment showed us that VIPER has many positives too and therefore could certainly work for others. Indeed, Uber have recently migrated to a VIPER-inspired architecture which definitely looks intriguing.

At the end of the day, if it works for Uber, then it certainly does many things right. Unfortunately for us, we have neither the scale nor the resources Uber has, so VIPER just wasn’t a good match for us.

On the other hand, MVVM fit the bill perfectly. It combined a moderate separation between the different app layers with the succinctness and the flexibility that is necessary in the beginning stages of a project. It allowed us to experiment and move quickly exactly when it mattered.

This flexibility is why we ultimately believe most iOS projects are better off started with an architecture closer to MVVM than to VIPER. Even the above-mentioned case of Uber confirms this rule of thumb. In particular, their move to VIPER followed years of successful operations in the global market using a much simpler architecture.

In any case, the take home message here is one: don’t be afraid to experiment; never simply follow somebody else’s “best practices” blindly. In software architecture, as in much else in life, do more of what works and let go of that which doesn’t. Inevitably, success will follow.

 

 

Author: Blagovest Gospodinov

Leave a Reply

Your email address will not be published. Required fields are marked *