COMBINE — X — FILE

Andrea Finollo
9 min readMar 9, 2021

--

When I started with this crazy idea to create a bluetooth low energy library for Apple hardware using Combine I really didn’t have an idea about what I was up to.

INTRODUCTION

Some stuff that blew my mind:

  • First the paradigm: Combine uses a functional react paradigm (FRP).
  • Second “time”: in imperative programming of course we take time into account, but in FRP we must take a step further. Time applies in sequence of events, how we react, transform and stream values over it. “Callback hell” keep us out of focus about “time”, in Combine pipelines time become so clear and important that you are going to see things in a different way.
  • Third: Output and Failure. We’ve got used to pass success and failure completion block, nesting them, sometimes ignoring errors. In Combine streams results are returned as a generic Output and Failure types and this is super convenient.
  • Fourth: fancy operators.

Are you using Combine but you still don’t have a real understanding about how is working under the hood? I will try to help you in that using references to my researches.

PRECONDITION

This article is not for newbies you must have a basic knowledge of FRP and Apple Combine framework.

SUBSCRIPTION FLOW

This is one of the most important part of using Combine. Understanding the subscription flow is a major key for mastering the concept of back pressure.
In Combine is the subscriber that controls the flow of data.
When a subscriber subscribes to a publisher it requests data based on a specific demand. The demand is sent though the pipeline.
This process il called backpressure.

Publisher-Subscriber dance

In these image taken from WWDC documentation you can clearly see how this works.

1. First as soon a subscriber is attached it calls the subscribe method up to the chain until it reaches the top
2. The publisher in a sort of handshake inform the subscriber that it has received a subscription
3. The subscriber then requests, based on its demand, N (or unlimited) values
4. The publisher returns <= N (or unlimited) values

A publisher may also send a completion event that could be a failure or a normal termination.
The subscriber can also cancel the current subscription.

But what happen if the publisher has more values available even if the demand is fulfilled?
well basically everything is paused until the subscriber asks for more values. I recommend to see this video where that concept is well explained.

You can also read this article about how you can modify the backpressure using a custom subscriber.

The back pressure

MARBLE DIAGRAMS

If you look on the internet about FRP you are probably going to see also some weird drawings, the so called marble diagrams.
Marble diagrams are really helpful in understanding what happen in a FRP pipeline through time.
There is a wonderful app RXMarbles that can be downloaded from the App Store and Google Play Store. While this application has been made for ReactiveX it has a lot in common with Combine and I strongly suggest you to download and play with it.
Marble diagrams use this kind of conventions:

Marble diagrams convention

Sometimes you will also find them represented as ASCII.

— 5 — — —6 — — — — — | — >

  • Timeline: the time line where events occurs it must be read from left to right, represented by a sequence of dashes, a single “-“ can represent a time slot.
  • Emitted values: usually represented as their value
  • End of stream: represented with the character pipe “|”
  • Error: represented by the character “#” or “X”
  • Operator: a huge ASCII art box with written inside the operator type

This kind of representation is super helpful when you need to design a pipeline.

HOT PUBLISHER OR COLD PUBLISHER?

Sometimes this behavior is also know as hot observable and cold observables.
What’s the difference? in a cold publisher model values are sent only once you attach a subscriber, basically they are “lazy” and you are in charge to pull the trigger.
On the contrary in a hot publisher model, publishers start to fire once they are created, that means that usually you must cache the result to make it available once the subscriber is attached.
Depending on the context you will find that sometimes it will be better to have a cold publisher instead of hot or viceversa.
In Combine the model applied is “cold” with the exceptions of promises and subjects. If you ever seen the Deferred word around a promise is used to transform them from hot to cold.
In a cold publisher situation there is no need to cache result because a value exists only when a subscriber request it and old values cannot be requested since the same set will be given again each time you attach a subscriber. This have important root on how the publisher graph is handled by combine.
You can read more about this concept from CocoaWithLove.
I do agree with the reference when they say that while cold publisher are useful when you program in a full react environment when you have to interface with imperative programming hot publishers suit better and that is where Subjects in Combine play a major role.

SUBSCRIPTION SHADOW GRAPH AND EXPERIMENTS

There is something in Combine that a at first glance is very strange.
Let’s say that I have a publisher that publishes a value each second, from 1 to 5, I attach a subscriber (A) to it and then after 2 seconds I attach another one (B).
Can you tell me what is going to happen?
Probably you will reply to that question that I will see the first two value on A and later values on A and B.
Or at least this is what I thought.
In the reality is not working like that, the default combine behaviour in most of Combine operator is a resubscribe mechanism.
What is really happening here is that when you add the B subscriber the publisher start working as there where two different publishers, even if there is an overlap in time the B subscriber will start to receive values from the first one.

Resubscribe mechanism

Is there a way to make the second publisher receive only the data that are just in time. Yes you can do it and you can do by using the share operator.

Sharing the publisher

Quoting Matt Gallagher this happens because:

Despite the programmer creating a single graph of Publishers, there is a shadow graph of other instances that really performs the value processing and sending. We can see this shadow graph in the last function in Subscriber protocol.
`func receive(subscription: Subscription)`
Every Publisher in your graph is shadowed by one instance of Subscription per active Subscriber.

I strongly recommend to read this article from CocoaWithLove.

FLATMAP

FlatMap at the beginning was really a sort of exoteric operator, I’ve read a ton of articles about it without really understand how it works.
On apple documentation we find something like this:

Transforms all elements from an upstream publisher into a new publisher up to a maximum number of publishers you specify.

It doesn’t sound very helpful.

Until I’ve faced the concept of metastream here.
In most of the articles you will read about analogies between map, flatMap, compactMap operators on array.
We know that if we have and array of arrays like this [[0,1][2,3]] and we apply flatMap the resulting array will be [0,1,2,3], but how to apply this concept to flatMap in Combine.
In this article Donny Wals (that also has written a fantastic book about Combine) writes:

This means that when you map over a publisher you transform its published values one by one. Using compactMap leads to the omission of nil from the published values. If publishers in Combine are analogous to collections when using map and compactMap, then publishers that we can flatten nested publishers with flatMap.

This is better, but what really opened my mind is the concept of meta stream and it is well explained here .
Let’s say that we need to chain multiple publishers.
We have an array of URLs that point on some resources to be downloaded on the internet and for each URL in the array we must create a URLSession to download those data. Of course we are interested in the values received at the end of the download.
A naive approach would be publishing values contained in the array and then map them to a URLSession, in these case we will see in the sink the URLSession publisher instances instead of their response.

This is a meta stream and is represented in this image.

Metastream

This is not very helpful we would like to receive the response not the meta stream, what we have built in the map operator is a nested publishers situation, shall we flat it?
By using flatMap we can obtain what we want inside the sink, the response of the URLSession publisher.

flatMap in action

Now you can see that what is printed are the sizes of the responses.

FlatMap is super powerful and can do more. In the code written before all requests are run concurrently, what if we don’t want that.
We can work on the demand using the maxPublisher argument that is defaulted to .unlimited. If we set 1 only one request at time will be executed. This is the famous backpressure concept.

Cap the demand

The printed responses are completely different from before, this is because everything is paused before starting another request.

Now let’s say that we have a refresh button that each time is pressed it will publish a URL to an URLSession publisher to refresh a list and that we have a user with OCD that presses this button 3 or 4 times to be sure that the list will be reloaded. In this case we are interested in only the last tap, that is where switchToLatest() can help us. Consider the same example modified a little bit:

switchToLatest() in action

Only the kadabra pokemon is downloaded.

DISPOSE BAG AND TRICKS

If you are working with asynchronous functions in Combine you will probably find yourself in a situation where you will build your beautiful pipeline, lauch it, and nothing happens.

I would like to save you sometime in debugging the issue: this happens because most probably your pipeline will be destroyed right after the scope of the function (where you declared it) ends. You are in charge of keeping a strong reference to your AnyCancellable pipeline. And you can do in different ways:

  • create an instance variable that points to it.
  • create a dispose bag, that is a fancy name for a collection that keeps all your AnyCancellable. This is so used that at the end of the sink you can add a .store passing the directly the dispose bag.

I suggest you to read this book from Matt Neuburg that also suggest a trick to avoid declaring variables for one-shot subscribers.

Courtesy Matt Neuburg Understanding Combine

As you can see we keep a reference to the cancellable variable inside the completion closure, thus we keep a reference until the process ends.

To debug Combine streams the print() operator in the pipeline is a must but sometimes is very annoying and there is no way to disable temporary. So here is a custom operator that is basically the same print operator, but that you can disable.

extension Publisher { 
func customPrint(_ prefix: String = “”, to: TextOutputStream? = nil, isEnabled: Bool = true) -> AnyPublisher<Self.Output, Self.Failure> {
if isEnabled {
return print(prefix, to: to).eraseToAnyPublisher()
}
return AnyPublisher(self)
}
}

CONCLUSIONS

I really hope that I helped you somehow in dealing with Combine. Combine is a super powerful technology and is really amazing what you can do with it.

I add also some references and books that I used.

Articles:

Books:

--

--