Latest news about Bitcoin and all cryptocurrencies. Your daily crypto news habit.
In the last post I listed the ways you should not use when splitting your monolith. But before splitting it the right way Iâd like to answer the following question: what properties should they have? You can divide your system in a myriad of ways, but you should choose only one. To make this decision you need to understand which principles you should be guided by. And in order to do so you need to realize where you want to arrive, i.e., what characteristics your services should possess.
Low coupling
Youâve seen the examples of wrong monolith splitting. What is the problem that these examples share? It is tight coupling, exactly what I try to get rid of: if you need to modify one service, itâs likely that youâll have to modify couple of more of them. Along with the low coupling usually comes the high cohesion. When the coupling is tight the cohesion is low. Whatâs interesting, the opposite is true as well, but with a little remark: with a right service granularity high cohesion results in loose coupling. Itâs not a generally recognized fact or ruleâââitâs just my observation. And it is the one I use for finding service boundaries, with services being loosely coupled. For me personally this is more simple, as the notion of âloose couplingâ seems too ephemeral to be used as a beacon for identifying service boundaries.
High cohesion
In a monolith the cohesion is low because one monolith, one piece of code does the whole thing. If you use a way I described in the âWrong reuseâ chapter to split your monolith, the cohesion is low as well, although all those service were created with this ability in mind. For example, the usual mindset goes like âWell, Ticket service contains all the logic related to tickets. Why isnât it cohesive?â. The problem is that âall the logicâ is a quite blurred notion. Itâs just a set of functionality related somehow to tickets, that is used by the other services on the whole ticketsâ life cycle. This Ticket service inherently can not be cohesive.It resembles me of splitting the project on modules around design patterns: singletons, factories, strategies, etc. Another way could be to split the system around program constructs: classes, interfaces, objects (in case language supports it), etc. Whereas the principle one should follow is semantics. Classes belonging to a module are used together, forming a coherent piece of functionality, telling about domain. This is what modules are used for, this is what they are needed for. Talking about service coherence I keep in mind the same approachââââcoherent piece of functionalityâ.Probably I shouldâve started with it, but, nonetheless, just in case, letâs dispel all doubts and clarify the word âcohesionâ. Thatâs what Wikipedia has to say about that:
(âŠ) cohesion measures the strength of relationship between pieces of functionality within a given module. For example, in highly cohesive systems functionality is strongly related.
Thatâs what Iâm striving for.
Correct granularity
The approach I use is based on the notion of bounded context taken from the book Domain-Driven Design: Tackling Complexity in the Heart of Software by Eric Evans.If some concept that I use in the code of my service is getting ambiguous (the customer is who browses the page or someone whoâve made a purchase?), if the same concept is used in semantically very different places that are interested in different data and behavior (âorderâ while making a purchase and âorderâ while delivering it), or the same entity reflexes some domain concept in every stage of its life cycle (both examples are valid), then the chances are that the service is too coarse-grained and its cohesion is low. I highly recommend to split it further to get more cohesive parts. As soon as we get services with monosemantic bounded contexts, where every concept is unambiguousâââyouâre done. No further splitting is required.
High autonomy
Low coupling results in full conceptual service autonomy. By autonomy I mean that serviceâs ability to do its job doesnât depend on the availability of other services. Service in order to do its job needs neither functionality nor data of another services. Moreover, service might not even know about other services (and in most of the cases should not).Service autonomy manifests itself in a way services store their data and the way they communicate with each other. But of course it doesnât come by itself. You should follow the autonomy deliberately by identifying correct service boundaries, but you wonât regret about the time spent, as with high autonomy comes high business agility, the key business property.
Autonomy manifests itself in communication via events and decentralized data storage.
Services communicate via events
Services donât live in vacuum, so they communicate with each other. How to implement that? I advocate for the use of behavior-centric and business-driven event message type, opposite to synchronous requests and command messages. Such architecture is called Event-driven architecture. Published events should reflect the business concepts, some real things happening in domain: order completed, transaction processed, invoice payed. Usually business policies donât require an immediate and transactional reaction. But nevertheless if you think that the event should be processed inside the database transaction and consequently the result of it should be âall or nothing, and immediately!ââââthink again. For me, personally, it helps to imagine how this business worked (or would have worked) hundred years ago, when there were no transactions, and even computers, that made data exchanging, data processing and communication extremely quick. By the way, it is partly because of this service boundaries identification is so difficult task. But even now, with all these technologies at hand, our life still remains not so transactional and rarely synchronous, mostly message-driven.
Take a look at such a transactional, from the first sight, thing like buying a house. Realtor prepares the documents. You make first deposit (it already gets less transactional, little by little, doesnât it?). But suddenly your realtor cancels the deal. Probably heâs found more profitable proposition. Or it turns out that the house is in an emergency condition and it canât be sold. Probably he even wrote you an email after failing to reach you by phone, but unfortunately you saw it after the deposit had been put. In this case, realtor I hope will give you your money back and youâll be off to look for a new house. An email here is a command-based message, whose semantics is fire and forget. Realtor wrote you an email but it doesnât matter what response would be. The deal is canceled anyway, and you should do all you can to get your money back and find a new house. The realtor is not much involved in your activities. Talking about transactionality, firstly, you havenât payed the full price, only the first deposit. Secondly, the deal is canceled anyway, but for now you donât have your money. This can hardly be called transactional interaction.Or the classic example of money transfer. In case when cards are emitted by different banks the transfer can take up to several days, while the sender can be debited at once. Is this transfer period spanned by a database transaction? No. Very roughly this process can go like this (it can go another way though, depends on a concrete bank): first the sender account is debited, then the senderâs card emitting bank starts clearing processâââa physical process of money movement between banks. After this the receiverâs account is credited. As you see, it is far from being transactional.
In other words, quite a few things in our life that seem to be transactional and synchronous are not such. And itâs fine.
But sometimes you really need a command
There are some cases when a service is inherently a request/reply-like. For example it is the valid case for a service doing some analyzing job. In this case it requires some input data and reports a result at the output. Iâm talking about situations when this is really a separate service, representing some business value of the same abstraction level with the others, so it can not be put inside any of the existing services. In this case command messages or asynchronous request/reply come in handy. But such services should not be blocking, so that client service invoking the considered service wouldnât wait for it to complete its job.
Decentralized data
Besides communication via events, service autonomy implies that there can not (and should not!) be shared database.
Well, when the service boundaries are specified correctly, when each service is highly cohesive, they simply donât need other servicesâ data. But if some service needs another serviceâs data (which is synchronous operation by nature) itâs likely that they should be a single service. I like to compare it with feature envy smell, which is a clear violation of the Information expert principle from GRASP guidelines. Itâs just another manifestation of the concepts unity on different levelsâââbe it an object or a service.
When data is decentralized, in case of EDA it is modified in a qualitatively different way. Now none of the services can invoke and modify its data. This can be done only through events. Why is it a qualitatively different way?Firstly, in most of the cases it means for a publisher that it should understand what happened from the business perspective. Otherwise it sometimes seems even unnatural to publish an event: it is not a simple CRUD operation that we all got used to, itâs something different, requiring different approach. And this different approach takes us closer to Domain-driven design, which is itself a huge benefit. Secondly, the subscriber decides how to react on an event, what data to modify and how. Publisher doesnât even know about its subscribers.
It reminds me the Dependency Inversion principle. Consider the following code:
class Server {
}
class Client { public function do(Server $server) { // ... }}
In synchronous request-reply client depends directly on concrete service it requests, on its API and availability.
In case of event-driven architecture based on publish-subscribe, subscriber doesnât care about a publisher. Subscriber doesnât know whoâs publishing an event. Subscriber takes the position of an experienced meditator who accepts the reality as it is, just watching it. It doesnât care about publisherâs availabilityââânevermind, messages can be delivered a bit later. Other put, publisher is totally abstracted from the subscriber, hiding behind a set of events that publisher can emit, i.e., its contract.So the previous piece of code transforms into the following:
interface IEvent {
}
class Subscriber { public function subscribe(IEvent $event) { // ... }}
Decentralized data make high scalability possible. Very often it is a database that becomes a performance bottleneck because usually you have to lock some data while request processing. When all the data is located in a single databases the probability that some request needs data that is already locked in parallel requestâs transaction rises. So each request waits for the ending of the previous. And when data is decentralized, locks are decentralized either: transactions span less data because request processing is split along different services and there are less âaccidentalâ locks that I wrote about in âCentralized dataâ chapter of my previous post.
Service choreography
Service choreography is a natural consequence of synchronous communication rejection, use of business-events and centralized data storage rejection. Governing authority in EDA looks like an archaism from a synchronous past.
Profits
So with event-driven architecture we donât have a disadvantages of synchronous communication, command-based communication, distributed transactions and orchestration. So there is a win-win situation: with this fire-and-forget approach we decouple our services both logically and technologically as messaging infrastructure promotes non-blocking communication.Concerning reuse, unit of reuse in EDA is an event, not a service. If you need a new functionality upon some event, then all that you need is to add a new event subscriber.
Besides agility, reliability and availability, almost infinite scalability perspectives arise with this approach: services donât need to cope with peak loads. When weâre flooded with messages they simply reside in our ESB or broker until they get processed.
This approach is nothing but a common sense based on the experience with failed ways of system architecture design. It is not a trend. It is not tied to any concrete technology. And surely itâs not new. You can call it SOA, Microservice architecture, Reactive programming or Self-contained systems. For me itâs like a bunch of people wanting to take (financial) advantage of a solid set of principles, coming up with a new catchy labels in an attempt to write their names in history.
In the next post Iâll talk about how exactly I identify service boundaries.
What Characteristics My Services Should Possess was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.
Disclaimer
The views and opinions expressed in this article are solely those of the authors and do not reflect the views of Bitcoin Insider. Every investment and trading move involves risk - this is especially true for cryptocurrencies given their volatility. We strongly advise our readers to conduct their own research when making a decision.