Torque Cloud Framework Blog

Torque Cloud Framework Blog

How Torque Solves the Complexity Problem in Cloud Software Systems

Vinko Buble's photo
Vinko Buble
·Oct 28, 2022·

4 min read

As a cloud software engineer, you know that the common way of managing cloud systems is instance-by-instance. Engineers have to touch each and every service and then each and every interconnected service in order to configure their cloud systems. The complexity of the system can be represented by the number of services multiplied by the number of links per service. And complexity only multiplies with the number of deployments. This means that even small changes can take more work over time.

Current tools do not solve this complexity problem

Now, you might say that we have Terraform, Kubernetes, and other great tools to help manage it all for us. While they do provide a number of benefits, they do not change the order of complexity. Humans still need to interact with each and every service, instance-by-instance. And all interconnected services. It is an exponential level of complexity.

Figure. The manual nature of cloud infrastructure management creates exponential complexity.

We must move to Loose Coupling

To reduce this complexity, Torque developed a Python framework for cloud system management which decouples system designs from specific deployments. We do that to reduce a system’s complexity from exponential to linear while also allowing any kind of complex configuration.

Here’s how.

First, to help you imagine, it all comes as Python code that implements Torque’s framework as a directory in your codebase.

There are two layers: a layer that describes your system design and a layer for the deployment of the system.

Now, let’s go a level deeper to see how these layers work together to reduce the order of complexity from exponential to linear.

Next, we bring in abstract types that represent components, like application services, databases and queues, as well as links between these components. These are simply implementations of Python classes from the Torque Framework. Torque’s team provides a collection of Abstract Type implementations that you can find in our GitHub repository, but since this is code, you are not limited by what is currently available. These can be open source, or you can write your own.

(Eventually, there will be a platform to help you assemble this directory, so you do not need to do this mundane work either. Code provides a lot of flexibility.)

Then, we use components and links to assemble the DAG that represents our Cloud System. Currently, we use a CLI tool to describe and visualize the DAG.

Before we can deploy the DAG, we need to bring in code that will convert the DAG into specific configurations like YAML files, HCL scripts, or just by directly using a cloud provider API. For that, Torque gives you code that implements different deployment providers. Again, since this is code, you can use ones prepared by Torque’s team, or you can make your own. That means an expert who has been there many times can create a deployment provider type that anybody else can use.

And the last step before deploying, you’ll need to create a deployment object that is essentially a list of deployment providers that you want to use for deploying your DAG. You hand this directory to the Torque engine and tell it: torque deployment apply staging, and Torque converts the DAG using code from deployment providers listed in the deployment object into cloud instances.

Figure. The top-level structure of Torque's code directory.

And now is the part where we can clearly see how the complexity of the system stays the same after we add an entirely new deployment, like production. All we need to do is create another deployment object with a list of production-grade deployment providers, and voila, one command and the production deployment is up and running.

Of course, the same goes for your development environment, too. That’s because, for Torque’s Framework, your development environment is just another deployment with docker-compose deployment providers optimized for developer experience.

Figure. Deploying another environment is as simple as adding a deployment object.

O(c + l + p + d) = O(N)

  • c – number of components
  • l – number of links
  • p – number of provider types
  • d – number of deployment objects

And that’s how the order of complexity in your cloud system can go from exponential to linear, making it much easier to manage, even as it grows.

I know this might have created a number of questions for you. Please, check out our GitHub, use our contact form, or follow us here for more blog posts. We’d love to hear from you!

Share this