How Walrus Simplifies Infrastructure Management with Terraform
Platform Engineering has been a hot technical concept in recent years. It focuses on a software engineering approach aimed at enhancing developer productivity by reducing the complexity and uncertainty associated with modern software delivery. One specific implementation of Platform Engineering is the Internal Development Platform (IDP). This article will combine the principles of Platform Engineering to introduce the basic concepts and usage of Terraform, including some of the challenges faced during its use and how Walrus AppManager addresses these issues.
What is an Internal Development Platform (IDP)?
An Internal Development Platform, or IDP, is a self-service layer built upon the existing technology tools of engineering teams. Developed by the platform team, IDP enables developers to easily configure, deploy, and launch application infrastructure without depending on the operations team. IDP contributes to further automating operational workflows, improving work efficiency by simplifying application configuration and infrastructure management. Moreover, IDP empowers developers with more autonomy, allowing them to seamlessly handle everything from coding to software delivery.
In traditional infrastructure deployment and management, manual configuration and management methods are often employed. This means administrators need to manually operate infrastructure components like servers, network devices, and storage, install and configure software, and deal with various dependencies and environmental changes. This approach consumes a substantial amount of time and effort and is susceptible to configuration errors and inconsistencies.
One of the primary objectives of Platform Engineering is to simplify the usage of infrastructure for developers. The majority of developers are not concerned with the intricate concepts and creation methods of underlying services. For example, developers may need to use storage services, but they are typically not interested in how these services are created. Similarly, terms like object storage and disk arrays are usually outside the scope of developers' concerns.
In most business scenarios, developers are primarily interested in how to use these services and whether data can be persistently stored in designated locations. Therefore, Platform Engineering can effectively provide a storage service that caters to the needs of most scenarios. Developers can simply choose the type of service they require, utilize the capabilities of the Platform Engineering to create the service, obtain the service's address, and start using it.
Platform Engineering achieves this by abstracting the definition of services and introduces the concept of applications built on top of these services. Multiple applications can be combined to form a business scenario, with explicit or implicit relationships between these applications. Through these relationships, they organically come together to create a complete business ecosystem and offer services to external entities.
What is Terraform
Terraform is an Infrastructure as Code (IaC) tool that allows you to securely, efficiently build, change, and update infrastructure. This encompasses low-level components like compute instances, storage, and networking, as well as high-level components like DNS entries and SaaS functionalities.
Before the emergence of Terraform, infrastructure management was a labor-intensive task that required manual operations, was error-prone, and complex. However, Terraform has made infrastructure management simpler, more efficient, and reliable. It is particularly well-suited for cloud environments such as AWS, GCP, Azure, Alibaba Cloud, and others. Terraform can manage various types of resources through its extensive providers, which function like plugins and enable easy scalability.
Terraform utilizes HashiCorp Configuration Language (HCL) to manage and maintain infrastructure resources. Before execution, you can use the terraform plan command to preview changes to resources. These changes are managed through Terraform's state files.
The concept of Terraform State is crucial for keeping track of the current state and configuration of your infrastructure. When deploying with Terraform, it tracks the resources created and their configuration states, storing this information in a local .tfstate file or managing it remotely using storage solutions like AWS S3 or Azure Blob Storage.
The state file records the state of resources, and as resources change, their configuration states also change. Therefore, Terraform supports previewing resource changes before execution. Terraform's configuration state file can be stored locally or in remote storage, such as S3, Consul, GCS, Kubernetes, or other custom HTTP backends.
Installing Terraform
macOS
brew tap hashicorp/tap brew install hashicorp/tap/terraform |
Linux Ubuntu
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list sudo apt update && sudo apt install terraform |
Linux CentOS
sudo yum install -y yum-utils sudo yum-config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo sudo yum -y install terraform |
Linux binary installation
Linux users can also install Terraform using binary, by downloading the corresponding version from the official website.
Install Terraform version 1.4.6 via binary:
curl -sfL https://releases.hashicorp.com/terraform/1.4.6/terraform_1.4.6_linux_amd64.zip -o /tmp/terraform.zip unzip /tmp/terraform.zip -d /usr/bin/ rm -f /tmp/terraform.zip |
Windows
Windows users can install Terraform using Chocolatey, which is a package manager for Windows, similar to Linux's yum or apt-get. Chocolatey enables you to quickly install and uninstall software, making it a convenient way to manage software packages on Windows.
choco install terraform |
Alternatively, you can download the appropriate version of Terraform from the official Terraform website, then extract it and configure the PATH environment variable. Finally, you can verify whether the installation was successful by using the terraform version command. This method provides manual control over the installation process and is suitable for users who prefer not to use package managers like Chocolatey.
Example of Terraform Managing Kubernetes Resources
Configuring the Kubernetes Provider, you can manage Kubernetes resources using the ~/.kube/config file or other kubeconfig files of your choice.
provider "kubernetes" { config_path = "~/.kube/config" } |
Create a Kubernetes Deployment.
resource "kubernetes_deployment" "nginx" { metadata { name = "nginx" labels = { app = "nginx" } } spec { replicas = 1 selector { match_labels = { app = "nginx" } } template { metadata { labels = { app = "nginx" } } spec { container { image = "nginx:latest" name = "nginx" port { container_port = 80 } } } } } } |
Kubernetes Deployment Service:
resource "kubernetes_service" "nginx" { metadata { name = "nginx" } spec { selector = { app = kubernetes_deployment.nginx.metadata[0].labels.app } type = "NodePort" port { port = 80 target_port = 80 node_port = 30080 } } } |
Copy the configuration above and place it in the target directory such as ~/terraform-demo.
mkdir ~/terraform-demo && cd ~/terraform-demo |
Create a main.tf file, copy the configuration to main.tf and save it. The complete file configuration is as follows:
Open the terminal, enter the target directory, and initialize the Terraform environment.
cd ~/terraform-demo && terraform init |
Execute run terraform plan to preview resource changes.
Execute run terraform apply to create resources.
Finally, you can access the nginx welcome page by opening your web browser and navigating to http://localhost:30080.
If you want to delete resources, you can execute the terraform destroy command to remove them.
From the example above, we can see that managing Kubernetes resources with Terraform is straightforward. You only need to describe the resource configurations using the HCL language and then create resources using terraform apply. Additionally, Terraform supports importing existing resources using terraform import, allowing you to bring existing resources under Terraform management.
Terraform Status Management
Terraform manages the state of resources and determines whether they have been created, modified, or need to be deleted through its state management mechanism.
Every time Terraform performs an infrastructure change operation, it records the state information in a state file. By default, this state file is named terraform.tfstate and is saved in the current working directory. Terraform relies on the state file to decide how to apply changes to resources.
In the example mentioned earlier, you can see that the generated state file contains the resources created, such as kubernetes_deployment and kubernetes_service. These correspond to the resources defined in our HCL files. Both resource and data resources we create are recorded in this file.
The state file also includes metadata information, such as resource IDs and attribute values. This metadata is crucial for Terraform to manage resources effectively.
By maintaining this state information, Terraform can understand the current state of the infrastructure and compute the necessary actions to achieve the desired state defined in your configuration files. This allows Terraform to create, modify, or delete resources as needed, as well as manage dependencies between resources to ensure they are provisioned in the correct order.
Terraform's choice of managing resources through state files rather than using API calls to inspect resource status directly from cloud providers is based on several advantages:
Efficiency and Performance: Terraform can directly access the state file to retrieve the current state of created cloud resources. For small-scale infrastructures, Terraform can query and synchronize the latest attributes of all resources, comparing them to the configuration in each apply operation. If they match, no action is needed; if they don't, Terraform creates, updates, or deletes resources accordingly. This approach reduces the need for time-consuming API calls, especially for large infrastructures where querying each resource individually can be slow. Many cloud providers also have API rate limits, which Terraform can encounter if it attempts to query a large number of resources in a short time. Managing state through a state file significantly improves performance.
Resource Dependency Management: Terraform records resource dependencies within the state file. This allows Terraform to manage the order in which resources are created or destroyed based on their dependencies. It ensures that resources are provisioned in the correct sequence.
Collaboration: Remote backends can manage state files, facilitating collaboration among team members. By storing the state remotely (e.g., in AWS S3, GCS, HTTP), multiple team members can work together on the same infrastructure project and share the state file.
While managing resources through state files offers many benefits, it has some potential drawbacks:
State File Security: The state file contains sensitive information in plain text. If this file is leaked or compromised, it could expose critical information, including passwords and access keys, posing a significant security risk. Care must be taken to protect this file from unauthorized access.
State File Reliability: If the state file becomes corrupted or is lost due to factors like hardware failure, it can result in resource information being lost. This can lead to resource leaks and difficulties in resource management.
To address these concerns, Terraform provides a solution through remote backends, where the state file is stored in a remote, secure location. This approach mitigates the risk of state file exposure and ensures better collaboration among team members.
Challenges in Using Terraform
The example provided above demonstrates that Terraform makes managing Kubernetes resources relatively simple. However, as previously mentioned, Terraform does come with some challenges that can impact the user experience in resource management. These challenges include:
HCL Language: Using HCL (HashiCorp Configuration Language) to describe resource configurations means that developers need to learn a new language. Additionally, if you want to use Terraform to manage resources in other platforms such as AWS, GCP, etc., you'll need to learn the configurations for those providers, increasing the learning curve. Syntax issues and complexity can also add to the difficulty.
State Management: Terraform manages resources through state files, and the information stored in these state files is in plain text. If the state file is leaked or compromised, it poses a significant risk to resource management and security.
Knowledge and Experience: Infrastructure resource managers need to have knowledge and experience with the resources they are configuring. Otherwise, misconfigurations may lead to resource creation failures or configurations that do not meet expectations.
Configuration Overhead: Managing a large number of resources often requires writing a significant amount of HCL files. Resource users may spend considerable time searching for resources and configuration files, increasing the management overhead.
Real-time Resource Status: Resource status, such as Kubernetes resource states, viewing logs, and executing commands in a terminal, may require other methods of management, as Terraform does not provide real-time access to these aspects.
Despite these challenges, Terraform remains a powerful tool for infrastructure management. Mitigating these issues often involves careful planning, adhering to best practices, and leveraging additional tools and processes to complement Terraform's capabilities.
Using Walrus to Simplify Infrastructure Management
Walrus is an application deployment and management platform based on the concept of platform engineering, built on Terraform technology at its core. It enables developers and operations teams to rapidly set up production or test environments and efficiently manage them. Walrus leverages the capabilities of platform engineering to address the previously mentioned challenges.
Walrus abstracts resources into services and uses applications to control these services, separating the underlying resource configurations from their actual usage. This simplifies infrastructure management.
By managing multiple environments and configurations, it ensures consistency across development, testing, and production environments, reducing the risk of errors and inconsistencies, and ensuring that applications run accurately at all times.
With governance and control features provided by the platform, developers can also ensure that the environments they use are secure and compliant with best practices and security standards. Resource users can focus on using resources without concerning themselves with the underlying details and configurations.
Through the defined resource templates, developers no longer need to worry about the syntax of HCL language, how to configure Terraform provider parameters, or the underlying implementation details of the infrastructure. They can simply use the resources through the platform's UI by filling in the parameters of the predefined modules. This significantly reduces the complexity of resource usage for developers and improves overall development efficiency.
Additionally, the inconveniences of Terraform's state management have been addressed. Walrus stores state files remotely in an HTTP backend, mitigating the risk of state file exposure. Different services automatically manage their respective states, allowing team members to use the backend to solve state storage and sharing issues.
Walrus defines a series of abstractions at the upper level of resources to simplify the management of application services. It manages the lifecycle of business services through concepts like Project, Environment, Connector, Service, and Resource.
Internally, it primarily uses three major components: App Manager, Deployer, and Operator to manage resources. App Manager is responsible for resource management, Deployer handles resource deployment, and Operator manages resource states. Before managing and deploying application resources, connectors need to be created first. Connectors are used to connect to resource providers such as Kubernetes, AWS, Alibaba Cloud, and custom connectors can also be defined to manage different cloud resources.
App Manager manages the lifecycle of services, including creating, updating, and deleting services, as well as controlling the versions of application instances, including creating, updating, and deleting application instance versions.
Deployer can automatically generate resource configuration files based on the application's configuration and deploy resources using Terraform. Deployer identifies the Provider defined in templates and generates resource configuration files based on the Connector configuration in the environment. Deployer enables resource deployment, creation, updates, deletion, rollback, and more.
Operator manages resource states, including viewing resource states, logs, and executing terminal commands. Operator defines a series of resource operations using different resource types and the corresponding service provider's API server to perform resource operations. For example, for Kubernetes resource operations, Operator uses the Kubernetes API Server to view resource states, logs, execute terminal commands, and more. Currently, Walrus supports operations for Kubernetes, AWS, and Alibaba Cloud resources, with support for more cloud providers' resource operations planned for the future.
How to deploy with Walrus
Before creating a service, you need to create a connector. You can create a Kubernetes connector and assign this connector to the default environment.
In this case, select the default project, create a default environment in the default project, and then create K8s-related services in the environment. You can also create other types of connectors such as AWS, AliCloud or whatever is related to your cloud services. Walrus also supports multiple environments . You can create different environments according to the actual needs, such as dev, prod, etc., in different environments to manage various resources.
After finishing environment configuration, it is time to create services. The module used by the service can be selected from the system built-in templates, or you can customize the template to create the service according to your own requirements in the module management. When creating a service, you need to select the module and version of the service and configure the template parameters for your service.
The following screenshot shows how a web application is created using the built-in webservice and mysql modules.
Create a database service in the default environment, fill in the module configuration and save it.
When the database service creation is done, create a web service in the default environment. Here you can inject the information of the database service into the environment variable of the web service in the form of environment variables, and the program can access the database service via the environment variable.
The created instance will automatically deploy resources by Deployer based on the application configuration management. After the deployment is done, the operator will automatically synchronise the state of the resources, and other resource operations such as viewing logs and executing terminals can be easily operated from the instance detail page.
Resources within an environment can be viewed through the service list page, and Walrus provides service-level and environment-level dependency diagrams. These top-level environment views help developers better understand the business architecture. In the dependency diagram, you can clearly see the dependencies between different services and the resources they deploy. You can also access logs and terminals on the dependency diagram, making it easier for developers to view service logs and execute business-related commands, ultimately improving development efficiency.
Walrus also supports quick service creation such as bulk cloning of services to different environments. Based on the previous instance, we can create a new dev environment and quickly clone the database and web services to this new dev environment. We can also clone their dependency management reservations into the new environment. Bulk cloning allows us to quickly create a development or testing environment for other developers or testers in the team.
With Walrus, resource management and deployment of business systems become simple, and developers without DevOps experience or even QA engineers can quickly manage and deploy resources on Walrus.
Besides, Walrus also supports other resource types, such as AWS, Alibaba Cloud, etc. The support of these resources can be controlled by defining the corresponding connector. You only need to configure the connector and create it in the DevOps center. Corresponding modules, such as built-in aws-rds, alicloud-rds. When creating a service, select the corresponding module to complete the deployment and management of the corresponding cloud resources.Walrus also provides AIGC capabilities to help platform engineers quickly create infrastructure modules.
With the powerful ecosystem of Terraform, Walrus can support various resource types, and platform engineers can define different modules in the system to realize consistent resource management and deployment experience for different systems.
Conclusion
In the era of cloud-native computing, application deployment and management have become increasingly complex. Traditional operations and maintenance methods no longer suffice for rapid business iterations. How to enhance the productivity of development teams and reduce operational costs is a challenge faced by every enterprise. Terraform, as an open-source infrastructure as code tool, can help organizations achieve automated infrastructure management. However, Terraform alone is not a complete solution; it needs to be complemented by other tools to achieve resource governance and deployment.
Walrus is a resource management and deployment platform built on Terraform. It assists enterprises in automating resource management and deployment, thus boosting the productivity of development teams and reducing operational costs. With the support of Walrus, developers can free themselves from tedious operational tasks and focus on developing business solutions.