<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Vivek Dhami]]></title><description><![CDATA[I'm a technologist. I write about cloud-native architecture, Kubernetes, security, and the future of technology.]]></description><link>https://blog.vivekdhami.com</link><generator>RSS for Node</generator><lastBuildDate>Sun, 26 Apr 2026 16:52:14 GMT</lastBuildDate><atom:link href="https://blog.vivekdhami.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Terraform: Eliminating phantom diffs using ignore_changes and replace_triggered_by]]></title><description><![CDATA[The Problem: Write-Only Fields and Perpetual Diffs
A particularly common source of phantom diffs is Azure DevOps service endpoint resources. Passwords and API keys are write-only — the API accepts the]]></description><link>https://blog.vivekdhami.com/terraform-eliminating-phantom-diffs-using-ignore-changes-and-replace-triggered-by</link><guid isPermaLink="true">https://blog.vivekdhami.com/terraform-eliminating-phantom-diffs-using-ignore-changes-and-replace-triggered-by</guid><category><![CDATA[Terraform]]></category><category><![CDATA[Devops]]></category><category><![CDATA[Infrastructure as code]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Fri, 20 Mar 2026 11:16:04 GMT</pubDate><content:encoded><![CDATA[<h2>The Problem: Write-Only Fields and Perpetual Diffs</h2>
<p>A particularly common source of phantom diffs is Azure DevOps service endpoint resources. Passwords and API keys are write-only — the API accepts them on create/update but never returns them on read. Every <code>terraform plan</code> sees a gap in state and reports a diff.</p>
<pre><code class="language-hcl">resource "azuredevops_serviceendpoint_nuget" "this" {
  project_id            = var.project_id
  service_endpoint_name = "nuget-feed"
  url                   = "https://artifacts.corp.example.com/repository/nuget-hosted/"
  username              = "deployer"
  password              = var.nuget_api_key
}
</code></pre>
<p>Every <code>terraform plan</code>, without fail:</p>
<pre><code># azuredevops_serviceendpoint_nuget.this will be updated in-place
~ resource "azuredevops_serviceendpoint_nuget" "this" {
      id                    = "..."
    ~ password              = (sensitive value)
      # (4 unchanged attributes hidden)
  }
</code></pre>
<h2>The Naive Fix (and Its Blind Spot)</h2>
<p>The obvious answer is <code>ignore_changes</code>:</p>
<pre><code class="language-hcl">resource "azuredevops_serviceendpoint_nuget" "this" {
  # ...
  password = var.nuget_api_key

  lifecycle {
    ignore_changes = [password]
  }
}
</code></pre>
<p>This kills the phantom diff, but creates a new problem: <strong>when you actually rotate the API key</strong>, Terraform silently ignores the change. You'd have to remember to manually run:</p>
<pre><code class="language-bash">terraform apply -replace="azuredevops_serviceendpoint_nuget.this"
</code></pre>
<p>That's fragile, easy to forget, and doesn't scale across dozens of service connections.</p>
<h2>The Better Fix: <code>ignore_changes</code> + <code>replace_triggered_by</code></h2>
<p>Combine both lifecycle features to get the best of both worlds — no phantom diffs during normal operation, but automatic recreation when the secret actually changes.</p>
<pre><code class="language-hcl"># A trigger resource that tracks the secret value.
# Only changes when the actual secret value changes.
resource "terraform_data" "nuget_api_key_version" {
  input = var.nuget_api_key
}

resource "azuredevops_serviceendpoint_nuget" "this" {
  project_id            = var.project_id
  service_endpoint_name = "nuget-feed"
  url                   = "https://artifacts.corp.example.com/repository/nuget-hosted/"
  username              = "deployer"
  password              = var.nuget_api_key

  lifecycle {
    ignore_changes = [password]
    replace_triggered_by = [
      terraform_data.nuget_api_key_version
    ]
  }
}
</code></pre>
<h2>How This Works</h2>
<ol>
<li><p><strong><code>ignore_changes = [password]</code></strong> eliminates the phantom diff on every plan — Terraform stops comparing the write-only password field against state.</p>
</li>
<li><p><strong><code>terraform_data.nuget_api_key_version</code></strong> stores the current value of the secret in state via its <code>input</code> attribute. It only reports a change when <code>var.nuget_api_key</code> <em>actually</em> changes.</p>
</li>
<li><p><strong><code>replace_triggered_by</code></strong> watches the trigger resource. When the secret rotates and <code>terraform_data</code> detects a real change, it forces a full replacement of the service endpoint — ensuring the new credential is applied.</p>
</li>
</ol>
<h3>Behavior Matrix</h3>
<table>
<thead>
<tr>
<th>Scenario</th>
<th>Behavior</th>
</tr>
</thead>
<tbody><tr>
<td>Normal plan, no changes</td>
<td>Clean plan, zero diffs</td>
</tr>
<tr>
<td>Secret rotated in <code>var.nuget_api_key</code></td>
<td><code>terraform_data</code> changes → service endpoint is <strong>replaced</strong> with new password</td>
</tr>
<tr>
<td>Other attributes changed (URL, name)</td>
<td>Normal in-place update, no interference</td>
</tr>
<tr>
<td><code>terraform plan</code> run repeatedly</td>
<td>Consistently clean — no phantom diff</td>
</tr>
</tbody></table>
<h2>Scaling the Pattern</h2>
<p>This pattern works for any write-only field: Docker registry passwords, Maven deploy tokens, PyPI API keys, generic service connection secrets, etc.</p>
<pre><code class="language-hcl"># --- Trigger resources (one per secret) ---

resource "terraform_data" "docker_password_version" {
  input = var.docker_password
}

resource "terraform_data" "maven_password_version" {
  input = var.maven_deploy_token
}

resource "terraform_data" "pypi_password_version" {
  input = var.pypi_api_token
}

resource "terraform_data" "npm_password_version" {
  input = var.npm_auth_token
}

# --- Service connections ---

resource "azuredevops_serviceendpoint_dockerregistry" "this" {
  project_id            = var.project_id
  service_endpoint_name = "docker-registry"
  docker_registry       = "https://registry.corp.example.com"
  docker_username       = "deployer"
  docker_password       = var.docker_password
  registry_type         = "Others"

  lifecycle {
    ignore_changes       = [docker_password]
    replace_triggered_by = [terraform_data.docker_password_version]
  }
}

resource "azuredevops_serviceendpoint_maven" "this" {
  project_id            = var.project_id
  service_endpoint_name = "maven-repo"
  url                   = "https://artifacts.corp.example.com/repository/maven-releases/"
  username              = "deployer"
  password              = var.maven_deploy_token

  lifecycle {
    ignore_changes       = [password]
    replace_triggered_by = [terraform_data.maven_password_version]
  }
}

resource "azuredevops_serviceendpoint_npm" "this" {
  project_id            = var.project_id
  service_endpoint_name = "npm-registry"
  url                   = "https://artifacts.corp.example.com/repository/npm-hosted/"
  access_token          = var.npm_auth_token

  lifecycle {
    ignore_changes       = [access_token]
    replace_triggered_by = [terraform_data.npm_password_version]
  }
}

resource "azuredevops_serviceendpoint_generic" "pypi" {
  project_id            = var.project_id
  service_endpoint_name = "pypi-repo"
  server_url            = "https://artifacts.corp.example.com/repository/pypi-hosted/"
  password              = var.pypi_api_token

  lifecycle {
    ignore_changes       = [password]
    replace_triggered_by = [terraform_data.pypi_password_version]
  }
}
</code></pre>
<h2>Modularizing the Pattern</h2>
<p>If you manage many service connections through a module, you can encapsulate the pattern:</p>
<pre><code class="language-hcl"># modules/service_connection_with_secret/main.tf

variable "project_id" {}
variable "endpoint_name" {}
variable "server_url" {}
variable "username" { default = "" }
variable "password" { sensitive = true }

resource "terraform_data" "password_version" {
  input = var.password
}

resource "azuredevops_serviceendpoint_generic" "this" {
  project_id            = var.project_id
  service_endpoint_name = var.endpoint_name
  server_url            = var.server_url
  username              = var.username
  password              = var.password

  lifecycle {
    ignore_changes       = [password]
    replace_triggered_by = [terraform_data.password_version]
  }
}

output "endpoint_id" {
  value = azuredevops_serviceendpoint_generic.this.id
}
</code></pre>
<p>Usage:</p>
<pre><code class="language-hcl">module "pypi_connection" {
  source        = "./modules/service_connection_with_secret"
  project_id    = azuredevops_project.this.id
  endpoint_name = "pypi-repo"
  server_url    = "https://artifacts.corp.example.com/repository/pypi-hosted/"
  password      = var.pypi_api_token
}
</code></pre>
<h2>Important Caveats</h2>
<h3>Replacement is Destructive</h3>
<p><code>replace_triggered_by</code> forces a <strong>destroy + create</strong>, not an in-place update. For service connections this is usually fine, but be aware that:</p>
<ul>
<li>The service endpoint ID changes on replacement</li>
<li>Pipelines referencing the endpoint by ID (not name) may briefly fail during apply</li>
<li>Any <code>azuredevops_resource_authorization</code> tied to the old endpoint ID needs to be recreated</li>
</ul>
<p>If zero-downtime is critical, add <code>create_before_destroy</code>:</p>
<pre><code class="language-hcl">  lifecycle {
    ignore_changes         = [password]
    replace_triggered_by   = [terraform_data.nuget_api_key_version]
    create_before_destroy  = true
  }
</code></pre>
<h3>Sensitive Values in <code>terraform_data</code></h3>
<p>The <code>terraform_data</code> resource stores its <code>input</code> in state. If your state backend isn't encrypted, the secret will be visible in the state file. Ensure you:</p>
<ul>
<li>Use an encrypted state backend (Azure Blob with encryption, S3 with SSE, etc.)</li>
<li>Restrict state file access with proper IAM/RBAC policies</li>
<li>Consider using <code>sensitive = true</code> on the variable to suppress console output</li>
</ul>
<h3>When NOT to Use This Pattern</h3>
<ul>
<li><strong>The provider correctly handles the field.</strong> Some providers do read secrets back from the API — no phantom diff means no need for this pattern.</li>
<li><strong>Terraform v1.11+ write-only attributes.</strong> If your provider supports the new <code>write_only</code> attribute modifier, use that instead — it's the native solution to this problem.</li>
<li><strong>You want in-place updates, not replacements.</strong> If the resource is expensive to recreate (e.g., databases with data), the replacement semantics of <code>replace_triggered_by</code> may be too aggressive.</li>
</ul>
<h2>Summary</h2>
<table>
<thead>
<tr>
<th>Approach</th>
<th>Phantom Diff?</th>
<th>Detects Secret Rotation?</th>
<th>Automation</th>
</tr>
</thead>
<tbody><tr>
<td>No lifecycle rules</td>
<td>Yes (every plan)</td>
<td>Yes</td>
<td>Automatic</td>
</tr>
<tr>
<td><code>ignore_changes</code> only</td>
<td>No</td>
<td>No</td>
<td>Manual <code>-replace</code> needed</td>
</tr>
<tr>
<td><code>ignore_changes</code> + <code>replace_triggered_by</code></td>
<td>No</td>
<td>Yes</td>
<td>Fully automatic</td>
</tr>
</tbody></table>
<p>The <code>ignore_changes</code> + <code>replace_triggered_by</code> pattern gives you clean plans <em>and</em> automatic secret rotation handling. It's the practical middle ground until write-only attributes become widespread across providers.</p>
]]></content:encoded></item><item><title><![CDATA[Managing Terraform Phantom Diffs: A Practical Guide]]></title><description><![CDATA[What Are Phantom Diffs?
Phantom diffs — also called perpetual diffs or ghost changes — are changes that appear in your terraform plan output every single run, even when you haven't modified any code o]]></description><link>https://blog.vivekdhami.com/managing-terraform-phantom-diffs-a-practical-guide</link><guid isPermaLink="true">https://blog.vivekdhami.com/managing-terraform-phantom-diffs-a-practical-guide</guid><category><![CDATA[Terraform]]></category><category><![CDATA[Devops]]></category><category><![CDATA[Infrastructure as code]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Fri, 20 Mar 2026 11:06:53 GMT</pubDate><content:encoded><![CDATA[<h2>What Are Phantom Diffs?</h2>
<p>Phantom diffs — also called <strong>perpetual diffs</strong> or <strong>ghost changes</strong> — are changes that appear in your <code>terraform plan</code> output every single run, even when you haven't modified any code or infrastructure. They create noise, erode trust in your plans, and can mask real changes hiding among the false positives.</p>
<pre><code># azurerm_resource_group.example will be updated in-place
~ resource "azurerm_resource_group" "example" {
      id       = "/subscriptions/.../resourceGroups/my-rg"
      name     = "my-rg"
    ~ tags     = {
        - "created_by" = "terraform" -&gt; null
      }
  }
</code></pre>
<p>You stare at the diff. You didn't change anything. You run <code>plan</code> again — same diff. Welcome to the world of phantom diffs.</p>
<hr />
<h2>Why Do Phantom Diffs Happen?</h2>
<p>There are several root causes, and understanding them is the key to fixing them.</p>
<h3>1. Provider Defaults and API Normalization</h3>
<p>Cloud APIs often return values in a different format than what you send. The provider compares your config to the API response and sees a "difference" every time.</p>
<p><strong>Example: Case sensitivity</strong></p>
<pre><code class="language-hcl">resource "azurerm_resource_group" "example" {
  name     = "My-Resource-Group"
  location = "East US"
}
</code></pre>
<p>Azure might normalize <code>location</code> to <code>eastus</code> internally. Every plan, Terraform sees <code>"East US"</code> in your config vs <code>"eastus"</code> in state, and reports a diff.</p>
<p><strong>Fix:</strong> Match the API's canonical form:</p>
<pre><code class="language-hcl">resource "azurerm_resource_group" "example" {
  name     = "My-Resource-Group"
  location = "eastus"
}
</code></pre>
<h3>2. Computed Attributes Conflicting with Config</h3>
<p>Some attributes are both configurable <em>and</em> computed by the provider. If you set a value that the API overrides or augments, you'll get a perpetual diff.</p>
<p><strong>Example: Tags merged by Azure Policy</strong></p>
<pre><code class="language-hcl">resource "azurerm_resource_group" "example" {
  name     = "my-rg"
  location = "eastus"

  tags = {
    environment = "dev"
  }
}
</code></pre>
<p>If an Azure Policy automatically adds <code>"created_by" = "policy"</code>, every plan shows:</p>
<pre><code>~ tags = {
    + "created_by" = "policy"    # Terraform wants to remove this
      "environment" = "dev"
  }
</code></pre>
<p>Terraform wants to enforce <em>your</em> declared state, which doesn't include that tag.</p>
<p><strong>Fix:</strong> Use <code>ignore_changes</code> in a lifecycle block:</p>
<pre><code class="language-hcl">resource "azurerm_resource_group" "example" {
  name     = "my-rg"
  location = "eastus"

  tags = {
    environment = "dev"
  }

  lifecycle {
    ignore_changes = [tags["created_by"]]
  }
}
</code></pre>
<p>Or, if the external system adds many unpredictable tags:</p>
<pre><code class="language-hcl">  lifecycle {
    ignore_changes = [tags]
  }
</code></pre>
<h3>3. Attribute Ordering and Serialization</h3>
<p>Some resources contain list or set attributes where order is semantically irrelevant but syntactically significant to Terraform.</p>
<p><strong>Example: Security group rules</strong></p>
<pre><code class="language-hcl">resource "aws_security_group" "example" {
  name = "my-sg"

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["10.0.0.0/8", "172.16.0.0/12"]
  }
}
</code></pre>
<p>If the AWS API returns <code>cidr_blocks</code> as <code>["172.16.0.0/12", "10.0.0.0/8"]</code>, you get a phantom diff every run.</p>
<p><strong>Fix:</strong> Sort your values to match API return order, or use <code>ignore_changes</code>:</p>
<pre><code class="language-hcl">  cidr_blocks = ["172.16.0.0/12", "10.0.0.0/8"]  # Match API order
</code></pre>
<h3>4. Default Values Applied Server-Side</h3>
<p>Resources often have optional attributes that the API fills in with defaults. If the provider reads these back but you didn't set them, Terraform may show an unwanted diff.</p>
<p><strong>Example: AWS Launch Template</strong></p>
<pre><code class="language-hcl">resource "aws_launch_template" "example" {
  name_prefix   = "my-lt-"
  image_id      = "ami-0abcdef1234567890"
  instance_type = "t3.micro"
}
</code></pre>
<p>The API might populate <code>metadata_options</code> with defaults. Next plan:</p>
<pre><code>~ metadata_options {
    ~ http_endpoint               = "enabled" -&gt; "enabled"
    ~ http_put_response_hop_limit = 1 -&gt; 1
    ~ http_tokens                 = "optional" -&gt; "optional"
  }
</code></pre>
<p><strong>Fix:</strong> Explicitly declare the defaults in your config:</p>
<pre><code class="language-hcl">resource "aws_launch_template" "example" {
  name_prefix   = "my-lt-"
  image_id      = "ami-0abcdef1234567890"
  instance_type = "t3.micro"

  metadata_options {
    http_endpoint               = "enabled"
    http_put_response_hop_limit = 1
    http_tokens                 = "optional"
  }
}
</code></pre>
<h3>5. Sensitive or Computed-Only Values in State</h3>
<p>Some providers mark attributes as sensitive or don't store them in state properly. This causes Terraform to believe the value is missing or changed.</p>
<p><strong>Example: Passwords and secrets</strong></p>
<pre><code class="language-hcl">resource "azuredevops_serviceendpoint_generic" "example" {
  project_id            = azuredevops_project.example.id
  service_endpoint_name = "my-endpoint"
  server_url            = "https://example.com"
  password              = "my-secret"
}
</code></pre>
<p>If the provider can't read the password back from the API (it's write-only), every plan shows:</p>
<pre><code>~ password = (sensitive value)
</code></pre>
<p><strong>Fix:</strong> Use <code>ignore_changes</code> for write-only secrets:</p>
<pre><code class="language-hcl">  lifecycle {
    ignore_changes = [password]
  }
</code></pre>
<blockquote>
<p><strong>Note:</strong> Terraform v1.7+ introduced the <code>ephemeral</code> attribute concept and write-only arguments in v1.11+ to handle this pattern more elegantly.</p>
</blockquote>
<h3>6. Timestamp and Auto-Generated Fields</h3>
<p>Resources that include timestamps (<code>last_modified</code>, <code>updated_at</code>) or auto-generated IDs that change on read will always show diffs.</p>
<p><strong>Example:</strong></p>
<pre><code>~ last_modified = "2026-03-19T10:00:00Z" -&gt; "2026-03-20T08:30:00Z"
</code></pre>
<p><strong>Fix:</strong></p>
<pre><code class="language-hcl">  lifecycle {
    ignore_changes = [last_modified]
  }
</code></pre>
<h3>7. Inconsistent <code>jsonencode</code> / JSON Formatting</h3>
<p>When you pass JSON as a string (e.g., IAM policies, API definitions), whitespace or key-ordering differences between your config and the API cause phantom diffs.</p>
<p><strong>Example:</strong></p>
<pre><code class="language-hcl">resource "aws_iam_role" "example" {
  name = "my-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action    = "sts:AssumeRole"
      Effect    = "Allow"
      Principal = { Service = "ec2.amazonaws.com" }
    }]
  })
}
</code></pre>
<p>The API may return the policy with different key ordering or whitespace, causing:</p>
<pre><code>~ assume_role_policy = jsonencode(
    ~ {
        - Statement = [...]
        + Statement = [...]  # Same content, different formatting
      }
  )
</code></pre>
<p><strong>Fix:</strong> Use the dedicated policy resources or data sources:</p>
<pre><code class="language-hcl">data "aws_iam_policy_document" "assume_role" {
  statement {
    actions = ["sts:AssumeRole"]
    effect  = "Allow"
    principals {
      type        = "Service"
      identifiers = ["ec2.amazonaws.com"]
    }
  }
}

resource "aws_iam_role" "example" {
  name               = "my-role"
  assume_role_policy = data.aws_iam_policy_document.assume_role.json
}
</code></pre>
<hr />
<h2>Strategies for Handling Phantom Diffs</h2>
<h3>Strategy 1: <code>lifecycle { ignore_changes = [...] }</code></h3>
<p>The most common and direct fix. Use it when an external system modifies attributes outside Terraform's control.</p>
<pre><code class="language-hcl">resource "kubernetes_deployment" "example" {
  metadata {
    name = "my-app"
  }

  lifecycle {
    ignore_changes = [
      metadata[0].annotations["kubectl.kubernetes.io/last-applied-configuration"],
      spec[0].template[0].metadata[0].annotations,
    ]
  }
}
</code></pre>
<p><strong>When to use:</strong> External systems (policies, operators, other tools) modify resources after Terraform applies them.</p>
<p><strong>When NOT to use:</strong> You're hiding a real configuration problem. <code>ignore_changes</code> is a scalpel, not a sledgehammer.</p>
<h3>Strategy 2: Match the API's Canonical Form</h3>
<p>Before reaching for <code>ignore_changes</code>, check if you can simply adjust your config to match what the API returns.</p>
<pre><code class="language-bash"># Check what the API actually stores:
terraform show -json | jq '.values.root_module.resources[] | select(.address == "aws_s3_bucket.example")'
</code></pre>
<p>Then update your config to match. This is the cleanest solution.</p>
<h3>Strategy 3: Use <code>terraform state show</code> to Inspect</h3>
<pre><code class="language-bash">terraform state show 'module.my_module.azuredevops_serviceendpoint_nuget.this'
</code></pre>
<p>Compare the state values with your config to identify exactly which attribute drifts.</p>
<h3>Strategy 4: Pin Provider Versions</h3>
<p>Provider updates can introduce or fix phantom diffs. Always pin your versions:</p>
<pre><code class="language-hcl">terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~&gt; 3.85.0"
    }
  }
}
</code></pre>
<p>Check provider changelogs — phantom diff fixes are common in patch releases.</p>
<h3>Strategy 5: Use <code>replace_triggered_by</code> Instead of Fighting Diffs</h3>
<p>Sometimes a resource will always show a diff because a dependency changed. Rather than suppress it, embrace it:</p>
<pre><code class="language-hcl">resource "null_resource" "config_reload" {
  lifecycle {
    replace_triggered_by = [
      azuredevops_variable_group.secrets
    ]
  }
}
</code></pre>
<h3>Strategy 6: Automate Phantom Diff Detection in CI/CD</h3>
<p>Add a check to your pipeline that flags plans with <em>only</em> known phantom diffs:</p>
<pre><code class="language-bash">#!/bin/bash
terraform plan -detailed-exitcode -out=plan.tfplan 2&gt;&amp;1
EXIT_CODE=$?

if [ $EXIT_CODE -eq 2 ]; then
  # Changes detected — check if they're all known phantoms
  terraform show -json plan.tfplan | jq -r '
    .resource_changes[]
    | select(.change.actions != ["no-op"])
    | .address
  ' &gt; changed_resources.txt

  KNOWN_PHANTOMS="azuredevops_serviceendpoint_generic.password_reset"

  while read -r resource; do
    if ! echo "\(KNOWN_PHANTOMS" | grep -q "\)resource"; then
      echo "REAL CHANGE DETECTED: $resource"
      exit 2
    fi
  done &lt; changed_resources.txt

  echo "Only phantom diffs detected — safe to skip apply."
  exit 0
fi
</code></pre>
<hr />
<h2>Decision Framework</h2>
<table>
<thead>
<tr>
<th>Symptom</th>
<th>Root Cause</th>
<th>Fix</th>
</tr>
</thead>
<tbody><tr>
<td>Case/format differences</td>
<td>API normalization</td>
<td>Match canonical form</td>
</tr>
<tr>
<td>External tags/annotations</td>
<td>External system modification</td>
<td><code>ignore_changes</code> on specific keys</td>
</tr>
<tr>
<td>Sensitive values always changing</td>
<td>Write-only API fields</td>
<td><code>ignore_changes</code> on secret fields</td>
</tr>
<tr>
<td>JSON/policy reformatting</td>
<td>Serialization differences</td>
<td>Use dedicated data sources</td>
</tr>
<tr>
<td>Ordering changes in lists</td>
<td>Non-deterministic API responses</td>
<td>Sort values or use <code>ignore_changes</code></td>
</tr>
<tr>
<td>New attributes appearing after upgrade</td>
<td>Provider version change</td>
<td>Pin provider, set explicit defaults</td>
</tr>
</tbody></table>
<hr />
<h2>Key Takeaways</h2>
<ol>
<li><strong>Diagnose before suppressing.</strong> Use <code>terraform state show</code> and <code>terraform show -json</code> to understand <em>why</em> a diff appears before reaching for <code>ignore_changes</code>.</li>
<li><strong><code>ignore_changes</code> is a trade-off.</strong> It solves the noise problem but creates a blind spot — Terraform will never manage that attribute again until you remove the lifecycle rule.</li>
<li><strong>Match the API, not the docs.</strong> The API's canonical representation is the source of truth. Adjust your config to match it.</li>
<li><strong>Pin provider versions.</strong> Phantom diff behavior can change between provider versions. Pin versions and upgrade deliberately.</li>
<li><strong>Automate detection.</strong> In CI/CD, distinguish phantom diffs from real changes to prevent pipeline fatigue and rubber-stamp approvals.</li>
<li><strong>Report bugs upstream.</strong> Many phantom diffs are provider bugs. File issues — the maintainers fix them regularly.</li>
</ol>
<p>Phantom diffs are an inevitable part of working with Terraform at scale, but they don't have to be a source of constant frustration. A methodical approach to diagnosing and resolving them keeps your plans clean and trustworthy.</p>
]]></content:encoded></item><item><title><![CDATA[Understanding WASM and WASI: A complete guide]]></title><description><![CDATA[If you've been following the infrastructure and cloud-native space, you've probably heard whispers about WebAssembly (WASM) being the "next big thing" beyond just browser sandboxing. And honestly? The hype isn't entirely unfounded. When you pair WASM...]]></description><link>https://blog.vivekdhami.com/understanding-wasm-and-wasi-a-complete-guide</link><guid isPermaLink="true">https://blog.vivekdhami.com/understanding-wasm-and-wasi-a-complete-guide</guid><category><![CDATA[wasm]]></category><category><![CDATA[WebAssembly]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Mon, 04 Aug 2025 22:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/8OyKWQgBsKQ/upload/89483122fb0cf9c4f92ffc454f324b41.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you've been following the infrastructure and cloud-native space, you've probably heard whispers about WebAssembly (WASM) being the "next big thing" beyond just browser sandboxing. And honestly? The hype isn't entirely unfounded. When you pair WASM with WASI (the WebAssembly System Interface), you get a genuinely compelling alternative to traditional container runtimes in certain scenarios.</p>
<p>This post breaks down what WASM and WASI actually are, why they're gaining traction in backend and edge computing, and where they fit in your infrastructure toolkit.</p>
<h2 id="heading-the-problem-runtime-portability-is-still-messy">THE PROBLEM: RUNTIME PORTABILITY IS STILL MESSY</h2>
<p>Let's be real—containers solved a lot of problems. Docker gave us "build once, run anywhere" at the OS level. But containers still carry baggage: you're shipping an entire userspace, you need a container runtime (Docker, containerd, CRI-O), and you're still fundamentally tied to the kernel your container host is running.</p>
<p>For lightweight, short-lived workloads—think serverless functions, edge compute, or plugin architectures—spinning up a full container can be overkill. You're paying the startup cost, the memory overhead, and the attack surface of a Linux userspace even when your actual code is a 2MB binary.</p>
<p>WASM offers a different trade-off: near-native performance, sub-millisecond cold starts, and true portability across architectures and operating systems. But to really unlock WASM outside the browser, you need WASI.</p>
<h2 id="heading-what-is-webassembly-wasm">WHAT IS WEBASSEMBLY (WASM)?</h2>
<p>WebAssembly is a low-level bytecode format designed to be a compilation target for higher-level languages. Originally created to let languages like C, C++, and Rust run in the browser at near-native speeds, WASM has evolved into a portable runtime standard.</p>
<h3 id="heading-key-characteristics">Key Characteristics</h3>
<ul>
<li><p><strong>Binary instruction format</strong>: Compact, fast to decode and execute</p>
</li>
<li><p><strong>Stack-based virtual machine</strong>: Executes in a sandboxed environment</p>
</li>
<li><p><strong>Language-agnostic</strong>: You can compile C, Rust, Go, C++, even Python to WASM</p>
</li>
<li><p><strong>Portable</strong>: The same <code>.wasm</code> binary runs on x86, ARM, in the browser, or on the server</p>
</li>
<li><p><strong>Secure by default</strong>: Runs in a memory-safe sandbox with no default access to the host system</p>
</li>
</ul>
<p>Here's a simple example. Take this Rust function:</p>
<pre><code class="lang-rust"><span class="hljs-meta">#[no_mangle]</span>
<span class="hljs-keyword">pub</span> <span class="hljs-keyword">extern</span> <span class="hljs-string">"C"</span> <span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">add</span></span>(a: <span class="hljs-built_in">i32</span>, b: <span class="hljs-built_in">i32</span>) -&gt; <span class="hljs-built_in">i32</span> {
    a + b
}
</code></pre>
<p>Compile it to WASM:</p>
<pre><code class="lang-bash">rustc --target wasm32-unknown-unknown -O add.rs
</code></pre>
<p>You now have a <code>.wasm</code> module that can execute in any WASM runtime—browser, Wasmtime, Wasmer, or even embedded systems. No OS dependencies, no libc mismatches, just the bytecode.</p>
<h2 id="heading-what-is-wasi-webassembly-system-interface">WHAT IS WASI (WEBASSEMBLY SYSTEM INTERFACE)?</h2>
<p>WASM by itself is intentionally isolated—it has no concept of filesystems, network sockets, or environment variables. That's great for browser security, but it's limiting if you want to use WASM for backend services or CLI tools.</p>
<p>Enter WASI: a standardized system interface for WASM modules. Think of it as a "POSIX for WASM"—a capability-based API that lets WASM code interact with the host system in a controlled, secure way.</p>
<h3 id="heading-core-wasi-capabilities">Core WASI Capabilities</h3>
<p>WASI provides access to:</p>
<ul>
<li><p><strong>File I/O</strong>: Read and write files via preopened directories</p>
</li>
<li><p><strong>Networking</strong>: Socket APIs for TCP/UDP (still evolving)</p>
</li>
<li><p><strong>Environment variables</strong>: Access to configuration</p>
</li>
<li><p><strong>Command-line arguments</strong>: Standard argc/argv handling</p>
</li>
<li><p><strong>Clocks and random numbers</strong>: System time and cryptographic randomness</p>
</li>
</ul>
<p>The key difference from traditional system calls? <strong>Capabilities</strong>. Instead of a process having blanket access to the filesystem, you explicitly grant access to specific directories. It's the principle of least privilege baked into the runtime.</p>
<h3 id="heading-wasi-in-action">WASI in Action</h3>
<p>Let's look at a Rust program using WASI to read a file:</p>
<pre><code class="lang-rust"><span class="hljs-keyword">use</span> std::fs;
​
<span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">main</span></span>() {
    <span class="hljs-comment">// This only works if the runtime preopens access to the directory</span>
    <span class="hljs-keyword">let</span> contents = fs::read_to_string(<span class="hljs-string">"/data/config.json"</span>)
        .expect(<span class="hljs-string">"Failed to read file"</span>);
​
    <span class="hljs-built_in">println!</span>(<span class="hljs-string">"Config: {}"</span>, contents);
}
</code></pre>
<p>Compile it to WASM with WASI support:</p>
<pre><code class="lang-bash">cargo build --target wasm32-wasi --release
</code></pre>
<p>Run it using Wasmtime (a WASM runtime):</p>
<pre><code class="lang-bash">wasmtime --dir=/host/path/to/data:/data target/wasm32-wasi/release/myapp.wasm
</code></pre>
<p>The <code>--dir</code> flag explicitly maps <code>/host/path/to/data</code> on the host to <code>/data</code> inside the WASM sandbox. Without that flag, the WASM module can't access the filesystem at all. This is capability-based security in action.</p>
<h2 id="heading-why-wasm-wasi-matters-for-infrastructure">WHY WASM + WASI MATTERS FOR INFRASTRUCTURE</h2>
<p>Alright, so WASM is portable and WASI makes it practical. Why should you care as someone running infrastructure?</p>
<h3 id="heading-1-cold-start-performance">1. Cold Start Performance</h3>
<p>WASM modules start in microseconds, not milliseconds or seconds. For serverless and edge computing, this is huge. Compare:</p>
<ul>
<li><p><strong>Container cold start</strong>: 100ms-1s (includes pulling layers, initializing userspace)</p>
</li>
<li><p><strong>WASM cold start</strong>: &lt;1ms (just instantiate the module)</p>
</li>
</ul>
<p>If you're running ephemeral workloads with high request-per-second variability, WASM can dramatically reduce idle resource consumption.</p>
<h3 id="heading-2-smaller-footprint">2. Smaller Footprint</h3>
<p>A typical container image for a Go or Rust service might be 50-200MB. The equivalent WASM binary? Often under 5MB. Less to transfer, less to store, less to scan for vulnerabilities.</p>
<h3 id="heading-3-true-multi-arch-support">3. True Multi-Arch Support</h3>
<p>Compile once to WASM, run on x86, ARM64, RISC-V, whatever. No more maintaining separate container images for different architectures or dealing with emulation layers.</p>
<h3 id="heading-4-enhanced-security-posture">4. Enhanced Security Posture</h3>
<p>WASM's sandbox is more restrictive than a container's by default. No filesystem access, no network, no syscalls unless explicitly granted via WASI capabilities. This makes it attractive for multi-tenant environments and plugin systems where you're running untrusted code.</p>
<h2 id="heading-the-wasm-runtime-ecosystem">THE WASM RUNTIME ECOSYSTEM</h2>
<p>To run WASM outside the browser, you need a runtime. Here are the main players:</p>
<h3 id="heading-wasmtime">Wasmtime</h3>
<p>Developed by the Bytecode Alliance, Wasmtime is a fast, secure WASM and WASI runtime. It's the reference implementation for WASI and integrates well with Rust, C, and Python.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Install Wasmtime</span>
curl https://wasmtime.dev/install.sh -sSf | bash
​
<span class="hljs-comment"># Run a WASM module</span>
wasmtime myapp.wasm
</code></pre>
<h3 id="heading-wasmer">Wasmer</h3>
<p>Another popular runtime with a focus on server-side WASM. Wasmer supports WASI and has a package manager (WAPM) for distributing WASM modules.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Install Wasmer</span>
curl https://get.wasmer.io -sSfL | sh
​
<span class="hljs-comment"># Run with filesystem access</span>
wasmer run myapp.wasm --dir=./data
</code></pre>
<h3 id="heading-wasmedge">WasmEdge</h3>
<p>Optimized for edge computing and cloud-native workloads. It's particularly fast and has Kubernetes integration via crun and containerd shims.</p>
<h3 id="heading-spin-fermyon">Spin (Fermyon)</h3>
<p>A framework specifically for building WASM-based microservices. Think of it as a lightweight alternative to running services in containers.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Install Spin</span>
curl -fsSL https://developer.fermyon.com/downloads/install.sh | bash
​
<span class="hljs-comment"># Create a new Spin app</span>
spin new http-rust my-service
<span class="hljs-built_in">cd</span> my-service
spin build
spin up  <span class="hljs-comment"># Runs locally, starts in milliseconds</span>
</code></pre>
<h2 id="heading-how-wasm-fits-into-your-stack">HOW WASM FITS INTO YOUR STACK</h2>
<p>Let's visualize where WASM sits relative to traditional deployments:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770331516839/d08b9480-a313-4235-b36c-0a52989986f3.png" alt class="image--center mx-auto" /></p>
<p>WASM isn't replacing containers across the board. It's carving out use cases where its characteristics—fast startup, small size, strong isolation—provide meaningful advantages.</p>
<h2 id="heading-real-world-use-cases">REAL-WORLD USE CASES</h2>
<h3 id="heading-edge-computing">Edge Computing</h3>
<p>Deploying compute to the edge (CDN nodes, IoT gateways) benefits massively from WASM's portability and low overhead. Cloudflare Workers, Fastly Compute@Edge, and Fermyon Cloud all use WASM under the hood.</p>
<h3 id="heading-plugin-systems">Plugin Systems</h3>
<p>If you're building software that supports user-provided extensions (think VS Code plugins, database UDFs, or API middleware), WASM provides a safe sandbox. The host can grant specific capabilities without risking arbitrary code execution.</p>
<h3 id="heading-serverless-functions">Serverless Functions</h3>
<p>WASM-based FaaS platforms (like Spin or Wasmer Edge) can scale to zero more aggressively and respond faster than traditional container-based Lambda.</p>
<h3 id="heading-polyglot-microservices">Polyglot Microservices</h3>
<p>Run services written in Rust, Go, C++, and AssemblyScript side-by-side in a WASM runtime without worrying about language-specific runtimes or dependency conflicts.</p>
<h2 id="heading-wasm-in-kubernetes">WASM IN KUBERNETES</h2>
<p>Yes, you can run WASM workloads in Kubernetes. Projects like <strong>runwasi</strong> and <strong>containerd-wasm-shims</strong> let you treat WASM modules as container images.</p>
<p>Example: Running a WASM workload with containerd</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Pod</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">wasm-demo</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">runtimeClassName:</span> <span class="hljs-string">wasmtime</span>  <span class="hljs-comment"># Use WASM runtime instead of runc</span>
  <span class="hljs-attr">containers:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">wasm-app</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">myregistry/wasm-app:latest</span>
    <span class="hljs-attr">command:</span> [<span class="hljs-string">"/app.wasm"</span>]
</code></pre>
<p>Under the hood, containerd uses a WASM runtime shim to execute the module instead of spawning a traditional container. You get Kubernetes orchestration with WASM's performance characteristics.</p>
<h2 id="heading-limitations-and-gotchas">LIMITATIONS AND GOTCHAS</h2>
<p>WASM and WASI aren't a silver bullet. Here's where they fall short (for now):</p>
<h3 id="heading-limited-language-support">Limited Language Support</h3>
<p>While Rust, C/C++, and AssemblyScript have excellent WASM support, languages with heavy runtimes (Java, Python, Ruby) produce large binaries or have limited functionality. Go's WASM support is improving but still has constraints around goroutines and syscalls.</p>
<h3 id="heading-wasi-spec-is-still-evolving">WASI Spec Is Still Evolving</h3>
<p>Networking support in WASI is incomplete. Socket APIs exist but aren't finalized. If your app does heavy network I/O, you might hit limitations.</p>
<h3 id="heading-ecosystem-maturity">Ecosystem Maturity</h3>
<p>Compared to Docker and containers, the WASM ecosystem is younger. Tooling, debugging, and observability aren't as polished. You'll encounter rough edges.</p>
<h3 id="heading-not-a-container-replacement-yet">Not a Container Replacement (Yet)</h3>
<p>For long-running stateful services with complex dependencies, containers are still the safer bet. WASM shines for stateless, ephemeral, or compute-bound workloads.</p>
<h2 id="heading-getting-started-a-practical-example">GETTING STARTED: A PRACTICAL EXAMPLE</h2>
<p>Let's build a simple HTTP service in Rust, compile it to WASM, and run it locally.</p>
<h3 id="heading-step-1-create-a-new-spin-app">Step 1: Create a New Spin App</h3>
<pre><code class="lang-bash">spin new http-rust hello-wasm
<span class="hljs-built_in">cd</span> hello-wasm
</code></pre>
<h3 id="heading-step-2-edit-srclibrshttplibrs">Step 2: Edit <code>src/</code><a target="_blank" href="http://lib.rs"><code>lib.rs</code></a></h3>
<pre><code class="lang-rust"><span class="hljs-keyword">use</span> spin_sdk::{
    http::{Request, Response},
    http_component,
};
​
<span class="hljs-meta">#[http_component]</span>
<span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">handle_request</span></span>(_req: Request) -&gt; <span class="hljs-built_in">Result</span>&lt;Response&gt; {
    <span class="hljs-literal">Ok</span>(http::Response::builder()
        .status(<span class="hljs-number">200</span>)
        .header(<span class="hljs-string">"Content-Type"</span>, <span class="hljs-string">"application/json"</span>)
        .body(<span class="hljs-literal">Some</span>(<span class="hljs-string">r#"{"message": "Hello from WASM!"}"#</span>.into()))?)
}
</code></pre>
<h3 id="heading-step-3-build-and-run">Step 3: Build and Run</h3>
<pre><code class="lang-bash">spin build
spin up
</code></pre>
<p>Hit <a target="_blank" href="http://localhost:3000"><code>http://localhost:3000</code></a> and you'll see your JSON response. The entire service compiles to a few MB and starts in under 10ms.</p>
<h3 id="heading-step-4-deploy-to-the-edge-optional">Step 4: Deploy to the Edge (Optional)</h3>
<pre><code class="lang-bash">spin deploy  <span class="hljs-comment"># Pushes to Fermyon Cloud or your configured WASM platform</span>
</code></pre>
<p>No Dockerfile, no Kubernetes manifest (unless you want it), no container registry juggling. Just a portable WASM module running at the edge.</p>
<h2 id="heading-wrapping-up">WRAPPING UP</h2>
<p>WebAssembly and WASI represent a genuine shift in how we think about runtime portability and isolation. They're not going to replace containers overnight, but for specific workloads—edge functions, plugin systems, fast-scaling microservices—they offer compelling advantages.</p>
<p><strong>Key takeaways:</strong></p>
<ul>
<li><p><strong>WASM</strong> is a portable, secure bytecode format with near-native performance</p>
</li>
<li><p><strong>WASI</strong> provides a standardized system interface, making WASM practical outside the browser</p>
</li>
<li><p><strong>Cold start times</strong> and <strong>binary sizes</strong> are orders of magnitude better than containers</p>
</li>
<li><p><strong>Capability-based security</strong> gives you fine-grained control over what code can access</p>
</li>
<li><p><strong>Ecosystem is maturing</strong> fast, but still has gaps in networking, language support, and tooling</p>
</li>
</ul>
<p>If you're building for serverless, edge, or multi-tenant environments, it's worth experimenting with WASM. Start small—convert a stateless API endpoint or a CLI tool—and see how it fits your stack.</p>
<p>And if you're already running Kubernetes, check out the WASM runtime shims. You might be surprised how easily WASM slots into your existing orchestration layer.</p>
]]></content:encoded></item><item><title><![CDATA[MicroVMs: the security-first alternative to containers]]></title><description><![CDATA[If you've ever deployed untrusted workloads—think CI/CD runners, serverless functions, or multi-tenant SaaS—you've probably lost sleep over container escape vulnerabilities. Containers share the kernel with the host, and that shared boundary is a con...]]></description><link>https://blog.vivekdhami.com/microvms-the-security-first-alternative-to-containers</link><guid isPermaLink="true">https://blog.vivekdhami.com/microvms-the-security-first-alternative-to-containers</guid><category><![CDATA[Security]]></category><category><![CDATA[containers]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[virtualization]]></category><category><![CDATA[DevSecOps]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Mon, 05 May 2025 22:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/UMgsYLI3if4/upload/32fea17fd860b6d1038678919ea6105f.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you've ever deployed untrusted workloads—think CI/CD runners, serverless functions, or multi-tenant SaaS—you've probably lost sleep over container escape vulnerabilities. Containers share the kernel with the host, and that shared boundary is a constant security concern. Virtual machines solve the isolation problem but come with heavyweight overhead that makes them impractical for modern workload densities.</p>
<p>MicroVMs sit right in that sweet spot: near-container speed with VM-level isolation. Let's dig into what they are, how they compare to traditional containers and VMs, and when you should actually use them.</p>
<h2 id="heading-the-isolation-problem-with-containers">THE ISOLATION PROBLEM WITH CONTAINERS</h2>
<p>Containers revolutionized deployment, but they weren't designed with strong isolation in mind. When you run a container, you're really just running a process on the host with some Linux kernel features (namespaces, cgroups, seccomp) creating the illusion of isolation.</p>
<p>The kernel is the shared attack surface. A kernel exploit in one container can potentially compromise the entire host and every other container running on it. We've seen this play out with real CVEs like the runC vulnerability (CVE-2019-5736) that allowed container escapes through malicious images.</p>
<p>For internal workloads where you trust your code, this risk is manageable. But if you're running untrusted code—customer-provided functions, arbitrary CI jobs, or hostile multi-tenant workloads—the shared kernel becomes a serious liability.</p>
<h2 id="heading-what-exactly-is-a-microvm">WHAT EXACTLY IS A MICROVM?</h2>
<p>A microVM is a lightweight virtual machine optimized for speed and density. It provides the strong isolation boundary of a full VM (separate kernel, virtualized hardware) but strips away everything unnecessary: legacy device support, BIOS emulation, firmware bloat.</p>
<p>The result? Boot times measured in milliseconds, memory overhead in single-digit megabytes, and the ability to pack thousands of microVMs on a single host—all while maintaining hardware-enforced isolation through the hypervisor.</p>
<p>Here's how the three approaches stack up:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770333454262/d82b6e84-7678-4f8f-ae75-44069820a488.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-comparing-containers-vms-and-microvms">COMPARING CONTAINERS, VMS, AND MICROVMS</h2>
<p>Let's get specific about the tradeoffs:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Metric</strong></td><td><strong>Containers</strong></td><td><strong>MicroVMs</strong></td><td><strong>Traditional VMs</strong></td></tr>
</thead>
<tbody>
<tr>
<td><strong>Boot time</strong></td><td>&lt;100ms</td><td>100-300ms</td><td>5-30s</td></tr>
<tr>
<td><strong>Memory overhead</strong></td><td>~1-5 MB</td><td>~5-20 MB</td><td>~500+ MB</td></tr>
<tr>
<td><strong>Density</strong></td><td>1000s per host</td><td>1000s per host</td><td>10s-100s per host</td></tr>
<tr>
<td><strong>Isolation</strong></td><td>Process-level (shared kernel)</td><td>Hardware virtualization</td><td>Hardware virtualization</td></tr>
<tr>
<td><strong>Attack surface</strong></td><td>Large (entire kernel)</td><td>Minimal (hypervisor + minimal kernel)</td><td>Moderate (hypervisor + full OS)</td></tr>
<tr>
<td><strong>Image size</strong></td><td>Small (OCI layers)</td><td>Small (kernel + rootfs)</td><td>Large (full OS)</td></tr>
<tr>
<td><strong>Ecosystem</strong></td><td>Docker, containerd, CRI-O</td><td>Firecracker, Cloud Hypervisor, gVisor</td><td>KVM, Xen, VMware</td></tr>
</tbody>
</table>
</div><p>The key insight: microVMs give you VM security at near-container performance. You're trading a slight increase in boot time and memory overhead for a massive reduction in blast radius.</p>
<h2 id="heading-real-world-microvm-implementations">REAL-WORLD MICROVM IMPLEMENTATIONS</h2>
<h3 id="heading-firecracker">Firecracker</h3>
<p>Developed by AWS for Lambda and Fargate, Firecracker is the most mature microVM implementation. It's built on KVM but strips out everything except what's needed to run a single workload.</p>
<p>Here's how you'd launch a simple Firecracker microVM:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Start the Firecracker process</span>
firecracker --api-sock /tmp/firecracker.sock
​
<span class="hljs-comment"># Configure the VM (via API)</span>
curl --unix-socket /tmp/firecracker.sock -X PUT \
  -H <span class="hljs-string">"Content-Type: application/json"</span> \
  -d <span class="hljs-string">'{
    "kernel_image_path": "/path/to/vmlinux",
    "boot_args": "console=ttyS0 reboot=k panic=1",
    "drives": [{
      "drive_id": "rootfs",
      "path_on_host": "/path/to/rootfs.ext4",
      "is_root_device": true,
      "is_read_only": false
    }],
    "machine-config": {
      "vcpu_count": 2,
      "mem_size_mib": 512
    }
  }'</span> \
  http://localhost/boot
</code></pre>
<p>Each Firecracker microVM runs its own kernel, completely isolated from the host. A kernel vulnerability in your workload's guest kernel doesn't give an attacker access to the host kernel.</p>
<h3 id="heading-gvisor">gVisor</h3>
<p>Google's gVisor takes a different approach: it's a user-space kernel that intercepts system calls before they reach the host kernel. It's not technically a microVM, but it solves the same problem—reducing the kernel attack surface.</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># Using gVisor (runsc) as a Docker runtime</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">node.k8s.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">RuntimeClass</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">gvisor</span>
<span class="hljs-attr">handler:</span> <span class="hljs-string">runsc</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Pod</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">secure-workload</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">runtimeClassName:</span> <span class="hljs-string">gvisor</span>  <span class="hljs-comment"># Use gVisor for this pod</span>
  <span class="hljs-attr">containers:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">app</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">untrusted-image:latest</span>
</code></pre>
<p>gVisor implements a substantial portion of the Linux kernel in Go, running in user space. System calls from your container go through gVisor's kernel implementation instead of directly to the host kernel. It's faster than Firecracker for some workloads but provides a slightly different security model.</p>
<h3 id="heading-kata-containers">Kata Containers</h3>
<p>Kata combines the OCI container workflow with lightweight VMs. Each pod runs in its own microVM, but you still use standard container images and orchestrators.</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># Install Kata runtime</span>
<span class="hljs-string">kubectl</span> <span class="hljs-string">apply</span> <span class="hljs-string">-f</span> <span class="hljs-string">https://raw.githubusercontent.com/kata-containers/kata-containers/main/tools/packaging/kata-deploy/kata-deploy.yaml</span>
<span class="hljs-string">​</span>
<span class="hljs-comment"># Use Kata for untrusted workloads</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Pod</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">kata-secured</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">runtimeClassName:</span> <span class="hljs-string">kata</span>  <span class="hljs-comment"># Runs in a microVM</span>
  <span class="hljs-attr">containers:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">workload</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">potentially-malicious:latest</span>
</code></pre>
<p>Kata is particularly useful in Kubernetes environments where you want to mix trusted and untrusted workloads on the same cluster—use the default runtime for trusted workloads, Kata for untrusted ones.</p>
<h2 id="heading-security-benefits-for-application-deployment">SECURITY BENEFITS FOR APPLICATION DEPLOYMENT</h2>
<p>MicroVMs shine in scenarios where the cost of compromise is high:</p>
<h3 id="heading-multi-tenant-saas-platforms">Multi-tenant SaaS platforms</h3>
<p>If you're running customer code (think Shopify apps, Salesforce functions, or CI/CD platforms like GitHub Actions), you need defense-in-depth. MicroVMs ensure that a malicious tenant can't escape their sandbox to access other tenants' data or the underlying infrastructure.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770333504391/90a6cccc-dd8d-4f81-b7a5-26444032f86c.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-serverless-and-function-as-a-service">Serverless and function-as-a-service</h3>
<p>AWS Lambda runs millions of functions per second, often from completely untrusted sources. Firecracker enables this scale by booting a fresh microVM per invocation in ~125ms, giving each function its own kernel and complete isolation.</p>
<h3 id="heading-cicd-runners">CI/CD runners</h3>
<p>Running arbitrary build and test jobs is inherently risky. Projects like GitLab Runner and GitHub Actions use microVMs to ensure a compromised build can't poison the runner or access other builds' secrets.</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># Example: Isolated CI job with Firecracker</span>
<span class="hljs-attr">job:</span>
  <span class="hljs-attr">script:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">npm</span> <span class="hljs-string">install</span>  <span class="hljs-comment"># Could be malicious dependencies</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">npm</span> <span class="hljs-string">test</span>
  <span class="hljs-attr">runner:</span>
    <span class="hljs-attr">executor:</span> <span class="hljs-string">firecracker</span>
    <span class="hljs-attr">isolation:</span> <span class="hljs-string">kernel-level</span>
    <span class="hljs-attr">memory:</span> <span class="hljs-string">2048MB</span>
    <span class="hljs-attr">timeout:</span> <span class="hljs-string">30m</span>
</code></pre>
<h3 id="heading-zero-trust-architecture">Zero-trust architecture</h3>
<p>MicroVMs fit naturally into zero-trust models. Even if an attacker compromises your application, they're still trapped inside a VM with no access to the host network, storage, or other workloads. You can further restrict with seccomp-bpf filters and minimal guest kernels that only include the syscalls your workload needs.</p>
<h2 id="heading-when-not-to-use-microvms">WHEN NOT TO USE MICROVMS</h2>
<p>They're not a silver bullet. Here's when containers are still the right choice:</p>
<ul>
<li><p><strong>Trusted internal workloads</strong>: If you control all the code and trust your developers, the isolation overhead isn't justified</p>
</li>
<li><p><strong>Cost-sensitive deployments</strong>: MicroVMs consume slightly more resources; at massive scale, this adds up</p>
</li>
<li><p><strong>Extremely latency-sensitive applications</strong>: That extra 100-200ms boot time matters for some real-time workloads</p>
</li>
<li><p><strong>Windows workloads</strong>: Most microVM implementations are Linux-focused</p>
</li>
</ul>
<h2 id="heading-getting-started-with-microvms">GETTING STARTED WITH MICROVMS</h2>
<p>If you're convinced microVMs are worth exploring:</p>
<ol>
<li><p><strong>Start with gVisor</strong> if you're already on Kubernetes—it's the lowest-friction option for testing isolation without changing your orchestration</p>
</li>
<li><p><strong>Use Firecracker directly</strong> if you're building a custom platform (FaaS, CI/CD) and need maximum control</p>
</li>
<li><p><strong>Try Kata Containers</strong> for a drop-in OCI-compatible solution that works with existing container tooling</p>
</li>
</ol>
<p>The ecosystem is still maturing, but major cloud providers (AWS, Google, Azure) are betting heavily on microVM technology. As the tooling improves and performance gaps close, we'll likely see microVMs become the default for any workload where isolation matters.</p>
<h2 id="heading-key-takeaways">KEY TAKEAWAYS</h2>
<p>MicroVMs bridge the gap between containers and VMs: you get near-container speed and density with VM-level security isolation. The hypervisor enforces a hard boundary that protects against kernel exploits and container escapes.</p>
<p>Use them when running untrusted code—multi-tenant platforms, serverless functions, CI/CD jobs—or anywhere the blast radius of a compromise would be catastrophic. Stick with containers when you trust your code and need maximum efficiency.</p>
<p>The security model is fundamentally different: instead of trusting the kernel and a stack of user-space isolation primitives, you're trusting the hypervisor—a much smaller, more auditable attack surface. For high-stakes deployments, that trade is worth making.</p>
]]></content:encoded></item><item><title><![CDATA[Terraform state: The good, the bad, and the ugly]]></title><description><![CDATA[If you've worked with Terraform for more than a day, you've encountered the state file. Maybe you've cursed at it when it locked during a critical deployment. Maybe you've panicked when someone accidentally deleted it. Or maybe—just maybe—you've comm...]]></description><link>https://blog.vivekdhami.com/terraform-state-the-good-the-bad-and-the-ugly</link><guid isPermaLink="true">https://blog.vivekdhami.com/terraform-state-the-good-the-bad-and-the-ugly</guid><category><![CDATA[Terraform]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Tue, 03 Sep 2024 22:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/YCgpoP7BP1Q/upload/d1a2f62d76fe4eaae15bcd73ccc5f4b0.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you've worked with Terraform for more than a day, you've encountered the state file. Maybe you've cursed at it when it locked during a critical deployment. Maybe you've panicked when someone accidentally deleted it. Or maybe—just maybe—you've committed one with Azure credentials baked in and had a very awkward conversation with your security team.</p>
<p>Terraform state is simultaneously one of the most elegant solutions and one of the biggest operational headaches in infrastructure-as-code. Let's break down why it exists, where it hurts, and how it can spectacularly blow up in your face.</p>
<h2 id="heading-the-good-why-state-exists-in-the-first-place">THE GOOD: WHY STATE EXISTS IN THE FIRST PLACE</h2>
<p>Terraform's state file is basically a snapshot of your infrastructure at a given point in time. It maps your HCL configuration to real-world resources and stores metadata that Terraform needs to manage those resources effectively.</p>
<h3 id="heading-it-solves-the-whats-actually-deployed-problem">It solves the "what's actually deployed" problem</h3>
<p>Without state, Terraform would have to query your cloud provider every single time to figure out what exists. That sounds reasonable until you're managing 500+ resources across multiple regions. State makes <code>terraform plan</code> fast by maintaining a local cache of your infrastructure.</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># Your config says this:</span>
<span class="hljs-string">resource</span> <span class="hljs-string">"azurerm_linux_virtual_machine"</span> <span class="hljs-string">"web"</span> {
  <span class="hljs-string">name</span>                <span class="hljs-string">=</span> <span class="hljs-string">"web-vm"</span>
  <span class="hljs-string">resource_group_name</span> <span class="hljs-string">=</span> <span class="hljs-string">azurerm_resource_group.main.name</span>
  <span class="hljs-string">location</span>            <span class="hljs-string">=</span> <span class="hljs-string">"East US"</span>
  <span class="hljs-string">size</span>                <span class="hljs-string">=</span> <span class="hljs-string">"Standard_B1s"</span>
  <span class="hljs-comment"># ... other config</span>
}
<span class="hljs-string">​</span>
<span class="hljs-comment"># State file knows this exists with its Azure resource ID</span>
<span class="hljs-comment"># and can detect when you change size or other attributes</span>
</code></pre>
<h3 id="heading-it-tracks-resource-dependencies">It tracks resource dependencies</h3>
<p>Terraform builds a dependency graph from your state file. This lets it know that your Azure SQL Database depends on your NSG, which depends on your VNet. When you destroy resources, it tears them down in the correct order. Try doing that manually at 2 AM during an incident—I'll wait.</p>
<h3 id="heading-it-enables-collaboration-sort-of">It enables collaboration (sort of)</h3>
<p>When multiple engineers work on the same infrastructure, state acts as the source of truth. Combined with remote backends and locking, it prevents the classic "we both applied changes at the same time and now everything's broken" scenario.</p>
<p>The state file also stores output values, which other Terraform configurations can reference:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># In your networking module</span>
<span class="hljs-string">output</span> <span class="hljs-string">"vnet_id"</span> {
  <span class="hljs-string">value</span> <span class="hljs-string">=</span> <span class="hljs-string">azurerm_virtual_network.main.id</span>
}
<span class="hljs-string">​</span>
<span class="hljs-comment"># Another team's config can use this</span>
<span class="hljs-string">data</span> <span class="hljs-string">"terraform_remote_state"</span> <span class="hljs-string">"network"</span> {
  <span class="hljs-string">backend</span> <span class="hljs-string">=</span> <span class="hljs-string">"azurerm"</span>
  <span class="hljs-string">config</span> <span class="hljs-string">=</span> {
    <span class="hljs-string">storage_account_name</span> <span class="hljs-string">=</span> <span class="hljs-string">"companytfstate"</span>
    <span class="hljs-string">container_name</span>       <span class="hljs-string">=</span> <span class="hljs-string">"tfstate"</span>
    <span class="hljs-string">key</span>                  <span class="hljs-string">=</span> <span class="hljs-string">"network.terraform.tfstate"</span>
  }
}
<span class="hljs-string">​</span>
<span class="hljs-string">resource</span> <span class="hljs-string">"azurerm_linux_virtual_machine"</span> <span class="hljs-string">"app"</span> {
  <span class="hljs-comment"># Reference the VNet from remote state</span>
  <span class="hljs-string">subnet_id</span> <span class="hljs-string">=</span> <span class="hljs-string">data.terraform_remote_state.network.outputs.subnet_id</span>
  <span class="hljs-comment"># ... other config</span>
}
</code></pre>
<p>This is actually pretty slick when it works. The problem is all the ways it doesn't work.</p>
<h2 id="heading-the-bad-common-pain-points-thatll-make-you-question-your-life-choices">THE BAD: COMMON PAIN POINTS THAT'LL MAKE YOU QUESTION YOUR LIFE CHOICES</h2>
<h3 id="heading-state-locking-conflicts">State locking conflicts</h3>
<p>You know that feeling when you run <code>terraform apply</code> and get hit with "Error acquiring the state lock"? That's state locking doing its job—preventing concurrent modifications. But when Jenkins crashes mid-apply or someone's laptop dies, that lock stays in place.</p>
<pre><code class="lang-plaintext"># The dreaded error
Error: Error acquiring the state lock
​
Error message: blob is already locked
Lock Info:
  ID:        a1b2c3d4-5678-90ef-ghij-klmnopqrstuv
  Path:      companytfstate/tfstate/prod.terraform.tfstate
  Operation: OperationTypeApply
  Who:       jenkins@ci-runner-42
  Created:   2026-02-04 14:23:17.123456789 +0000 UTC
</code></pre>
<p>Now you're stuck deciding: is that lock legitimate, or is it a zombie from a failed run? Force-unlock and you might corrupt state if someone's actually using it. Don't unlock and your deployment is blocked.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># The nuclear option</span>
terraform force-unlock a1b2c3d4-5678-90ef-ghij-klmnopqrstuv
​
<span class="hljs-comment"># Proceed with extreme caution</span>
</code></pre>
<h3 id="heading-state-drift-when-reality-diverges-from-code">State drift: When reality diverges from code</h3>
<p>Someone made a "quick fix" in the Azure Portal. Someone else manually scaled a VMSS. Now your state file thinks you have 3 instances, but you actually have 5. Terraform has no idea what happened.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Running refresh shows the horror</span>
$ terraform refresh
​
azurerm_linux_virtual_machine.web[3]: Refreshing state... [id=/subscriptions/.../manually-created-vm-1]
azurerm_linux_virtual_machine.web[4]: Refreshing state... [id=/subscriptions/.../manually-created-vm-2]
​
<span class="hljs-comment"># Now what? Delete them? Keep them? Cry?</span>
</code></pre>
<p>Drift detection is possible with <code>terraform plan -refresh-only</code>, but it only tells you about resources Terraform knows about. If someone created resources outside of Terraform, you're blind to them until they cause an outage.</p>
<h3 id="heading-team-collaboration-nightmares">Team collaboration nightmares</h3>
<p>Even with remote state and locking, you'll hit these scenarios:</p>
<p><strong>Scenario 1: The race condition</strong></p>
<ul>
<li><p>Alice runs <code>terraform plan</code> at 3:00 PM</p>
</li>
<li><p>Bob runs <code>terraform apply</code> at 3:05 PM</p>
</li>
<li><p>Alice runs <code>terraform apply</code> at 3:10 PM using her stale plan</p>
</li>
<li><p>State is now inconsistent with both their changes</p>
</li>
</ul>
<p><strong>Scenario 2: The branch problem</strong></p>
<ul>
<li><p>Feature branches create competing versions of state</p>
</li>
<li><p>Merging code is easy; merging state files is... not a thing</p>
</li>
<li><p>You end up with orphaned resources nobody remembers creating</p>
</li>
</ul>
<p><strong>Scenario 3: The "who changed what" mystery</strong> State files don't have audit logs. Someone deleted your load balancer and you have no idea who or when. Good luck with that post-mortem.</p>
<h3 id="heading-performance-degradation-at-scale">Performance degradation at scale</h3>
<p>Once you hit thousands of resources, <code>terraform plan</code> slows to a crawl even with state caching. You'll start breaking up monolithic state files into smaller modules just to keep things manageable:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># This used to take 10 seconds, now takes 5 minutes</span>
$ terraform plan
Refreshing Terraform state in-memory prior to plan...
azurerm_virtual_network.main: Refreshing state... [id=/subscriptions/.../virtualNetworks/main-vnet]
azurerm_subnet.public[0]: Refreshing state... [id=/subscriptions/.../subnets/public-subnet-0]
<span class="hljs-comment"># ... 2,437 more resources to go</span>
</code></pre>
<h2 id="heading-the-ugly-security-disasters-and-state-corruption">THE UGLY: SECURITY DISASTERS AND STATE CORRUPTION</h2>
<p>This is where things get scary. State files are ticking time bombs if you don't handle them properly.</p>
<h3 id="heading-sensitive-data-exposure">Sensitive data exposure</h3>
<p>Here's the thing nobody tells you: <strong>Terraform state files store everything in plaintext</strong>. Passwords, API keys, database connection strings—all sitting there unencrypted.</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"version"</span>: <span class="hljs-number">4</span>,
  <span class="hljs-attr">"terraform_version"</span>: <span class="hljs-string">"1.6.0"</span>,
  <span class="hljs-attr">"resources"</span>: [
    {
      <span class="hljs-attr">"type"</span>: <span class="hljs-string">"azurerm_mssql_database"</span>,
      <span class="hljs-attr">"name"</span>: <span class="hljs-string">"main"</span>,
      <span class="hljs-attr">"instances"</span>: [
        {
          <span class="hljs-attr">"attributes"</span>: {
            <span class="hljs-attr">"administrator_login"</span>: <span class="hljs-string">"sqladmin"</span>,
            <span class="hljs-attr">"administrator_login_password"</span>: <span class="hljs-string">"SuperSecretPassword123!"</span>,
            <span class="hljs-attr">"fully_qualified_domain_name"</span>: <span class="hljs-string">"mydb.database.windows.net"</span>
          }
        }
      ]
    }
  ]
}
</code></pre>
<p>If you commit this to Git (don't laugh, it happens constantly), you've just handed attackers your database credentials. Even if you remove it in the next commit, it's in Git history forever.</p>
<p>If your Azure Storage Account permissions are misconfigured, anyone with read access can download your state blob and extract secrets. If you're using Terraform Cloud's free tier, your state is visible to anyone in your organization.</p>
<h3 id="heading-state-corruption-and-the-recovery-nightmare">State corruption and the recovery nightmare</h3>
<p>State corruption usually happens when:</p>
<ul>
<li><p>A Terraform run gets interrupted mid-apply</p>
</li>
<li><p>Two applies run simultaneously (locking failed)</p>
</li>
<li><p>Manual edits go wrong</p>
</li>
<li><p>The state backend has issues (Azure Blob Storage transient errors, network timeouts)</p>
</li>
</ul>
<p>When state corrupts, you're in for a bad time:</p>
<pre><code class="lang-bash">$ terraform plan
Error: Failed to load state: state snapshot was created by Terraform v1.6.0,
<span class="hljs-built_in">which</span> is newer than current v1.5.0; upgrade to Terraform v1.6.0 or greater to work with this state
​
<span class="hljs-comment"># Or worse...</span>
Error: Failed to decode state: invalid character <span class="hljs-string">'x'</span> looking <span class="hljs-keyword">for</span> beginning of value
</code></pre>
<p>Now you're trying to restore from backups (you have backups, right?), manually editing JSON to fix corruption, or using <code>terraform import</code> to rebuild state from scratch. For every. Single. Resource.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># The tedious recovery process</span>
$ terraform import azurerm_linux_virtual_machine.web /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/mygroup/providers/Microsoft.Compute/virtualMachines/web-vm
$ terraform import azurerm_network_security_group.web /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/mygroup/providers/Microsoft.Network/networkSecurityGroups/web-nsg
<span class="hljs-comment"># ... repeat 500 more times</span>
</code></pre>
<h3 id="heading-the-someone-deleted-production-disaster">The "someone deleted production" disaster</h3>
<p>The ultimate nightmare: someone runs <code>terraform destroy</code> on production state by accident. Or they delete the Azure Storage Account container holding state. Or they push broken state that marks everything for deletion.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># This should require two-factor authentication and a blood oath</span>
$ terraform destroy
<span class="hljs-comment"># ...</span>
Destroy complete! Resources: 437 destroyed.
​
<span class="hljs-comment"># Congratulations, you've just deleted production</span>
</code></pre>
<p>Without backups and versioning enabled on your state backend, recovery is somewhere between "extremely difficult" and "update your resume."</p>
<h2 id="heading-best-practices-making-peace-with-state">BEST PRACTICES: MAKING PEACE WITH STATE</h2>
<p>Alright, enough doom and gloom. Here's how to avoid these disasters.</p>
<h3 id="heading-always-use-remote-backends">Always use remote backends</h3>
<p>Never, ever keep state files local. Use Azure Storage, Terraform Cloud, or another remote backend with versioning and encryption.</p>
<pre><code class="lang-yaml"><span class="hljs-string">terraform</span> {
  <span class="hljs-string">backend</span> <span class="hljs-string">"azurerm"</span> {
    <span class="hljs-string">resource_group_name</span>  <span class="hljs-string">=</span> <span class="hljs-string">"terraform-state-rg"</span>
    <span class="hljs-string">storage_account_name</span> <span class="hljs-string">=</span> <span class="hljs-string">"companytfstate"</span>
    <span class="hljs-string">container_name</span>       <span class="hljs-string">=</span> <span class="hljs-string">"tfstate"</span>
    <span class="hljs-string">key</span>                  <span class="hljs-string">=</span> <span class="hljs-string">"prod.terraform.tfstate"</span>
<span class="hljs-string">​</span>
    <span class="hljs-comment"># State locking is built-in with blob leases</span>
    <span class="hljs-comment"># No separate lock table needed like DynamoDB</span>
  }
}
</code></pre>
<p>Make sure your Azure Storage Account has:</p>
<ul>
<li><p>Versioning enabled (recover from accidental deletes/corruption)</p>
</li>
<li><p>Encryption at rest (enabled by default with Microsoft-managed keys or use CMK)</p>
</li>
<li><p>Strict RBAC policies (principle of least privilege)</p>
</li>
<li><p>Soft delete enabled (provides undelete capability)</p>
</li>
<li><p>Diagnostic logging enabled (audit trail)</p>
</li>
</ul>
<h3 id="heading-state-locking-with-azure-blob-leases">State locking with Azure blob leases</h3>
<p>State locking prevents concurrent modifications. The good news? Azure Storage handles this automatically using blob leases—no separate infrastructure required.</p>
<p>When Terraform acquires a lock, it creates a lease on the state blob. If another process tries to modify the same state, it fails immediately:</p>
<pre><code class="lang-plaintext"># Azure handles locking automatically via blob leases
# No additional resources to provision
# Lock duration: 15 seconds by default, auto-renewed during operations
</code></pre>
<p>You can verify locking behavior by checking blob properties in the Azure Portal or via CLI:</p>
<pre><code class="lang-bash">az storage blob show \
  --account-name companytfstate \
  --container-name tfstate \
  --name prod.terraform.tfstate \
  --query <span class="hljs-string">"properties.lease"</span> -o table
</code></pre>
<p>The lease status will show "locked" when Terraform is actively modifying state.</p>
<h3 id="heading-never-commit-secrets-to-terraform-config">Never commit secrets to Terraform config</h3>
<p>Use secret management tools and reference them dynamically:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># BAD: Hardcoded secrets</span>
<span class="hljs-string">resource</span> <span class="hljs-string">"azurerm_mssql_server"</span> <span class="hljs-string">"main"</span> {
  <span class="hljs-string">administrator_login_password</span> <span class="hljs-string">=</span> <span class="hljs-string">"SuperSecretPassword123!"</span>  <span class="hljs-comment"># Don't do this</span>
}
<span class="hljs-string">​</span>
<span class="hljs-comment"># GOOD: Reference from Azure Key Vault</span>
<span class="hljs-string">data</span> <span class="hljs-string">"azurerm_key_vault"</span> <span class="hljs-string">"main"</span> {
  <span class="hljs-string">name</span>                <span class="hljs-string">=</span> <span class="hljs-string">"company-keyvault"</span>
  <span class="hljs-string">resource_group_name</span> <span class="hljs-string">=</span> <span class="hljs-string">"shared-services-rg"</span>
}
<span class="hljs-string">​</span>
<span class="hljs-string">data</span> <span class="hljs-string">"azurerm_key_vault_secret"</span> <span class="hljs-string">"db_password"</span> {
  <span class="hljs-string">name</span>         <span class="hljs-string">=</span> <span class="hljs-string">"sql-admin-password"</span>
  <span class="hljs-string">key_vault_id</span> <span class="hljs-string">=</span> <span class="hljs-string">data.azurerm_key_vault.main.id</span>
}
<span class="hljs-string">​</span>
<span class="hljs-string">resource</span> <span class="hljs-string">"azurerm_mssql_server"</span> <span class="hljs-string">"main"</span> {
  <span class="hljs-string">administrator_login_password</span> <span class="hljs-string">=</span> <span class="hljs-string">data.azurerm_key_vault_secret.db_password.value</span>
}
</code></pre>
<p>The secret still ends up in state, but at least it's not in your Git repo. For extra paranoia, use customer-managed keys (CMK) in Azure Key Vault to encrypt state and configure RBAC to restrict access.</p>
<h3 id="heading-master-state-manipulation-commands">Master state manipulation commands</h3>
<p>Sometimes you need to manually fix state. Learn these commands:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Import existing resources into state</span>
terraform import azurerm_linux_virtual_machine.example /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/mygroup/providers/Microsoft.Compute/virtualMachines/example-vm
​
<span class="hljs-comment"># Remove a resource from state (without deleting the actual resource)</span>
terraform state rm azurerm_linux_virtual_machine.old_server
​
<span class="hljs-comment"># Move a resource to a different address</span>
terraform state mv azurerm_linux_virtual_machine.old azurerm_linux_virtual_machine.new
​
<span class="hljs-comment"># List all resources in state</span>
terraform state list
​
<span class="hljs-comment"># Show details about a specific resource</span>
terraform state show azurerm_linux_virtual_machine.web
</code></pre>
<p>Use these carefully—they directly modify state. Always back up state before manual operations:</p>
<pre><code class="lang-bash">terraform state pull &gt; backup.tfstate
</code></pre>
<h3 id="heading-implement-proper-cicd-workflows">Implement proper CI/CD workflows</h3>
<p>Don't let humans run <code>terraform apply</code> from their laptops. Use CI/CD pipelines with:</p>
<ul>
<li><p>Automatic <code>terraform plan</code> on pull requests</p>
</li>
<li><p>Required approvals before apply</p>
</li>
<li><p>State locking handled automatically</p>
</li>
<li><p>Audit logs of who applied what</p>
</li>
<li><p>Separate state files per environment</p>
</li>
</ul>
<pre><code class="lang-yaml"><span class="hljs-comment"># Example GitHub Actions workflow</span>
<span class="hljs-attr">name:</span> <span class="hljs-string">Terraform</span>
<span class="hljs-attr">on:</span>
  <span class="hljs-attr">pull_request:</span>
    <span class="hljs-attr">paths:</span> [<span class="hljs-string">'terraform/**'</span>]
<span class="hljs-string">​</span>
<span class="hljs-attr">jobs:</span>
  <span class="hljs-attr">plan:</span>
    <span class="hljs-attr">runs-on:</span> <span class="hljs-string">ubuntu-latest</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/checkout@v3</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">uses:</span> <span class="hljs-string">hashicorp/setup-terraform@v2</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">run:</span> <span class="hljs-string">terraform</span> <span class="hljs-string">init</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">run:</span> <span class="hljs-string">terraform</span> <span class="hljs-string">plan</span> <span class="hljs-string">-out=tfplan</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/upload-artifact@v3</span>
        <span class="hljs-attr">with:</span>
          <span class="hljs-attr">name:</span> <span class="hljs-string">tfplan</span>
          <span class="hljs-attr">path:</span> <span class="hljs-string">tfplan</span>
</code></pre>
<h3 id="heading-split-state-into-manageable-pieces">Split state into manageable pieces</h3>
<p>Don't put everything in one giant state file. Break it up by:</p>
<ul>
<li><p>Environment (dev, staging, prod)</p>
</li>
<li><p>Layer (networking, compute, data)</p>
</li>
<li><p>Team ownership (platform, security, apps)</p>
</li>
</ul>
<pre><code class="lang-plaintext">terraform/
├── networking/
│   ├── backend.tf
│   └── main.tf
├── compute/
│   ├── backend.tf
│   └── main.tf
└── data/
    ├── backend.tf
    └── main.tf
</code></pre>
<p>This limits blast radius and improves performance. Just be careful with cross-stack dependencies.</p>
<h2 id="heading-wrapping-up">WRAPPING UP</h2>
<p>Terraform state is a necessary evil. It's the price you pay for declarative infrastructure management. The good news is that the benefits—dependency tracking, collaboration, drift detection—outweigh the operational overhead once you've got proper workflows in place.</p>
<p>The bad news is that you'll definitely learn these lessons the hard way at least once. When you do, remember:</p>
<ol>
<li><p>Remote state with versioning and locking is non-negotiable</p>
</li>
<li><p>Secrets in state files are a security risk—treat state like sensitive data</p>
</li>
<li><p>State corruption happens—have backups and know how to restore</p>
</li>
<li><p>Manual state operations are powerful and dangerous—measure twice, cut once</p>
</li>
</ol>
<p>And if you ever find yourself running <code>terraform destroy</code> in production, maybe take a coffee break first and verify you're in the right directory. Your future self will thank you.</p>
]]></content:encoded></item><item><title><![CDATA[Managing Azure DevOps pipeline templates: patterns and best practices]]></title><description><![CDATA[If you've worked with Azure DevOps for any length of time, you've probably noticed that pipeline YAML files have a tendency to sprawl. What starts as a simple build-and-deploy pipeline quickly turns into hundreds of lines of duplicated logic scattere...]]></description><link>https://blog.vivekdhami.com/managing-azure-devops-pipeline-templates-patterns-and-best-practices</link><guid isPermaLink="true">https://blog.vivekdhami.com/managing-azure-devops-pipeline-templates-patterns-and-best-practices</guid><category><![CDATA[azure-devops]]></category><category><![CDATA[Devops]]></category><category><![CDATA[ci-cd]]></category><category><![CDATA[CI/CD pipelines]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Sun, 14 Apr 2024 22:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/9AxFJaNySB8/upload/2cc72cd8dc8e19cdac111c9821014623.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you've worked with Azure DevOps for any length of time, you've probably noticed that pipeline YAML files have a tendency to sprawl. What starts as a simple build-and-deploy pipeline quickly turns into hundreds of lines of duplicated logic scattered across dozens of repositories. Enter pipeline templates—ADO's answer to the DRY principle for CI/CD workflows.</p>
<p>In this post, we'll walk through the fundamentals of Azure DevOps pipelines, explore why centralized templates matter, and dive into practical patterns for structuring, versioning, and managing templates at scale.</p>
<h2 id="heading-azure-devops-pipelines-the-building-blocks">AZURE DEVOPS PIPELINES: THE BUILDING BLOCKS</h2>
<p>Before we get into templates, let's establish the core concepts. Azure DevOps pipelines are built on three main abstractions:</p>
<p><strong>Pipelines</strong> are the top-level containers. They define when and how your CI/CD process runs. A pipeline can be triggered by code commits, pull requests, scheduled runs, or manual invocations.</p>
<p><strong>Jobs</strong> represent units of work that run on an agent. Each job executes independently and can run in parallel with other jobs. Jobs are where you specify the agent pool, define dependencies, and set conditions.</p>
<p><strong>Tasks</strong> are the individual steps within a job—think "run this shell script," "build this Docker image," or "deploy to Kubernetes." Microsoft provides a bunch of built-in tasks, and you can also create custom ones.</p>
<p>Here's a minimal pipeline to illustrate:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">trigger:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-string">main</span>
<span class="hljs-string">​</span>
<span class="hljs-attr">pool:</span>
  <span class="hljs-attr">vmImage:</span> <span class="hljs-string">'ubuntu-latest'</span>
<span class="hljs-string">​</span>
<span class="hljs-attr">jobs:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">job:</span> <span class="hljs-string">Build</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">UseDotNet@2</span>
        <span class="hljs-attr">inputs:</span>
          <span class="hljs-attr">version:</span> <span class="hljs-string">'8.x'</span>
<span class="hljs-string">​</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">dotnet</span> <span class="hljs-string">build</span>
        <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Build application'</span>
<span class="hljs-string">​</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">dotnet</span> <span class="hljs-string">test</span>
        <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Run tests'</span>
</code></pre>
<p>Simple enough. But once you've got ten repos with similar build processes, you'll be copy-pasting this YAML everywhere. And when you need to update the .NET version or add a security scan? Good luck hunting down every instance.</p>
<h2 id="heading-why-centralize-with-templates">WHY CENTRALIZE WITH TEMPLATES</h2>
<p>Pipeline templates let you extract common logic into reusable files that multiple pipelines can reference. This gives you a few critical benefits:</p>
<p><strong>Consistency across teams.</strong> When everyone uses the same template for Docker builds or Kubernetes deployments, you eliminate configuration drift. Security policies, compliance checks, and best practices get baked in automatically.</p>
<p><strong>Easier maintenance.</strong> Need to add container scanning to every build? Update the template once, and it propagates to all consumers. No PR marathons across dozens of repos.</p>
<p><strong>Reduced cognitive load.</strong> Developers shouldn't need to be YAML experts or remember the incantations for artifact publishing. Templates abstract away complexity and let teams focus on their application logic.</p>
<p><strong>Governance and guardrails.</strong> Templates can enforce organizational standards—like requiring approval stages for production deploys or mandating specific test coverage thresholds.</p>
<h3 id="heading-a-simple-template-example">A Simple Template Example</h3>
<p>Let's say you want to standardize how your teams build and push Docker images. You'd create a template file in a central repository:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># templates/docker-build.yml</span>
<span class="hljs-attr">parameters:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">dockerfilePath</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
    <span class="hljs-attr">default:</span> <span class="hljs-string">'./Dockerfile'</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">imageName</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">tags</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">object</span>
    <span class="hljs-attr">default:</span> [<span class="hljs-string">'latest'</span>]
<span class="hljs-string">​</span>
<span class="hljs-attr">steps:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">Docker@2</span>
    <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Build Docker image'</span>
    <span class="hljs-attr">inputs:</span>
      <span class="hljs-attr">command:</span> <span class="hljs-string">'build'</span>
      <span class="hljs-attr">Dockerfile:</span> <span class="hljs-string">${{</span> <span class="hljs-string">parameters.dockerfilePath</span> <span class="hljs-string">}}</span>
      <span class="hljs-attr">tags:</span> <span class="hljs-string">${{</span> <span class="hljs-string">join(',',</span> <span class="hljs-string">parameters.tags)</span> <span class="hljs-string">}}</span>
<span class="hljs-string">​</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">Docker@2</span>
    <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Push to registry'</span>
    <span class="hljs-attr">inputs:</span>
      <span class="hljs-attr">command:</span> <span class="hljs-string">'push'</span>
      <span class="hljs-attr">repository:</span> <span class="hljs-string">${{</span> <span class="hljs-string">parameters.imageName</span> <span class="hljs-string">}}</span>
      <span class="hljs-attr">tags:</span> <span class="hljs-string">${{</span> <span class="hljs-string">join(',',</span> <span class="hljs-string">parameters.tags)</span> <span class="hljs-string">}}</span>
<span class="hljs-string">​</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">|
      docker scan ${{ parameters.imageName }}:latest
</span>    <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Scan image for vulnerabilities'</span>
</code></pre>
<p>Now any pipeline can consume this template:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># azure-pipelines.yml</span>
<span class="hljs-attr">resources:</span>
  <span class="hljs-attr">repositories:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">repository:</span> <span class="hljs-string">templates</span>
      <span class="hljs-attr">type:</span> <span class="hljs-string">git</span>
      <span class="hljs-attr">name:</span> <span class="hljs-string">DevOps/pipeline-templates</span>
      <span class="hljs-attr">ref:</span> <span class="hljs-string">refs/tags/v1.2.0</span>
<span class="hljs-string">​</span>
<span class="hljs-attr">trigger:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-string">main</span>
<span class="hljs-string">​</span>
<span class="hljs-attr">pool:</span>
  <span class="hljs-attr">vmImage:</span> <span class="hljs-string">'ubuntu-latest'</span>
<span class="hljs-string">​</span>
<span class="hljs-attr">jobs:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">job:</span> <span class="hljs-string">BuildAndPush</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">template:</span> <span class="hljs-string">templates/docker-build.yml@templates</span>
        <span class="hljs-attr">parameters:</span>
          <span class="hljs-attr">imageName:</span> <span class="hljs-string">'myapp'</span>
          <span class="hljs-attr">tags:</span> [<span class="hljs-string">'latest'</span>, <span class="hljs-string">'$(Build.BuildId)'</span>]
</code></pre>
<p>Notice the <code>@templates</code> reference—that tells ADO to pull the template from the external repository we defined in <code>resources.repositories</code>. This pattern decouples template definitions from consuming pipelines, which is key for centralized management.</p>
<h2 id="heading-versioning-and-update-management">VERSIONING AND UPDATE MANAGEMENT</h2>
<p>One of the trickiest parts of managing templates is handling updates without breaking existing pipelines. Here's how to approach it:</p>
<h3 id="heading-use-git-tags-for-template-versions">Use Git Tags for Template Versions</h3>
<p>Store your templates in a dedicated Git repository and tag releases using semantic versioning (e.g., <code>v1.0.0</code>, <code>v1.1.0</code>, <code>v2.0.0</code>). Consuming pipelines reference specific tags:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">resources:</span>
  <span class="hljs-attr">repositories:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">repository:</span> <span class="hljs-string">templates</span>
      <span class="hljs-attr">type:</span> <span class="hljs-string">git</span>
      <span class="hljs-attr">name:</span> <span class="hljs-string">DevOps/pipeline-templates</span>
      <span class="hljs-attr">ref:</span> <span class="hljs-string">refs/tags/v1.2.0</span>  <span class="hljs-comment"># Pin to a specific version</span>
</code></pre>
<p>This gives teams control over when they adopt new template versions. You can test changes in a staging pipeline before rolling them out broadly.</p>
<h3 id="heading-branch-strategies-for-development">Branch Strategies for Development</h3>
<p>For active template development, maintain separate branches:</p>
<ul>
<li><p><code>main</code>: Production-ready, stable templates</p>
</li>
<li><p><code>develop</code>: Integration branch for new features</p>
</li>
<li><p><code>hotfix/*</code>: Quick fixes for critical bugs</p>
</li>
</ul>
<p>Tag releases from <code>main</code> and document breaking changes in your release notes. Use conventional commits to make changelog generation easier.</p>
<h3 id="heading-communicating-breaking-changes">Communicating Breaking Changes</h3>
<p>When you need to introduce breaking changes (like renaming a parameter or changing default behavior), follow this pattern:</p>
<ol>
<li><p><strong>Deprecate, don't delete.</strong> Mark old parameters as deprecated and support them for at least one major version.</p>
</li>
<li><p><strong>Provide migration guides.</strong> Document what teams need to change in their pipelines.</p>
</li>
<li><p><strong>Use major version bumps.</strong> Go from <code>v1.x.x</code> to <code>v2.0.0</code> to signal incompatibility.</p>
</li>
</ol>
<p>Example deprecation:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">parameters:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">buildConfiguration</span>  <span class="hljs-comment"># New parameter</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
    <span class="hljs-attr">default:</span> <span class="hljs-string">'Release'</span>
<span class="hljs-string">​</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">configuration</span>  <span class="hljs-comment"># Deprecated</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
    <span class="hljs-attr">default:</span> <span class="hljs-string">''</span>
<span class="hljs-string">​</span>
<span class="hljs-attr">steps:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">|
      if [ -n "${{ parameters.configuration }}" ]; then
        echo "##vso[task.logissue type=warning]Parameter 'configuration' is deprecated. Use 'buildConfiguration' instead."
      fi
</span>    <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Check for deprecated parameters'</span>
</code></pre>
<h3 id="heading-tracking-template-usage">Tracking Template Usage</h3>
<p>Azure DevOps doesn't have built-in analytics for template usage, so you'll need to get creative. Some options:</p>
<ul>
<li><p><strong>Search across repos</strong> using Git grep or ADO's code search to find references to specific template versions.</p>
</li>
<li><p><strong>Pipeline analytics dashboard</strong> that parses pipeline definitions and surfaces which templates are in use.</p>
</li>
<li><p><strong>Automated dependency scanning</strong> as part of your template repository's CI to flag pipelines still using old versions.</p>
</li>
</ul>
<h3 id="heading-automating-version-management-with-git">Automating Version Management with Git</h3>
<p>Manual version bumping is tedious and error-prone. Instead, leverage Git and conventional commits to automate versioning and release note generation. Here's how to set it up:</p>
<p><strong>Use Conventional Commits</strong></p>
<p>Adopt the <a target="_blank" href="https://www.conventionalcommits.org/">Conventional Commits</a> specification for your template repository. This standard formats commit messages in a machine-readable way:</p>
<pre><code class="lang-plaintext">feat: add terraform-apply job template
fix: correct parameter validation in docker-build template
docs: update README with usage examples
BREAKING CHANGE: rename buildConfig parameter to buildConfiguration
</code></pre>
<p>The commit type (<code>feat</code>, <code>fix</code>, <code>docs</code>, <code>chore</code>, etc.) determines how the version number gets bumped:</p>
<ul>
<li><p><code>feat</code> → minor version bump (1.2.0 → 1.3.0)</p>
</li>
<li><p><code>fix</code> → patch version bump (1.2.0 → 1.2.1)</p>
</li>
<li><p><code>BREAKING CHANGE</code> → major version bump (1.2.0 → 2.0.0)</p>
</li>
</ul>
<p><strong>Automate Versioning with GitVersion</strong></p>
<p><a target="_blank" href="https://gitversion.net/">GitVersion</a> is a tool that derives semantic version numbers from your Git history. It works great with Azure DevOps pipelines:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># azure-pipelines.yml (in your template repo)</span>
<span class="hljs-attr">trigger:</span>
  <span class="hljs-attr">branches:</span>
    <span class="hljs-attr">include:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">main</span>
<span class="hljs-string">​</span>
<span class="hljs-attr">pool:</span>
  <span class="hljs-attr">vmImage:</span> <span class="hljs-string">'ubuntu-latest'</span>
<span class="hljs-string">​</span>
<span class="hljs-attr">steps:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">gitversion/setup@0</span>
    <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Install GitVersion'</span>
    <span class="hljs-attr">inputs:</span>
      <span class="hljs-attr">versionSpec:</span> <span class="hljs-string">'5.x'</span>
<span class="hljs-string">​</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">gitversion/execute@0</span>
    <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Calculate version'</span>
    <span class="hljs-attr">inputs:</span>
      <span class="hljs-attr">useConfigFile:</span> <span class="hljs-literal">true</span>
      <span class="hljs-attr">configFilePath:</span> <span class="hljs-string">'GitVersion.yml'</span>
<span class="hljs-string">​</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">|
      echo "Version: $(GitVersion.SemVer)"
      echo "##vso[build.updatebuildnumber]$(GitVersion.SemVer)"
</span>    <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Display version'</span>
<span class="hljs-string">​</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">|
      git tag v$(GitVersion.SemVer)
      git push origin v$(GitVersion.SemVer)
</span>    <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Create and push Git tag'</span>
    <span class="hljs-attr">condition:</span> <span class="hljs-string">and(succeeded(),</span> <span class="hljs-string">eq(variables['Build.SourceBranch'],</span> <span class="hljs-string">'refs/heads/main'</span><span class="hljs-string">))</span>
</code></pre>
<p>Configure GitVersion with a <code>GitVersion.yml</code> file:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">mode:</span> <span class="hljs-string">ContinuousDeployment</span>
<span class="hljs-attr">branches:</span>
  <span class="hljs-attr">main:</span>
    <span class="hljs-attr">tag:</span> <span class="hljs-string">''</span>
  <span class="hljs-attr">develop:</span>
    <span class="hljs-attr">tag:</span> <span class="hljs-string">'beta'</span>
  <span class="hljs-attr">feature:</span>
    <span class="hljs-attr">tag:</span> <span class="hljs-string">'alpha'</span>
<span class="hljs-attr">ignore:</span>
  <span class="hljs-attr">sha:</span> []
</code></pre>
<p>Now every merge to <code>main</code> automatically calculates the next version and creates a Git tag. No manual intervention required.</p>
<p><strong>Generate Changelogs Automatically</strong></p>
<p>Once you're using conventional commits, generating release notes is straightforward. You can use tools like <a target="_blank" href="https://github.com/conventional-changelog/conventional-changelog">conventional-changelog</a> or <a target="_blank" href="https://git-cliff.org/">git-cliff</a>.</p>
<p>Here's a pipeline task that generates a changelog:</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">NodeTool@0</span>
  <span class="hljs-attr">inputs:</span>
    <span class="hljs-attr">versionSpec:</span> <span class="hljs-string">'18.x'</span>
  <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Install Node.js'</span>
<span class="hljs-string">​</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">|
    npm install -g conventional-changelog-cli
    conventional-changelog -p angular -i CHANGELOG.md -s -r 0
</span>  <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Generate changelog'</span>
<span class="hljs-string">​</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">|
    git config user.name "Pipeline Bot"
    git config user.email "pipeline@company.com"
    git add CHANGELOG.md
    git commit -m "docs: update changelog for $(GitVersion.SemVer)"
    git push origin HEAD:main
</span>  <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Commit changelog'</span>
  <span class="hljs-attr">condition:</span> <span class="hljs-string">and(succeeded(),</span> <span class="hljs-string">eq(variables['Build.SourceBranch'],</span> <span class="hljs-string">'refs/heads/main'</span><span class="hljs-string">))</span>
</code></pre>
<p>This generates a nicely formatted changelog grouped by commit type:</p>
<pre><code class="lang-plaintext">## [1.3.0] - 2026-02-05
​
### Features
* add terraform-apply job template
* support for multi-cloud deployments
​
### Bug Fixes
* correct parameter validation in docker-build template
* fix kubectl installation on Windows agents
​
### BREAKING CHANGES
* rename buildConfig parameter to buildConfiguration
  Migration: Update all pipeline references from `buildConfig` to `buildConfiguration`
</code></pre>
<p><strong>Create Azure DevOps Releases</strong></p>
<p>Take it a step further and automatically create releases with the generated changelog using the Azure DevOps REST API:</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">PowerShell@2</span>
  <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Create Azure DevOps Release'</span>
  <span class="hljs-attr">inputs:</span>
    <span class="hljs-attr">targetType:</span> <span class="hljs-string">'inline'</span>
    <span class="hljs-attr">script:</span> <span class="hljs-string">|
      $changelog = Get-Content CHANGELOG.md -Raw
      $body = @{
        name = "v$(GitVersion.SemVer)"
        description = $changelog
        tagName = "v$(GitVersion.SemVer)"
      } | ConvertTo-Json
</span><span class="hljs-string">​</span>
      <span class="hljs-string">$headers</span> <span class="hljs-string">=</span> <span class="hljs-string">@{</span>
        <span class="hljs-string">Authorization</span> <span class="hljs-string">=</span> <span class="hljs-string">"Bearer $(System.AccessToken)"</span>
        <span class="hljs-string">"Content-Type"</span> <span class="hljs-string">=</span> <span class="hljs-string">"application/json"</span>
      <span class="hljs-string">}</span>
<span class="hljs-string">​</span>
      <span class="hljs-string">$url</span> <span class="hljs-string">=</span> <span class="hljs-string">"$(System.CollectionUri)$(System.TeamProject)/_apis/git/repositories/$(Build.Repository.ID)/releases?api-version=7.1-preview.1"</span>
      <span class="hljs-string">Invoke-RestMethod</span> <span class="hljs-string">-Uri</span> <span class="hljs-string">$url</span> <span class="hljs-string">-Method</span> <span class="hljs-string">Post</span> <span class="hljs-string">-Headers</span> <span class="hljs-string">$headers</span> <span class="hljs-string">-Body</span> <span class="hljs-string">$body</span>
</code></pre>
<p><strong>Enforce Commit Message Standards</strong></p>
<p>To ensure everyone follows conventional commits, add validation to your PR pipeline:</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">NodeTool@0</span>
  <span class="hljs-attr">inputs:</span>
    <span class="hljs-attr">versionSpec:</span> <span class="hljs-string">'18.x'</span>
<span class="hljs-string">​</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">|
    npm install -g @commitlint/cli @commitlint/config-conventional
    npx commitlint --from HEAD~1 --to HEAD --verbose
</span>  <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Validate commit messages'</span>
</code></pre>
<p>Create a <code>commitlint.config.js</code> file:</p>
<pre><code class="lang-javascript"><span class="hljs-built_in">module</span>.exports = {
  <span class="hljs-attr">extends</span>: [<span class="hljs-string">'@commitlint/config-conventional'</span>],
  <span class="hljs-attr">rules</span>: {
    <span class="hljs-string">'type-enum'</span>: [<span class="hljs-number">2</span>, <span class="hljs-string">'always'</span>, [
      <span class="hljs-string">'feat'</span>, <span class="hljs-string">'fix'</span>, <span class="hljs-string">'docs'</span>, <span class="hljs-string">'style'</span>, <span class="hljs-string">'refactor'</span>,
      <span class="hljs-string">'test'</span>, <span class="hljs-string">'chore'</span>, <span class="hljs-string">'revert'</span>
    ]]
  }
};
</code></pre>
<p>This fails the PR if commit messages don't follow the standard, keeping your Git history clean and machine-readable.</p>
<p><strong>The Complete Template Repository Pipeline</strong></p>
<p>Putting it all together, here's a full CI/CD pipeline for your template repository:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">trigger:</span>
  <span class="hljs-attr">branches:</span>
    <span class="hljs-attr">include:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">main</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">develop</span>
<span class="hljs-string">​</span>
<span class="hljs-attr">pr:</span>
  <span class="hljs-attr">branches:</span>
    <span class="hljs-attr">include:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">main</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">develop</span>
<span class="hljs-string">​</span>
<span class="hljs-attr">pool:</span>
  <span class="hljs-attr">vmImage:</span> <span class="hljs-string">'ubuntu-latest'</span>
<span class="hljs-string">​</span>
<span class="hljs-attr">stages:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">stage:</span> <span class="hljs-string">Validate</span>
    <span class="hljs-attr">jobs:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">job:</span> <span class="hljs-string">ValidateTemplates</span>
        <span class="hljs-attr">steps:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">NodeTool@0</span>
            <span class="hljs-attr">inputs:</span>
              <span class="hljs-attr">versionSpec:</span> <span class="hljs-string">'18.x'</span>
<span class="hljs-string">​</span>
          <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">npx</span> <span class="hljs-string">commitlint</span> <span class="hljs-string">--from</span> <span class="hljs-string">origin/main</span> <span class="hljs-string">--verbose</span>
            <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Validate commit messages'</span>
            <span class="hljs-attr">condition:</span> <span class="hljs-string">eq(variables['Build.Reason'],</span> <span class="hljs-string">'PullRequest'</span><span class="hljs-string">)</span>
<span class="hljs-string">​</span>
          <span class="hljs-comment"># Run your template tests here</span>
          <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">./scripts/test-templates.sh</span>
            <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Test templates'</span>
<span class="hljs-string">​</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">stage:</span> <span class="hljs-string">Release</span>
    <span class="hljs-attr">condition:</span> <span class="hljs-string">and(succeeded(),</span> <span class="hljs-string">eq(variables['Build.SourceBranch'],</span> <span class="hljs-string">'refs/heads/main'</span><span class="hljs-string">))</span>
    <span class="hljs-attr">jobs:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">job:</span> <span class="hljs-string">CreateRelease</span>
        <span class="hljs-attr">steps:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">gitversion/setup@0</span>
            <span class="hljs-attr">inputs:</span>
              <span class="hljs-attr">versionSpec:</span> <span class="hljs-string">'5.x'</span>
<span class="hljs-string">​</span>
          <span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">gitversion/execute@0</span>
<span class="hljs-string">​</span>
          <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">|
              npm install -g conventional-changelog-cli
              conventional-changelog -p angular -i CHANGELOG.md -s
</span>            <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Generate changelog'</span>
<span class="hljs-string">​</span>
          <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">|
              git config user.name "Pipeline Bot"
              git config user.email "pipeline@company.com"
              git tag v$(GitVersion.SemVer)
              git push origin v$(GitVersion.SemVer)
</span>            <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Create version tag'</span>
<span class="hljs-string">​</span>
          <span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">PowerShell@2</span>
            <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Create Azure DevOps Release'</span>
            <span class="hljs-attr">inputs:</span>
              <span class="hljs-attr">targetType:</span> <span class="hljs-string">'inline'</span>
              <span class="hljs-attr">script:</span> <span class="hljs-string">|
                $changelog = Get-Content CHANGELOG.md -Raw
                $body = @{
                  name = "v$(GitVersion.SemVer)"
                  description = $changelog
                  tagName = "v$(GitVersion.SemVer)"
                } | ConvertTo-Json
</span><span class="hljs-string">​</span>
                <span class="hljs-string">$headers</span> <span class="hljs-string">=</span> <span class="hljs-string">@{</span>
                  <span class="hljs-string">Authorization</span> <span class="hljs-string">=</span> <span class="hljs-string">"Bearer $(System.AccessToken)"</span>
                  <span class="hljs-string">"Content-Type"</span> <span class="hljs-string">=</span> <span class="hljs-string">"application/json"</span>
                <span class="hljs-string">}</span>
<span class="hljs-string">​</span>
                <span class="hljs-string">$url</span> <span class="hljs-string">=</span> <span class="hljs-string">"$(System.CollectionUri)$(System.TeamProject)/_apis/git/repositories/$(Build.Repository.ID)/releases?api-version=7.1-preview.1"</span>
                <span class="hljs-string">Invoke-RestMethod</span> <span class="hljs-string">-Uri</span> <span class="hljs-string">$url</span> <span class="hljs-string">-Method</span> <span class="hljs-string">Post</span> <span class="hljs-string">-Headers</span> <span class="hljs-string">$headers</span> <span class="hljs-string">-Body</span> <span class="hljs-string">$body</span>
</code></pre>
<p>With this setup, your versioning and release process becomes completely hands-off. Developers just write good commit messages, and the pipeline handles the rest—calculating versions, creating tags, generating changelogs, and publishing releases.</p>
<h2 id="heading-template-granularity-and-structure">TEMPLATE GRANULARITY AND STRUCTURE</h2>
<p>Deciding how to break up your templates is part art, part science. Here are some patterns that work well:</p>
<h3 id="heading-start-with-task-level-templates">Start with Task-Level Templates</h3>
<p>Task templates are the smallest reusable unit—think of them as functions in your CI/CD code. They encapsulate a single responsibility:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># templates/tasks/install-kubectl.yml</span>
<span class="hljs-attr">parameters:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">version</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
    <span class="hljs-attr">default:</span> <span class="hljs-string">'1.28.0'</span>
<span class="hljs-string">​</span>
<span class="hljs-attr">steps:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">|
      curl -LO "https://dl.k8s.io/release/v${{ parameters.version }}/bin/linux/amd64/kubectl"
      chmod +x kubectl
      sudo mv kubectl /usr/local/bin/
</span>    <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Install kubectl $<span class="hljs-template-variable">{{ parameters.version }}</span>'</span>
</code></pre>
<p>Task templates are great for operations you need to repeat across different jobs or stages.</p>
<h3 id="heading-job-templates-for-reusable-workflows">Job Templates for Reusable Workflows</h3>
<p>Job templates bundle multiple tasks into a cohesive unit of work:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># templates/jobs/dotnet-build.yml</span>
<span class="hljs-attr">parameters:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">dotnetVersion</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
    <span class="hljs-attr">default:</span> <span class="hljs-string">'8.x'</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">buildConfiguration</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
    <span class="hljs-attr">default:</span> <span class="hljs-string">'Release'</span>
<span class="hljs-string">​</span>
<span class="hljs-attr">jobs:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">job:</span> <span class="hljs-string">Build</span>
    <span class="hljs-attr">pool:</span>
      <span class="hljs-attr">vmImage:</span> <span class="hljs-string">'ubuntu-latest'</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">UseDotNet@2</span>
        <span class="hljs-attr">inputs:</span>
          <span class="hljs-attr">version:</span> <span class="hljs-string">${{</span> <span class="hljs-string">parameters.dotnetVersion</span> <span class="hljs-string">}}</span>
<span class="hljs-string">​</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">dotnet</span> <span class="hljs-string">restore</span>
        <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Restore dependencies'</span>
<span class="hljs-string">​</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">dotnet</span> <span class="hljs-string">build</span> <span class="hljs-string">--configuration</span> <span class="hljs-string">${{</span> <span class="hljs-string">parameters.buildConfiguration</span> <span class="hljs-string">}}</span>
        <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Build'</span>
<span class="hljs-string">​</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">dotnet</span> <span class="hljs-string">test</span>
        <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Run tests'</span>
<span class="hljs-string">​</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">PublishTestResults@2</span>
        <span class="hljs-attr">inputs:</span>
          <span class="hljs-attr">testResultsFormat:</span> <span class="hljs-string">'VSTest'</span>
          <span class="hljs-attr">testResultsFiles:</span> <span class="hljs-string">'**/*.trx'</span>
</code></pre>
<p>Use job templates when you want to standardize the entire execution environment, including pool selection and dependency management.</p>
<h3 id="heading-stage-templates-for-multi-environment-deployments">Stage Templates for Multi-Environment Deployments</h3>
<p>Stage templates are ideal for deployment workflows that need to run across multiple environments:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># templates/stages/deploy-to-k8s.yml</span>
<span class="hljs-attr">parameters:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">environment</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">namespace</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">manifests</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
    <span class="hljs-attr">default:</span> <span class="hljs-string">'./k8s/*.yaml'</span>
<span class="hljs-string">​</span>
<span class="hljs-attr">stages:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">stage:</span> <span class="hljs-string">Deploy_${{</span> <span class="hljs-string">parameters.environment</span> <span class="hljs-string">}}</span>
    <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Deploy to $<span class="hljs-template-variable">{{ parameters.environment }}</span>'</span>
    <span class="hljs-attr">jobs:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">deployment:</span> <span class="hljs-string">DeployToK8s</span>
        <span class="hljs-attr">environment:</span> <span class="hljs-string">${{</span> <span class="hljs-string">parameters.environment</span> <span class="hljs-string">}}</span>
        <span class="hljs-attr">pool:</span>
          <span class="hljs-attr">vmImage:</span> <span class="hljs-string">'ubuntu-latest'</span>
        <span class="hljs-attr">strategy:</span>
          <span class="hljs-attr">runOnce:</span>
            <span class="hljs-attr">deploy:</span>
              <span class="hljs-attr">steps:</span>
                <span class="hljs-bullet">-</span> <span class="hljs-attr">template:</span> <span class="hljs-string">../tasks/install-kubectl.yml</span>
<span class="hljs-string">​</span>
                <span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">KubernetesManifest@0</span>
                  <span class="hljs-attr">inputs:</span>
                    <span class="hljs-attr">action:</span> <span class="hljs-string">'deploy'</span>
                    <span class="hljs-attr">namespace:</span> <span class="hljs-string">${{</span> <span class="hljs-string">parameters.namespace</span> <span class="hljs-string">}}</span>
                    <span class="hljs-attr">manifests:</span> <span class="hljs-string">${{</span> <span class="hljs-string">parameters.manifests</span> <span class="hljs-string">}}</span>
</code></pre>
<p>Then consume it in your pipeline:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">stages:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">template:</span> <span class="hljs-string">templates/stages/deploy-to-k8s.yml@templates</span>
    <span class="hljs-attr">parameters:</span>
      <span class="hljs-attr">environment:</span> <span class="hljs-string">'dev'</span>
      <span class="hljs-attr">namespace:</span> <span class="hljs-string">'myapp-dev'</span>
<span class="hljs-string">​</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">template:</span> <span class="hljs-string">templates/stages/deploy-to-k8s.yml@templates</span>
    <span class="hljs-attr">parameters:</span>
      <span class="hljs-attr">environment:</span> <span class="hljs-string">'prod'</span>
      <span class="hljs-attr">namespace:</span> <span class="hljs-string">'myapp-prod'</span>
</code></pre>
<h3 id="heading-organizing-your-template-repository">Organizing Your Template Repository</h3>
<p>A well-structured template repository looks something like this:</p>
<pre><code class="lang-plaintext">pipeline-templates/
├── README.md
├── CHANGELOG.md
├── templates/
│   ├── tasks/
│   │   ├── install-kubectl.yml
│   │   ├── docker-build.yml
│   │   └── security-scan.yml
│   ├── jobs/
│   │   ├── dotnet-build.yml
│   │   ├── node-build.yml
│   │   └── terraform-plan.yml
│   ├── stages/
│   │   ├── deploy-to-k8s.yml
│   │   └── deploy-to-azure-app-service.yml
│   └── pipelines/
│       ├── microservice-standard.yml
│       └── infrastructure-deploy.yml
└── examples/
    ├── dotnet-api-pipeline.yml
    └── terraform-pipeline.yml
</code></pre>
<p><strong>Tasks</strong> are atomic operations. <strong>Jobs</strong> are workflows. <strong>Stages</strong> are multi-step processes. <strong>Pipelines</strong> are full end-to-end templates that teams can use with minimal customization.</p>
<p>Include examples that show how to use your templates—this dramatically reduces onboarding friction.</p>
<h2 id="heading-best-practices-and-patterns">BEST PRACTICES AND PATTERNS</h2>
<p>Here are some patterns that'll save you headaches:</p>
<h3 id="heading-use-parameters-with-defaults">Use Parameters with Defaults</h3>
<p>Always provide sensible defaults so teams can use templates without needing to specify every parameter:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">parameters:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">buildConfiguration</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
    <span class="hljs-attr">default:</span> <span class="hljs-string">'Release'</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">runTests</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">boolean</span>
    <span class="hljs-attr">default:</span> <span class="hljs-literal">true</span>
</code></pre>
<h3 id="heading-validate-inputs">Validate Inputs</h3>
<p>Use parameter validation to catch errors early:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">parameters:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">environment</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
    <span class="hljs-attr">values:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">dev</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">staging</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">prod</span>
</code></pre>
<h3 id="heading-keep-templates-focused">Keep Templates Focused</h3>
<p>A template should do one thing well. If your template is hundreds of lines long and takes 20 parameters, it's probably trying to do too much. Break it into smaller, composable pieces.</p>
<h3 id="heading-use-conditional-logic-sparingly">Use Conditional Logic Sparingly</h3>
<p>Templates can include conditions, but overusing them makes templates harder to understand and maintain:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">steps:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-string">${{</span> <span class="hljs-string">if</span> <span class="hljs-string">eq(parameters.environment,</span> <span class="hljs-string">'prod'</span><span class="hljs-string">)</span> <span class="hljs-string">}}:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">echo</span> <span class="hljs-string">"Running production-only checks"</span>
      <span class="hljs-attr">displayName:</span> <span class="hljs-string">'Production validation'</span>
</code></pre>
<p>This is fine for simple cases, but if you find yourself writing complex conditional trees, consider creating separate templates instead.</p>
<h3 id="heading-document-your-templates">Document Your Templates</h3>
<p>Every template should have a header comment explaining:</p>
<ul>
<li><p>What it does</p>
</li>
<li><p>Required and optional parameters</p>
</li>
<li><p>Example usage</p>
</li>
<li><p>Any prerequisites (service connections, variable groups, etc.)</p>
</li>
</ul>
<pre><code class="lang-yaml"><span class="hljs-comment"># Docker Build and Push Template</span>
<span class="hljs-comment">#</span>
<span class="hljs-comment"># Builds a Docker image and pushes it to a container registry.</span>
<span class="hljs-comment"># Includes vulnerability scanning via docker scan.</span>
<span class="hljs-comment">#</span>
<span class="hljs-comment"># Parameters:</span>
<span class="hljs-comment">#   - imageName (required): Name of the image to build</span>
<span class="hljs-comment">#   - dockerfilePath (optional): Path to Dockerfile (default: './Dockerfile')</span>
<span class="hljs-comment">#   - tags (optional): List of tags to apply (default: ['latest'])</span>
<span class="hljs-comment">#</span>
<span class="hljs-comment"># Example:</span>
<span class="hljs-comment">#   - template: templates/docker-build.yml@templates</span>
<span class="hljs-comment">#     parameters:</span>
<span class="hljs-comment">#       imageName: 'myapp'</span>
<span class="hljs-comment">#       tags: ['latest', '$(Build.BuildId)']</span>
</code></pre>
<h3 id="heading-leverage-variable-templates">Leverage Variable Templates</h3>
<p>You can also use templates for variables, which is handy for environment-specific configuration:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># templates/variables/prod-vars.yml</span>
<span class="hljs-attr">variables:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">resourceGroup</span>
    <span class="hljs-attr">value:</span> <span class="hljs-string">'prod-rg'</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">storageAccount</span>
    <span class="hljs-attr">value:</span> <span class="hljs-string">'prodstorageacct'</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">logLevel</span>
    <span class="hljs-attr">value:</span> <span class="hljs-string">'Warning'</span>
</code></pre>
<p>Then reference it:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">variables:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">template:</span> <span class="hljs-string">templates/variables/prod-vars.yml@templates</span>
</code></pre>
<h3 id="heading-test-your-templates">Test Your Templates</h3>
<p>Treat templates like code—they need testing too. Set up a test pipeline that consumes your templates with various parameter combinations. Run it on every PR to catch regressions before they hit production.</p>
<h2 id="heading-wrapping-up">WRAPPING UP</h2>
<p>Pipeline templates are one of Azure DevOps' most powerful features for scaling CI/CD practices across an organization. By centralizing common workflows, you gain consistency, reduce maintenance burden, and enforce standards without turning into a bottleneck.</p>
<p>Key takeaways:</p>
<ul>
<li><p><strong>Structure templates by scope</strong>: tasks for single operations, jobs for workflows, stages for multi-environment processes.</p>
</li>
<li><p><strong>Version templates with Git tags</strong> and use semantic versioning to communicate breaking changes.</p>
</li>
<li><p><strong>Keep templates focused and composable</strong>—smaller, single-purpose templates are easier to maintain and test.</p>
</li>
<li><p><strong>Document thoroughly</strong> and provide examples to reduce onboarding friction.</p>
</li>
</ul>
<p>A few gotchas to watch out for: Azure DevOps caches template repositories, so sometimes you'll need to manually refresh or re-trigger pipelines to pick up changes. Also, template expressions (<code>${{ }}</code>) are evaluated at queue time, which can trip you up if you're used to runtime variables.</p>
<p>If you're managing templates across multiple teams, consider setting up a governance model—define who owns the template repo, establish a review process for changes, and create a feedback loop so teams can request new templates or improvements.</p>
<p>Done right, pipeline templates become the foundation of a self-service CI/CD platform where teams can ship fast without sacrificing consistency or security.</p>
]]></content:encoded></item><item><title><![CDATA[Powershell: Custom prompt using oh-my-posh]]></title><description><![CDATA[Install a nerd font from https://www.nerdfonts.com/font-downloads
Install oh-my-posh using winget

winget install oh-my-posh

// Close & reopen powershell window

// Check oh-my-posh version
oh-my-posh version

// Upgrade oh-my-posh using winget
wing...]]></description><link>https://blog.vivekdhami.com/ps-custom-prompt</link><guid isPermaLink="true">https://blog.vivekdhami.com/ps-custom-prompt</guid><category><![CDATA[Powershell]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Wed, 15 Nov 2023 09:09:13 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/cckf4TsHAuw/upload/45fb5883dc7227030fccf90b0795b030.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<ol>
<li>Install a nerd font from <a target="_blank" href="https://www.nerdfonts.com/font-downloads">https://www.nerdfonts.com/font-downloads</a></li>
<li>Install oh-my-posh using winget</li>
</ol>
<pre><code class="lang-powershell">winget install <span class="hljs-built_in">oh</span><span class="hljs-literal">-my</span><span class="hljs-literal">-posh</span>

// Close &amp; reopen powershell window

// Check <span class="hljs-built_in">oh</span><span class="hljs-literal">-my</span><span class="hljs-literal">-posh</span> version
<span class="hljs-built_in">oh</span><span class="hljs-literal">-my</span><span class="hljs-literal">-posh</span> version

// Upgrade <span class="hljs-built_in">oh</span><span class="hljs-literal">-my</span><span class="hljs-literal">-posh</span> <span class="hljs-keyword">using</span> winget
winget upgrade <span class="hljs-built_in">oh</span><span class="hljs-literal">-my</span><span class="hljs-literal">-posh</span>
</code></pre>
<ol>
<li>Install Terminal-Icons and import them for displaying.</li>
</ol>
<pre><code class="lang-powershell">
// Open the current user profile:
code <span class="hljs-variable">$PROFILE</span>

// Add the following lines to the profile file:
<span class="hljs-built_in">Install-Module</span> <span class="hljs-literal">-Name</span> Terminal<span class="hljs-literal">-Icons</span> <span class="hljs-literal">-Repository</span> PSGallery <span class="hljs-literal">-Scope</span> CurrentUser

<span class="hljs-built_in">Import-Module</span> <span class="hljs-literal">-Name</span> Terminal<span class="hljs-literal">-Icons</span>
</code></pre>
<p>This will install Terminal-Icons and import them for displaying common icons with the PowerShell terminal.</p>
<ol>
<li>Initialize oh-my-posh with a theme i.e. amro in this case.</li>
</ol>
<pre><code class="lang-powershell"><span class="hljs-built_in">oh</span><span class="hljs-literal">-my</span><span class="hljs-literal">-posh</span> init pwsh -<span class="hljs-literal">-config</span> <span class="hljs-string">"<span class="hljs-variable">$env:POSH_THEMES_PATH</span>\amro.omp.json"</span> | <span class="hljs-built_in">Invoke-Expression</span>
</code></pre>
<p>Themes are available at: <a target="_blank" href="https://ohmyposh.dev/docs/themes">https://ohmyposh.dev/docs/themes</a></p>
]]></content:encoded></item><item><title><![CDATA[Share WSL rootless Podman instance with Windows]]></title><description><![CDATA[Background
Podman Desktop is the easiest way to start using Podman with windows. However currently Podman Desktop creates a new WSL instance with podman and connects windows to it. While it is pretty convenient for most of the users, this still requi...]]></description><link>https://blog.vivekdhami.com/share-wsl-rootless-podman-instance-with-windows</link><guid isPermaLink="true">https://blog.vivekdhami.com/share-wsl-rootless-podman-instance-with-windows</guid><category><![CDATA[podman]]></category><category><![CDATA[containers]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Sat, 15 Jul 2023 10:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/HjBOmBPbi9k/upload/08f6293d54acc6e68ab6b72f34773925.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h4 id="heading-background">Background</h4>
<p><a target="_blank" href="https://podman-desktop.io/">Podman Desktop</a> is the easiest way to start using Podman with windows. However currently Podman Desktop creates a new WSL instance with podman and connects windows to it. While it is pretty convenient for most of the users, this still requires another WSL instance and maybe not so convenient when you already have a preconfigured WSL dev environment.</p>
<p>So in this post I will share the setup I have been using to share the rootless podman running in a WSL instance with Windows.</p>
<h4 id="heading-prerequisites">Prerequisites</h4>
<p>We will be using systemd for setting up rootless podman in WSL. So the only prerequisite for this setup to work is that the WSL version installed on your windows system should support systemd.</p>
<p>Systemd support was added to WSL in version: <a target="_blank" href="https://devblogs.microsoft.com/commandline/systemd-support-is-now-available-in-wsl/">0.67.6</a>. You can check the wsl version via the following command:</p>
<pre><code class="lang-bash">
wsl -l -v <span class="hljs-comment"># Should be 0.67.6 or above.</span>
</code></pre>
<p>If your WSL version is below the above mentioned version then you can check for WSL updates via wsl --update command.</p>
<p>Furthermore, systemd is not enabled in WSL by default and you can enable it at boot by creating a WSL configuration file <strong>/etc/wsl.conf</strong> within your WSL instance.</p>
<pre><code class="lang-yml">
<span class="hljs-comment"># /etc/wsl.conf file content</span>

[<span class="hljs-string">boot</span>]
<span class="hljs-string">systemd=true</span>
</code></pre>
<p>Restart your WSL instance and you can check systemd status after reboot using this command:</p>
<pre><code class="lang-bash">systemctl list-unit-files --<span class="hljs-built_in">type</span>=service
</code></pre>
<h4 id="heading-install-podman-on-wsl">Install Podman on WSL</h4>
<p>You can follow the podman installation steps for installing podman on your WSL instance.</p>
<p>For Ubuntu/Debian, installation via apt currently only supports v3.4.4 of podman and it is pretty outdated. However there is an updated version available via an unofficial source. You can read more about it <a target="_blank" href="https://www.reddit.com/r/podman/comments/10wvjjp/how_i_backported_podman_4_on_ubuntu_2204/">here</a> and <a target="_blank" href="https://launchpad.net/~quarckster/+archive/ubuntu/containers">here</a>.</p>
<pre><code class="lang-bash">sudo apt install podman <span class="hljs-comment"># installs podman 3.4.4</span>

<span class="hljs-comment"># For a more updated version: https://www.reddit.com/r/podman/comments/10wvjjp/how_i_backported_podman_4_on_ubuntu_2204/</span>
<span class="hljs-comment"># https://launchpad.net/~quarckster/+archive/ubuntu/containers</span>

sudo add-apt-repository ppa:quarckster/containers
sudo apt update
</code></pre>
<p>After installation you can confirm podman installation by running following commands:</p>
<pre><code class="lang-bash">
podman version <span class="hljs-comment"># shows podman version</span>

podman info <span class="hljs-comment"># shows information about podman instance. Of particular interest are remoteSocket.Path and security.rootless=true.</span>
</code></pre>
<h4 id="heading-connect-to-podman-service-in-wsl-using-ssh">Connect to podman service in WSL using SSH</h4>
<p>We will be using ssh to connect to podman service running in WSL from windows.</p>
<p>Follow the following commands to setup ssh in WSL if it is not already setup:</p>
<pre><code class="lang-bash">
<span class="hljs-comment"># Install and enable ssh in Ubuntu/Debian. Please search online for other Linux distributions. </span>
sudo apt install openssh-server
sudo systemctl <span class="hljs-built_in">enable</span> --now ssh.service
sudo systemctl start --now ssh.service
</code></pre>
<p>Generate ssh keys and add it to WSL authorized list:</p>
<pre><code class="lang-bash">
<span class="hljs-comment"># Generate and export ssh keys</span>
<span class="hljs-built_in">export</span> WINDOWS_HOME=/mnt/c/Users/&lt;username&gt;
ssh-keygen -b 2048 -t rsa -f <span class="hljs-variable">$WINDOWS_HOME</span>/.ssh/id_rsa_podman -q -N <span class="hljs-string">""</span>

<span class="hljs-comment"># Add to authorized list for ssh connection</span>
cat <span class="hljs-variable">$WINDOWS_HOME</span>/.ssh/id_rsa_podman.pub &gt;&gt; ~/.ssh/authorized_keys
</code></pre>
<p>Enable podman rootless service:</p>
<pre><code class="lang-bash">
<span class="hljs-comment"># Enable podman rootless service</span>
systemctl --user <span class="hljs-built_in">enable</span> --now podman.socket
systemctl --user start --now podman.socket

<span class="hljs-comment"># Enable systemd services to continue to work even after user log offs</span>
sudo loginctl enable-linger <span class="hljs-variable">$USER</span>

<span class="hljs-comment"># Check podman remote information</span>
podman --remote info

<span class="hljs-comment"># Check podman socket fullpath. Needed while adding connection in windows.</span>
ls <span class="hljs-variable">$XDG_RUNTIME_DIR</span>/podman/podman.sock
</code></pre>
<p>Finally from windows terminal, add the podman connection and set it as default. You will need podman cli for this step to work and the easiest way to install it in Windows is using <a target="_blank" href="https://podman-desktop.io/">Podman Desktop</a>.</p>
<pre><code class="lang-bash">
<span class="hljs-comment"># Add podman connection using ssh key and podman socket path</span>
podman system connection add wsl-podman --identity C:\Users\&lt;username&gt;\.ssh\id_rsa_podman ssh://&lt;username&gt;@localhost/run/user/&lt;userid&gt;/podman/podman.sock 
<span class="hljs-comment"># output from $XDG_RUNTIME_DIR/podman/podman.sock should replace /run/user/&lt;userid&gt;/$XDG_RUNTIME_DIR/podman/podman.sock</span>

<span class="hljs-comment"># set the new connection as default if not set already</span>
podman system connection default wsl-podman

<span class="hljs-comment"># Check if evrything works</span>
podman info
podman images
</code></pre>
<p>This should connect window with WSL podman. If you face connection issues then check if the WSL instance is running.</p>
<h4 id="heading-known-issue-ubuntu-wsl">Known issue: Ubuntu WSL</h4>
<p>Due to a <a target="_blank" href="https://github.com/microsoft/WSL/issues/10205">bug</a> in WSL2 for Ubuntu or maybe other distributions as well, after restarting WSL Ubuntu instance the file <strong>/run/user//bus</strong> is nuked and as a result <strong>systemctl --user</strong> returns <strong>file not found error</strong>.</p>
<p>The fix for this is to run the following command after restarting WSL VM every time:</p>
<pre><code class="lang-bash">
sudo systemctl restart user@&lt;userid&gt;
</code></pre>
<p>A better fix is to disable <strong>wslg</strong> within WSL, if the graphical Linux apps are not being used as it is this particular mount that is causing the above mentioned issue. <strong>wslg</strong> can be disable globally by adding the <strong>.wslconfig</strong> file at <strong>%USERPROFILE%</strong> location in windows:</p>
<pre><code class="lang-yaml">[<span class="hljs-string">wsl2</span>]
<span class="hljs-string">guiApplications=false</span>
</code></pre>
<p>That's all folks. I hope you were able to make your podman instance in WSL work with windows.</p>
]]></content:encoded></item><item><title><![CDATA[How Linux networking configuration evolved from ifconfig to systemd]]></title><description><![CDATA[If you've been working with Linux for more than a few years, you've probably noticed that networking configuration has gone through some serious changes. What started as simple text files and a handful of commands has evolved into a complex ecosystem...]]></description><link>https://blog.vivekdhami.com/how-linux-networking-configuration-evolved-from-ifconfig-to-systemd</link><guid isPermaLink="true">https://blog.vivekdhami.com/how-linux-networking-configuration-evolved-from-ifconfig-to-systemd</guid><category><![CDATA[Linux]]></category><category><![CDATA[networking]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Tue, 16 May 2023 22:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/SwVkmowt7qA/upload/67f02f45ded9ac0c0d0570df159561ab.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you've been working with Linux for more than a few years, you've probably noticed that networking configuration has gone through some serious changes. What started as simple text files and a handful of commands has evolved into a complex ecosystem of tools, daemons, and abstractions. And if you're like me, you've probably typed <code>ifconfig</code> only to be met with "command not found" on a fresh Ubuntu install at least once.</p>
<p>Let's walk through how we got here—from the early days of net-tools to the systemd-dominated present—and understand why these changes happened in the first place.</p>
<h2 id="heading-the-old-guard-net-tools-and-friends">THE OLD GUARD: NET-TOOLS AND FRIENDS</h2>
<p>Back in the day (we're talking late 90s through early 2010s), Linux networking was straightforward. You had a small set of tools that did specific jobs:</p>
<ul>
<li><p><code>ifconfig</code> - configure network interfaces</p>
</li>
<li><p><code>route</code> - manage routing tables</p>
</li>
<li><p><code>netstat</code> - network statistics and connections</p>
</li>
<li><p><code>arp</code> - ARP cache manipulation</p>
</li>
<li><p><code>/etc/resolv.conf</code> - DNS configuration (literally just a text file)</p>
</li>
</ul>
<p>This was the <strong>net-tools</strong> package, and it worked. Sort of. The problem was that these tools were designed for a simpler time when Linux networking meant "configure eth0 with a static IP and you're done." As networking grew more complex—VLANs, bridges, tunnels, multiple routing tables, policy-based routing—net-tools started showing its age.</p>
<p>Here's what a typical interface configuration looked like:</p>
<pre><code class="lang-plaintext"># Bring up eth0 with a static IP
ifconfig eth0 192.168.1.100 netmask 255.255.255.0 up
​
# Add a default route
route add default gw 192.168.1.1
​
# Check what you just did
ifconfig eth0
route -n
</code></pre>
<p>Simple enough for basic use cases, but try doing anything advanced and you'd quickly hit walls. Want to create a VLAN? You'd need <code>vconfig</code>. Bridge interfaces? That's <code>brctl</code>. The tooling was fragmented, and the underlying kernel interfaces (ioctls) these tools used were showing their limitations.</p>
<h2 id="heading-enter-iproute2-the-modernization">ENTER IPROUTE2: THE MODERNIZATION</h2>
<p>Around 2003, the <code>iproute2</code> suite started gaining traction. The crown jewel was the <code>ip</code> command—a single unified tool that could handle interfaces, routing, tunnels, and more. It was built on top of the newer netlink socket interface, which gave it more power and flexibility than the old ioctl-based tools.</p>
<p>The same configuration from above now looks like this:</p>
<pre><code class="lang-plaintext"># Bring up eth0 with a static IP
ip addr add 192.168.1.100/24 dev eth0
ip link set eth0 up
​
# Add a default route
ip route add default via 192.168.1.1
​
# Check what you just did
ip addr show eth0
ip route show
</code></pre>
<p>Notice the CIDR notation (<code>/24</code> instead of <code>netmask 255.255.255.0</code>)—this was one of many improvements. But more importantly, <code>ip</code> could do things that net-tools simply couldn't:</p>
<pre><code class="lang-plaintext"># Create a VLAN
ip link add link eth0 name eth0.100 type vlan id 100
​
# Set up policy-based routing with multiple routing tables
ip rule add from 10.0.0.0/8 table 100
ip route add default via 192.168.2.1 table 100
​
# Configure network namespaces (crucial for containers)
ip netns add production
ip link set eth1 netns production
</code></pre>
<p>This last feature—network namespaces—became absolutely critical for containerization. Docker, Kubernetes, and basically every modern container runtime relies on network namespaces to isolate network stacks between containers.</p>
<p>By the mid-2010s, major distributions started deprecating net-tools. Debian and Ubuntu stopped installing it by default. Red Hat marked it as deprecated. The message was clear: learn <code>ip</code> or get left behind.</p>
<h2 id="heading-interface-management-gets-complicated">INTERFACE MANAGEMENT GETS COMPLICATED</h2>
<p>So now we had better low-level tools, but who was actually calling them? In the early days, you'd have distribution-specific scripts:</p>
<ul>
<li><p><strong>Debian/Ubuntu</strong>: <code>/etc/network/interfaces</code> file, managed by <code>ifupdown</code></p>
</li>
<li><p><strong>Red Hat/CentOS</strong>: <code>/etc/sysconfig/network-scripts/ifcfg-*</code> files, managed by <code>network</code> service</p>
</li>
<li><p><strong>SUSE</strong>: YaST with its own configuration format</p>
</li>
</ul>
<p>Each distribution did things differently, and migrating configurations between distros was painful. Worse, these systems were designed for static configurations—they struggled with dynamic scenarios like laptops moving between networks or cloud instances with changing metadata.</p>
<p>Two major players emerged to solve this:</p>
<h3 id="heading-networkmanager">NetworkManager</h3>
<p>NetworkManager arrived to handle dynamic networking scenarios. It was perfect for desktops and laptops that needed to switch between WiFi networks, handle VPNs, and deal with frequent network changes. It introduced D-Bus APIs for managing connections and provided both GUI and CLI tools:</p>
<pre><code class="lang-plaintext"># List connections
nmcli connection show
​
# Connect to WiFi
nmcli device wifi connect "MyNetwork" password "MyPassword"
​
# Create a new static connection
nmcli connection add con-name "static-eth0" \
  ifname eth0 type ethernet \
  ip4 192.168.1.100/24 gw4 192.168.1.1
</code></pre>
<p>NetworkManager became the default on Fedora, RHEL, and Ubuntu Desktop. It's great for interactive systems but can be overkill for servers with static configurations.</p>
<h3 id="heading-systemd-networkd">systemd-networkd</h3>
<p>When systemd took over init, it brought its own networking daemon: <code>systemd-networkd</code>. Unlike NetworkManager, it's designed for servers and embedded systems with relatively static configurations. Configuration lives in <code>.network</code> files under <code>/etc/systemd/network/</code>:</p>
<pre><code class="lang-plaintext"># /etc/systemd/network/10-static-eth0.network
[Match]
Name=eth0
​
[Network]
Address=192.168.1.100/24
Gateway=192.168.1.1
DNS=8.8.8.8
DNS=8.8.4.4
</code></pre>
<p>Then enable and start it:</p>
<pre><code class="lang-plaintext">systemctl enable --now systemd-networkd
</code></pre>
<p>The systemd approach is declarative and integrates cleanly with the rest of the systemd ecosystem. Cloud images (especially for AWS, GCP, Azure) often use systemd-networkd because it's lightweight and predictable.</p>
<p>Here's where it gets interesting: on a modern Linux system, you might have multiple things trying to manage networking. You could have <code>systemd-networkd</code> managing physical interfaces while <code>docker</code> manages bridge interfaces and <code>NetworkManager</code> handles WiFi. They usually stay out of each other's way, but conflicts can happen.</p>
<h2 id="heading-dns-resolution-from-simple-to-sophisticated">DNS RESOLUTION: FROM SIMPLE TO... SOPHISTICATED?</h2>
<p>Now let's talk about DNS, because this has gotten wild.</p>
<h3 id="heading-the-old-way-etcresolvconf">The old way: /etc/resolv.conf</h3>
<p>For decades, DNS configuration was dead simple. You edited <code>/etc/resolv.conf</code>:</p>
<pre><code class="lang-plaintext">nameserver 8.8.8.8
nameserver 8.8.4.4
search example.com
</code></pre>
<p>Done. Every application used the standard resolver library (libc), which read this file and made DNS queries. Simple, predictable, and easy to debug.</p>
<h3 id="heading-the-problem-with-simple">The problem with simple</h3>
<p>But simple broke down in modern scenarios:</p>
<ul>
<li><p><strong>Split DNS</strong>: VPN connections might need different DNS servers for internal domains</p>
</li>
<li><p><strong>DNSSEC</strong>: validating DNS responses cryptographically</p>
</li>
<li><p><strong>DNS over TLS/HTTPS</strong>: encrypting DNS queries for privacy</p>
</li>
<li><p><strong>mDNS/LLMNR</strong>: local network name resolution (think <code>.local</code> domains)</p>
</li>
<li><p><strong>Per-interface DNS</strong>: different DNS servers for different network interfaces</p>
</li>
</ul>
<p>Applications started implementing their own DNS logic. Browsers added DNS-over-HTTPS. VPN clients fought over <code>/etc/resolv.conf</code>. It was chaos.</p>
<h3 id="heading-systemd-resolved-to-the-rescue-sort-of">systemd-resolved to the rescue (sort of)</h3>
<p>systemd-resolved came along to centralize all this complexity. It's a local DNS resolver that sits between applications and actual DNS servers:</p>
<p>On a systemd-resolved system, <code>/etc/resolv.conf</code> becomes a symlink to a stub file:</p>
<pre><code class="lang-plaintext">$ ls -l /etc/resolv.conf
lrwxrwxrwx 1 root root 39 Jan 15 10:23 /etc/resolv.conf -&gt; ../run/systemd/resolve/stub-resolv.conf
​
$ cat /etc/resolv.conf
nameserver 127.0.0.53  # Points to systemd-resolved
</code></pre>
<p>All DNS queries go to <code>127.0.0.53</code>, where systemd-resolved handles them intelligently. The <code>resolvectl</code> command (which replaced the older <code>systemd-resolve</code>) is your main interface for managing and debugging DNS:</p>
<pre><code class="lang-plaintext"># Check overall DNS status and per-interface DNS servers
resolvectl status
​
# Query a domain and see which server answered
resolvectl query example.com
​
# Check DNSSEC validation for a domain
resolvectl query --type=A --class=IN example.com
​
# Flush DNS cache
resolvectl flush-caches
​
# Get statistics (cache hits, transaction counts)
resolvectl statistics
</code></pre>
<h4 id="heading-configuring-dns-with-systemd-networkd">Configuring DNS with systemd-networkd</h4>
<p>The most common way to configure DNS is through systemd-networkd's <code>.network</code> files. You can set DNS servers per-interface:</p>
<pre><code class="lang-plaintext"># /etc/systemd/network/10-eth0.network
[Match]
Name=eth0
​
[Network]
DHCP=no
Address=192.168.1.100/24
Gateway=192.168.1.1
​
# Primary and fallback DNS servers
DNS=1.1.1.1
DNS=1.0.0.1
​
# Domains to search (for short hostnames)
Domains=internal.company.com
​
# Route only specific domains through these DNS servers (split DNS)
Domains=~internal.company.com ~vpn.company.com
</code></pre>
<p>The <code>~</code> prefix on domains means "route DNS queries for this domain through this interface's DNS servers." This is crucial for VPN scenarios where you want corporate domains resolved by corporate DNS, but everything else goes to your regular DNS.</p>
<p>After editing, restart systemd-networkd:</p>
<pre><code class="lang-plaintext">systemctl restart systemd-networkd
</code></pre>
<h4 id="heading-global-dns-configuration-with-resolvedconf">Global DNS configuration with resolved.conf</h4>
<p>For system-wide DNS settings, edit <code>/etc/systemd/resolved.conf</code>:</p>
<pre><code class="lang-plaintext">[Resolve]
# Fallback DNS if no interface provides DNS servers
DNS=8.8.8.8 1.1.1.1
​
# Fallback DNS for specific domains
Domains=~.
​
# Enable DNSSEC validation (yes/allow-downgrade/no)
DNSSEC=allow-downgrade
​
# Enable DNS over TLS (yes/opportunistic/no)
DNSOverTLS=opportunistic
​
# Enable mDNS for .local domains
MulticastDNS=yes
​
# Enable LLMNR for local network name resolution
LLMNR=yes
​
# DNS stub listener address (default: 127.0.0.53)
DNSStubListener=yes
​
# Cache entries
Cache=yes
CacheFromLocalhost=no
</code></pre>
<p>After changing <code>resolved.conf</code>, restart the service:</p>
<pre><code class="lang-plaintext">systemctl restart systemd-resolved
</code></pre>
<h4 id="heading-split-dns-for-vpn-scenarios">Split DNS for VPN scenarios</h4>
<p>Here's a real-world example: you're connected to a corporate VPN via <code>tun0</code> and want corporate domains (<code>*.</code><a target="_blank" href="http://corp.example.com"><code>corp.example.com</code></a>) resolved through the VPN's DNS, but everything else through Cloudflare:</p>
<pre><code class="lang-plaintext"># /etc/systemd/network/10-vpn.network
[Match]
Name=tun0
​
[Network]
DNS=10.10.10.10  # Corporate DNS server
Domains=~corp.example.com ~internal.example.com
</code></pre>
<p>The routing domains (prefixed with <code>~</code>) tell systemd-resolved: "queries for these domains go to this interface's DNS servers only." Check if it's working:</p>
<pre><code class="lang-plaintext">$ resolvectl status tun0
Link 4 (tun0)
      Current Scopes: DNS
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: allow-downgrade
    DNSSEC supported: yes
         DNS Servers: 10.10.10.10
          DNS Domain: ~corp.example.com
                      ~internal.example.com
​
# Test it
$ resolvectl query jenkins.corp.example.com
jenkins.corp.example.com: 10.20.30.40       -- link: tun0
</code></pre>
<p>That <code>-- link: tun0</code> confirms the query went through the VPN interface.</p>
<h4 id="heading-enabling-dnssec">Enabling DNSSEC</h4>
<p>DNSSEC validation adds cryptographic verification to DNS responses. In <code>resolved.conf</code>:</p>
<pre><code class="lang-plaintext">[Resolve]
DNSSEC=yes  # Strict validation; queries fail if DNSSEC validation fails
# or
DNSSEC=allow-downgrade  # Validate when available, but allow unsigned responses
</code></pre>
<p>Check DNSSEC status for a domain:</p>
<pre><code class="lang-plaintext">$ resolvectl query --legend=yes cloudflare.com
cloudflare.com: 104.16.132.229
                104.16.133.229
​
-- Information acquired via protocol DNS in 23.4ms.
-- Data is authenticated: yes
</code></pre>
<p>That "authenticated: yes" means DNSSEC validation succeeded.</p>
<h4 id="heading-dns-over-tls-dot">DNS over TLS (DoT)</h4>
<p>To encrypt DNS queries in transit, enable DNS-over-TLS in <code>resolved.conf</code>:</p>
<pre><code class="lang-plaintext">[Resolve]
DNS=1.1.1.1#cloudflare-dns.com 1.0.0.1#cloudflare-dns.com
DNSOverTLS=yes
</code></pre>
<p>The <code>#</code><a target="_blank" href="http://cloudflare-dns.com"><code>cloudflare-dns.com</code></a> syntax specifies the TLS server name. Restart and verify:</p>
<pre><code class="lang-plaintext">systemctl restart systemd-resolved
resolvectl status | grep "DNS over TLS"
</code></pre>
<p>Note that DoT uses port 853, so your firewall needs to allow it. Also, this is different from DNS-over-HTTPS (DoH), which systemd-resolved doesn't support natively—you'd need something like dnscrypt-proxy for that.</p>
<h4 id="heading-runtime-dns-changes-with-resolvectl">Runtime DNS changes with resolvectl</h4>
<p>You can temporarily override DNS settings without editing config files:</p>
<pre><code class="lang-plaintext"># Set DNS servers for eth0 until next reboot
resolvectl dns eth0 8.8.8.8 8.8.4.4
​
# Set routing domains for eth0
resolvectl domain eth0 ~internal.example.com
​
# Revert to defaults
resolvectl revert eth0
</code></pre>
<p>These changes don't persist across reboots, which is useful for testing.</p>
<h4 id="heading-common-troubleshooting-scenarios">Common troubleshooting scenarios</h4>
<p><strong>DNS queries timing out?</strong> Check if systemd-resolved is actually running:</p>
<pre><code class="lang-plaintext">systemctl status systemd-resolved
journalctl -u systemd-resolved -f  # Follow logs in real-time
</code></pre>
<p><strong>VPN DNS not working?</strong> Verify your VPN client is setting DNS through systemd-resolved. Some VPN clients (especially older ones) still try to directly edit <code>/etc/resolv.conf</code>, which doesn't work when it's a symlink. You might need to configure the VPN to use resolvectl:</p>
<pre><code class="lang-plaintext"># In VPN up script
resolvectl dns $INTERFACE 10.10.10.10
resolvectl domain $INTERFACE ~corp.example.com
</code></pre>
<p><strong>DNS leaking when on VPN?</strong> Check which DNS servers are being used:</p>
<pre><code class="lang-plaintext">resolvectl status
# Look for your VPN interface and verify routing domains are set correctly
</code></pre>
<p><strong>Want to bypass systemd-resolved entirely?</strong> Replace the symlink with a real file:</p>
<pre><code class="lang-plaintext">rm /etc/resolv.conf
echo "nameserver 8.8.8.8" &gt; /etc/resolv.conf
echo "nameserver 8.8.4.4" &gt;&gt; /etc/resolv.conf
</code></pre>
<p>But you'll lose all the advanced features—split DNS, DNSSEC, mDNS, etc.</p>
<p><strong>Cache causing issues?</strong> Flush it:</p>
<pre><code class="lang-plaintext">resolvectl flush-caches
# Or restart the service entirely
systemctl restart systemd-resolved
</code></pre>
<p>This works great when it works. But when it breaks—say, DNS leaks after connecting to a VPN—debugging becomes harder because you now have this abstraction layer between your app and the actual DNS queries. The key is understanding <code>resolvectl status</code> output and watching the journal logs when issues occur.</p>
<h3 id="heading-the-alternatives-still-exist">The alternatives still exist</h3>
<p>Not everyone loves systemd-resolved. You'll still find systems using:</p>
<ul>
<li><p><strong>dnsmasq</strong>: lightweight caching DNS proxy, popular in network appliances</p>
</li>
<li><p><strong>resolvconf</strong>: the older dynamic <code>/etc/resolv.conf</code> updater (not to be confused with resolvectl)</p>
</li>
<li><p><strong>plain old /etc/resolv.conf</strong>: some sysadmins just want simplicity and manage it manually or via configuration management</p>
</li>
</ul>
<h2 id="heading-where-we-are-today">WHERE WE ARE TODAY</h2>
<p>Modern Linux networking is powerful but complex. On a typical cloud VM running Ubuntu 22.04 or Rocky Linux 9, you might have:</p>
<ul>
<li><p><code>iproute2</code> tools (<code>ip</code>, <code>ss</code>, <code>bridge</code>) for low-level interface management</p>
</li>
<li><p><code>systemd-networkd</code> managing your primary network interface</p>
</li>
<li><p><code>systemd-resolved</code> handling DNS with DNSSEC validation</p>
</li>
<li><p>Docker creating its own bridge networks and iptables rules</p>
</li>
<li><p>NetworkManager installed but disabled</p>
</li>
</ul>
<p>The key insight is that Linux networking evolved from a monolithic model ("one tool per task, everything global") to a layered, namespaced, and delegated model ("multiple domains of control, isolation by default").</p>
<p>This makes sense given how we use Linux today. We're running containers, connecting to VPNs, managing split-horizon DNS, and dealing with cloud networking that changes dynamically. The tooling had to evolve to support these use cases.</p>
<h2 id="heading-things-to-keep-in-mind">THINGS TO KEEP IN MIND</h2>
<p><strong>If you're managing servers</strong>, learn the <code>ip</code> command inside and out. Know how to inspect routing tables, check interface statistics, and understand network namespaces. And pick either NetworkManager or systemd-networkd—running both is usually asking for trouble.</p>
<p><strong>If you're debugging DNS issues</strong>, understand whether systemd-resolved is in the picture. Check <code>resolvectl status</code> before you start editing <code>/etc/resolv.conf</code> (which might be a symlink anyway).</p>
<p><strong>If you're writing automation</strong>, use declarative configurations (systemd-networkd <code>.network</code> files or NetworkManager connection files) rather than calling <code>ip</code> commands in scripts. Your future self will thank you.</p>
<p>The Linux networking stack has gotten more complex, no question. But it's also far more capable than it was 20 years ago. We're now running thousands of isolated network stacks on a single host, dynamically reconfiguring interfaces in response to cloud metadata, and validating DNS responses cryptographically. The old tools couldn't handle that—and now we have tools that can.</p>
<p>Just don't forget to update your muscle memory from <code>ifconfig</code> to <code>ip</code>. That command-not-found error gets old fast.</p>
]]></content:encoded></item><item><title><![CDATA[Git: Move files from one repo to another with history]]></title><description><![CDATA[Background
A lot of times as developers and code maintainers we need to move files/folders between code repos and most of the time we avoid exporting history because of the complexity and issues often faced during the export. However fret not anymore...]]></description><link>https://blog.vivekdhami.com/git-move-repo-files-with-history</link><guid isPermaLink="true">https://blog.vivekdhami.com/git-move-repo-files-with-history</guid><category><![CDATA[Git]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Thu, 02 Dec 2021 11:00:00 GMT</pubDate><content:encoded><![CDATA[<p><strong>Background</strong></p>
<p>A lot of times as developers and code maintainers we need to move files/folders between code repos and most of the time we avoid exporting history because of the complexity and issues often faced during the export. However fret not anymore, I will share in this post how to easily export files with history using git command line and <strong>filter-repo</strong> command.</p>
<p>In the past git <em>filter-branch</em> used to be one of the way to achieve the export task. However due to issues associated with the history rewritings by this command, its no longer a preferred option and even the git cli prompts a warning when you use the command.</p>
<p>Some git users have been using patching as an alternative. However it also does not provides a seamless experience and can get complex. It might be a preferred option for git experts. However I would not suggest using it when you have a better alternative available i.e. <strong>filter-repo</strong>. </p>
<p><strong>Installation</strong></p>
<p>Unlike <em>filter-branch</em>, <em>filter-repo</em> is not part of git cli and needs to be installed separately. </p>
<p>Installation on systems using package managers can be accomplished using the available package manager and the appropriate package. However on windows the installation might be a little trickier. The <a target="_blank" href="https://github.com/newren/git-filter-repo/blob/main/INSTALL.md">installation instructions</a> available at the project's github repo details all the peculiarities associated with the installation.</p>
<p>After installing filter-repo should be available as a subcommand with git cli.</p>
<pre><code class="lang-bash">    git filter-repo &lt;options&gt;
</code></pre>
<p>Among all the available command options, the following are noteworthy for this post:</p>
<p>path: This is used to specify the folder/file to include with filter-repo operation.</p>
<p>invert-paths: Inverts the paths included i.e. the files/folders specified via path flag will now be excluded from filter-repo operation.   </p>
<p><strong>Walkthrough</strong></p>
<p>For moving files from one repo to another please proceed along as follows:</p>
<ol>
<li><p>Clean the source repo</p>
<p> In this step we include/exclude the files/folders and their history using the filter-repo command.</p>
<p> Before starting with the cleanup step in the source repo, its a good practice to <strong>perform the cleanup in a local branch</strong> where the changes are not going to have any impact on the original repo.</p>
<pre><code class="lang-bash"> <span class="hljs-comment"># Include file and folder from the source repo</span>
 git filter-repo --path &lt;include-folder&gt; --path &lt;include-file&gt;

 <span class="hljs-comment"># Exclude file and folder from the source repo</span>
 git filter-repo --path &lt;folder&gt; --path &lt;file&gt; --invert-paths
</code></pre>
</li>
<li><p>Add the source repo as a remote source for the target repo</p>
<p> In this step, we move to the target repo directory and add the source repo folder as a remote origin.</p>
<pre><code class="lang-bash"> git remote add &lt;origin-name&gt; &lt;source-repo-folder-path&gt;

 <span class="hljs-comment">#e.g.</span>
 git remote add move ../repo-to-move
</code></pre>
</li>
<li><p>Fetch and merge the changes and history from source repo
 In this step, we fetch changes from the remote source and merge it to the target branch.</p>
<pre><code class="lang-bash"> git fetch &lt;origin-name&gt;

 git branch &lt;branch-name&gt; remotes/&lt;origin-name&gt;/&lt;remote-branch-name&gt;

 git merge &lt;branch-name&gt; --allow-unrelated-histories

 <span class="hljs-comment">#e.g.</span>
 git fetch move

 git branch merge-remote remotes/move/migrate-remote

 git merge merge-remote --allow-unrelated-histories
</code></pre>
</li>
<li><p>Cleanup</p>
<p> In this step, we clean up by removing the remote source and the branch.</p>
<pre><code class="lang-bash"> git remote rm &lt;origin-name&gt;

 git branch -d &lt;branch-name&gt;

 <span class="hljs-comment">#e.g.</span>
 git remote rm move

 git branch -d merge-remote
</code></pre>
</li>
</ol>
<p>That's all folks!!</p>
]]></content:encoded></item><item><title><![CDATA[Azure Bicep: Pump up your azure deployments]]></title><description><![CDATA[Azure Bicep: Pump up your azure deployments
Even though Azure Bicep was announced at Ignite 2020 I was not so keen to try it out because of the preview nature of the cli and my personal preference and past experience with beta phase tools and technol...]]></description><link>https://blog.vivekdhami.com/az-bicep-intro</link><guid isPermaLink="true">https://blog.vivekdhami.com/az-bicep-intro</guid><category><![CDATA[Azure]]></category><category><![CDATA[azure-bicep]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Fri, 15 Jan 2021 11:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1754469648578/73732bf1-275f-4e5f-ad2a-d7dcd540d395.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h4 id="heading-azure-bicep-pump-up-your-azure-deployments">Azure Bicep: Pump up your azure deployments</h4>
<p>Even though Azure Bicep was announced at Ignite 2020 I was not so keen to try it out because of the preview nature of the cli and my personal preference and past experience with beta phase tools and technology.  However keeping my preference aside I decided to start learning and using bicep for managing azure resource deployments. So this post is an accumulation of my learnings so far with bicep.</p>
<h4 id="heading-overview">Overview</h4>
<p>Bicep is an abstraction over ARM templates that have been the bread and butter for azure deployments so far. It is a DSL (Domain Specific Language) for deploying resources in azure. It is similar to terraform's HCL (HashiCorp Configuration Language), however it currently supports only azure and does not supports state management. It is continuously  updated with support for the newer ARM Api versions as they become available. You can either generate ARM templates from bicep files for deployment or you can directly deploy bicep files, as Azure Resource Manager now supports deployments via bicep files as well.</p>
<p>Bicep lets you define your azure resources via less lines of code as compared to ARM which can be humongous and unmanageable sometimes. Also bicep being a DSL supports various language features which are difficult to accomplish with ARM templates.</p>
<h4 id="heading-installation-bicep-cli">Installation: Bicep cli</h4>
<p>Installation is pretty easy and quick if you already have azure cli installed. Otherwise install azure cli before in order to proceed with bicep installation.</p>
<p>There are also <a target="_blank" href="https://github.com/Azure/bicep/blob/main/docs/installing.md">other</a> installation options available for bicep cli. However this post will follow the azure cli based installation procedure and use the set of commands available via azure cli for working with bicep.</p>
<pre><code class="lang-bash">// Upgrade azure cli
az upgrade

// Install bicep <span class="hljs-built_in">command</span> group
az bicep install
</code></pre>
<p>After installing bicep command group as part of azure cli, you can use the following commands: </p>
<pre><code class="lang-bash">// View all the available bicep sub-commands
az bicep --<span class="hljs-built_in">help</span>

Group
    az bicep : Bicep CLI <span class="hljs-built_in">command</span> group.

Commands:
    build           : Build a Bicep file.
    decompile       : Attempt to decompile an ARM template file to a Bicep file.
    format          : Format a Bicep file.
    generate-params : Generate parameters file <span class="hljs-keyword">for</span> a Bicep file.
    install         : Install Bicep CLI.
    list-versions   : List out all available versions of Bicep CLI.
    publish         : Publish a bicep file to a remote module registry.
    restore         : Restore external modules <span class="hljs-keyword">for</span> a bicep file.
    uninstall       : Uninstall Bicep CLI.
    upgrade         : Upgrade Bicep CLI to the latest version.
    version         : Show the installed version of Bicep CLI.

// Show the installed version of Bicep CLI
az bicep version

// Upgrade Bicep CLI to the latest version
az bicep upgrade
</code></pre>
<h4 id="heading-example-usage-decompile-and-deploy">Example usage: Decompile and deploy</h4>
<p>The easiest way to compare bicep and ARM would be via decompiling an existing ARM template to bicep and deploying it to azure.</p>
<p>For this example we will be using the simplest ARM template i.e. <a target="_blank" href="https://github.com/Azure/azure-quickstart-templates/blob/master/quickstarts/microsoft.network/vnet-two-subnets/azuredeploy.json">vNet with two subnets</a> from azure quickstarts samples. To follow along download the file locally and use the following command to decompile and deploy to azure.</p>
<pre><code class="lang-bash"> <span class="hljs-comment"># decompile arm template to bicep</span>
 azure bicep decompile -f .\azuredeploy.json

 <span class="hljs-comment"># login to azure</span>
 az login --use-device-code

 <span class="hljs-comment"># validate bicep template</span>
 az group validate -g &lt;rg-name&gt; -f .\azuredeploy.bicep

 <span class="hljs-comment"># deploy bicep template</span>
 az group deploy -g &lt;rg-name&gt; -f .\azuredeploy.bicep
</code></pre>
<p>If you open the decompiled bicep template and compare it with the ARM template, you will find that the bicep template is more readable than the ARM template even for this simple example.</p>
<h4 id="heading-visual-studio-code-bicep-extension">Visual Studio Code: Bicep extension</h4>
<p>One tool that I can highly recommend for bicep is the <a target="_blank" href="https://marketplace.visualstudio.com/items?itemName=ms-azuretools.vscode-bicep">bicep extension</a> for VS Code. It will help you in writing bicep code from scratch using intellisense and autocomplete as well as in finding/highlighting issues within your bicep files.</p>
<h4 id="heading-learning-bicep">Learning bicep</h4>
<p>For learning bicep I recommend the following:</p>
<ol>
<li><a target="_blank" href="https://learn.microsoft.com/en-us/training/paths/fundamentals-bicep/">MS Learn: Bicep fundamentals course</a></li>
<li><a target="_blank" href="https://www.manning.com/books/azure-infrastructure-as-code">Azure IaC Book</a></li>
</ol>
]]></content:encoded></item><item><title><![CDATA[Setup a kubernetes cluster on raspberry pi/s using k3s]]></title><description><![CDATA[This blog post will detail the steps for setting up a kubernetes cluster on your raspberry pi/s using k3s.
k3s: A brief introduction
While it is completely possible to install, manage and run a complete kubernetes distribution (pronounced k8s) on you...]]></description><link>https://blog.vivekdhami.com/install-k3s-cluster-on-rpi</link><guid isPermaLink="true">https://blog.vivekdhami.com/install-k3s-cluster-on-rpi</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[k3s]]></category><category><![CDATA[Raspberry Pi]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Wed, 11 Nov 2020 11:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/CkbnIzH_YNo/upload/5208b583eedf4ad52ca07b32c9d1ff2a.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This blog post will detail the steps for setting up a kubernetes cluster on your raspberry pi/s using k3s.</p>
<h3 id="heading-k3s-a-brief-introduction"><strong>k3s: A brief introduction</strong></h3>
<p>While it is completely possible to install, manage and run a complete kubernetes distribution (pronounced k8s) on your raspberry pi cluster (or on similar devices), for better performance and resource utilization in a resource constrained device like raspberry pi its advisable to install a lightweight kubernetes distribution like <a target="_blank" href="https://k3s.io">k3s</a> by rancher.</p>
<p>From k3s website:</p>
<blockquote>
<p>K3s is a highly available, certified Kubernetes distribution designed for production workloads in unattended, resource-constrained, remote locations or inside IoT appliances.</p>
<p>K3s is packaged as a single &lt;40MB binary that reduces the dependencies and steps needed to install, run and auto-update a production Kubernetes cluster.</p>
<p>Both ARM64 and ARMv7 are supported with binaries and multiarch images available for both. K3s works great from something as small as a Raspberry Pi to an AWS a1.4xlarge 32GiB server.</p>
</blockquote>
<p><img src="https://k3s.io/img/how-it-works-k3s-revised.svg" alt class="image--center mx-auto" /></p>
<h3 id="heading-prepare-raspberry-pi-for-k3s-installation"><strong>Prepare raspberry pi for k3s installation</strong></h3>
<p>Before we start installing k3s on raspberry pi we need to prepare it for the installation as follows:</p>
<ol>
<li><p><strong>Download Raspberry Pi Imager</strong></p>
<p> Download the Raspberry Pi Imager from: https://www.raspberrypi.org/software/ choosing the version most appropriate for you. With Raspberry Pi Imager you have the option to install the desktop client or install it as a package for use with your linux cli.</p>
<p> For this article we will be using the desktop client version.</p>
</li>
<li><p><strong>Format SD card</strong></p>
<p> Insert the SD card in your computer and format the card using the erase option available with Raspberry Pi Imager.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754469325663/db1bba58-6982-40fc-bd72-d240420f50c0.png" alt class="image--center mx-auto" /></p>
</li>
<li><p><strong>Install Raspberry Pi OS</strong></p>
<p> With the formatted SD card still inserted in your computer, install the Raspberry Pi OS using the available option in the Raspberry Pi Imager. For k3s setup it is recommended to use the lite (headless) version of the Raspberry Pi OS instead of the full version.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754469088571/02612ca5-d324-4041-97b5-6546fa2d3db2.png" alt class="image--center mx-auto" /></p>
<p> It will take some time for the Raspberry Pi OS to be installed on your SD card.</p>
<p> After the installation is completed you can remove and reinsert SD card on your computer.</p>
<p> After reinserting the SD card you will see two partitions (or drives) in your computer: boot and rootfs (or SDHC in windows).</p>
</li>
<li><p><strong>Enable SSH</strong></p>
<p> In order to enable SSH on your raspberry pi create a file named ssh in the boot partition of your SD card.</p>
<pre><code class="lang-bash"> sudo touch &lt;path-to-boot-partition&gt;/ssh
</code></pre>
</li>
<li><p><strong>Power up and customize raspberry pi</strong></p>
<p> Insert the SD card in your raspberry pi and power it up.</p>
<p> After the raspberry pi is up and running find the IP address assigned to the raspberry pi using the admin UI on your router and SSH into it.</p>
<pre><code class="lang-bash"> ssh pi@&lt;ip-address&gt;
</code></pre>
<p> The default password for <strong><em>pi</em></strong> user is <strong><em>raspberry</em></strong>.</p>
<p> Once you are logged in, you can change the various raspberry pi settings like password, hostname, localization, time zone, and wifi using the raspi-config utility:</p>
<pre><code class="lang-bash"> sudo raspi-config
</code></pre>
<p> Set up a static IP for your pi by updating the entry in the file <strong><em>/etc/dhcpcd.conf</em></strong> as follows:</p>
<pre><code class="lang-markdown"> interface eth0
 static ip<span class="hljs-emphasis">_address=192.168.1.50/24
 static routers=192.168.1.1
 static domain_</span>name<span class="hljs-emphasis">_servers=192.168.1.1</span>
</code></pre>
<p> Reboot and ssh into the pi using the static IP address.</p>
<p> Enable container features in the kernel by editing the file <strong><em>/boot/cmdline.txt</em></strong> and adding the following to the end of the line:</p>
<pre><code class="lang-markdown"> cgroup<span class="hljs-emphasis">_enable=cpuset cgroup_</span>memory=1 cgroup<span class="hljs-emphasis">_enable=memory</span>
</code></pre>
<p> Reboot the pi.</p>
<p> As an additional option you can use your existing ssh keys to login to rPi by copying over the ssh key using the following command:</p>
<pre><code class="lang-bash"> ssh-copy-id pi@&lt;ip-address&gt;
</code></pre>
<p> You can also generate a new ssh key using ssh-keygen utility. After copying over your ssh key you will no longer be prompted for a password while using ssh with your rPi.</p>
<p> By default the ssh-copy-id copies over the id_rsa.pub file. However you can specify a different ssh key using the -i flag as shown below:</p>
<pre><code class="lang-bash"> ssh-copy-id -i .ssh/&lt;other-key&gt;.pub pi@192.168.1.50
</code></pre>
<p> After the ssh key has been copied over to the rPi, you can login as follows using the specific key:</p>
<pre><code class="lang-bash"> ssh pi@192.168.1.50 -i .ssh/&lt;other-key&gt;
</code></pre>
</li>
</ol>
<h3 id="heading-install-k3s-on-rpi"><strong>Install k3s on rPi</strong></h3>
<p>Now that the raspberry pi has been prepared for k3s installation. We will proceed with the installation using one of the following option:</p>
<ol>
<li><p><strong>Using latest stable k3s release</strong></p>
<p> Login to your master node rPi using ssh and install the latest stable version of k3s using the following command:</p>
<pre><code class="lang-bash"> curl -sfL https://get.k3s.io | sh -
</code></pre>
<p> After the installation is complete you can check the status of k3s using the following command:</p>
<pre><code class="lang-bash"> sudo systemctl status k3s
</code></pre>
<p> If the status shows up as active (running) then k3s has been installed successfully.</p>
<p> From the master node copy the node-token from its location:</p>
<pre><code class="lang-bash"> sudo cat /var/lib/rancher/k3s/server/node-token
</code></pre>
<p> This token will be used by the other nodes to join the k3s cluster.</p>
<p> Login to the other nodes and install k3s as follows:</p>
<pre><code class="lang-bash"> <span class="hljs-built_in">export</span> K3S_URL=<span class="hljs-string">"https://&lt;MASTER-NODE-IP&gt;:6443"</span>

 <span class="hljs-built_in">export</span> K3S_TOKEN=<span class="hljs-string">"&lt;TOKEN-COPIED_FROM-MASTER-NODE&gt;"</span>

 curl -sfL https://get.k3s.io | sh -
</code></pre>
<p> You can check the status of k3s installation on the worker/agent nodes using the following command:</p>
<pre><code class="lang-bash"> sudo systemctl status k3s-agent
</code></pre>
<p> If the status shows as active (running) then k3s is up and running on the worker nodes.</p>
<p> Login to the master node and run the following command to check the status of the nodes:</p>
<pre><code class="lang-bash"> sudo kubectl get nodes
</code></pre>
<p> The command should output the master node along with all the worker nodes that have been added to the cluster.</p>
<p> If the above command returns an error or fails to run, then restart the k3s service as follows:</p>
<pre><code class="lang-bash"> sudo systemctl restart k3s
</code></pre>
<p> Also for troubleshooting purpose its always helpful to analyze the outputs from the following commands:</p>
<pre><code class="lang-bash"> sudo systemctl status k3s

 journalctl -xe
</code></pre>
<p> For uninstalling k3s use the following commands:</p>
<pre><code class="lang-markdown"> //on server
 /usr/local/bin/k3s-uninstall.sh

 //on agent
 /usr/local/bin/k3s-agent-uninstall.sh
</code></pre>
</li>
<li><p><strong>Using k3sup</strong></p>
<p> k3sup (pronounced ketchup) is a utility that simplifies the k3s installation process on both server and agent nodes. Also additionally k3sup allows you to export the k3s cluster configuration to your system so you can easily access your new cluster.</p>
<p> Also k3sup is helpful with installing apps in your cluster using yaml files or helm charts, however that is not covered in this article.</p>
<p> k3sup can be installed on your system via the following commands:</p>
<pre><code class="lang-bash"> // install k3s binary
 curl -sLS https://get.k3sup.dev | sh
 sudo install k3sup /usr/<span class="hljs-built_in">local</span>/bin/

 // <span class="hljs-built_in">test</span> k3s installation
 k3s --<span class="hljs-built_in">help</span>
</code></pre>
<p> You can install k3s on a rPi master node using the following commands:</p>
<pre><code class="lang-bash"> // check the available options
 k3sup install --<span class="hljs-built_in">help</span>

 // install using ssh key and merge to <span class="hljs-built_in">local</span> context at the mentioned path
 k3sup install --ip &lt;MASTER-NODE-IP&gt; --ssh-key ~/.ssh/&lt;key-name&gt; --user &lt;user&gt; --merge --local-path <span class="hljs-variable">$HOME</span>/.kube/config --context k3s-cluster
</code></pre>
<p> After k3s has been installed on the master node you can check the status of k3s using the following command:</p>
<pre><code class="lang-bash"> kubectl get nodes
</code></pre>
<p> The above command should output master node as available and ready. Also before running the above command ensure that your new k3s cluster is set as the current context using the following command:</p>
<pre><code class="lang-bash"> kubectl config current-context
</code></pre>
<p> Now we can proceed to install k3s on the worker (or agent) nodes using k3sup with the following commands:</p>
<pre><code class="lang-bash"> k3sup join --ip &lt;WOKER-NODE-IP&gt; --server-ip &lt;MASTER-NODE-IP&gt; --ssh-key ~/.ssh/&lt;key-name&gt; --user &lt;user&gt;
</code></pre>
<p> After k3s has been installed on the agent nodes, you can check the installation using the following command:</p>
<pre><code class="lang-bash"> kubectl get nodes
</code></pre>
<p> It should output all the available and ready (both master and agent) nodes which are part of the cluster.</p>
</li>
</ol>
<p><strong><em>That's all folks, k3s is installed and running on your raspberry pi and you can now install applications on your cluster.</em></strong></p>
]]></content:encoded></item><item><title><![CDATA[Upgrade WSL/WSL2 Ubuntu version to 20.04 LTS]]></title><description><![CDATA[Its the time of the year again to upgrade Ubuntu version in WSL/WSL2 since the Ubuntu 20.04 LTS came out last week.
Please follow along the following steps in your WSL console to upgrade to the new version:

Check installed Ubuntu version
  Take a no...]]></description><link>https://blog.vivekdhami.com/upgrade-wslwsl2-ubuntu-version-to-2004-lts</link><guid isPermaLink="true">https://blog.vivekdhami.com/upgrade-wslwsl2-ubuntu-version-to-2004-lts</guid><category><![CDATA[WSL]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Fri, 01 May 2020 00:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/xbEVM6oJ1Fs/upload/24ffd88e321fb5206ac99c383fc096c3.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Its the time of the year again to upgrade Ubuntu version in WSL/WSL2 since the Ubuntu 20.04 LTS came out last week.</p>
<p>Please follow along the following steps in your WSL console to upgrade to the new version:</p>
<ul>
<li><p>Check installed Ubuntu version</p>
<p>  Take a note of your current Ubuntu version by running the following command:</p>
<pre><code class="lang-bash">  lsb_release -a
</code></pre>
<pre><code class="lang-bash">  No LSB modules are available.
  Distributor ID: Ubuntu
  Description:    Ubuntu 18.04.4 LTS
  Release:        18.04
  Codename:       bionic
</code></pre>
<p>  **<em>Upgrading to 20.04 should work seamlessly when upgrading from 18.04. However if you are on older versions then the suggested path should be upgrading first to 18.04 from 16.04 for example.</em></p>
</li>
<li><p>Upgrade installed packages</p>
<pre><code class="lang-bash">  sudo apt update
  sudo apt list --upgradable
  sudo apt upgrade
</code></pre>
</li>
<li><p>Remove unused packages</p>
<pre><code class="lang-bash">  sudo apt --purge autoremove
</code></pre>
</li>
<li><p>Install update-manager-core package if not already installed</p>
<pre><code class="lang-bash">  sudo apt install update-manager-core
</code></pre>
</li>
<li><p>Upgrade to 20.04</p>
<pre><code class="lang-bash">  sudo do-release-upgrade
</code></pre>
<p>  If you receive the following message:</p>
<pre><code class="lang-bash">  Checking <span class="hljs-keyword">for</span> a new Ubuntu release
  There is no development version of an LTS available.
  To upgrade to the latest non-LTS develoment release 
  <span class="hljs-built_in">set</span> Prompt=normal <span class="hljs-keyword">in</span> /etc/update-manager/release-upgrades.
</code></pre>
<p>  Then do an upgrade forcefully using the following command:</p>
<pre><code class="lang-bash">  sudo do-release-upgrade -d
</code></pre>
</li>
<li><p>Finally check the version after upgrade is done:</p>
<pre><code class="lang-bash">  lsb_release -a
</code></pre>
<p>  and you should receive the following similar output:</p>
<pre><code class="lang-bash">  No LSB modules are available.
  Distributor ID: Ubuntu
  Description:    Ubuntu 20.04 LTS
  Release:        20.04
  Codename:       focal
</code></pre>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Kubernetes: Working with multiple contexts]]></title><description><![CDATA[A context in kubernetes defines the current scope under which all the kubectl commands will run. In simple words, it defines the cluster against which all the kubectl commands will execute.
The contexts are easy to work with as long as you have a fix...]]></description><link>https://blog.vivekdhami.com/kubernetes-working-with-multiple-contexts</link><guid isPermaLink="true">https://blog.vivekdhami.com/kubernetes-working-with-multiple-contexts</guid><category><![CDATA[Kubernetes]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Sat, 02 Feb 2019 00:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/jOqJbvo1P9g/upload/91fb19809da08212d54c2863f6f8e2f7.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A context in kubernetes defines the current scope under which all the kubectl commands will run. In simple words, it defines the cluster against which all the kubectl commands will execute.</p>
<p>The contexts are easy to work with as long as you have a fixed number of contexts and context files. However as the number of contexts keep on growing it becomes rather difficult to manage them. In this post I will try to demystify the way of working and managing multiple contexts.</p>
<p>A list of all available contexts can be fetched via:</p>
<pre><code class="lang-bash">kubectl config get-contexts
</code></pre>
<p>Also the current context in use can be checked via:</p>
<pre><code class="lang-bash">kubectl config current-context
</code></pre>
<p>Similarly the current context can be switched to another one via:</p>
<pre><code class="lang-bash">kubectl config use-context &lt;context-name&gt;
</code></pre>
<p>All the above mentioned commands work well as long as all the contexts are present in the same configuration file, which is usually located at <strong>$HOME/.kube/config</strong>. This holds true depending upon the way the context was downloaded to your console. Sometimes the contexts are downloaded to different locations and in order to use those contexts, the following ways are preferred:</p>
<ul>
<li><p>Temporary Context Switching</p>
<p> You can temporary switch your context to the preferred one by setting <strong>KUBECONFIG</strong> value to the desired kube configuration location. However this will be a temporary switch and will be lost once you close your terminal/console.</p>
<pre><code class="lang-bash"> <span class="hljs-built_in">export</span> KUBECONFIG = <span class="hljs-variable">$HOME</span>/.newkube/config
 kubectl config view
</code></pre>
</li>
</ul>
<ul>
<li><p>Merging Context Configuration Files</p>
<p> Merging context configuration files is preferred if you would like to use the same configuration file for all your contexts. The only downside with this approach is that the configuration file might get very long as the number of contexts increase. The context configuration is usually automatically merged into the current context while downloading contexts from clusters in GKE AKS and EKS using the available CLIs.</p>
<p> However to manually merge the context files, please use the following commands:</p>
<pre><code class="lang-bash"> <span class="hljs-comment"># Backup the current kube configuration</span>
 cp <span class="hljs-variable">$HOME</span>/.kube/config <span class="hljs-variable">$HOME</span>/.kube/config.backup.$(date +%Y-%m-%d.%H:%M:%S)

 <span class="hljs-comment"># Include all the configuration files to merge in KUBECONFIG</span>
 <span class="hljs-built_in">export</span> KUBECONFIG=<span class="hljs-variable">$HOME</span>/.kube/config:File1:File2

 <span class="hljs-comment"># Merge the configurations and move it to the default configuration location</span>
 kubectl config view --merge --flatten &gt; ~/.kube/merged_kubeconfig &amp;&amp; mv ~/.kube/merged_kubeconfig ~/.kube/config

 <span class="hljs-comment"># View all available contexts</span>
 kubectl config get-contexts
</code></pre>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Git: Sync fork with upstream changes]]></title><description><![CDATA[A forked repository can be synced with the upstream one as follows:

Clone the forked repository locally if not already done.

git clone https://github.com/OWNER/FORKED-REPOSITORY.git


List the remote repository configured for your fork:

git remote...]]></description><link>https://blog.vivekdhami.com/git-sync-fork-with-upstream-changes</link><guid isPermaLink="true">https://blog.vivekdhami.com/git-sync-fork-with-upstream-changes</guid><category><![CDATA[Git]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Thu, 07 Mar 2013 00:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/842ofHC6MaI/upload/146b0e09bbee634e093dc093b5c5b150.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A forked repository can be synced with the upstream one as follows:</p>
<ol>
<li>Clone the forked repository locally if not already done.</li>
</ol>
<pre><code class="lang-bash">git <span class="hljs-built_in">clone</span> https://github.com/OWNER/FORKED-REPOSITORY.git
</code></pre>
<ol start="2">
<li>List the remote repository configured for your fork:</li>
</ol>
<pre><code class="lang-bash">git remote -v
</code></pre>
<ol start="3">
<li>Add a new remote upstream repository that will be synced with the local fork:</li>
</ol>
<pre><code class="lang-bash">git remote add upstream https://github.com/OWNER/REPOSITORY.git
</code></pre>
<ol start="4">
<li>Verify the configured remote repository for your fork:</li>
</ol>
<pre><code class="lang-bash">git remote -v
</code></pre>
<ol start="5">
<li>Pull the latest changes from upstream remote:</li>
</ol>
<pre><code class="lang-bash">git fetch upstream
</code></pre>
<ol start="6">
<li>Checkout to your local master branch of the forked repository:</li>
</ol>
<pre><code class="lang-bash">git checkout master
</code></pre>
<ol start="7">
<li>Merge upstream/master to local master branch:</li>
</ol>
<pre><code class="lang-bash">git merge upstream/master
</code></pre>
<ol start="8">
<li>Commit and push your changes to remote fork:</li>
</ol>
<pre><code class="lang-bash">git commit -m <span class="hljs-string">'merged with upstream changes'</span>
git push
</code></pre>
]]></content:encoded></item><item><title><![CDATA[Git: Submodule Init and Update]]></title><description><![CDATA[For a repository with submodules all the submodules can be pulled down locally for the first time using:
git submodule update --init --recursive

Subsequently, submodules can be updated with remote changes using:
git submodule update --recursive --re...]]></description><link>https://blog.vivekdhami.com/git-submodule-init-and-update</link><guid isPermaLink="true">https://blog.vivekdhami.com/git-submodule-init-and-update</guid><category><![CDATA[Git]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Sat, 02 Feb 2013 00:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/EUzk9BIEq6M/upload/3e582eb75099cf6d221735536c0ff0a8.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>For a repository with submodules all the submodules can be pulled down locally for the first time using:</p>
<pre><code class="lang-bash">git submodule update --init --recursive
</code></pre>
<p>Subsequently, submodules can be updated with remote changes using:</p>
<pre><code class="lang-bash">git submodule update --recursive --remote
</code></pre>
]]></content:encoded></item><item><title><![CDATA[MySQL–Restoring large database on windows]]></title><description><![CDATA[Recently while trying to restore database from a large database backup file I encountered few issues with MySQL.
So for reference I am posting the solution steps for resolving these issues on windows :

Initiate the restoration process from command p...]]></description><link>https://blog.vivekdhami.com/mysqlrestoring-large-database-on-windows</link><guid isPermaLink="true">https://blog.vivekdhami.com/mysqlrestoring-large-database-on-windows</guid><category><![CDATA[MySQL]]></category><dc:creator><![CDATA[Vivek Dhami]]></dc:creator><pubDate>Fri, 17 Aug 2012 00:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/Y9kOsyoWyaU/upload/b1a4093b1b58cb256a7029d4b3675a98.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Recently while trying to restore database from a large database backup file I encountered few issues with MySQL.</p>
<p>So for reference I am posting the solution steps for resolving these issues on windows :</p>
<ol>
<li>Initiate the restoration process from command prompt as follows:</li>
</ol>
<pre><code class="lang-bash">mysql –u root –p &gt; [sql file path]
</code></pre>
<p>After this you will be prompted for root account’s password. After entering root account’s password server will begin the databse restoration process from the backup. However if mysql server’s configuration has not been adjusted to handle large backups then the restoration process will end with the following message:</p>
<pre><code class="lang-bash">Error 2006: MySQL server has gone away.
</code></pre>
<ol>
<li><p>Stop MySQL service. Generally MySQL is installed as a service on windows. So open services window (use ‘services.msc‘ command on windows run dialog) and stop the required service.</p>
</li>
<li><p>Locate <strong>my.ini</strong> file on your system. On <strong>Windows XP</strong> the default location is:</p>
</li>
</ol>
<pre><code class="lang-bash">&lt;Drive&gt;:\Documents and Settings\All Users\Application Data\MySQL\MySQL Server &lt;version&gt;
</code></pre>
<p>For <strong>Windows 7</strong> the default location is:</p>
<pre><code class="lang-bash">&lt;Drive&gt;:\ProgramData\MySQL\MySQL Server &lt;version&gt;
</code></pre>
<ol>
<li>Open <strong>my.ini</strong> file and add the following:</li>
</ol>
<pre><code class="lang-bash">max_allowed_packet = &lt;Size&gt;M
wait_timeout = &lt;time <span class="hljs-keyword">in</span> seconds&gt;
</code></pre>
<p>Use a split down value for , generally using a value half of the database size would be good to start with. Save the file and restart MySQL server by starting the stopped service. Now restart database restoration process as described in step 1. If the error still persists then reduce the max_allowed_packet size further in step 4 and continue back from step 1.</p>
<p>Following are some links for reference:</p>
<p>http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html</p>
<p>http://bogdan.org.ua/2008/12/25/how-to-fix-mysql-server-has-gone-away-error-2006.html</p>
<p>http://wpmu.org/how-to-backup-and-import-a-very-large-wordpress-mysql-database/</p>
]]></content:encoded></item></channel></rss>