Ansible Data Manipulation with Modules

Ansible loves to pretend that YAML is a programming language. It isn’t. And every engineer who has ever tried to munge data inside a playbook knows the pain. You have filters everywhere, Jinja spaghetti, and tasks that look like they were written during a period of sleep deprivation. How do I know, guilty as charged.

Just to be clear, what i’m saying is YAML and Jinja are not intended to be a Data‑Processing stack

The usual Ansible talking point is that “Ansible is declarative, not imperative”. Sure. But then I immediately need to write imperative logic in Jinja because the playbook layer simply isn’t built for data transformation. I do it, you do it, we’ve all done it. It’s usually quick, and depending on the use case, relatively painless, but at some point, you’ve taken it too far. I know I have, so I’m talking about it now. Automation should be declarative, but you need imperative to achieve declarative – stuff needs to be queried and computed to achieve a desired state. Ansible provides all the batteries needed to achieve this.

The pain points you run into are real, you end up with:

  • Complex list/dict transformations
  • Conditional logic that becomes unreadable in YAML
  • Repeated filter chains that break the moment your data shape changes
  • Playbooks that become untestable because the logic is embedded in templates

Fundamentally, if you’re doing anything non‑trivial with data, YAML based tasks and Jinja are the wrong tool.

Ansible does have a solution: Move the logic into a module

Stop abusing filters, YAML, Jinja. Write a module. If your playbook contains more than two chained filters, or chains of set_facts, or complex jinja, you probably should have written a module.

I’ve written my fair share of modules, they aren’t that difficult, but my mindset has always gone to the convoluted set_fact, conditional, filter, jinja fiasco – because somehow it seems easier at the time. Perhaps it is when you’re trying to capture that initial thought process. But at some point, you need to give yourself a reality check, and maybe it’s just simpler to start with modules than convert later. That’s the thought experiment i’d like you to consider. A module gives you:

  • Real programming constructs
  • Real error handling
  • Real testability
  • Real maintainability
  • Real version control and reuse

Why is this a better Pattern?

Input validation – YAML doesn’t. Playbooks don’t, (don’t say assert to me as i’ve abused that as well). Jinja definitely doesn’t.

Modules let you validate input before you do your thing with it.

Modules are testable

You can unit test a module. You cannot unit test a Jinja filter chain inside a task, and when your processing is a sequence of knitted tasks full of set_facts and recursive playbook calls, you’ve crossed the line into prayer-based testing.

Modules are reusable across roles and playbooks

Copy‑pasting filter chains, or jinja compute, or those wonderful blocks of set_facts and conditionals is how outages happen.

Modules reduce cognitive load

A 20‑line Python function is easier to understand than a 20‑line set_fact, conditional, jinja monstrosity.

The summary is Playbooks orchestrate. Modules compute. This is how Ansible should always have been used.

So, what is my example problem and how do I fix it with modules.

My ansible role was running Proxmox backups in my home lab. I was only backing up systems in my lab that had been powered on since the last backup, either daily or weekly. My pve_backup role was doing all of the following in YAML:

  • Multi‑node API discovery
  • Cross‑node VM enumeration
  • Tag parsing and normalization
  • Per‑VM filtering
  • Per‑VM state evaluation
  • Time‑window logic
  • Task‑history correlation
  • Backup triggering
  • UPID polling
  • Error handling

This is imperative logic. YAML + Jinja is not an imperative language. I had effectively built a Python program using a markup language. Yay me!

Based on my thought process that I describe above, I could identify many ‘code smells’:

Excessive set_fact

This is always a sign the playbook is doing computation, not orchestration.

Nested loops + sub elements

This is a red flag that the data model is too complex for YAML.

Repeated REST calls with identical headers

Modules handle this cleanly; playbooks do not.

Per‑VM include files

This is a workaround for the fact that YAML cannot express real logic.

State accumulation (vms_powered_on_last_week)

This is business logic, not orchestration.

UPID polling in YAML

This is the worst possible place to do it.

Debug statements everywhere

Because debugging YAML logic is hell.

My role wasn’t bad in the terms that it did ‘work’. It’s simply doing something Ansible playbooks were never designed to do.

How did I refactor this mess?

A good module model I use is:

One module = one conceptual operation

My conceptual operations are:

“Given a Proxmox cluster, return the list of VMIDs that should be backed up.”

“Given a VMID, run vzdump and wait for completion.”

That’s it. Two modules replace ~300 lines of gnarly YAML.

Why this is objectively better (oh and I simply feel better about it)

Testable

You can unit‑test the module logic without running Ansible.

Faster

Fewer tasks, leading to fewer forks, leading to fewer HTTP sessions.

Maintainable

No more Jinja filter soup.

Debuggable

You can print structured Python objects, not YAML hacks.

Reusable

Other roles can use the same modules.

Correct abstraction

Playbooks orchestrate. Modules compute.

In summary

Think of your future self now 🙂

Train yourself to spot the above code smells sooner rather than later.

N-4 support for Windows Server Upgrades – Wow?

Windows Server 2025 is well and truly here, and Microsoft is pushing hard to convince you that this is the smoothest upgrade cycle in years. They’re not wrong but is it the whole story. If you’re running 2012 R2, 2016, 2019, or 2022, the new N‑4 in‑place upgrade support means you can jump straight to 2025 in one hop. That’s a big deal. But whether you really can do this is a different question.

If you’re still on 2012 R2, this is your last lifeline.

Media‑Based Upgrade: The Official Story vs Reality

Microsoft claims the ISO‑based upgrade takes “under an hour per server.” Sure, in a lab. With clean images. And no vendor agents. And no ancient NIC drivers. And no weird backup software from 2014. I was able to easily do it under these ideal conditions.

In the real world, which is often quite messy (today’s understatement), expect:

  • Several hours for typical servers – some of these old physical servers take 15 minutes to boot.
  • Longer for anything with third‑party agents, monitoring hooks, or legacy storage drivers
  • Rollback time if something breaks (and something always breaks). Do you have a rollback plan for your in-place upgrade? Does your backup actually work?

Still, the process is quite straightforward: mount ISO —> run setup —> choose “Keep files, settings, and apps” —> Ok, some form of conversation with a deity may help here.

Virtual Machines: The Easy Path (Mostly)

If you’re on Hyper‑V, Microsoft is right: upgrades are usually painless. Hyper‑V integration components update automatically during setup.

If you’re on VMware, Nutanix, Proxmox, or anything else, Microsoft’s advice is blunt but correct:

Update your guest drivers first. Outdated virtualization drivers are the #1 cause of upgrade failures.

Ignore this and you’ll be staring at a recovery console and re-considering your life choices.

Physical Servers: Where Things Get Messy

Microsoft politely hints that your 2012 R2 hardware is “about 15 years old.” Translation: Your server is a fossil. Don’t expect miracles.

You must validate:

  • NIC drivers
  • HBAs
  • RAID/storage controllers
  • PCIe cards
  • Out‑of‑band management firmware

If any of these lack 2025‑compatible drivers, you’re choosing between:

  • Buying new hardware
  • Virtualizing the workload
  • Moving it to the cloud
  • Or hoping

Planning: The Part Everyone Skips Until It’s Too Late

Microsoft’s checklist is great, but incomplete. But really put some thought into these as well …

  • Uninstall old antivirus/EDR agents as they break upgrades constantly
  • Disable third‑party disk encryption
  • Remove legacy monitoring agents (SolarWinds, SCOM, etc.)
  • Disconnect from SANs you don’t need during upgrade
  • Expect to reboot multiple times even after the upgrade completes

The main missing step: Fix whatever broke. Something always breaks.

You cannot in‑place upgrade domain controllers. Don’t even bother trying, let me help you here – no, stop, don’t, you can thank me later

The process for Active Directory upgrades remains the same as it has been for 20 years: Deploy new DC, Replicate, Demote / Promote, Raise Functional Levels.

Licensing: The Part Everyone Forgets – I say everyone, but it was me – I forgot.

If you have Software Assurance, you’re fine. If you rely on KMS, make sure it’s updated — 2025 requires new keys.

If the Upgrade Fails

Microsoft recommends:

  • Checking logs in C:\Windows\Panther
  • Running SetupDiag
  • Restoring from backup

Final Thoughts

If you didn’t test the upgrade on a clone first, you’re already headed for disaster – think of your future self.

You can find more details at Upgrading to Windows Server 2025 from Windows Server 2012 R2, 2016, 2019, or 2022 using Media (ISO) | Microsoft Community Hub

Navigation