It is a configuration management tool, like Ansible?
Is it meant for running one-off commands across the infrastructure, like Salt?
It says it integrates with Terraform, so it's not a provisioning tool...
What does it do different (and presumably better) than other tools?
The Getting Started guide doesn't cover this. The FAQ doesn't cover this, and the Docs doesn't have an Introductory section to cover this.
It's disheartening to find a potentially interesting project, but not really know what it does and how it might fit in your workflow.
That's what I discovered by reading the homepage.
And that's cool- Ansible is a bit of an oddball system, but then I'm still left wondering, why is this better, or why it is better for the author at least?
I've used cfengine, Puppet, Chef, bcfg2 (briefly) and ansible. I want to know what makes this tool different and better. :)
It can also run one-off commands across the infrastructure (like Tentakel: https://pypi.org/project/tentakel/ ).
I've been using Pyinfra for some time. It's good enough for me.
Reasonable people can 100% disagree about whether yaml is the correct packaging for those operations, and ansible is a bit too imperative for my liking, but as far as "I have one hammer..." it does all the things
My main gripe with Ansible is the YAML specification. Ansible chooses to separate the task specification and task execution. Pyinfra chooses to directly expose the Python layer, instead of using slightly ugly magic functions/variables. I like this approach more since it allows standard Pythonic control flow instead of using a new (arguably ugly and more hassle to maintain) grammar.
Excited for Pyinfra!
A couple years ago I inherited about 100 Mac Pros that are part of $dayjob's CI infrastructure. They had been managed over the years using a combination of shell scripts, Chef, and manually via VNC. No two machines were alike. The Chef recipes had all bit-rotted and weren't usable and due to $reasons were based on an old version of Chef that $company was stuck on.
So I looked around for alternatives, and being most comfortable in Python, I explored Ansible, Salt, and Pyinfra.
Ansible seemed like the obvious choice, but it has very few playbooks/actions for macOS systems. I was going to have to write my own. As I dug into its documentation, I found it was taking me a long time to wrap my head around all that I needed to do and started to sour on its complexity. This is a matter of taste, but I just didn't find Ansible very welcoming. I wanted something simpler.
I had previously used Fabric, so considered using it again. But Fabric offers too little (it's really not much more than parallel ssh-if you want idempotent operations you have to write that yourself), and I don't agree with the direction it took with version 2.x.
Then I found Pyinfra. It took me less than 30 minutes to understand it in its entirety. It's conceptually simple: you have an inventory of machines that it connects to in parallel over ssh. You provide it with a deploy script that combines facts and operations. Pyinfra uses the deploy script to gather facts about each machine, then you use those facts to decide whether you need to perform any operations. It then performs those operations on each machine as needed. The inventory file, deploy script, facts, and operations are trivial to write for someone comfortable with Python. It's all Python with the facts and operations being decorated functions. There is no DSL to learn. (It comes with a bunch of pre-written facts and operations, but they are mostly for Linux systems. I had to mostly wrote my own for macOS, but I found them really easy to write.)
I had it operational the same day I found it. I used it to successfully get all of the Mac Pros into consistent state: things like system settings, installing Xcode, automating installs of brew packages all at the same version, installing JVMs, updating and upgrading macOS, installing Sentinel One, etc.
I've been very happy with it, even contributing a few PRs to fix small bugs and contribute minor functionality.
I also hang out on the Matrix room: https://matrix.to/#/#pyinfra:matrix.org
Another thing: the GH repo points at currently in beta v3 and the docs for this are here: https://docs.pyinfra.com/en/next (highly recommend starting with v3, I just haven't had any time recently to wrap up the release, but it's stable).
It's all pretty messy but useful.
I also think that the facts/manifest/apply separation is conducive to nicely testable infra code, and useful dry-run output.
I'm always surprised that Puppet isn't still more popular. My theory is that it's passed over because of its age/cruftiness/bad vibes in some cases, and that a couple of technical flaws mess it up for some key userbases:
For folks who just want a quick-to-start management tool for a small set of config, Puppet's ugly and clunky client/server model and the hyper-YAML-ification of its best practices (which is pursued to a fault by the community, and not helped by the Hiera pitch that the Puppet stack can also be sort of an asset tracking/catalog system) make small-scale usage and prototyping hard. Puppet doesn't have to be used that way (it can be used just like pyinfra/Ansible with a local-apply or via Bolt, hitting a nice sweet spot between ad-hoc/non-idempotent commands and nice declarative/idempotent Puppet code), but I think the puppetmaster/hiera-all-the-things legacy in the community does Puppet and potential new users a disservice.
From the other side, I think a lot of more cloud-oriented users looking for a "better Terraform for server state" end up annoyed by the quality of modules on the Puppet forge and Puppet's lack of a statefile equivalent (meaning that it doesn't support deletes or infrastructure state snapshots in the same way TF does).
You can specify your config in user-data when launching pretty generic AMIs. https://cloudinit.readthedocs.io/en/latest/index.html
Yes
> Is it meant for running one-off commands across the infrastructure, like Salt?
Also yes.
> It says it integrates with Terraform, so it's not a provisioning tool...
The TF integration is specifically to use TF as an inventory source - ie TF to create resources and pyinfra to then configure them.
> What does it do different (and presumably better) than other tools?
The homepage covers the highlights, I originally created pyinfra because debugging Ansible was complicated (no plain stderr as not "just" commands on the remote side) and slow, but things have evolved significantly since then.
> The Getting Started guide doesn't cover this. The FAQ doesn't cover this, and the Docs doesn't have an Introductory section to cover this.
Hugely appreciate this feedback, this is super helpful and something I will attempt to make clearer.
---
Quick attempt at a better explanation: You write Python code that defines operations (either state "this apt package should be installed" or stateless "run this command"), provide an inventory of targets (SSH, local machine) and pyinfra executes it.
Roughly sits where Ansible does for configuring servers, but also solves the case of "how do I run this command across my server fleet" (which I believe Ansible can also do).
A few years ago, I found a library that lets you utilize Ansible's tasks in raw Python, without the huge hassle of using the Ansible Python API. I cannot find it again however. But PyInfra looks great.
Also CDKTF should be in the space for imperative infrastructure as code definitions.
Also, can I just say that cm is extremely frustrating? Not sure this is the fix, but hopefully the story isn’t over. In my experience the maintenance of cm codebases never, ever stops. At first I thought it was a matter of expertise, but experts typically agree and just call it the cost of doing business.
Shelve something for three months and it will break on the next run, on the same os/host where it used to work. Blame the package manager, blame the os choice, or the cm tool. But it’s embarrassing and insulting for Devops teams after putting in the effort to do things right, and evangelizing to everyone else about repeatability. I’d rather just see tighter integrations with containers moving forward and never think about it again. Not everyone is using k8s but in the 2020s everyone probably should default to using docker before doing things of even marginal complexity directly on hosts.
https://news.ycombinator.com/item?id=18717422
Interesting to see all the Ansible comments here. I'll have to check this out asap.
There's something about YAML that just sucks the joy out of programming. It seems like a giant step backwards when we have plenty of amazing programming languages in existence.
Even when infrastructure yaml like cloudformation are wrapped by some SDK, it can still be a pain because you end up with stuff like...
do_something("___((!-prickly_config_string_::might as well use yaml _blah-blah:blah))")
Back in the days of java and xml, there used to be a distant promise of "binding" the xml to code (remember jaxb?) so that you could then just manipulate it fluently as code and then "marshall" it out back to xml when you were done. Those days and that promise are gone, right?If so, nice, shout about it more - it's my number one requirement of such a tool, why I think Terraform (or OpenTofu) is great and mostly everything else sucks, and I think it should be everyone's. It's just obviously (at least, once someone makes it available!) the correct paradigm for managing stateful resources and coping with drift.
- state definitions, "ensure this apt package is installed" (apt.packages: https://docs.pyinfra.com/en/next/operations/apt.html#operati...) - stateless, "run these shell commands" (server.shell: https://docs.pyinfra.com/en/next/operations/server.html#oper...)
Most operations are state definitions and much preferred, the stateless ones exist to satisfy edge cases where either the state-ful version isn't implemented or simply isn't possible.
Maybe I need an explanation “like I’m just a programmer/sysadmin and I need to use boring terms years old” of what is what, every explanation so far (when I bothered to look for it last) was too invested in this theatrical terminology, so I gave up and stuck to what worked after a command or two.
Same with Chef and its galore of cooking words, but thankfully I don’t have to use Chef.
Config management that uses SSH is generally good enough.
Configuration Management tools (that's what this, and Ansible, are) are a nice idea, but get very complicated very quickly. The tools themselves get complicated, the configuration gets complicated, you're constantly finding ways that the state gets broken that you need to re-incorporate into your script, it has to work in a variety of states, and you have to keep re-running and re-running and re-running it, monitoring for problems, investigating, fixing. Very complex, lots of maintenance, lots of potential problems. The "Pets" model from the phrase "Cattle, not Pets." I strongly recommend you do not raise Pets.
Instead, use Immutable Infrastructure: build an immutable image one time that works one way. Deploy that image. If you need to change it, change the build script, build a new image (with a new version), deploy a new instance with the new image, take the old one out back and shoot it. (The "Cattle" of "Cattle, not Pets") If the state gets out of whack or there are problems, just shoot it and deploy a new one that you know works.
This is the single most revolutionary concept i've seen in over 20 years of doing this job. It is an absolute game-changer. I would not go back to Configuration Management for all the tea in China.
Because people wanted to reuse their code, the abstraction of roles was created. A role is something like „setup basic OS stuff“, „create this list of users“, „setup a Webserver“, „setup a database“. The association, which roles to run on which machine still happens in the playbook.
Not trying to be rude ofc, I'm sure you considered it and have a good reason – just curious as of what it is. An incredible project you put there, nonetheless:)
It's unfortunately not widely known that Puppet can be run just like you describe, over SSH (or, for e.g. running in a Docker container, can be invoked as a one-shot "puppet apply" against a local configuration file like pyinfra's "local" transport): https://www.puppet.com/docs/bolt/latest/bolt.html. Doing that requires no background daemons, puppetmasters, cert-signing hell, inventory management PuppetDB/Foreman stacks, or any of that stuff: you run a command which SSHes to a remote/local machine and applies changes based on instructions written in Puppet-lang or one-off scripts. The remote end is entirely self-hosting; it doesn't rely on anything being running on the remote host (Bolt will install the "puppet-agent" package to bootstrap itself, but in this context that package is inert and is used equivalently to a library when you run tasks).
I'm with you that the agent-based approach is far from the best way to go these days. I'm just bummed that we're throwing the baby out with the bathwater: I wish Puppet-the-language and Puppet-the-server-management-tool weren't so often dismissed along with the Puppet-as-inventory-system or Puppet-as-daemonized-continuous-compliance-engine.
The value add would be unifying provisioning and configuration management in a Python-y experience? The lifecycle of each is distinct and that's traditionally where the headaches of using a single tool for both has come in
In Terraform CDK, you use a language like python to compute the set of resources and such you want to have, and then hand that over to the terraform core, which does the usual terraform song and dance to make it happen.
This is actually interesting to me, because we struggle with even the simplest data transformations in ansible so much. Like, as soon as you start thinking about doing a simple list comprehension in python in jinja templating, lots and lots of pain starts. From there, we're really starting to think about templating the inventories in some way because it would be less painful.
On a side note, one of the most hacky things I came up with to get Ansible working on Fedora CoreOS was to bind mount a container rootfs that had python 3 and then symlink it into the right spots. You can of course add Python in with rpm-ostree if you want but I wanted to avoid layering packages at the time. I wasn't proud of it. But it worked.
https://github.com/forem/selfhost/blob/main/playbooks/templa...
Ruby is far, far preferable to shell for ease of idempotence and implicit convergence.
For hour+ runtimes I really do think that's pretty much always user error. I know that's a clichéd and grouchy comment, but (as, I'll admit, a Puppet fan with some personal defensiveness for a favored tool) I do think it's true in this case.
- Python is not easy to build into portable binaries
- The package ecosystem is very hard to use in a reproducible way
- The language is not truly typed - types add massive value for infrastructure and scripts because they are less likely to be unit-tested
- The lack of a "let" or "var" keyword makes simple programming errors more likely (again, this code is less likely to be unit-tested)
Maybe I'm missing something? I don't know why I would want to introduce Python in this domain.
Totally agree on templating which is why inventories have always been python code just as operations, giving maximum flexibility (with some complexity/type confusion drawbacks).
But there’s always edge cases and situations that doesn’t work which is why pyinfra supports both and they can combine any way you like.
It's very hard to be confident about python code.
If you have a good code review feedback loop and so on then it can be OK but proper types enable lots of good things when dealing with configuration and state.
Having inherited a big mess on Puppet of some people who used the flexibility of Ruby to automate 5 datacenters, but then left the company was also an interesting experience..
I chose Python because it’s what I was writing all day back in 2015. Which makes me realise pyinfra is almost 10!
Edit: I mostly write Go or YAML (k8s) these days but Python still makes an appearance from time to time (outside of pyinfra dev).
Declarative scripts make it easy to manage a fleet.
It was before ansible 2, so probably things are better now.
Then we started using Python fabric. Wow it was so freeing. Any helper methods were easily extracted and writing conditions felt natural.
Now I am using Python invoke to maintain my local setup.
for i in range(100):
ip = cidrhost(subnet, i)
if exists := get_server(ip):
continue
create_server(ip=ip)
and so on. I don't like it, but because it's procedural/imperative, not because it's particularly more 'petty' than the Terraform (or equivalent) would be.For me it's more about what I'm doing, conceptually. I want a server to exist, it to have access to this S3 bucket, etc. - the logic of how to interface with the APIs to make that happen, to manage their lifecycle and check current state etc. isn't what I'm thinking about. (In Terraform terms, that belongs in the provider.) When I write the above I'm just thinking I want 100 servers, so:
resource "cloud_server" "my_servers" {
count = 100
ip = cidrhost(subnet, count.index)
# and so on
}
comes much more naturally.I think there is not a lot of overlap between people who need to automate infrastructure and people who don't know how to install Python on their development machine.
As for your other comments regarding Python as a language: I mostly agree. I have stepped away from Python as a language to develop production software. In Python I miss the confidence I get from static typing. Having said that, for automating infrastructure, you're effectively comparing Pyinfra and Python to bash scripts and YAML (for things like Ansible), which are both orders of magnitude worse if you like static typing or any form of being able to verify what you wrote.
This is no where near the level of readiness needed to be reliably used in a production environment.
Verbose logging is not a reason to introduce a non-standard tool into your stack.
1. Use PyInfra to set up Docker and Tailscale on remote hosts and any other setup. Open the Docker port to your Tailnet.
2. Use the Docker provider for Terraform to set up and manage containers on those hosts from your dev machine or from a CI/CD tool. Tailscale allows containers on different machines to communicate privately, or you can open a port to the web.
This makes for such an easy-to-use and bulletproof setup. In the past I would have used Kubernetes but I've come to realize that's overkill for anything I do and way harder to debug.
The difference between just using some Python vs Terraform is idempotency. TF isn’t going to touch the nodes the script succeeded on; if you have to start your for-loop script it will, which may not be desirable.
Frankly these days configuration management is a bit dated…
You’re much better off in most cases using a tool like Packer with whatever system you want to bake an image, then use a simple user-data script for customization.
It’s very hard to scale continuous config management to thousands of servers.
But practically, these tools have their areas of speciality.
So a portable binary is not a requirement. Other points like let or types are not an impediment either, there are many quality tools available if you need them (ruff, pyflakes, mypy), and python has been doing this kind of work productively for thirty years now.
Is very interesting though, I think I'm gonna try it myself.
https://docs.ansible.com/ansible/latest/reference_appendices...
One of them is handling "if-this-then-that-else-that". Being purely declarative, Ansible is horrible at that.
Pyinfra can be used in imperative mode, am I right? This would make the use of if-else a breeze, which would be a really good reason for me to to switch.
Declarative structure is at the heart of functional programming. Declarative is not the right choice everywhere, but when it makes sense, it can significantly raise the quality of the code.
It's one of the tool that get out of your way and let. you get the work done. The tool works for you instead of you fighting with the tool.
Pain point of ansible: storing state and checking later, coordinate state between server is all a breezy with Pyinfra because you write the Python code to perform those check.
The system is very well though out. No need to hack around host file, inventory is just a python script that export resource definition.
No more static, ad-hoc host var, you get a real python script to define and return your variable.
Using pyinfra I was able to focus more on the "compute". the state such as credential, inventory can managed and store outside such as in SSM or just call python ec2 api to filter instance by tag.
I feel as though we're splitting hairs here, given there is, to the best of my knowledge, no `resource remote_file make_sshd_config { inventory_host = "whatever" dest = "/etc/sshd_config" src = "./sshd_config.tmpl" vars = {...} }` in TF. There is template, and there is local_exec and the rest is a Simple Matter Of Programming :-/
I'm waiting patiently for someone to chime in "well, just spawn ansible in local_exec" as if they're missing the point
As you craft your "Why this and not Ansible" content, you might actually state clearly what you already noted on the Performance page, namely: "One of the reasons pyinfra was started was performance of agent-less tools at the time." If I read that, it'd instantly make me want to stick around and read some more, play with pyinfra, etc. BTW, i will be playing with it anyway, but just wanted to point out that you likely won;t need to start from scratch for copy (on a comparison or answering "Why this and not Ansible" content). Cheers!
No type checking = no serious job. I have learned enough from Ansible to not ever touch that kind of stuff again.
There has been a time that Python was a fringe language, only known by some hardcore nerds. I thought Joel Spolsky had once mentioned that having Python on your resume was a signal of a quality developer, someone who went off the beaten path.
Times have changed. Python is now the MS Excel for developers. It shines for quick and dirty data mangling. Unfortunately, that is how a seizable portion of people approach software engineering. My theory is that for some having to do abstract thinking and perform a dry analysis beforehand is an impediment. They can only discover what they want while banging out something. They fix the runtime errors they could catch, and slap some more features on top.
Types imply a kind of foresight, and that is what some people really have difficulty with.
EDIT: Might sound negative, so I admit that the quick feedback cycle you can get from an interpreter language like php/python is a feature in itself.
But a fully type-hinted Python codebase is extremely expressive, the times where you have to opt-out of the type system is much much rarer and the types you end up writing are much more specific so you get stronger guarantees. It's not without downsides but I don't think it's "because they're hints you can't trust them" since lots of languages erase their types on compilation.
This looks like infinity times better than Ansible in some cases and somewhat worse in others (python.call every time I'd need to access a previous operation's result feels clunky, though I certainly understand why it works that way).
Do you think it would be possible to use Ansible modules as pyinfra operations? As in, for example:
- name: install foo
apt:
pkg: foo
state: present
could be available as: from pyinfra import ansible
ansible(name='install foo').apt(pkg='foo', state='present')
where the `ansible` function itself would know nothing about apt, just forward everything to the Ansible module.Note 1: I know pyinfra has a way to interface with apt, this is just an example :) Note 2: It's just my curiosity, my sysadmin days are long gone now.
Pyinfra automates infrastructure super fast at scale - https://news.ycombinator.com/item?id=33286972 - Oct 2022 (37 comments)
Show HN: pyinfra v2 - https://news.ycombinator.com/item?id=30999030 - April 2022 (2 comments)
Pyinfra v2.0 Released - https://news.ycombinator.com/item?id=30973976 - April 2022 (3 comments)
Show HN: Pyinfra v1.4 - https://news.ycombinator.com/item?id=26983266 - April 2021 (3 comments)
Pyinfra – automate infrastructure super fast at scale - https://news.ycombinator.com/item?id=23487178 - June 2020 (64 comments)
Pyinfra v0.3 - https://news.ycombinator.com/item?id=13862942 - March 2017 (1 comment)
Pyinfra v0.2 - https://news.ycombinator.com/item?id=12956784 - Nov 2016 (2 comments)
Packer and Terraform do different jobs (they're both by Hashicorp!) - you can bake an immutable image all you like, you still need to get a server, put the image on it, give it that S3 bucket it needs, IAM, etc.
Maybe I am overlooking because I am not a pythonista, but when looking at this code [1] I see only some superficial hints. Looking at `_make_command`, I need to look inside the body to see that the first argument is expected (?) to be callable (it just ignores otherwise).
____
1. https://github.com/pyinfra-dev/pyinfra/blob/3.x/pyinfra/api/...
Alternatively you could just yield ansible cli and execute from the local machine using the @local connector.
In puppet and saltstack you can declare that a folder is empty and declare a specific file in this folder. The system's smart enough to delete all the files except the one.
To achieve such feat in ansible is hard. Easiest way is to have two tasks, one deletes everything and second recreates your file. Doesn't feel very declarative
Unrelated thing, they don't even try to be declarative in ansible E.g you can have a file with state "touch". It not a state if it updates each playbook run!
I've gone to the trouble of googling these articles for you (it took me a whole 30 seconds!). Please read any of them.
https://webcache.googleusercontent.com/search?q=cache:https:...
https://devopscube.com/immutable-infrastructure/
https://thenewstack.io/a-brief-look-at-immutable-infrastruct...
https://www.digitalocean.com/community/tutorials/what-is-imm...
https://www.hashicorp.com/resources/what-is-mutable-vs-immut...
https://www.techtarget.com/searchitoperations/definition/imm...
https://www.oreilly.com/radar/an-introduction-to-immutable-i...
https://www.terraformpilot.com/articles/mutable-vs-immutable...
https://www.bmc.com/blogs/immutable-infrastructure/
https://www.linode.com/docs/guides/what-is-immutable-infrast...
https://devops.com/immutable-infrastructure-the-next-step-fo...
https://openupthecloud.com/what-is-immutable-infrastructure/
https://www.opsramp.com/guides/why-kubernetes/infrastructure...
https://www.cloudbees.com/blog/immutable-infrastructure
https://www.daily-devops.com/devops/immutable/architecture-p...
http://radar.oreilly.com/2015/06/an-introduction-to-immutabl...
https://highops.com/insights/immutable-infrastructure-what-i...
https://docs.aws.amazon.com/wellarchitected/latest/financial...
Is there a reason this isn't an option for you?
That said I think there are precious few good alternatives. I've been using Deno a fair bit for "scripting" and it works pretty well, but I wish there were more options.
Also I have to say if you are using a tool like this to manage thousands of machines you're absolutely doing it wrong. I don't even work in ops/infra but even I know that manually running commands on multiple machines via SSH is asking for trouble.
Despite its many great ideas, I never particularly liked the agent or need for a master server. And I've always managed to avoid learning Ruby so I couldn't easily hack on it myself. The company I'm with now uses it extensively so I'm having to re-learn it and so far my impression is that it went from "cool new open source thing" to "your average enterprise-grade bloatware thing".
Same as in programming, over-adherence to DRY leads to spaghetti code.
I definitely think it'd be easier to explain a python-like declarative language to someone who asks what programming is than actual python. 'It's just describing the way things should be' vs. 'it's like a series of instructions for how to compute ...'
Certainly not more clever IMO, if anything the opposite. Like I said above or elsewhere in this thread, when I'm managing infrastructure with Terraform I don't want to (and don't have to) be thinking about how to interface with the API, check whether things exist already, their current state, how to move from that to what I want, etc. I just know the way I want it being, I declare that, and the procedure for figuring it out and making it so is the provider's job. That's not smarter! The smart's in the provider! (But ok if you're going to make me flex, I've written and contributed to providers too... But that's Go; not declarative.)
DISCOVERING the available ansible actions is the JFC since, like all good things python, it depends on what's currently on the PYTHONPATH and what makes writing or using any such language-server some onoz
And this wasn't what you asked, but ansible has a dedicated library for exec, since the normal `ansible` and `ansible-playbook` CLIs are really, really oriented toward interactive use: https://github.com/ansible/ansible-runner#readme
This is baseless FUD.
Pyinfra is 8 years old, just 2 years younger than Terraform. It's well maintained, stable, and used by many teams in production. Just because it's not as widely known or adopted as other tools, doesn't mean it should be avoided. In fact, as you can see from testimonials here, users often prefer it over Ansible.
> Should probably just stick with the Terraform CDK or Chef if you need this level of expressibility.
Terraform is used for provisioning infrastructure. Pyinfra is a configuration management tool. They're not equivalent.
Chef is closer, but it's an older tool that has largely been superseded by Ansible. It shouldn't be anyone's first choice, unless they really need some obscure feature it does better than Ansible, or Puppet for that matter.
> Verbose logging is not a reason to introduce a non-standard tool into your stack.
Why would that be the only reason to use this? That's not even one of its prominent features, and surely all tools in this space support verbose logging...
What a confused comment.
In Ansible, it's fairly arduous to try to reshape data from command outputs into structures that can be used in loops in other tasks--especially if you want to merge output from multiple commands. Main usecase is more dynamic playbooks where you combine state from multiple systems to create a new piece of infrastructure.
I think templating yaml or templates inside yaml is a bit of an anti pattern.
Puppet doesn't work well for that. I've seen it come up in auditing scenarios since the agent can effectively report if the instance is still in the correct config state.
Admittedly some languages like Go do a better job integrating all this into the core of the language. However, Go doesn't tend to have as powerful of a stdlib so it tends to be a lot more verbose to achieve the same thing.
You can also throw in systemd units for Docker or Podman. I usually create a small shell script that pulls, removes any old container, then runs a new container with correct args in the foreground and toss that in a simple systemd unit
For this type of use case AWS has managed services like Batch, ECS, or even auto scaling groups that can make this easier depending on what you're trying to achieve.
ECS with Fargate executors is fairly easy to run arbitrary things inside a VPC
Seems they still kinda discourage it but do have examples at least.
https://docs.ansible.com/ansible/latest/dev_guide/developing...
The problem is to track deletions you either have to constantly have a view of global state (i.e. do you want to put `linux-kernel` in your package list?) or you need to store specific state about that machine (i.e. `redis` was installed by playbook redis-server.yml, task "install redis") - because the packages absence in that list doesn't necessarily mean "uninstall it" if something else in another playbook or task will later declare it should be present.
As soon as you're trying to do deletions, you're making assumptions that the view of the state you have is complete and total and that is usually not the case - and even if it is within the scope of your system, is it the case on the system you're interacting with? Do you know every package that should be installed because it comes out of the box in the distro? Do you want to (aka: do you have the time, resourcing and effort to do this for the almost zero gain it will get you in the short term unless you can point to business outcomes which are fulfilled by the activity?)
Pyifra doesn't even need that. Just needs a shell.
Subjective opinion but it is heavily under recognized piece of software.
Ansible is really great but you soon end up writing Python in Yaml strings.
So why not straight up Python?
I've been using my own homegrown project that does just this - Python roles, server/client, Mako templates: https://github.com/mattbillenstein/salty
It's very very fast to do deploys on long-lived infrastructure, but it hasn't been optimized for large clusters yet; I expect the server process will be a bottleneck with many clients, but still probably faster than Ansible for most setups.
I'm not familiar with Terraform CDK, but I don't see what Chef does/has that this doesn't?
> This is no where near the level of readiness needed to be reliably used in a production environment.
Why?
I might have more YOE than you do for example, and might have worked on bigger companies/infras than you did - what does it matter to the opinion at hand ?
If a SQL compiler or `terraform plan` command can convert "the current state of the system" + "desired end-state" to a series of steps that constitute "how to get there from here", then I can usually just move forward to declaring more desired states after that, or debugging something else, etc. Let the computer do the routine calculations.
When using a path-finding / route-finding tool, having the map and some basic pathfinding algorithms already programmed in means we no longer need to "pop a candidate route-segment off the list of candidates and evaluate the new route cost"... I simply observe that I am "probably here" and I wish to get to "there"; propose a route and if it's good enough I'll instruct the machine to do that.
If I can declare that I want the final system to contain only the folder "/stuff/config.yaml" with permissions 700 -- I don't care what the contents of stuff were previously, and if it had a million temp files in it from an install going sideways or the wrong permissions or a thousand nested folders in it, well, it would be great if the silly computer had a branching workflow that detected and fixed that for me, rather than me having to write yet another one-off script to clean up yet another silly mis-configured system that Bob left as a dumping ground that I have to write yet more brittle bizarre-situation-handling code for.
Same for SQL and data. "Look, Mr. Database, I don't actually know what's in the table today, and I don't know why the previous user dumped a million unrelated rows in the table.... Can you answer my query about if my package has shipped, or not?"
> pyinfra now executes operations at runtime, rather than pre-generating commands. Although the change isn't noticeable this fixes an entire class of bugs and confusion. See the limitations section in the v2 docs. All of those issues are now a thing of the past.
https://github.com/pyinfra-dev/pyinfra/blob/3.x/CHANGELOG.md
terraform does this, which is why it tracks the its own representation of the prior global state. So when you remove a declared resource the diff against the prior state is interpreted as a delete. Note this does introduce the problem of "drift" when you have resources that are not captured in the scope of the state.
> i.e. do you want to put `linux-kernel` in your package list?
Yes. At least I want to put something like "core-packages" or "default" or similar as part of setting my explicit intent.
Ansible you could deploy a small playbook that did just one thing. A lot easier to get started with and keep under control.
As others have mentioned puppet was also a lot less useful when server images can per-configured and often short lived. It was more designed to take a bare OS install and turn into a long-lived server.
Pulumi for Terraform and Dagger for Docker are two examples I use
I like CUE as a language to replace my Yaml that has some of the typical language constructs but maintains the declarative approach
The way I see roles vs playbooks is whether I’m going to reuse it or not.
Roles are more generic playbooks in a sense that I can share with others or across deployments (for example setup a reverse proxy, or install a piece of software with sane, overridable defaults.
I can then use roles within playbooks to tweak the piece of software’s configuration. If it’s a one-off confit/setup then I’ll use a playbook.
I don’t know if it’s the right paradigm (I don’t think it’s explained well and clearly in the docs), but using this rule of thumb has helped me deal with it.
Of course, any role can be a playbook and vice versa since they do the same thing functionally, it’s all about reusability and sharing.
Kinda how you have libraries in software: role = library, playbook = the software you actually want to write.
> - Python is not easy to build into portable binaries > - The package ecosystem is very hard to use in a reproducible way
People use OS packages since 4 decades.
> - The language is not truly typed
The language IS strictly typed.
> - types add massive value for infrastructure and scripts because they are less likely to be unit-tested
99% of errors in deployment are not solved by typing.
> - The lack of a "let" or "var" keyword makes simple programming errors more likely (again, this code is less likely to be unit-tested)
If your logic is so complex that let/var makes a difference you should be not touching infra.
The only place where I have accepted that DRY is not worth it is, unit tests. I used to extract any common behavior in a shared test, but each object will eventually evolve in its own way that the effort to make it DRY will be useless.
And organizing the modules was straight forward as we already knew/did that in the project.
Perhaps, it comes from my programming background, but its true.
It will have to be executed on many different developer machines (or even your own machine several years in the future) so a simple, reproducible build process, including fetching pip dependencies, is critical.
This is not meant to scale to more than a handful of machines, but you get the idea how nice straight Ruby is for a machine specification DSL.
Given that ansible has neither it can't be much better then what it is. I disagree that that is the right choice though. As it is I see not much more value in ansible than in some sort of SSH over xargs contraption combined with a list of servers. The guarantees they give are the same.
> Do you know every package that should be installed because it comes out of the box in the distro? Do you want to [...]?
No, I don't want to. Thankfully, with NixOS I don't need to, since the pre-installed packages are automatically part of the declared state of my NixOS systems (i.e. I declare the wanted state in the same way in which the defaults are also declared, which makes it easy to merge both).
Can this or any other tool do that?
The issue is, Ansible was written for sysadmins who aren't programmers. There is no good explanation, other than it's a historically grown, syntactic and semantic mess that should've been barebones python from the get go.
It is not idempotent. For example, how can I revert a task/play when it fails, so that my infra doesn't end up in an unknown state? How do I deal with inevitable side effects that come from setting up infra?
People will now refer you to Terraform but that is imo a cop out from tool developers that would much rather sell you expensive solutions to get error handling in Ansible (namely RedHat's Ansible Automation platform) than have it be part of a language.
But to give you a proper explanation: Plays define arrays of tasks, tasks being calls to small python code snippets called modules (such as ansible.builtin.file or ansible.builtin.copy). To facilitate reuse, common "flows" (beware, flow is not part of their terminology) of tasks are encapsulated into reusable roles, although reusability depends on the skill of the author of the role.
It will probably work for some simple and common cases, but they barely need any automation anyways...
The problem isn't even the tool itself, it's the lack of standards. Every large enough system is too unique to be easily managed by cookie-cutter tools like this one. Some people will bite the bullet anyways, and try to adapt general-purpose infra tools to their case. I've seen that too. This is a very miserable experience. Frustrating in that very obviously simple and necessary things are sometimes described as "impossible" due to how the chosen framework works. To contrast that, the home-brewed systems usually suffer from the lack of generality, worse user experience in general, quickly start lagging behind the underlying technology updates...
Also, out of popular languages, Python would be somewhere towards the bottom of the hierarchy if I had to choose a language to manage infrastructure. The only redeeming quality of Python is its popularity. On engineering merits alone its unremarkable at best.
----
PS.
import click
If I see this in the project source code, I blacklist it and never look at it again. This is a red flag, a sure sign that the person writing it are clueless.There will likely be a security fix in it or a dep at a later point, so you wouldn’t want to use the exact same version anyway.
You can't be declarative all the way down because reality is not declarative.
You can have all modules being declarative but if you need orchestration, it's not declarative anymore unless you create a new abstraction on top of it.
So people keep arguing about declarative vs imperative and fail to specify at which abstraction level they want things to be either.
Since apparently (I try to avoid ansible, so I might be missing something) playbooks are the go-to approach of using ansible this means that most uses of ansible are imperative (in the context of configuring a system), unless you only ever give a system a singular role and then you are probably defining your role in imperative steps.
A system like NixOS on the other hand presents the entirety of a system configuration in a single declarative interface that is applied in one go, while applying such a configuration to a system can be a thought of as an imperative step (although it is usually a singular, unconditional step). So it is declarative at a higher abstraction level.
The way TF and Pulumi traditionally think about this problem would be to use cloud-init/ignition/Cloudformation Hooks to cause the machine to execute scripts upon itself. Ansible also has an approach do that via "ansible-pull" which one would use in a circumstance where the machine has no sshd nor SSM agent upon it but you still want some complex configuration management applied post-boot (or, actually even if they do have sshd/ssm but there are literally a hundred of them, since the machines doing the same operation to themselves is going to be much less error prone than trying to connect to each one of them and executing the same operations, regardless of the concurrency of any such CM tool)
Ansible plugins can be written in any language, shell, compiled binaries, whatever, and communicate with the control plane via stdin/stdout
I suspect you are thinking of Jinja2 when you are writing python in yaml strings, which ... kind of, I guess, but also confusingly not Python, or at least the hacked up copy of Jinja2 that ansible uses can't do all the fun things normal Jinja2 can
In all seriousness, I would guess this requirement has a hidden 80/20 in it because it is very unlikely that one wishes every machine to be a perfect copy of each other, unless the config files have been very, very disciplined about the hard-coded strings and assumptions made
So even in my glib "create-image" response, even then there's almost certainly going to be some cloud-init that subsequently stamps the booted instance with its actual identity
Contrast that with https://docs.pyinfra.com/en/next/examples/client_side_assets... where any sane setup will show completions after both the `from` and the `local.` typing
I'm fairly sure that the way to make Puppet do what you suggest is the same in Puppet and Ansible. The difference is that Puppet is smart enough to not actually remove you file during every run (I think). On the other hand, Ansible will normally not be configured to run every 30 minutes like Puppet, so it's much less of an issue.
Both tools are great but they work some what differently. How you think about using them is much the same though, you need to tell the computer what to do. In Puppet this is often talked about is if you describe the state, I suppose that's partly true, but in the end you have a series of actions the computer will need to take to achieve this state.
> Python is not easy to build into portable binaries
https://pex.readthedocs.io/en/v2.1.40/buildingpex.html
- The package ecosystem is very hard to use in a reproducible way
pip, virtualenv, and requirements.in/txt is extremely reproducible. I will offer that it's not exactly idiot-proof yet and there are tons of stale tutorials out there
> The language is not truly typed - types add massive value for infrastructure and scripts because they are less likely to be unit-tested
Yes it is, if you want it to be. There's nothing stopping someone from using mypy, pyright, or other type tool on the strictest setting, and not passing builds unless you have 100% type coverage.
> The lack of a "let" or "var" keyword makes simple programming errors more likely (again, this code is less likely to be unit-tested)
No, but you get ~95% of the safety guarantees by using immutable-esque objects like @dataclass(frozen=True), pydantic models with the same, or attrs/cattrs with similar setting.
Terraform cannot deploy such a configuration in a single config, since its planning stage requires that all containers already exist. Terraform crashes when planning the user and role changes, saying that the database doesn't exist. This is a large pain-point when using Terraform. How does Pyinfra handle such deployments?
I agree that completion would be nice to have, and probably relatively hard to implement for koch.
However, I prefer the cleanliness, dare I say beauty, of the config file and Ruby.
To your point though, the `_make_command` method here is not setting hints in its arguments. I'm not super familiar if this is considered "fine" in a pydantic world, as I found for my usage, native python type hints were more than fine to make my code more usable and safer. Based on the code though, it seems like there are cases where the `command_attribute` is not a callable. What I don't understand is why this isn't hinted as a Union of Callable and whatever other types it could receive. I'd have to spend more than 3 minutes looking at the code base to understand how it's used to get a stronger idea here.
I use ansible for creating machine images or initial provisioning. (I don't run the ansible, someone racks the host, sets it's build state to install, and boots the host and it joins the appropriate cluster and people do container things. I don't necessarily know when my ansible runs against a host.
I also have a pretty good stack of ansible playbooks that I use manually day to day for hardware validation for new server models and one off type stuff. But again, I never really know what I'm running against or have pet servers.
A good chunk of hardware validation runs automatically if the boot target is set to hw-validate, but the whole point is that you are gonna find stuff that doesn't work with your standard process and either pass on it or adjust.
I do run tf to provision cloud infra so its transparent to the devs, and, honestly, not sure how ansible is dated and tf is not, they are pretty much the same thing in a different coat.
And honestly, generating thousands of lines of conflicting generic yaml isn't really much of an improvement over writing it once and running it automatically on 1000s of boxes.