Goog is even heavily subsidising this. Anthropic is likely doing it with their top tiers as well. Even the small ones @20$ most likely did in the beginning.
In specific, I'm really proud of "spec driven development", which is based on the internal processes that software development teams at Amazon use to build very large technical projects. Kiro can take your basic "vibe coding" prompt, and expand it into deep technical requirements, a design document (with diagrams), and a task list to break down large projects into smaller, more realistic chunks of work.
I've had a ton of fun not just working on Kiro, but also coding with Kiro. I've also published a sample project I built while working on Kiro. It's a fairly extensive codebase for an infinite crafting game, almost 95% AI coded, thanks to the power of Kiro: https://github.com/kirodotdev/spirit-of-kiro
Unless it's literally a Cursor clone, I'd request to change it to describe the product category.
Cursor by no means defines the whole category. Not even close.
In a space that moves as quickly as "AI" does, it is inevitable that a better and cheaper solution will pop up at some point. We kinda already see it with Cursor and Windsurf. I guess Claude Code is all the rage now and I personally think CLI/TUI is the way to go for anyone that has a similar view.
That said, I'm sure there's a very big user base (probably bigger than terminal group) that will enjoy using this and other GUI apps.
>overage charges for agentic interactions will be $0.04 per interaction, and if enabled, will begin consuming overages once your included amounts are used (1,000 interactions for Pro tier, 3,000 for Pro+ tier). Limits are applied at the user level. For example, if you are a Pro tier customer who uses 1,200 requests, your bill would show an overage charge of $8 (200 × $0.04). Overages for agentic interactions must be enabled prior to use.
What is defined as an interaction?
EDIT: RTFM
>Whenever you ask Kiro something, it consumes an agentic interaction. This includes chat, a single spec execution, and/or every time an agent hook executes. However, the work Kiro does to complete your request—such as calling other tools, or taking multiple attempts—does not count towards your interactions.
You're basically advocating for GNU Emacs: https://github.com/karthink/gptel
How does Kirk deal with changes to the requirements? Are all the specs updated?
Is Claude Code good for the "ask" flow? No, right?
The old flow before agent mode got added. Select some code, ask questions about it or give an instruction on editing it and then choose to accept the change.
As I understand (I could be wrong), with agent mode, it edits the file for you, no way for you to accept before it does, so you have to manually check the diff, roll back parts you don't want, etc.
Am I right?
Realistically, I don't think that Harper's statement of "I get to play cookie clicker" is achievable, at least not for nontrivial tasks. Current LLM's still need a skilled human SDE in the loop. But Kiro does help that loop run a lot smoother and on much larger tasks than a traditional AI agent can tackle.
AI seems to be a way to engage happy users to try new things. Kiro joins a growing list of projects:
- Kiro (AWS)
- VSCode + Copilot (Microsoft)
- Windsurf (OpenAI tried to get it)
- Cursor
- Trae (Alibaba)
- Zed
- etc.
I put Zed in a separate category in the past. Now with assistants / agents, it's playing on the same space.
The market is a bit saturated and tools like Claude Code gave some flexibility and an alternative for users. I tried Cursor in the past, and now I'm back to Helix / VSCode + Claude Code.
Overall I do believe this has accelerated our development and I'm interested to see where it goes. I don't think it's a direct comparison to claude code or cursor - its a different approach with some overlap.
As a customer I have no incentive to try it.
I think that reputation is 100% Amazon’s fault. When all you do is ship half-baked rushed products your customers will assume your next big thing sucks because that’s the reputation you built for yourself.
And yes, Kiro is agentic, so it can (and often does) execute a long running multi-turn workflow in response to your interactions, however, the billing model is based on your manual interaction that kicks off the workflow (via chat, spec, or hook), even if that agent workflow takes many turns for Kiro to complete
Companies would benefit a lot by creating better onboarding flows that migrate users from other applications. It should either bring in the rules 1:1 or have an llm agent transform them into a format that works better for the agent.
In reality these tools would be best if they took a more socratic method, a more interactive pair programming approach. So instead of giving you a blanket diff to accept or refuse or "No, and here's some changes" -- it should be more dialog oriented.
Of all of them so far though, I think Claude Code is closest to this. IF you prompt it right you can have a much more interactive workflow, and I find that most productive.
You’re sort of technically correct but I wouldn’t really describe it this way exactly. You have to proactively accept or reject all changes to your files in some way.
It is almost impossible to accidentally commit code you don’t want.
It’s not really an edit in the same sense as an unstated change. It doesn’t even really do that until you accept the result.
It’s basically saving you a UI step compared to ask mode with basically no downside.
But at the same time, it's my biggest worry that they will continue on the AI and pollute the project with too much junk. I gotta trust the smart minds behind it will find a way to balance this trend.
I guess you lose tab-completion suggestions, but I am not a fan of those compared to 'normal' tab-complete (if backed by an lang server). If I want AI, I'll write a short comment and invoke the tool explicitly.
EDIT: Of course, it really depends an your usecase. I maintain/upgrade C code libs and utils; I really cannot speak to what works best for your env! Webdev is truly a different world.
EDIT2: Can't leave this alone for some reason, the backend thing is a big deal. Switching between Claude/Gemini/Deekseek and even rando models like Qwen or Kimi is awesome, they can fill in each other's holes or unblock a model which is 'stuck'.
How is one fork different from cursor or kiro or something else?
Arent these like what i assume skinning chromium or something more ?
I find the best way to use specs is to progressively commit them into the repo as an append only "history" showing the gradual change of the project over time. You can use Kiro to modify an existing spec and update it to match the new intended state of the project, but this somehow feels a bit less valuable compared to having a historical record of all the design choices that led from where you started to where you now are.
I think in the long run Kiro will be able to serve both types of use: keeping a single authoritative library of specs for each feature, and keeping a historical record of mutations over time.
There was a paper recently where they had an LLM evolve tool harnesses and got ~20% more than w/ aider on the benchmark they used, so it's pretty clear that the models + tools (+better harness) are better than just aider.
Also these steering rules are just markdown files, so you can just drop your other rules files from other tools into the `.kiro/steering` directory, and they work as is.
Just this morning, Cursor was giving me a ton of incorrect tab completions. When I use prompts, it tends to break more than it fixes. It's still a lot faster to write by hand. Lots of libraries that take *arguments in Python also cannot be groked by AI.
Lately, I started putting it together due to all the AI excitement. People try it because of the AI capabilities to find an IDE that works for them.
I hope Zed continues providing new amazing features in all the areas.
It's also interesting that the pricing is in terms of "interactions" rather than tokens. I don't believe I've seen that before.
> evolve tool harnesses
Claude code & Gemini cli etc. don't do this either
I doubt these tools will ever convince every last person on every single use case, so the existence of those people isn't exactly an indictment.
It’s also not free or unlimited (though throttled) like Cursor and Claude Code using max plan.
For example, you can use Kiro without having any AWS account at all. Kiro has social login through Google and GitHub. Basically, Kiro is backed by AWS, but is it's own standalone product, and we hope to see it grow and appeal to a broader audience than just AWS customers.
I think it is entirely possible to build a fantastic CLI tool for coding, and the CLI tools for coding already work well enough, but there is just more context info available inside of an IDE, therefore the ceiling is higher when working with an agent that runs inside of the IDE. Context is king for LLM results, and IDE's just have more context.
Over time I'm sure we'll see tools like Claude Code support everything that an IDE can do, but for now if you want to reach the same ceiling you still have to glue together a very custom setup with MCP tool use, and that has the downside of introducing additional tool use latency, compared to an IDE that is able to source context directly from the IDE's internal API, and provide that to the LLM nearly instantly.
As for 1), I agree but you force the model to work within aider's constraints. Claude4 for example excels at the agentic flow and it's better at that than providing the diffs that aider expects.
As for the last sentence, I disagree. They are evolving the stack, and more importantly they are evolving both at the same time, stack + LLM. That's the main reason they all subsidise use atm, they are gathering data to improve both. If I were to place a bet right now, I'd say that provider_tool + provider_LLM > 3rd party tool + same model in the short, medium and long term.
In my experience using Kiro you are still going to be hands on with the code. Personally I choose to turn AI powered autocomplete off because when I do touch the code manually it's usually for the purposes of working on something tricky that AI would likely not get right in autocomplete either.
However, the Kiro autocomplete is average in capability in my experience, and you can absolutely use it to write code by hand as well.
When you make a tool that is "model agnostic" you also make a tool that is unable to play to the individual strengths of each model, or you set yourself up for a massive, multiplicative effort of trying to tune your tool to every single popular model out there, even though some of the models are drastically less capable than others.
It’s not just the IDE but the ML model you are selling yourself to. I see my colleagues atrophy before me. I see their tools melt in their hands. I am rapidly becoming the only person functionally capable of reason on my own. It’s very very weird.
When the model money dries up what’s going to happen?
Redirect back to localhost:3128 is normal, that's where Kiro is watching for a callback, but the missing state is not normal. Something may have stripped the info out of the callback before it occurred, which is why I suspect an extension in your browser.
Will keep an eye on this though!
And Google killed it.
Going to take a while before I trust any AWS AI related tooling won't just be abandoned / mis-managed after my prior experience.
I too am old enough to have seen a lot of unnecessary tech change cycles, and one thing I've noticed about this industry is no matter how foolish a trend was, we almost never unwind it.
In the short term though, I think CLI-based tools like Claude Code are taking off because hardcore developers see them as the last "vestige" they have in separating themselves from the "noobs." They know there's still a good portion of the public who don't know how to use the terminal, install packages, or even know what Linux is.
Edit: I know there’s manual VSIX route.
That said, thanks for being willing to demo what kinds of things it can do!
What else does Kiro do differently?
Edit: The hooks feature looks nifty. How is the memory management handled? Any codebase indexing etc? Support to add external MCP servers like context7 etc?
I've published a sample project that is medium sized, about 20k lines encompassing a game client, game server, and background service: https://github.com/kirodotdev/spirit-of-kiro This has been all developed by Kiro. The way Kiro is able to work in these larger projects is thanks to steering files like these:
- Structure, helps Kiro navigate the large project: https://github.com/kirodotdev/spirit-of-kiro/blob/main/.kiro...
- Tech, helps Kiro stay consistent with the tech it uses in a large project: https://github.com/kirodotdev/spirit-of-kiro/blob/main/.kiro...
And yes, the specs do help a lot. They help Kiro spend more time gathering context before getting to work, which helps the new features integrate into the existing codebase better, with less duplication, and more accuracy.
Basically, the AI fast forwards through the easy stuff and I just spend all day jumping directly from hard problem to hard problem.
1) It's normal for Kiro (and almost every AI editor) to use a lot more CPU when you first start it up, because it is indexing your codebase in the background, for faster and more accurate results when you prompt. That indexing should complete at some point
2) On initial setup of Kiro it will import and install your plugins from VS Code. If you have a large number of plugins this continues in the background, and can be quite CPU heavy as it extracts and runs the installs for each plugin. This is a one time performance hit though.
3) If your computer is truly idle, most modern CPU's get throttled back to save power. When the CPU is throttled, even a tiny amount of CPU utilization can show up as a large percentage of the CPU, but that's just because the CPU has been throttled back to a very slow clock speed.
In my setup (minimal plugins, medium sized codebase, computer set to never idle the processor clock) I rarely see Kiro helper go above .4% CPU utilization, so if you are seeing high CPU it is likely for one of the above reasons.
Why are they shipping them with different key bindings? Seems like the opposite of what you do to encourage product adoption.
An agent running in the IDE can make use of all this context to provide better results. So, for example, you will see Kiro automatically notice and attempt to resolve problems from the "Problems" tab in the IDE. Kiro will look at what files you have open and attempt to use that info to jump to the right context faster.
The way I describe it is that the ceiling for an IDE agent is a lot higher than a CLI agent, just because the IDE agent has more context info to work with. CLI agents are great too, but I think the IDE can go a lot further because it has more tools available, and more info about what you are doing, where you are working, etc
The Q Developer CLI, Q Developer IDE plugins, and now Kiro are pretty much just wrappers around Claude Sonnet 3.7/4, and work just as well as them.
This recent OpenAI presentation might resonate too then:
Prompt Engineering is dead (everything is a spec)
In an era where AI transforms software development, the most valuable skill isn't writing code - it's communicating intent with precision. This talk reveals how specifications, not prompts or code, are becoming the fundamental unit of programming, and why spec-writing is the new superpower.
Drawing from production experience, we demonstrate how rigorous, versioned specifications serve as the source of truth that compiles to documentation, evaluations, model behaviors, and maybe even code.
Just as the US Constitution acts as a versioned spec with judicial review as its grader, AI systems need executable specifications that align both human teams and machine intelligence. We'll look at OpenAI's Model Spec as a real-world example.
For people wanting to get up and running with vanilla Emacs (instead of a distribution) so that they can try out gptel sometime this week, I recommend emacs-bedrock: https://codeberg.org/ashton314/emacs-bedrock
And for a gptel backend Gemini is the fastest route (excluding something local) from generating an API key to using a LLM in Emacs (for free).
Bonus points because Emacs is useful for things other than coding you can use gptel on your notes or any buffer really to ask/talk about stuff.
There is an AWS IAM Identity Center option for login as well: https://kiro.dev/docs/reference/auth-methods/#aws-iam-identi...
We really need to add some more step by step docs for setting this up, but it's very similar to the Amazon Q Developer integration with AWS IAM Identity Center if you are familiar with that: https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/sec...
Then I add it as a git submodule to my projects and tell whatever agents to look at @llm-shared/ and update its own rule file(s) accordingly
> Am I right?
With cursor you get reasonably flexible control at many levels. You can have it only suggest changes that you have to apply manually or you can have it make automatic changes with various ways to review, change, reject or accept. I usually have the changes made automatically but don’t accept the changes automatically. Cursor has a UI that lets you review each edit individually, for the whole file or all files. Depending on the situation I will use whichever level is appropriate. The UI also allows you to revert changes or you can ask the AI to undo or rework a change that you just approved so there’s plenty of ways to do large changes without giving up control. There’s also a stop button you can use to interrupt mid-stream if the work it’s doing isn’t what you want. It isn’t flawless but I haven’t found myself in a corner where I couldn’t get back to a happy path.
Is there any way to control this? I have my files.watcherExclude setting, does it respect that?
As for a hypothetical new "context setup" protocol like you posit, I suspect it'd benefit from the "cognitive tools" ideas in this awesome paper / project: <https://github.com/davidkimai/Context-Engineering>
^ inspiring stuff
I always keep the readme and some basic architecture docs (using markdown/mermaid) updated as I go, and I often just work on those rather than on code with Claude, because I find the value it offers is less in code generation and more in helping me document the rubber ducking process into useful schematics and architecture.
What can Kiro offer that's meaningfully better than what I'm already doing? I can take my system anywhere Claude Code and my repos can go, using whatever editor I like. Does Kiro have some special sauce for making this approach work better? Maybe some DSL it uses for more succinct and actionable diagrams and plans?
As much as I like the idea, I find it so hard to abandon a process I've been working on for months, using tools I'm already productive with.
Also, will pricing essentially be bedrock pricing, or will there be a value-add margin tacked on?
However with a large project, it seems that it indexed, then dropped CPU, then I started opening up files and working with them, then the CPU spiked again.
It actually has a pretty decent free tier, and maybe the subscription is better value than Claude Code, but hard to tell.
In my opinion, CLIs have a higher ceiling, and then they are easy to integrate into CI/CD, run them in parallel, etc.
But in the meantime I'm also the author of the "Learn by Playing" guide in the Kiro docs. It goes step by step through using Kiro on this codebase, in the `challenge` branch. You can see how Kiro performs on a series of tasks starting with light things like basic vibe coding to update an HTML page, then slightly deeper things like fixing some bugs that I deliberately left in the code, then even deeper to a full fledged project to add email verification and password reset across client, server, and infrastructure as code. There is also an intro to using hooks, MCP, and steering files to completely customize the behavior of Kiro.
Guide link here: https://kiro.dev/docs/guides/learn-by-playing/
Also,I don't mean to be rude to cursor but the fact that they are literally just a vscode wrapper still, to this day makes me really crazy thinking that the value of an AI editor could be so high..
I think it was the lack of competition really, Cursor (IMO) always felt like the biggest player, I think there was continue.dev before that, but that's all I know before Cursor.
After Cursor became a hit, there are lot more things now like (Void editor?) etc.
Also, if you Find Vscode editor slow, try zed. But like my brother said to me when I was shilling zed, Vscode is just for manipulating texts and using LSP. He personally didn't feel like it was there any meaningful slowness to Vscode even though he had tried zed. Zed has Ai stuff too iirc
Now Sure, they could've created CLI, but there are a lot of really decent CLI like SST/opencode and even gemini cli. Though I have heard good things about claude code too.
Honestly, I just think that any efforts in anything is cool. I just like it when there are a lot of options and so things stay a little competitive I guess.
It has a pretty decent free tier, and maybe the subscription is better value than Claude Code, but hard to tell.
It supports MCP as well.
Amazon Q also has a VC Code and IntelliJ Idea plugin too, Kiro goes beyond what you can do as a plugin in VS Code though, similar to why Cursor had to fork VS Code.
I read it (I think) in one of the comment that There is a model picker that currently allows you to switch between Claude Sonnet 4.0 and Claude Sonnet 3.7
So is this just using Claude?
I really thought that the advantages of using Kiro might really be that of the leverage that Amazon Gpu's infrastructure could provide, maybe even some discounts to lure people to Kiro.
I am pretty sure that a lot of people will ask you the same question, But I would really appreciate it if you could answer me this question in preferably simple terms: "Why Kiro? Why not all the other stuff that has come before it and the stuff that will come after it"
Also I am really having some dejavu but while writing this comment, has the title of this post changed, I swear I saw something written in the header with Amazon and now I don't see it. honestly, I am really being so off-topic but after seeing this name change of the post, I really wish if that there was some website that could track all the name changes of posts that happen in HN, because I was completely baffled by this name change or I am being totally paranoid.
Not as polished as Claude Code but also a bit price difference
Claude 4 can do it all already.
The advantage of something more purpose built for gathering context from the IDE is that you can skip a lot of roundtrips. Knowing the user's intent upfront, the IDE can gather all the necessary context data preemptively, filter it down to a token efficient representation of just the relevant stuff, add it in the context preemptively along with the user's prompt, and there is a single trip to the LLM before the LLM gets to work.
But yeah I agree with your point about CLI capabilities for running in parallel, integrating in other places. There is totally room for both, I just think that when it comes to authoring code in the flow, the IDE approach feels a bit smoother to me.
- Created by an AWS team but aws logo is barely visible at the bottom.
- Actually cute logo and branding.
- Focuses on the lead devs front and center (which HN loves). Makes it seem less like a corporation and more like 2 devs working on their project / or an actual startup.
- The comment tone of "hey ive been working on this for a year" also makes it seem as if there weren't 10 6-pagers written to make it happen (maybe there weren't?).
- flashy landing page
Props to the team. Wish there were more projects like this to branch out of AWS. E.g. Lightsail should've been launched like this.
In some sense, we are starting with a very high-level and gradually refining the idea to a lower and lower levels of detail. It is structured hierarchical thinking. Right now we are at 3 levels: requirement -> spec -> code. Exposing each of these layers as structured text documents (mostly Markdown right now it seems) is powerful since each level can be independently reviewed. You can review the spec before the code is written, then review the code before it gets checked in.
My intuition is that this pattern will be highly effective for coding. And if we prove that out at scale, we should start asking: how does this pattern translate to other activities? How will this affect law, medicine, insurance, etc. Software is the tip of the iceberg and if this works then there are many possible avenues to expand this approach, and many potential startups to serve a growing market.
The key will be managing all of the documents, the levels of abstraction and the review processes. This is a totally tractable problem.
What people do to avoid what you discussed, is multi-agents. The main agent can build up context, plan, than delegate execution to other agents, etc.
In my opinion, the benefit of the IDE is really just in the possibility of an improved UI/UX over a TUI.
I'd like to think so, but you'd have to compare the results to what you are currently doing to see how you feel about it. I personally love the format that it uses to define requirements, and the details of the software design docs that it writes (including mermaid diagrams)
> will pricing essentially be bedrock pricing, or will there be a value-add margin tacked on?
The pricing is a flat rate, with a cap on number of interactions per month. Each human driven "push" for Kiro to do something is an interaction toward your limit, but Kiro may work autonomously for many turns based on an interaction, and will produce significant amounts of code from a single interaction.
More details here: https://kiro.dev/pricing/
This, along with the "CHALLENGE.md" and "ROADMAP.md" document, is an incredibly cool way to show off your project and to give people a playground to use to try it out. The game idea itself is pretty interesting too.
It would be awesome if I ... didn't have to deal with AWS to use it. I guess maybe that might be a good use case for agentic coding: "Hey, Kiro - can you make this thing just use a local database and my Anthropic API key?"
Complaining aside though, I think that's just such a cool framework for a demo. Nice idea.
At $39/month, is 3000 interactions a high limit? I use Claude Code on the $30 plan (I think), and routinely hit limits. I'm not ready to jump to the next tier, though. I think it's $200/month, and the NGO I work for isn't prepared to throw that kind of cash at developers (I'm second-rate here; the science comes first)
Neither VSCode nor Cursor do this, so even if it's an extension triggering it somehow, the behaviour in Kiro is different to those other two.
I have found it extremely useful for spinning up personal projects though.
My wife bought us Claude subscriptions and she's been straight-up vibe coding an educational game for our son with impressive results (she is a UX designer so a lot more attuned to vibes than gen-pop). I'm picking up some computational physics research threads I dropped in grad school and Claude Code has been incredible at everything besides physics and HPC. Define and parse an input file format, integrate I/O libraries, turn my slapdash notes into LaTeX with nice TiKz diagrams, etc.
Hoping I can transfer over some insights to make it more helpful at work.
I wrote more about Spec Driven AI development here: https://lukebechtel.com/blog/vibe-speccing
I think Auth can be a bit of mess, but yes Its still absolutely great that I can just login with github and it just works, I am trying out Kiro right as we speak!
> Starting August 1, 2025, we’re introducing a new pricing plan for Amazon Q Developer designed to make things simpler and more valuable for developers.
> Pro Tier: Expanded limits $19/mo. per user
> 1,000 agentic requests per month included (starting 8/1/2025)
- https://aws.amazon.com/q/developer/pricing/
Previously agentic use was apparently "free", but with a set deadline in June, so it seems like this was just for a testing phase?
It is a huge hassle to match my existing settings, which I've spent countless hours tweaking over the years, with a new editor that can't import them. :(
It is my new daily driver.
IDEs don't feel slow, they ARE slow
because written in HTML and Javascript
go and try Delphi from 2005, it's blazing fast (and more functional...)
For reference 3000 interactions * assumed 3 mins of AI work per interaction / 60 mins per hour / 8 working hours per day equals 18.75 working days of nonstop back to back AI coding. Typical month has 20-23 working days. But realistically you likely won't be using Kiro nonstop all day back to back, so 3000 interactions per month should more than cover your work month.
> For users who access Kiro with Pro or Pro+ tiers once they are available, your content is not used to train any underlying foundation models (FMs). AWS might collect and use client-side telemetry and usage metrics for service improvement purposes. You can opt out of this data collection by adjusting your settings in the IDE. For the Kiro Free tier and during preview, your content, including code snippets, conversations, and file contents open in the IDE, unless explicitly opted out, may be used to enhance and improve the quality of FMs. Your content will not be used if you use the opt-out mechanism described in the documentation. If you have an Amazon Q Developer Pro subscription and access Kiro through your AWS account with the Amazon Q Developer Pro subscription, then Kiro will not use your content for service improvement. For more information, see Service Improvement.
Or, you know, stop chasing the latest trends, and use whatever you're most comfortable with.
Kiro will do it for you automatically.
If you were referring to the prompts inside of the game, you might find those fun and interesting. This one in particular is the heart of the game: https://github.com/kirodotdev/spirit-of-kiro/blob/main/serve...
Then in Kiro I see "There was an error signing you in. Please try again.".
FWIW, I've tried GitHub & Google, in different browsers, on different networks.
In all seriousness, I'm sure this will become more standardized over time, in the same way that MCP has standardized tool use.
I've long been interested in something that can gather lightweight rules files from all your subdirectories as well, like a grandparent rule file that inherits and absorbs the rules of children modules that you have imported. Something kind of like this: https://github.com/ash-project/usage_rules
I think over time there will be more and more sources and entities that desire to preemptively provide some lightweight instructive steering content to guide their own use. But in the meantime we just have to deal with the standard proliferation until someone creates something amazing enough to suck everyone else in.
I'm also thinking of creating a fork of the project that is designed to run entirely locally using your GPU. I believe with current quantized models, and a decent GPU, you can have an adequate enough fully local experience with this game, even the dynamic image generation part.
ActiveX and Java Web Start, etc all tried to do this, and all of them ended up deprecated and out of favor for native web solutions.
Java IDEs did a lot of this for many years (Eclipse, IntelliJ, NetBeans, JDeveloper, etc) and they worked reasonably well on the desktop, but had no path to offering a web hosted solution (like gitpod or codespaces)
There are not a lot of options here, compiling down a native solution to wasm and running it in the browser would work, I'm not sure if the performance would be substantially better or more consistent across all OS'es and web unfortunately.
So we are where we are :)
The docs state at https://kiro.dev/docs/reference/privacy-and-security/#servic... that "Kiro is an AWS application that works as a standalone agentic IDE."
But nowhere on the landing page or other pages it states that this is an Amazon product.
What is going on?
Edit: I see that @nathanpeck is the "author" and he works for Amazon, why are they trying to hide that fact?
If these re-skinned vscode IDEs have any good ideas I'm sure Microsoft will steal them anyway.
I integrated[1] the recently released Apple Container (instead of shell) to run codes generated by Kiro. It works great!
1. CodeRunner: https://github.com/BandarLabs/coderunner
1. Open Settings in Kiro.
2. Switch to the User sub-tab.
3. Choose Application, and from the drop-down choose Telemetry and Content.
4. In the Telemetry and Content drop-down field, select Disabled to disable all product telemetry and user data collection.
source: https://kiro.dev/docs/reference/privacy-and-security/#opt-ou...
- cmd-t fuzzy finding files of cmd-p finding symbols to open the various files that are relevant
- selecting a few lines in each file using fast IDE shortcuts to move and add
- drag and drop an image or other json files into prompt
- not leave the editor im already working on
Not to mention:
- viewing the agents edits as a diff in the editor and all the benefits of easily switching between tabs and one click rejecting parts etc
- seeing the sidebar of the agents thoughts and progress async alongside the code as I keep looking at things
- pausing the agent and reversing back steps visually with the sidebar
- not having to reconfig or setup my entire dev environment for some CLI - for example the biome v2 lsp just works since it’s already working in code which has the best support for these things
And really the list of reasons an editor is far better just never ends. Claude is ok, but I’m way way faster with Cursor when I do need AI.
Looking at the brief for this, it likely involves painstaking work to review and refine the specs produced. Not that there is anything wrong with that; as I said before in a comment on another story, coding assistants may reduce quite a bit of drudgery, but to get the best out of them you still need to do lots of work.
The more I use agentic coding tools, the more I come to the realization that speccing is where you add value as an experienced skilled engineer. And I think this bodes well for software engineering, as a bifurcation emerges between the vibe coders (who will probably merge with the mythical end-user programmers) and serious engineers whose are skilled at using LLMs via high quality specs to create software of higher quality and maintainability.
So the vibe coders would probably not take to this tool that much, but that's fine.
From the about page.
> Kiro is built and operated by a small, opinionated team within AWS.
Disclaimer: I work at AWS, different org though.
2. At every task it tried to compile the code but failed for dependency errors
3. It still marked the task being complete and passed the onus of failures on the downstream tasks
4. Kept moving with the tasks where the original error were still not fixed but the tasks were being marked as done
5. After some point of time I got tired to a degree that I stopped reading the exact commands being executed, the fatigue of doing something that you are not involved in is for real 6. I made a naive assumption that I can sandbox it by giving permissions to the project folder only. It executed some CLI commands for java that looked simple enough in the beginning.
7. Turns out my environment variables got messed up and other simple things related to git, gradle stopped working
Ended my experiment, reverted the code changes, fixed my environment
Key takeaways:
1. Its giving a sense of work being executed, the quality and concreteness of work is hard to measure unless you have already done that in past. Its creating classes, tests which are not needed instead of focussing on the actual use case.
2. Sandboxes are MUST, there is a real risk of corruption, environment commands are not just simple file changes which could be easily reverted.
Have you considered a fourth file for Implemented such that Spec = Implemented + Design?
It would serve both as a check that nothing is missing from Design, and can also be an index for where to find things in the code, what architecture / patterns exist that should be reused where possible.
And what about coding standards / style guide? Where does that go?
All the people I know in the US with those skills make huge amounts of money- at least, the ones who haven't already retired rich.
Clearly, companies view the context fed to these tools as valuable. And it certainly has value in the abstract, as information about how they're being used or could be improved.
But is it really useful as training data? Sure, some new codebases might be fed in... but after that, the way context works and the way people are "vibe coding", 95% of the novelty being input is just the output of previous LLMs.
While the utility of synthetic data proves that context collapse is not inevitable, it does seem to be a real concern... and I can say definitively based on my own experience that the _median_ quality of LLM-generated code is much worse than the _median_ quality of human-generated code. Especially since this would include all the code that was rejected during the development process.
Without substantial post-processing to filter out the bad input code, I question how valuable the context from coding agents is for training data. Again, it's probably quite useful for other things.
What if I want to set preferences for the underlying LLM for different usage scenarios? For example, for a quick and snappy understanding of a single file id want to use a fast model that doesn't cost me an arm and a leg. Recent research on preference-aligned LLM routing here: https://arxiv.org/abs/2506.16655
- Uses ripgrep under the hood
- VSCode fork (thus suffers from the https://ghuntley.com/fracture problem)
- There are 14 different ways defined to edit a file due to its multi-modal design. Tuning this is going to be a constant source of headaches for the team.
- Kiro utilises a https://ghuntley.com/specs based workflow.
They can try to market grab with low %, but will find themselves in the boat as Cursor and eventually be forced to raise their prices. Except their market grab will be significantly less effective because they're not a stand-out product. Cursor was.
To be clear, we have no intent to hide that Kiro is from Amazon / AWS, that's why you'll see Matt Garman, for example, posting about Kiro: https://www.linkedin.com/feed/update/urn:li:activity:7350558...
However, the long term goal is for Kiro to have it's own unique identity outside of AWS, backed by Amazon / AWS, but more friendly to folks who aren't all in on AWS. I'll admit that AWS hasn't been known in recent years for having the best new user or best developer experience. Kiro is making a fresh start from an outsider perspective of what's possible, not just what's the AWS tradition. So, for example, you can use Kiro without ever having an AWS account. That makes it somewhat unique, and we aim to keep it that way for now.
If we take it far enough, we could end up with a well structured syntax with a defined vocabulary for specifying what the computer should do that is rigorously followed in the implemented code. You could think of it as some kind of a ... language for .... programming the computer. Mind blowing.
It’s a bit more than I pay in CAD, but I’d pay quite a bit more just to stop hitting the limits I have with Claude, even if the rest of the service was identical. It’s a pain. My usage is also very bursty so I spend several days getting no usage, then repeatedly hit limits while I brainstorm and spec things out.
I’m thinking out loud here in case it’s useful feedback. It seems like a great pricing schemes for my use case.
I thought AI was ushering in the age of innovation so why is the only innovation anyone seems capable of copying something that already exists...?
In all actuality, AI seems to be ushering out the age of innovation since people now consider it foolish to spend their time trying to innovate instead of clone
This is all a fuzzy memory, I could have multiple details wrong.
This is nice for documentation but really having a design document after-the-fact doesn't really help much. Designing is a decision-making process before the code is written.
It gets really frustrating reviewing people's designs at times, when it's crystal clear they're a) working backwards and b) haven't really considered the customer experience at all.
One of my favourite tell tale signs of a) is when the chosen option 100% fits the specifications, doubly so if there's no cons associated with the pros. Sometimes it's genuine, but very rarely.
- Machine code
- Assembly code
- LLVM
- C code (high level)
- VM IR (byte code)
- VHLL (e.g. Python/Javascript/etc)
So, we already have hierarchical stacks of structured text. The fact that we are extending this to higher tiers is in some sense inevitable. Instead of snark, we could genuinely explore this phenomenon.
LLMs are allowing us to extend this pattern to domains other than specifying instructions to processors.
Just because you use AI does not mean that you need to be careless about quality, nor is AI an excuse to turn off your brain and just hit accept on the first result.
There is still a skill and craft to coding with AI, it's just that you will find yourself discarding, regenerating, and rebuilding things much faster than you did before.
In this project I deliberately avoided manual typing as much as possible, and instead found ways to prompt Kiro to get the results I wanted, and that's why 95% of it has been written by Kiro, rather than by hand. In the process, I got better at prompting, faster at it, and reached a much higher success rate at approving the initial pass. Early on I often regenerated a segment of code with more precise instructions three or four times, but this was also early in Kiro's development, with a dumber model, and with myself having less prompting skill.
If there was such a thing you would just check in your prompts into your repo and CI would build your final application from prompts and deploy it.
So it follows that if you are accepting 95% of what random output is being given to you. you are either doing something really mundane and straightforward or you don't care much about the shape of the output ( not to be confused with quality) .
Like in this case you were also the Product Owner who had the final say about what's acceptable.
I am nowhere near being a lawyer, but I believe the promise would be more legally binding, and more likely to be adhered to, if money was exchanged. Maybe?
The "Amazon Q Developer Pro" sub they mention appears to be very inexpensive. https://aws.amazon.com/q/pricing/
Qt is pretty good at this actually. I don’t have a Mac, but building the same codebase for windows, linux, and a wasm target was pretty neat the first time I did it.
Otherwise the spec may cover requirements that are already met in the existing code and needs to understand integration points it needs to include in the spec.
Having used Kiro myself I think it does what you expect.
If I've atrophied in certain aspects of my thinking, I honestly think I've more than made up for it in learning how to engineer the context and requirements for Claude Code more effectively and to quickly dive in to fix things without taking my hands off the keyboard and leaving the terminal.
Coding standards / style guide are both part of the "steering" files: https://kiro.dev/docs/steering/index
The good: It's great to see they've baked in the concept of setup -> plan -> act into the tool with the use of specs If you're someone who currently only has Copilot / Q dev, this is a good step in the right direction - if you don't mind changing your IDE. I love that it has a command / task queuing system. Hooks = good.
Goes either way: Even though it uses Q under the bonnet, it does seem somewhat better than Q although I think most of that is down to the use of plan -> act workflows
The not good: There's no reason at all for it to be a VSCode fork and running multiple IDEs for every vendor that wants me to use their products is a PITA. It seems to massively over-complicate solutions, for things that could be quite simple even if the tasks are well defined it likes to create many files and go down very extensive and complex implementation patterns. This has to be something to do with the app itself as Sonnet 4 does not do this with Cline/Roo Code so hopefully it can be fixed (but maybe it suits the kind of folks that write big java apps!). It doesn't seem to have any integrated web browser capabilities to allow the model to run up and explore the app while inspecting the js console browser side like Cline / Roo have. My installation has mysteriously become 'corrupted' and needed reinstalling several times. There's no global rules, you have to keep a note of them somewhere and copy paste them into each project. GH Issue #25
The bad: It's slower than Cline / Roo Code, it just takes a lot longer to get things done. It's very easy to hit the rate limits and be blocked for an undefined amount of time before you can continue working. There's lots of MCP bugs mainly relating to it still not supporting the standard streamableHTTP or SSE modes and breaks all MCPs without any warning or logs if you try to add one. GH Issue #23 The version of VSCode it's built from is quite out of date already, which rings alarm bells for how well such a large, complex application will be maintained and rolled out at speed over time.
If they are saying the code in this project was in line with what they would have written, I lean towards trusting their assessment.
Within the next couple of years there's going to be a 4-for-1 discount on software engineers. Welcome to The Matrix. You'd best find Morpheus.
Check out the comments on https://news.ycombinator.com/item?id=44567857 and tell me what the alternative future is. Best wishes and good luck.
The better question is why is there this horrible monoculture in SW startups around raising money through VCs? We need more regular businesses who build something useful and charge a fair price. Period.
How do we know if random internet service sells our email / password pair? They probably store the hashed password because it's easier (libraries) than writing their own code, but they get it as cleartext every time we type it in.
Not the ones maintaining frontend web apps or "vibe coding".
Essentially, the user labels (accept/edit) data (design documents) for the agent (amazon)
For that, we can just use a unique password per service. That's not really a thing for code.
Natural language is trying to be a new programming language, one of many, but it's the least precise one imho.
This implies your "content" may be used for anything else, including training non-foundation LLMs. Frankly, even if their disclaimer were broader, I'd still probably not trust them.
I don't mind that everyone is all-in with VSCode now, but I already paid $500 for the big-boy version and I've got 20k hours on it.
Click about, says its made by a team within AWS.
Click on any of the legal links in footer, get sent to AWS
Look at the footer, has AWS logo
Look at the license, clearly says "Amazon.com, Inc. or its affiliates"
On the download page "By downloading and using Kiro, you agree to the AWS Customer Agreement"
So, I can use the tools I use anyway and have AIs adapt to me instead of me having to adapt to new AI powered tools. I'm using a proper IDE (intellij). Me switching to cursor, kiro, or whatever would be an enormously massive downgrade for me. These tools don't come close to the utility and features of what I am used to and depend on. And those new AI tools trying to catch up with Intellij is not their focus or roadmap. I'm not going to wait for that to happen. I need stuff that works now. Not some years after they figure it out. And that includes AI features.
There's a difference between vibe coding where you are sitting on your hands and admiring all the crazy clever stuff the AI does for you that you wouldn't be able to do yourself and working on a system that you've spent years building from scratch with AI to assist you. I do the latter. I'm constantly intervening, dismissing poor results from AI, getting frustrated with LLMs misunderstanding things, ignoring my directions, not getting the full context, etc. But I'm also getting a lot of value out of AIS with dealing with tedious/repetitive stuff, figuring out weird bugs, pointing out my mistakes, or generating perfectly usable solutions for TODOs and FIXMEs I leave in my code. About 50-60% of the PRs codex creates for me are pretty usable.
I use ChatGPT for the small stuff (it can look at intellij and apply diffs) and codex for the bigger stuff "implement foo, add some tests, and tell me when I can look at the PR". And maybe I'll check out the branch and fix a few things myself. That's something my IDE supports very well. It's not a big deal. It doesn't need to be fixed.
I find that increasingly, model quality is not the main blocker for this stuff but the developer/user experience is. Claude might be better. But chat gpt has the far better UX. And I don't even use o3 most of the time. I prefer the more rapid responses other faster models give me. It's not a cost thing but a speed thing. I only escalate to slower models when I don't like the response I'm getting. Codex is nice but slooooooow. But at least I can work on other stuff while it is doing its thing. ChatGPT gives me instant gratification. Select line, Option+shift+1, "Fix this", "....", "apply fix". That's so nice and I do that a lot. And I didn't have to replace my tools. In the same way, Claude code might be marginally better at some stuff. But the Codex developer experience is superior.
So, Kiro sounds like a nice tool for people who don't need or use IDEs. But it's not for me.
Kiro has the same problem as many LLM coding tools has: it's not economically sustainable for the company producing the tool (bubble will burst at some point), or it's not worth it for the developer.
I disagree that natural language is trying to be a programming language. I disagree that being less precise is a flaw.
Consider:
- https://www.ietf.org/rfc/rfc793.txt
- https://datatracker.ietf.org/doc/html/rfc2616
I think we can agree these are both documents written in natural language. They underpin the very technology we are using to have this discussion. It doesn't matter to either of us what platform we are on, or what programming language was used to implement them. That is not a flaw.
Biological evolution shows us how far you can get with "good enough". Perfection and precision are highly overrated.
Let's imagine a wild future, one where you copy-and-paste the HTML spec (a natural language doc) into a coding agent and it writes a complete implementation of an HTML agent. Can you say with 100% certainty that this will not happen within your own lifetime?
In such a world, I would prefer to be an expert in writing specs rather than to be an expert in implementing them in a particular programming language.
At best an LLM is a new UI model for data. The push to get them writing code is bizarre.
Who is accountable?
* requirements doc
* design doc
* plan doc
These alone make an interesting product. Give me a thing that asks me the right, thought provoking questions (ideally let me answer by just choosing from a list of relevant options) and produces these docs in a structured and highly detailed way, and I’m set.
I think the code is just a distraction. There are plenty of tools for that.
I feel like this might be nicer as an MCP and I bring my own AI assistant to it.
> in line with what they would have written,
point i am making is that they didn't know what they would've written. they had a rough overall idea but details were being accepted on the fly. They were trying out bunch of things and see what looks good based on a rough idea of what output should be.
In a real world project you are not both product owner and coder.
We really need a new (and cheaper!) SOTA for agentic models.
Will be trying Kiro, excited to see how you approached implementing a similar idea!
At what point do you actually do engineering? This was a great demo for a project manager. Lead your “team” through feature development. But without proper architecture, development really does become spaghetti.
There’s vibe coding - and then there’s this, where I feel like I’m a PM, not an engineer.
For any real-world application that is even slightly more complex than these demos it will start to fail.
At least that has been my experience
not sure why you brought up YC, not relevant at all
HN comments aren't an ad network...
What I can tell you is that the last time I checked: laws are written in natural language, they are argued for/against and interpreted in natural language. I'm pretty confident that there is applicable precedent and the court system is well equipped to deal with autonomous systems already.
I generally like the integration but in some cases it's getting in the way of other ai that is runnin q to quit and all of a sudden its in q... I renamed it to amazonq and removed it from my zshrc and added it as a command to integrate, amazonqinit
i also need to individually approve each command for some reason and then if it fails due to a service high load i need to manually restart all over again the same task.
1. autopilot should be off by default. this is the norm for claude code and cline (plan mode)
2. my sidebar is on the right. that was imported correctly from vs code, but the kiro window is still on the left. why can't it be on the same side as my sidebar like a usual extension (e.g. cline)?
3. the textbox is super busy. for some reason the "Hold Shift to drop image" is stuck there
Agentic tools of the future will be rich notebook/chat interface that's available in all form factors, which is to say, most likely web/cross platform apps.
Some agents have multiple prompts that are used for different modes; I’ve typically seen this stored as JSON that is agent specific and wouldn’t necessarily apply to different agents.
The only agent specific thing I’ve ever included in a context file is referring to a specific tool. I probably could have abstracted that by describing it as “the tool that does X” or by just telling it to do the function that the tool does.
If anything, TUIs are the awkward in-between of "human in the loop, but with poor tools" where one side is fully automatic, agents suggesting fixes on issue tracker, and the other is holding-AI's-hand where you review every step one at a time.
I hate trying to copy paste in/out of Claude Code's unnecessarily-cute boxed text input.
Zed's implementation of the agent feedback loop isn't yet as good as Claude Code, but there's nothing inherently IDE-related in the parts that are lacking.
There was never a point in which it successfully did anything.
1. Tried to use macOS’s toolchain (m2 laptop so this is not going to work to build x86 binary)
2. It tried to fix that and failed multiple times.
3. Eventually I gave up on it trying to fix itself and told it to just use a container with Fedora Linux on it to work around the issue.
4. It created Dockefile for Fedora 39 (current is 42)
3. It still fails to recognize that we are on aarch64 and we need x86 so the container it built was not correct anyway lol
I imagine “minimal bootloader + printing hello” is quite represented in the training set as there’s thousands of projects like this on GitHub.
If it cannot deal with basic things like this, I legitimately don’t get all the comments here praising it
If AI can handle things like that, then let AI do it: it's not really engineering work anyway; it's copy-and-paste from a previous design, just change the handler logic and the names of things. If 90% of incoming features are like that, then that gives you a lot more time to work on the 10% that are more complex.
Eventually, you'll end up with spaghetti code no matter how well you plan out the architecture, whether human or AI is doing designs. But it'll move that direction even faster with AI, and eventually AI won't be able to understand it well enough to reliably design things anymore. That's where the real engineering will come in. As the system evolves, how do we re-architect things so that AI (and humans) can understand the patterns again and make future changes more reliably?
Right now, it seems like services go through major refactors/rewrites like that every five years or so. And those rewrites tend to be slow and often unsuccessful: even though the existing system is complex, engineers are used to it and it's easier to add one more bandaid than to wait for the full rewrite. Then such rewrites can get stuck in navel-gazing as there's no "perfect" way to do them, and it's lower effort just to go back to the system you already know.
As AI creates more churn though, the architecture will need to be rethought much more frequently. Additionally there will be more urgency to deliver the cleanup because AI will be completely blocked by the existing spaghetti, which brings all product dev to a halt, and you don't have time for navel-gazing because there's no fallback option.
So I think the engineering work post-AI is really going to be this kind of infrastructural planning and rearchitecting, such that AI can deliver features on top of it without friction. And in a way, as an engineer, that's what I want to be doing anyway. We've always had this ideal of continuous refactoring and continuous improvement, that always gets pushed to the backburner when compared to feature development. "Sure this refactor will help future velocity, but we need to make our quarterly goals!" But now, AI will compress those timelines so that maintaining clean architectures has a direct effect on the deliverables of the current quarter.
I personally think this is great. If, in the future, PMs can launch whole features without engineers writing a line of code, that's awesome. It's our job to maintain a system where such an ideal is possible. Which sounds like the job I wanted when I originally signed up to be an engineer.
Not really. Electron is basically web browser.
The issue is that a cornerstone of modern development is basically "don't rewrite what already has been written", however the problem is that you always get optimization creep because of this - people just build shit on top of other shit continuously and never go back and optimize.
Yuk. That’s pure project management.
> If, in the future, PMs can launch whole features without engineers writing a line of code, that's awesome.
No it’s not! Because then, there is no incentive to engineer anything anymore. Just let the AI do it, Yey! Need new features? Let the AI do it! Need to fix your infrastructure, AI has your back! Has your product gotten so unwieldy that you have context rot and AI can’t do it anymore? Pivot, throw it away, rebuild with AI, who needs engineers.
I am not, and don't expect to be able to do that for many years yet. The models aren't that good yet.
I would estimate that I accepted perhaps 25% of the initial code output from the LLM. The other 75% of output I wasn't satisfied with I just unapplied and retried with a different prompt, or I refactored or mutated it using a followup prompt.
In the final project 95% of the committed lines of code in the published version were written by AI, however there was probably 4x as much discarded AI generated code along the way that was also written by AI. Often the first take wasn't good enough so I modified it or refactored it, also using AI. Over the course of using the project I got better at providing more precise prompts that generated good code the first time, however, I rarely accepted the first draft of code back from Kiro without making followup prompts.
A lot of people have a misguided thought that using AI means you just accept the first draft that AI returns. That's not the case. You absolutely should be reading the code, and iterating on it using followup prompts.
Or can you join the Discord and message me directly @swaminator
VSCode has some popular paid plugins like LSPs or some for git.
I dont see why it wouldnt be possible to monetize a VSCode plugin.
So while we're in the middle of those two pivot points, I think most of our work will be on the architecture side. Continuously clean up the platform so the LLM agents can keep humming along on it.
Eventually we'll perhaps get to the point where AI can automate 100% of this as well, and I have no clue what will become of engineers then. But I don't see that happening in the next ten years, and even when it does happen, I'm sure the changes will create whole new industries, workflows, and sets of problems for human engineers to solve. (SciFi me expects a whole new field of extracting the most value out of AI without letting it run amock. As fast as it goes, and as intelligent as it will be, we won't be able to just let it take over. We'll need to design guardrails for it so that it does the things you want, and doesn't make decisions that you don't want. This, by definition, has to be a human driven process. So I think there'll be work for human engineers to do for a long long time.)
Also if you're interested our plugin is https://sweep.dev!
I found this approach to be quite poor from the standpoint of cloud seems to treat user instructions very differently than other agentic code assistants and I think it’s because their system prompt is so long. As a result, I’m getting some pretty poor adherence to my claude.md file and I’ve noticed that very rarely if ever have I seen it reverse the file to any of the nested paths.
So.. im going to try to refactor my files to keep it all in the same file, with some heavy “LOOK AT ME” prompt engineering.
How about you?
And the way I see the future of coding is that should should be able to code from anywhere, mobile, web, your computer. You already have your code on the cloud (most of the time). Neither TUI or IDE works well currently for that.