Most active commenters

Popular/hot comments

(lyons-den.com)

Show context

ceejayoz ◴[15 Jul 25 02:42 UTC] No.44567373[source]▶

>>44566996 (OP) #

I saw this come up on /r/aws a few days back.

This response seemed illuminating:

https://www.reddit.com/r/aws/comments/1lxfblk/comment/n2qww9...

> Looking at (this section)[https://will-o.co/gf4St6hrhY.png], it seems like you're trying to queue up an asyncronous task and then return a response. But when a Lambda handler returns a response, that's the end of execution. You can't return an HTTP response and then do more work after that in the same execution; it's just not a capability of the platform. This is documented behavior: "Your function runs until the handler returns a response, exits, or times out". After you return the object with the message, execution will immediately stop even if other tasks had been queued up.

replies(2): >>44567415 #>>44567675 #

1. mcflubbins ◴[15 Jul 25 02:49 UTC] No.44567415[source]▶

>>44567373 #

If this is the case (it might very well be) I do at least find it odd that no one on AWS' side was able to explain this to them.

replies(6): >>44567446 #>>44567457 #>>44567464 #>>44567509 #>>44567669 #>>44569284 #

2. ceejayoz ◴[15 Jul 25 02:54 UTC] No.44567446[source]▶

>>44567415 (TP) #

Some people have a lot of certainty coupled with a lack of accuracy.

3. semiquaver ◴[15 Jul 25 02:55 UTC] No.44567457[source]▶

>>44567415 (TP) #

Based on the way the document is written, it seems very likely that several people did realize exactly what the misunderstanding was and try to explain it to them.

4. somethingAlex ◴[15 Jul 25 02:56 UTC] No.44567464[source]▶

>>44567415 (TP) #

It may just be such a basic tenant of the platform that no one thought to. You stop getting billed after lambda returns a response so why would you expect computation to continue? This guy expected free lunch.

replies(1): >>44567701 #

5. m3sta ◴[15 Jul 25 03:06 UTC] No.44567509[source]▶

>>44567415 (TP) #

AWS has a lot of employees. Many of them don't know AWS very well.

replies(2): >>44567698 #>>44567748 #

6. eddythompson80 ◴[15 Jul 25 03:47 UTC] No.44567669[source]▶

>>44567415 (TP) #

They did (in a typical non-fault admitting way). They didn't escalate to Lambda engineering team, then said that this is a code issue, and that they should move to EC2 or Fargate, which is the polite way of saying "you can't do that on lambda. its your issue. no refunds. try fargate."

OP seems to be fixated on wanting MicroVM logs from AWS to help them correlate their "crash", but there likely no logs support can share with them. The microVM is suspended in a way you can't really replicate or test locally. "Just don't assume you can do background processing". Also to be clear, AWS used to allow the microVM to run for a bit after the response is completed to make sure anything small like that has been done.

It's an nondeterministic part of the platform. You usually don't run into it until some library start misbehaving for one reason or another. To be clear, it does break applications and it's a common support topic. The main support response is to "move to EC2 or fargate" if it doesn't work for you. Trying to debug and diagnose the lambda code with the customer is out of support scope.

replies(1): >>44569159 #

7. scarface_74 ◴[15 Jul 25 03:57 UTC] No.44567698[source]▶

>>44567509 #

No AWS employee knows “AWS well”. AWS has a huge surface area. If you work on one of the service teams - ie the team that maintains the different AWS services - you are very much unlikely to know the entire AWS surface area and be focused on your team and surrounding services.

It use to be the case if you were interviewing for an SDE position, you were specifically told not to mention specific AWS services in the system design rounds and speak of generic technologies.

8. charcircuit ◴[15 Jul 25 03:58 UTC] No.44567701[source]▶

>>44567464 #

I don't see that on the main AWS Lambda pages. It just says that you pay for what you use. It would make sense that the time billed would be until there is no more code to execute.

replies(3): >>44567740 #>>44567843 #>>44573010 #

9. meepmorp ◴[15 Jul 25 04:07 UTC] No.44567740{3}[source]▶

>>44567701 #

Yeah, but if you actually read the developer documentation, they explain the execution model pretty extensively.

replies(1): >>44568686 #

10. jamesfinlayson ◴[15 Jul 25 04:09 UTC] No.44567748[source]▶

>>44567509 #

I remember interviewing a guy who worked at Amazon - he commented that he started trusting AWS much less after getting to know some of the developers who worked on AWS.

replies(1): >>44568638 #

11. somethingAlex ◴[15 Jul 25 04:32 UTC] No.44567843{3}[source]▶

>>44567701 #

Fair enough, I guess this just seems like a bold assumption to make since an explicit handler function is a cornerstone of lambda, rather than being able to run module level code and having the end self detected.

12. crinkly ◴[15 Jul 25 06:58 UTC] No.44568638{3}[source]▶

>>44567748 #

Yep that. We are a fairly high spend customer which gives us direct access to engineering managers there. Some of the non-core products are run by what looks like two guys in a trailer protected by a layer of NDAs and finger crossing. Had some pretty crazy problems even with their core stuff over the years which absolutely knocked my confidence.

At the same time, I've had two separate Apple SREs in the last 5 years tell me that I should never trust their cloud services.

If you can't see it and can't control it yourself, then you accept these things silently.

My distrust for this goes as far as the only thing actually being subcontracted out for me is Fastmail, DNS and domains and the mail is backed up hourly.

13. everfrustrated ◴[15 Jul 25 07:05 UTC] No.44568686{4}[source]▶

>>44567740 #

Developers will do anything except read the documentation.

replies(1): >>44570264 #

14. jiggawatts ◴[15 Jul 25 08:33 UTC] No.44569159[source]▶

>>44567669 #

I had a similar issue with Microsoft's own Application Insights dropping logs on the floor when used inside Azure Functions (equiv. of Lambda).

Same technical cause, it uses background tasks to upload logs. If the Function exits, the logs aren't sent.

This has been fixed since, but for a while generated a lot of repeated support tickets and blog articles.

replies(1): >>44570796 #

15. arpinum ◴[15 Jul 25 09:00 UTC] No.44569284[source]▶

>>44567415 (TP) #

They did, he has writings elsewhere where he says AWS told him it was an application code problem. The sub-title of this screed also says it.

This person should not be trusted to accurately recount a story, I would be sceptical of any claim in the doc.

16. meepmorp ◴[15 Jul 25 12:02 UTC] No.44570264{5}[source]▶

>>44568686 #

So many times I've caught myself thinking "I don't want to understand this shit, I just wanna fix it," as I've grudgingly opened up whatever docs I've avoided reading; almost as many times as I've wondered "what fucking moron wrote this code" before immediately `git blame`ing myself.

17. eddythompson80 ◴[15 Jul 25 13:14 UTC] No.44570796{3}[source]▶

>>44569159 #

I know for a fact that’s not true. You must have misunderstood the issue.

This is one of the main technical differences between Azure Functions and Google Cloud Run vs Lambda. Azure and GCP offer serverless as a billing model rather than an execution model precisely to avoid this issue. (Among many other benefits*)

Both in Azure and GCP you listen and handle SIGTERM (on Linux at least, Azure has a Windows offering and you use a different Windows thing there that I’m forgetting) and you can control and handle shutdown. There is no “suspend”. “Suspending nodejs” is not a thing. This is a super AWS Lambda specific behavior that is not replicable outside AWS (not easily at least)

The main thing I do is review cloud issues mid size companies run into. Most of them were startups that transitioned into a mid size company and now need to get off the “Spend Spend Spend” mindset and rain in cloud costs and services. The first thing we almost always have to untangle is their Lambdas/SF and it’s always the worst thing to ever untangle or split apart because you will forever find your code behavior differently outside of AWS. Maybe if you have the most excellent engineers working with the most excellent processes and most excellent code. But in reality all Lambda code takes complete dependency on the fact that lambda will kill your process after a timeout. Lambda will run only 1 request through your process at any given time. Lambda will “suspend” and “resume” your process. 99% of lambdas I helped move out of AWS have had a year+ trail of bugs where those services couldn’t run for any length of time before corrupting their state. They rely on the process restarting every few request to clear up a ton of security-impacting cross contamination.

* I might be biased, but I much much prefer GCR or AZF to Lambda in terms of running a service. Lambda shines if you resell it. Reselling GCR or AZF as a feature in your application is not straight forward. Reselling a lambda in your application is very very easy

18. nemothekid ◴[15 Jul 25 16:36 UTC] No.44573010{3}[source]▶

>>44567701 #

>until there is no more code to execute

What does "no more code to execute" mean to you? How would you define that?

replies(2): >>44573250 #>>44576270 #

19. Hikikomori ◴[15 Jul 25 16:54 UTC] No.44573250{4}[source]▶

>>44573010 #

In the context of lambda, when you return.

20. charcircuit ◴[15 Jul 25 22:00 UTC] No.44576270{4}[source]▶

>>44573010 #

Essentially !uv_loop_alive, when the event loop will exit.

↑

AWS Lambda Silent Crash – A Platform Failure, Not an Application Bug [pdf]