Francois Chollet is leaving Google

1. minimaxir ◴[13 Nov 24 23:23 UTC] No.42131340[source]▶

Genuine question: who is using Keras in production nowadays? I've done a few work projects in Keras/TensorFlow over the years and it created a lot of technical debt and lost time debugging it, with said issues disappearing once I switched to PyTorch.

The training loop with Keras for simple model is indeed easier and faster than PyTorch oriented helpers (e.g. Lightning AI, Hugging Face accelerate) but much, much less flexible.

replies(4): >>42131586 #>>42131775 #>>42133251 #>>42136668 #

2. dools ◴[13 Nov 24 23:56 UTC] No.42131586[source]▶

>>42131340 (TP) #

FTA "With over two million users, Keras has become a cornerstone of AI development, streamlining complex workflows and democratizing access to cutting-edge technology. It powers numerous applications at Google and across the world, from the Waymo autonomous cars, to your daily YouTube, Netflix, and Spotify recommendations."

replies(1): >>42131649 #

3. mistrial9 ◴[14 Nov 24 00:04 UTC] No.42131649[source]▶

>>42131586 #

sure -- all true in 2018; right about then pyTorch passed TensforFlow in the raw numbers of research papers using it.. grad students later make products and product decisions.. currently, pyTorch is far more popular, the bulk of that is with LLMs

source: pyTorch Foundation, news

replies(1): >>42132136 #

4. magicalhippo ◴[14 Nov 24 00:24 UTC] No.42131775[source]▶

>>42131340 (TP) #

As someone who hasn't really used either, what's pytorch doing that's so much better?

replies(3): >>42131884 #>>42131972 #>>42133260 #

5. jwjohnson314 ◴[14 Nov 24 00:40 UTC] No.42131884[source]▶

>>42131775 #

PyTorch is just much more flexible. Implementing a custom loss function, for example, is straightforward in PyTorch and a hassle in Keras (or was last time I used it, which was several years ago).

6. minimaxir ◴[14 Nov 24 00:54 UTC] No.42131972[source]▶

>>42131775 #

A few things from personal experience:

- LLM support with PyTorch is better (both at a tooling level and CUDA level). Hugging Face transformers does have support for both TensorFlow and PyTorch variants of LLMs but...

- Almost all new LLMs are in PyTorch first and may or may not be ported to TensorFlow. This most notably includes embeddings models which are the most important area in my work.

- Keras's training loop assumes you can fit all the data in memory and that the data is fully preprocessed, which in the world of LLMs and big data is infeasible. PyTorch has a DataLoader which can handle CPU/GPU data movement and processing.

- PyTorch has better implementations for modern ML training improvments such as fp16, multi-GPU support, better native learning rate schedulers, etc. PyTorch can also override the training loop for very specific implementations (e.g. custom loss functions). Implementing them in TensorFlow/Keras is a buggy pain.

- PyTorch was faster to train than TensorFlow models using the same hardware and model architecture.

- Keras's serialization for model deployment is a pain in the butt (e.g. SavedModels) while PyTorch both has better implementations with torch.jit, and also native ONNX export.

replies(1): >>42132254 #

7. paxys ◴[14 Nov 24 01:16 UTC] No.42132136{3}[source]▶

>>42131649 #

The existence of a newer, hotter framework doesn't mean all legacy applications in the world instantly switch to it. Quite the opposite in fact.

8. perturbation ◴[14 Nov 24 01:38 UTC] No.42132254{3}[source]▶

>>42131972 #

I think a lot of these may have improved since your last experience with Keras. It's pretty easy to override the training loop and/or make custom loss. The below is for overriding training / test step altogether, custom loss is easier by making a new loss function/class.

https://keras.io/examples/keras_recipes/trainer_pattern/

> - Keras's training loop assumes you can fit all the data in memory and that the data is fully preprocessed, which in the world of LLMs and big data is infeasible.

The Tensorflow backend has the excellent tf.data.Dataset API, which allows for out of core data and processing in a streaming way.

replies(1): >>42132363 #

9. minimaxir ◴[14 Nov 24 01:55 UTC] No.42132363{4}[source]▶

>>42132254 #

That's a fair implementation of custom loss. Hugging Face's Trainer with transformers suggests a similar implementation, although their's has less boilerplate.

https://huggingface.co/docs/transformers/main/en/trainer#cus...

10. ic_fly2 ◴[14 Nov 24 04:53 UTC] No.42133251[source]▶

>>42131340 (TP) #

We run a decent Keras model on production.

I don’t need a custom loss function, so keras is just fine.

From the article it sounds like Waymo run on Keras. Last I checked Waymo was doing better than the PyTorch powered Uber effort.

replies(1): >>42161474 #

11. adultSwim ◴[14 Nov 24 04:54 UTC] No.42133260[source]▶

>>42131775 #

Being successful is also why it's better. PyTorch has a thriving ecosystem of software around it and a large userbase. Picking it comes with many network benefits.

12. braza ◴[14 Nov 24 14:53 UTC] No.42136668[source]▶

>>42131340 (TP) #

I implemented Keras in Production in 2019 (Computer Vision Classification for Fraud Detection) in my previous employer and I got in touch with the current team, they are happy and still using it in production with small updates only (security updates).

In our case, we made some ensembling with several small models using Keras. Our secret sauce at that time was in the specificity of our data and the labeling.

13. hustwindmaple1 ◴[17 Nov 24 02:22 UTC] No.42161474[source]▶

>>42133251 #

well, is Waymo doing better than the PyTorch-powered Tesla?