←back to thread

229 points modinfo | 1 comments | | HN request time: 0.244s | source
Show context
simonw ◴[] No.40835549[source]
If this is the API that Google are going with here:

    const model = await window.ai.createTextSession();
    const result = await model.prompt("3 names for a pet pelican");
There's a VERY obvious flaw: is there really no way to specify the model to use?

Are we expecting that Gemini Nano will be the one true model, forever supported by this API baked into the world's most popular browser?

Given the rate at which models are improving that would be ludicrous. But... if the browser model is being invisibly upgraded, how are we supposed to test out prompts and expect them to continue working without modifications against whatever future versions of the bundled model show up?

Something like this would at least give us a fighting chance:

    const supportedModels = await window.ai.getSupportedModels();
    if (supportedModels.includes("gemini-nano:0.4")) {
        const model = await window.ai.createTextSession("gemini-nano:0.4");
        // ...
replies(7): >>40835678 #>>40835703 #>>40835717 #>>40835757 #>>40836197 #>>40836971 #>>40843533 #
1. pmg0 ◴[] No.40835757[source]
> But... if the browser model is being invisibly upgraded, how are we supposed to test out prompts and expect them to continue working without modifications against whatever future versions of the bundled model show up?

Pinning the design of a language model task against checkpoint with known functionality is critical to really support building cool and consistent features on top of it

However the alternative to an invisibly evolving model is deploying an innumerable number of base models and versions, which web pages would be free to select from. This would rapidly explode the long tail of models which users would need to fetch and store locally to use their web pages, eg HF's long tail of LoRA fine tunes all combinations of datasets & foundation models. How many foundation model + LoRAs can people store and run locally?

So it makes some sense for google to deploy a single model which they believe strikes a balance in the size/latency and quality space. They are likely looking for developers to build out on their platform first, bringing features to their browser first and directing usage towards their models. The most useful fuel to steer the training of these models is knowing what clients use it for