HTTP/1.1 PATCH /user
Content-Type: application/json-patch+json
[
{
"op": "replace",
"path": "/username",
"value": "bob"
}
]
why not HTTP/1.1 PATCH /user/username
Content-Type: application/json
"bob"
I feel like you could pretty sensibly implement replace/delete/append with that.Edit: "test" and "copy" from the json patch spec are unique! So there is those, as well as doing multiple edits all at once in a sort of "transaction".
The biggest thing on my wishlist for a system like this is a standardized syntax for choosing an item in a list by an identifying (set?) of key-value pairs. E.g. for
{
"name": "Clark St Garden",
"plants": [
{ "latin": "Phytelephas aequatorialis", year: 2009 },
{ "latin": "Ceiba speciosa", year: 2009 },
{ "latin": "Dillenia indica", year: 2021 }
]
}
I'd like to be able to specify that I want to update Ceiba speciosa regardless of its index. This gets especially important if we're adding items or trying to analyze diffs of previous versions of a json itemBut I just find that the 1 by 1 approach is easier to reason about if you're opening this up to the internet. I'd personally feel more comfortable with the security model of 1 URL + Session => 1 JSON key.
Turns out there is https://www.ietf.org/archive/id/draft-goessner-dispatch-json...
For JSON Patch to be useful and efficient, it requires both sides to use JSON for representation. As in, if you have some native structure that you maintain, JSON Patch either must be converted into operations on your native structure, or you have to serialize to JSON, patch, and deserialize back to the native structure. Which is not efficient, and so either you don't use JSON Patch or you have to switch to JSON as your internal representation, which is problematic in many situations. (The same arguments apply to the sending side.)
Not only that, but you become dependent on the patch to produce exactly the same result on both sides, or things can drift apart and later patches will start failing to apply, requiring resynchronization. I would probably want some sort of checksum to be communicated alongside the patch, but it would be tough to generate a checksum without materializing the full JSON structure. (If you try to update it incrementally, then you're back to the same problem of potential divergence.)
I mean, I can see the usefulness of this: it's like logical vs physical WAL. Or state- vs operation-based CRDTs. Or deltas vs snapshots. But the impact on the representation on both sides bothers me, as does the fact that you're kind of reimplementing some database functionality in the application layer.
If I were faced with this problem and didn't already represent everything as giant JSON documents (and it's unlikely that I would do that), I think I'd be tempted to use some binary format that I could apply the rsync algorithm to in order to guarantee a bit-for-bit identical copy on the other side. Basically, hand the problem off to a fully generic library that I already trust. (You don't have to pay for lots of round-trip latencies; rsync has batch updates if you already know the recipient has an exact matching copy of the previous state.) You still have to match representations on the sending and receiving side, but you can use any (pointer-free) representation you want.
Why can’t your protocol just be valid JavaScript too? this.name = “string”; instead of mixing so many metaphors?
Anyway, a few resources to help you learn:
FTFY
Since JSON is a subset of JS, I would have expected `.` to be the delimiter. That jives with how people think of JSON structures in code. (Python does require bracket syntax for traversing JSON, but even pandas uses dots when you generate a dataframe from JSON.)
When I see `/`, I think:
- "This spec must have been written by backend people," and
- "I wonder if there's some relative/absolute path ambiguity they're trying to solve by making all the paths URLs."
```
{ "checksum": { "algorithm": "sha1", "normalization": "minify", "root-checksum": "d930e659007308ac8090182fe664c7f64e898ed9" }, "patch": [ { "op": "replace", "path": "/id", "node-checksum": "b11ee5e59dc833a22b5f0802deb99c29fb50fdd0", "value": { "foo": "bar", "nullptr": 0 } }, { "op": "replace", "path": "/cat", "original-value": "foo" "value": "bar" } ] }
```
[1] https://user-images.githubusercontent.com/50021387/184360079...
It's not really hard to protect yourself against that.
Any (competent) security guy can give you like 4 ways to implement it properly.
Also, if someone is using this in production: any gotchas?
> I think I'd be tempted to use some binary format
And now you require both sites to use a binary format for representation. And then you have the same list of challenges.
As for working at Mozilla: oh, it's worse than that, I'm the one who did the most recent round of optimization to JSON.stringify(). So if you're using Firefox, you're even more vulnerable to my incompetence than you realized.
Furthermore, I'll note that for Mozilla, I do 90% of my work in C++, 10% in Python, and a rounding error in JS and Rust. So although I work on a JS engine, I'm dangerously clueless about real-world JS. I mostly use it for fun, and for side projects (and sometimes those coincide). If you expect JS engine developers to be experts in the proper way to use JS, then you're going to be sorely disappointed.
That said, I'd be interested to hear a counterargument. Your argument so far agreed with me.
[1] https://www.rfc-editor.org/info/rfc6902
You probably want to have some constraints on the kinds of patch objects you apply to avoid simple attacks (e.g. a really large patch, or overly complex patches). But you can probably come up with a set of rules to apply generally to the patch without trying to validate that the patch value for the address meets some business logic. Just do that at the end.
{ "name": "bob", "phone" null }
This would set the name to bob, null out the phone, but leave all other fields untouched. No need for a DSL-over-json.
Only trouble is static type people love their type serializers, which are ever at a mismatch with the json they work with.
What about just stringifying a JS function?
The alternative is the modify the large array on the client side and send the whole modified array every time.
Fair point. I probably overstated the weaknesses of the JSON model; it's fine for many uses.
But I like Map, and Set, and occasionally even WeakMap. I especially like JS objects that maintain their ordering. I'm even picky enough to like BigInts to stay BigInts, RegExes to stay RegExes, and graphs to have direct links between nodes and be capable of representing cycles. So perhaps it's just the nature of problems I work on, but I find JSON to be a pretty impoverished representation within an application -- even with no OOP involved.
It's great for interchange, though.
>> I think I'd be tempted to use some binary format
> And now you require both sites to use a binary format for representation. And then you have the same list of challenges.
Requiring the same format on both sides is an important limitation, but it's less of a limitation than additionally requiring that format to be JSON. It's not the same list of challenges, it's a subset.
Honestly, I'm fond of textual representations, and avoid binary representations in general. I like debuggability. But I'm imagining an application where JSON Patch is useful, as in something where you are mutating small pieces of a large structure, and that's where I'd be more likely to reach for binary and a robust mirroring mechanism.
JSON is not JavaScript (despite the "J"), and `undefined` is not a part of JSON specification.
However, I think every language out there that has an dictionary-like type can distinguish between presence of a key and absence of one, so your argument still applies. At the very least, for simple documents that don't require anything fancy.
I believe this simple merge-based approach is exactly what people are using when they don't need JSON Patch. If you operate on large arrays or need transactional updates, JSON Patch is probably be a better choice, though.
> Only trouble is static type people love their type serializers, which are ever at a mismatch with the json they work with.
I don't think it's a type system problem, unless the language doesn't have some type that is present in JSON and has to improvise. Typically, it's rather a misunderstanding that a patch document (even if it's a merge patch like your example) has its own distinct type from the entity it's supposed to patch - at the very least, in terms of nullability. A lot of people (myself included) made that blunder only to realize how it's broken later. JSON Patch avoids that because it's very explicit that patch document has its own distinct structure and types, but simple merge patches may confuse some.
For example, Firebase doesn't let you store null values. Instead, for Firebase, setting something to null means the same as deleting it. With a single simple restriction like that, you can implement PATCH simply by accepting a (recursive) partial object of whatever that endpoint. Eg if /books/1 has
{ title: "Dune", score: 9 }
you can add a PATCH /books/1 that takes eg { score: null, author: "Frank Herbert" }
and the result will be { title: "Dune", author: "Frank Herbert" }
This is way simpler than JSON Patch - there's nothing new to learn, except "null means delete". IMO "nothing new to learn" is a fantastic feature for an API to have.Of course, if you can't reserve a magic value to mean "delete" then you can't do this. Also, appending things to arrays etc can't be done elegantly (but partially mutating arrays in PATCH is, I'd wager, often bad API design anyway). But it solves a very large % of the use cases JSON Patch is designed for in a, in my humble opinion, much more elegant way.
* Agreed by both parties to be the protocol they'll use
* Used for "representation" (assuming we mean the same by that, if not, please clarify)
>So if you're using Firefox
I jumped ship ten years ago; but I've heard you guys are doing quite well?
>I'm dangerously clueless about real-world JS.
Agree.
Disclosure: I'm @moralestapia but my laptop ran out of battery and this is my backup account, lol.
You want them to all fail or not,
One-by-one is a bit of a weird suggestion tbh. You shouldn't be reasoning that way about code.
If you are going to get a 4xx response to one of the 4 property updates you want them all to fail at once.
Just like anything else we use like SQL.
{
"makes": {
"toyota": {
"models: [ ... ]
}
}
}
"I need all the models of Toyota cars."Or
"Toyota came out with a new Camry, I need to update the Camry object within Toyota's models."
That's the format that people tend to naturally use. The main problem is that arrays can only be replaced.
[1] https://zuplo.com/blog/2024/10/10/unlocking-the-power-of-jso...
[1]: https://github.com/tktech/py_yyjson [2]: https://tkte.ch/py_yyjson/api.html#yyjson.Document.patch
The point is that this is a data structure and not a web server. It's using a convention from one domain in a different one. Relatively minor in the scope of possible quibbles, but it's just one more thing to remember/one more trivial mistake to make. I'm sure much collective time will be lost to many people using dots where the spec says to use slashes, and not realizing it because it looks right in every other context with dots. Dots also makes copy/pasting easier, because you wouldn't have to migrate the format after pasting.
What if the client supplied actual code to do the update? I'm thinking something sort of like ebpf in the kernel - very heavily restricted on what can be done, but still very flexible.
{ "op": "merge", "path": "/", "value": { "score": null, "author": "Frank Herbert" } }
It's kind of nice to retain the terse and intuitive format while also gaining features like "test" and explicit nulls. It's of course not spec compliant anymore but for standard JSON Patch APIs the client could implement a simple Merge Patch->Patch compiler.The most glaring issue is JSON number type versus JavaScript float. This causes issues in both directions whereby people consistently believe JSON can't represent numbers outside the float range and in addition JSON has no way to represent NaN.
https://docs.aws.amazon.com/cloudcontrolapi/latest/APIRefere...
So, wait, you can't add an item to an array with this (except at a predefined position)? I.e. "add" with path "/.../myarray/~" (if I've understand their notation right) isn't allowed?
I'm not sure if that's good or bad, but it's certainly surprising and could do with saying a bit more explicitly.
I had built a quick and dirty web interface so that a handful of people we contracted overseas can annotate some text data at the word level.
Originally, the plan was that the data was being annotated in small chunks (a sentence or two of text) but apparently the person managing the annotation team started assigning whole documents and we got a complaint that suddenly the annotations weren't being saved.
It turned out that the annotators had been using a dial up connection the entire time (in 2018!) and so the upload was timing out for them.
We panicked a bit until I discovered JSON Patch and I rewrote the upload code to only use the patch.
If I have json like
[{"name": "Bob", age: 22}, {"name": "Sally", age: 40}]
and then I delete the "Sally" element and add "Jenny" with the same age, I end up with
[{"name": "Bob", age: 22"}, {"name": "Jenny", age: 40}]
However, the patch would potentially look like I changed the name of "Sally" to "Jenny", when in reality I deleted and added a new element. If I'm submitting this to a server and need to reconcile it with a database, the server cares about the specifics of how the JSON was manipulated.
I'd end up needing some sort of container object (like Yjs provides) that can track the individual changes and generate a correct patch.
[{"name": "Bob", age: 22}, {"name": "Sally", age: 40, "deleted":"2024-10-18 12:34:56"}]
I think it also needs a “replace” option at the individual object update level. Merge is a good default, but the semantics of the data or a particular update could differ.
You’re almost surely doing something wrong if replace doesn’t work for arrays. I think the missing thing is a collection that is both ordered and keyed (often not by the same value). JSON by itself just doesn’t do that.
So maybe what’s missing is a general facility for specifying metadata on an update, which can be used to specify the magical delete value, and the key/ordering field for keyed, ordered collections.
{
"delete":["123-234","567-700"],
"insert":[123,456],
"substrs":[{"foo":"bar"},[]]
}
Then delete the ranges from the string, convert the substrs to strings and insert them at the offset.Yeah, you could assign an identity value to each element of the array and then use a subresource to manipulate those elements by identity value. Then you could PUT using the same JSON merge mechanism to clear individual fields, and you could DELETE to remove items from the array by subresource.
This just seems like a reinvention of a crufty piece of XML.
I actually think an array would be better, ["foo","bar"] for "foo.bar". How many many bugs are introduced by people not escaping properly? It's more verbose, but judging by the rest of the standard, they don't seem to be emphasizing brevity.
JSON Patch uses JSON Pointer (RFC 6901) to address elements, but another method from (very) roughly same time is JSON Path [0] (RFC 9535) and here's one of my favorite mnemonics:
- JSON Path uses "points" between elements
- JSON Pointer uses "path separators" between elements
It seems like this is mainly a problem if you're implementing this _ad hoc_ on the client or server side -- is that right?
I mean: presumably most of the time that you want to either of these, you already have both the old and new object, right? Is it not straightforward to write a function (or library) that takes two plain objects and generates the JSON Patch from one to the other, and then use that everywhere and not think about this (but retain the advantage of "being able to modify every possible JSON document under the sun").
If there are cases where you're making a delta without the original object (i.e., I know I always want to remove one field and add some other, whatever the original state was), it seems like you could have nice helpers like `JsonPatch::new().remove_field('field1').add_field('field2', value)`.
I haven't actually done this so maybe I'm missing something about how you want to use these things in practice?
edit to add my motivation: I'd much rather having something robust and predictable, even if it means writing tooling to make it convenient, than something that _seems_ easy but then can't handle some cases or does something different than expected ("I wanted this to be null, not gone!").
I think I've seen this in other libraries too but I forget which.
Some time ago (9 years, apparently - time sure does fly), I made a thing to describe patches to a binary using JSON (https://github.com/zahlman/json_bpatch). It was meant primarily for hacking content into someone else's existing binary file, and I spent way too much time on fancy algorithms for tracking "free" (safely modifiable, based on the user's initial assessment and previous patches) space and fitting new content into the remaining space. Overall, I consider it a failure - but a fair amount of this project DNA is likely to survive in future projects.
I also had the idea at some point to make some kind of JSON diffing tool that works at a JSON-structure level instead of a textual level. I guess I don't need to reinvent that wheel.
Or here's one built into Symfony: https://symfony.com/doc/current/components/expression_langua...
I suppose a subset of this is idempotent though.
Broadly this seems fraught with peril. It sounds like edge case upon edge case, and only would work in the narrow case where you are 100% sure exactly what the remote document looks like such that you can calculate the patch. If anything gets out of sync, or serialization differs between local and remote, etc youre going to get subtle bugs...
Both, JSONPath or JMESPath, support query expressions whereas JSON Pointer does not.
I was trying to extend Python's dataclasses.replace() function to be able to replace deeply nested elements. In vanilla Python this function is used as obj = replace(obj, element1=newval, element2=newval2). Meaning that to replace nested readonly dataclasses this must be called recursively.
Implementing a sequence of objects (var1.var2.var3) was mostly easy, but playing around the [] operator for sequence or mappings or slices was filled with edge cases. And implementing edge cases is a hornet nest just in itself.
Functions as first class citizens is one of the most useful concept of make dynamic code that remains readable. The next step would be structured accessors as first class citizens.
z = ['a', 'b', 'c', 'd', 'e', 'f']
z.splice(2, 3, 'w', 'x', 'y', 'z')
Array(7) [ "a", "b", "w", "x", "y", "z", "f" ]
Unfortunately it's not on strings in javascript, but the verb has basically that meaning even outside of programming and could work here too.I don't see how the values in your example would work though.
Such a document may not be wise, but how would you update something like:
{ "charCounts": { "/": 1234, "a": 100, ".": 99 } }
You compare the old string with the new one. What should be deleted is marked for deletion. Say char 10 to 30. Then the insert part points at spots where new stuff should be inserted.
Commas are left as an exercise for the backend.
Weaknesses
Maintenance Costs: As APIs evolve, the paths specified in JSON Patches might
become obsolete, leading to maintenance overhead.
Then again, if a path becomes obsolete, it means that I would be about to update with a wrongly formatted document anyway.My plan was to dig up an operational transform implementation on json patches to implement collaborative editing when we decided we need that feature (we don't need fine grained editing of text, but we do have arrays of objects).
I'm now evaluating automerge - it seems to be performant enough and not likely to disappear.
(I can implement OT, but I won't be on this project forever and need to leave something that less senior devs can work with and maintain.)
Hm, maybe? I'm thinking about this from the perspective API design, eg for REST APIs, JavaScript modules etc. How to let users change a single field in an object, and leave the rest alone? Like set the subject of a conversation? JSON Patch lets you do that, and so does this. And whether you have the target object already is kind of.. maybe? Sometimes? I wouldn't make that assumption tbh.
Does anyone know if it’s possible to use that update language on local files, without having a full mongodb running?
https://www.mongodb.com/docs/manual/reference/operator/updat...
An array of keys alleviates this issue entirely because any string element of the array can be used as a JSON object property name too.
URIs are not only used on web servers though, they're all over the place, probably most notable your filesystem is using `/` as a path separator, so it wouldn't be completely out of place to use it as a path separator elsewhere.
Are you referring to the possibility to point to the end of the array? If so, a single minus sign might solve it: "/path/to/the/array/-"
RFC 6901 JavaScript Object Notation (JSON) Pointer > exactly the single character "-", making the new referenced value the (nonexistent) member after the last array element
If you're concerned about concurrency for array updates, you'll usually need a form of concurrency control whether the database supports partial updates or not.
REST is great but I wouldn't exactly call it declarative, it relies on HTTP verbs, plus there are times when you simply can't express things with pure REST so usually you break the textbook pattern with something like POST /resource/doSomething.
As for RPCs, you have to predefine the operations you want to expose as part of your schema, which is even less flexible than what I'm proposing. Imagine for example that instead of passing a JSON Patch string you'll pass a real JS arrow function (with some limitations, e.g. no closure state). It allows for max flexibility, the user doesn't have to learn any APIs, plus the implementation becomes trivial. It's kind of like SQL without inventing a new language.
Yet, I would still like a checksum option because of its constant size impact.
I didn't consider before though that checksumming would differ from json's understanding of equality if items are reordered (objects, but also some other cases like integers with explicit sign +0, 0, -0). Those cases could (optionally) be considered in a normalization step.
Off the top of my head, an optional header like "MergeMetadataObjectPropertyName: @mergeMetadata"
Which would cause objects in the merge containing the property "@mergeMetadata" to be treated specially.
The merge meta data could (optionally) specify an alternative to null for the special delete value. Or (optionally) specify the key value for an array representing an ordered, keyed collection. (Or, possibly, to specify the order value for an object used to represent a keyed, ordered collection.)
I guess you could just do without the header and specify the metadata using magic values (in the same way null is used as a special value meaning delete), but it seems better to opt in to things things like that.
(IMO, json merge patch would have been slightly better if it had no special values by default, but it's not bad. "null means delete" is a small thing, you probably need delete regardless, and, anyway, the ship has sailed on that one.)
It's really bad at creating deeply nested schemas though, (eg from CSV -> Notion you'll get trouble). It's also limited in things like renaming keys, splitting values into multiple keys (eg a csv string into an array) and those sorts of things
I remember testing out various libraries (not sure if a proper JSON Patch library was already around back then, looking at the spec I think it should...) and picking it over all the others because it handled complex objects and arrays way better than all the others.
Would also love to see how it compares.
This is most obvious in Kubernetes, where you have a list property like:
"conditions": [{"type":"a", "status":"b"}, {"type":"c", "status":"d"}, ...]
and you can't really create a patch like "change the conditions[type="a"].status to foo". As a result, you usually end up doing a whole field patch.