Hence the former can be used in contexts like "if x := 10: pass", which is the whole point of the PEP.
if ((int test = my_func_call()) == BLAH) { do_foo(); }
For example, import mod is NOT defined as
mod = eval(open("mod.py").read())
but involves abstract load module operation, which is dependant on the environment.That's why := is just syntactic sugar; there are no new semantics.
The sugar is sprinkled on top of syntax, the stuff the parser deals with. Typing a += 1 instead of a = a + 1 is sugar because it parses the same. This assignment syntax seems different. IMHO.
I don't think that's right; what expression/statement is `x := y` equivalent to? I'm thinking in particular about using mutable collections to emulate assignment in a lambda, e.g.
>>> counter = (lambda c: lambda: (c.append(c.pop() + 1), c[0])[1])([0])
>>> counter()
1
>>> counter()
2
>>> counter()
3
It looks like this could now be done as: >>> counter = (lambda c: lambda: (c := c + 1))(0)
Yet the semantics here are very different: one is pushing and popping the contents of a list, without changing any variable bindings (`c` always points to the same list, but that list's contents changes); the other has no list, no pushing/popping, and does change the variable bindings (`c` keeps pointing to different integers).Maybe it's equivalent to using a `=` statement, but statements are forbidden inside lambdas. Maybe the lambdas are equivalent to `def ...` functions, but what would their names be? Even if we made the outer one `def counter(c)...` the resulting value would have a different `func_name` (`counter` versus `<lambda>`).
Even the `if` examples that are scattered around this page don't seem to have an equivalent. For example:
if (x := foo() is not None):
do_something()
We can't "desugar" this, e.g. to something like the following: x = foo()
if x is not None:
do_something
The reason is that we're changing the point at which the binding takes place. For example, Python guarantees to evaluate the elements of a tuple in left to right order (which we exploited in the above push/pop example). That means we could write: if (sys.stdout.write(x), x := foo() is not None)[1]:
do_something
This will print the current value of `x`, then update `x` to the return value of `foo()`. I can't think of a way to desugar this which preserves the semantics. For example, using the incorrect method from above: x = foo()
if (sys.stdout.write(x), x is not None)[1]:
do_something
This isn't equivalent, since it will print the new value of `x`. Maybe we could float the `write` call out of the condition too, but what about something like: if foo(x) and (x := bar()):
do_something
We would have to perform `foo(x)` with the old value of `x`, store the result somewhere (a fresh temporary variable?), perform the `x = bar()` assignment, reconstruct the condition using the temporary variable and the new value of `x`, then `del` the temporary variable (in case `do_something` makes use of `locals()`).PS: I think this `:=` is a good thing, and writing the above examples just reminded me how infuriating it is when high-level languages distinguish between statements and expressions, rather than having everything be an expression!
if (match := re.match(r1, s)):
o = match.group(1)
elif (match := re.match(r2, s)):
o = match.group(1)
You could, but that would turn "syntactic sugar" into a useless phrase with arbitrary meaning.
The phrase "syntactic sugar" is usually reserved for language constructs which can always be rewritten, in-place, to some other construct in the same language, such that the semantics is identical (i.e. we can't tell which construct was used, unless we parse the contents of the file).
Python has examples like `foo += bar` being sugar for `foo = foo + bar`.
As an aside, your mention of "machine language" implies the use of operational semantics. That's where we say the "meaning" of a program depends on what it does to the machine when executed. That's fine, but it's not the only approach to semantics. In particular denotational semantics defines the meaning of a program by giving a meaning to each syntactic element of the language and their combinations, usually by rewriting them into some other, well-defined language (e.g. set theory). I much prefer denotational semantics, since it lets me 'think in the language', rather than making me 'think like the machine'.
That doesn't seem possible (see my sibling comments). You might be able to write a different program, which might be similar (e.g. same return value, most of the time), but I don't think there's anything that's equivalent.
This is an important distinction! For example, let's say you're given a program that uses a lot of `x := y` expressions. You're asked to back-port this to an older Python version, which doesn't have `x := y`. What do you do? If there's an equivalent expression, you can just swap them out; you could even automate it with a keyboard macro, since there's no need to think about it.
If, on the other hand, you only know how to write similar code, you can't be as confident. Some examples of where "similar" programs can end up behaving differently are:
- The application makes heavy use of threading
- There are lots of magic methods defined, like `__getattribute__`, which can alter the meaning of common Python expressions (e.g. `foo.bar`)
- Those magic methods cause global side effects which the program relies on, so that they have to get triggered in the correct order
- The program manipulates implementation features, like `locals()`, `func_globals`, `__class__`, etc.
- The software is a library, which must accept arbitrary values/objects given by users
- It makes use of hashes, e.g. to check for data in an existing database, and those hashes may depend on things like the order of insertion into internal properties
Whilst it's perfectly reasonable to curse whoever wrote such monstrous code, that doesn't help us backport it. We would have to tread very carefully, and write lots of tests.
> I'd prefer more lines for readability reasons
Verbosity and readability are not the same thing. Overly verbose code might have easier to understand parts, whilst obscuring the big picture of what it's actually doing. A classic example is assembly: each instruction is pretty easy, e.g. "add the value of register A to register B", "jump to the location stored in register C if register B is non-positive", etc. Yet we can pluck a page of disassembled machine code from, say, the middle of LibreOffice and have no idea what problem it's meant to be solving. (I posted a rant about this at https://news.ycombinator.com/item?id=16223583 ).
Nitpick, but I don't think that's true – AFAIK they translate into different method calls.
`foo + bar` →`foo.__add__(bar)`
`foo += bar` → `foo.__iadd__(bar)`
(note the `i` in the second one) match = re.match(r1, s)
if match:
o = match.group(1)
else:
match = re.match(r2, s)
if match:
o = match.group(1)
or a bit shorter: match = re.match(r1, s)
if not match:
match = re.match(r2, s)
if match:
o = match.group(1)
You could also just loop: for pattern in (r1, r2, ...):
match = re.match(pattern, s)
if match:
o = match.group(1)
break
else:
do_failure_handling()
But this goes a bit beyond the original question.Just for fun, this seems to work:
(locals().pop('x', None), locals().setdefault('x', y))[1]
Does not. One is addition, the other is in-place addition; they're different things and can behave differently. E.g. in "a += b" and "a = a + b", the former might not construct an intermediate object, but mutate the existing a.
>>> class A(object):
... def __getattribute__(self, attr):
... if attr == "__add__":
... return lambda *_: "hello world"
... return None
...
>>> a = A()
>>> a.__add__(A())
'hello world'
>>> a + A()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'A' and 'A'
if (match := re.match(r1, s)):
o = match.group(1)
# plus some code here
elif (match := re.match(r2, s)):
o = match.group(2)
# plus some other code here
In this case only your first solution works, I think. Leaving aside that having those deeply nested ifs is incredibly ugly, I find it hard to accept that something which completely changes the possible structure of the code is just "syntactic sugar".Take a more familiar example:
x, y = (y, x)
Let's pretend that this is "just sugar" for using a temporary variable. What would the desugared version look like? As a first guess, how about: z = (y, x)
x = z[0]
y = z[1]
del(z)
This seems fine, but it's wrong. For example, it would break the following code (since `z` would get clobbered): z = "hello world"
x, y = (y, x)
print(z)
A temporary variable would need to be "fresh" (i.e. not clobber any existing variable). As far as I'm aware, there's no syntax for that in Python. What we can do is create a fresh scope, so that the temporary variable would merely shadow an existing binding rather than overwrite it. We can do that with a lambda and the new `:=` syntax: (lambda z: (x := z[0], y := z[1]))((y, x))
However, this alters the semantics because the stack will be different. For example, we might have a class which forbids some attribute from being altered: class A(object):
def __init__(self, x):
super(A, self).__setattr__('x', x)
def __setattr__(self, name, value):
if name == "x":
raise Exception("Don't override 'x'")
return super(A, self).__setattr__(name, value)
This will raise an exception if we try to swap two attributes: >>> a = A('foo')
>>> a.y = 'bar'
>>> print(repr({'x': a.x, 'y': a.y}))
{'y': 'bar', 'x': 'foo'}
>>> a.x, a.y = (a.y, a.x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 6, in __setattr__
Exception: Don't override 'x'
If we replace this with the lambda version above, the exception will have a different stack trace, which we can catch and process in arbitrary ways. For example, maybe we know that the `foo` function will trigger these exceptions when given `A` objects, but it's a recoverable error. So we "ask for forgiveness instead of permission" by catching these exceptions somewhere, looking checking the stack trace to see if the Nth stack frame is `foo`, and abort if it wasn't. If we "desugared" using the above lambda, the Nth stack frame source of the exception would be a different function (`<lambda>` instead of `foo`) and hence such a program would abort.On the one hand, that's a pretty crappy program. But on the other it demonstrates that "use a temporary variable" is not "easy" in the general case (which is what language implementations must handle).
> You need to use a temporary variable but then your example is easy.
Yes this example, of `if foo(x) and (x := bar()):`, would be easy with a temporary variable. But there are infinite variations we can make:
if foo(x) and (x := bar()):
if foo(x) or (x := bar()):
if (x := baz()) and foo(x) and (x := bar()):
if foo(x, y) and (x := bar()) and baz(x) and (y := quux()):
...
I fail to see how something is "just sugar" when desugaring it seems to require implementing a general-purpose compiler from "Python" to "Python without ':='".But the overall question is: when is the sugar just syntactical, and at what point does it become a complete new taste?
"__iadd__" and "__add__" can do whatever they want.
There seem to be new semantics in the interaction with comprehensions, which is one of the main sources of controversy in the discussion these linked as the OP.
I would suggest that if you can express the exact same semantics with a "few" more lines then it's just sugar.
In the case of x := y, it's always possible to rewrite the program with a "few" extra lines where it means the same thing. It's just combining the assignment and expose operations.
Unless you can provide an example where that isn't true, it's just sugar, i.e. unneeded, but maybe desired, syntax.
if '__add__' in a.__dict__:
try:
return a.__add__(b)
except NotImplemented:
if '__radd__' in b.__dict__:
return b.__radd__(a)
elif '__radd__' in b.__dict__:
return b.__radd__(a)
raise TypeError('unsupported operand type(s) for +: '{}' and '{}'.format(type(a), type(b))
In particular, the runtime seems to directly look for '__add__' in the object's __dict__, rather than just trying to invoke `__add__`, so your `__getattribute__` method isn't enough to make it work. If you add an actual `__add__` method to A your example will work.I agree. The important question is what we mean by "the exact same semantics". I would say that observational equivalence is the most appropriate; i.e. that no other code can tell that there's a difference (without performing unpredictable side-effects like parsing the contents of the source file). Python is a really difficult language for this, since it provides so many hooks for redefining behaviour. For example in many languages we could say that 'x + x' and 'x * 2' and 'x << 1' are semantically the same (they double 'x'), but in Python those are very different expressions, which can each invoke distinct, arbitrary code (a `__mul__` method, an `__add__` method, etc.). The fact they often do the same thing is purely a coincidence (engineered by developers who wish to remain sane).
It's fine if we only care about the 'black box' input/output behaviour, but at that point it no longer matters which language we're using; we could have something more akin to a compiler rather than desugaring into expressions from the same language.
> it's always possible to rewrite the program
There's an important distinction here too. Are we saying that "a semantically equivalent program exists"? That's a trivial consequence of Turing completeness (e.g. there's always an equivalent turing machine; and an equivalent lambda calculus expression; and an equivalent Java program; etc.)
Are we saying that an algorithm exists to perform this rewriting? That would be more useful, since it tells us that Rice's theorem doesn't apply for this case (otherwise it might be impossible to tell if two programs are equivalent or not, due to the halting problem).
Are we saying that we know an algorithm which will perform this rewriting? This is the only answer which lets us actually run something (whether we call that an "elaborator", a "compiler", etc.). Yet in this case I don't know of any algorithm which is capable of rewriting Python involving `:=` into Python which avoids it. I think such an algorithm might exist, but I wouldn't be surprised if Python's dynamic 'hooks' actually make such rewriting impossible in general.
I certainly don't think that a local rewrite is possible, i.e. where we can swap out any expression of the form `x := y` without changing any other code, and keep the same semantics. If it is possible, I would say that such a local, observational equivalence preserving rewrite rule would qualify for the name "syntactic sugar".
> It's just combining the assignment and expose operations.
I'm not sure what you mean by "expose", and a search for "python expose" didn't come up with anything. It would be nice to know if I've missed out on some Python functionality!
What makes you say that? I would say it's crucial. Syntactic sugar is anything where we can say "Code of the form 'foo x y z...' is defined as 'bar x y z...'" where both forms are valid in the same language. Such a definition, by its very nature, gives us an automatic translation (look for anything of the first form, replace it with the second).
> It just means that in all cases a human can rewrite it without the new syntax and get the same semantics.
Yet that's so general as to be worthless. I'm a human and I've rewritten Java programs in PHP, but that doesn't make Java "syntactic sugar" for PHP.
I'm reminded of PHP, where (at least in version 5.*) we could write:
$myObject->foo = function() { return "hello world"; };
$x = $myObject->foo;
$x(); // Hello world
$myObject->foo(); // Error: no such method 'foo'
(Taken from an old comment https://news.ycombinator.com/item?id=8119419 )How about integer arithmetic? That's the programming language Goedel used for his incompleteness theorems (specifically, he showed that the semantics of any formal logical system can be implemented in Peano arithmetic, using Goedel numbering as an example).
I wouldn't call that a useful definition though. There are reasons why we don't treat RAM as one giant binary integer.
https://docs.python.org/3/reference/datamodel.html#special-l...
Apparently the runtime is even more picky than I showed. The method has to be defined on the object's type, not in the object's instance dictionary. So, really the lookup is something like:
if hasattr(type(a), '__add__'):
The link I provided explains the rationale for bypassing the instance dictionary and `__getattribute__` method.re.match shouldn't return None at all. I often write helper functions like:
matcher = lambda r, s: getattr(re.match(r, s), 'group', lambda i: '')
o = matcher(r1, s)(1) or matcher(r2, s)(3)
here matcher have a fixed, static return type, string.