Here's the most basic error handling to implement, which I learned from studying FParsec:
1. Keep a collection of errors in the stream. It represents errors at the current position.
2. Clear all errors when input is consumed.
3. When a token-like parser fails without consuming input, push an "expected '<literal>'" error.
4. When a labeled parser fails without consuming input, restore the error set as it was before running it, and push an "expected <label>" error instead.
This will naturally accumulate errors from choice and repetition combinators, as long as you don't have needless backtracking before the choice points.
So this already gives you reasonable errors like "foo:17:4: expected one of expression, assignment or 'end'". And it's easy to extend to nested errors, backtracking information or recoverable nodes.