I still think the best way to conceptualize a monad is a "wrapper" around some data that contains additional state. The critical part of this definition is the second half, which sometimes gets skipped over for simplicity but is really what ties together all the disparate uses of monads.
The most convincing description of this idea imo is this video by Tsoding: https://www.youtube.com/watch?v=fCoQb-zqYDI