If I'm going to store in my database a user-entered string that will eventually make it into a webpage, should I html-escape it first?
I think the answer is no. Databases have no trouble storing arbitrary strings of data. Security problems occur only when we try to put that arbitrary string into an HTML page, so that is where the risk should be mitigated.
However, for some reason, I feel like that "best practice" is to escape everything *before* putting it into a database.
@email@example.com hard no. You don't know when you capture it what context it will be rendered in, so you don't know what the right rules are for serialising it in that context
Do you want to be sending emails/SMS to Mr O&lquo;Reilly?
@firstname.lastname@example.org @email@example.com if you're interested in Haskell you should definitely read https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/
@philipwhite I guess that's because you're trying "sanitise user input" and doing that at user input time gives good locality.
Yes. On input time, sanitize (if the field is supposed to be only alphanumeric, make it alphanumeric). On output time, escape.
If the users want to enter raw html into the field, they should just know that it's going to get escaped.
Escaping before putting into the database is just going to end you up with a bunch of double escaping problems. (Double escaping is not best practice).
The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!