What each number actually measures
- Words — runs of letters and digits, segmented locale-aware via Intl.Segmenter.
- Characters — grapheme clusters, so emoji and combining marks count as one.
- No-space characters — useful for typography and ad-copy fits.
- Sentences — split on
.,!,?followed by whitespace. - Paragraphs — separated by one or more blank lines.
- Bytes (UTF-8) — the encoded size, which is what API and database limits use.