Which hash function should you use?
On this page
The naive answer is “SHA-256.” It’s right about 70% of the time. The other 30% it’s catastrophically wrong — either far overkill, or dangerously underkill, depending on what you’re hashing and why.
This is the practical decision tree, with the why behind each recommendation.
Step 1: are you hashing for security?
Two very different categories. The mistake of treating them as one breaks production roughly weekly.
Non-security hashing: ETags, deduplication, content addressing, Bloom filters, fingerprints, cache keys, sharding. The goal is “two identical inputs produce identical hashes; two different inputs almost always produce different hashes.” An attacker is not in your threat model.
Security hashing: signatures, integrity verification against a malicious actor, password storage, message authentication. An attacker is actively trying to find collisions or recover the input.
If you’re not sure: when an attacker finds two inputs with the same hash, does that hurt you? If yes, you’re in security mode.
Non-security: pick by performance and ergonomics
For non-security work, MD5 is genuinely fine. So is SHA-1. So is the xxhash family. Pick by speed and convenience, not strength.
| Use case | Best pick | Why |
|---|---|---|
| ETag / cache key | MD5 or xxhash | Fast, fixed-width, low collision rate on real data |
| Deduplication of files | MD5 or SHA-1 | Existing tools (rsync, git) speak these |
| Content addressing in a trusted network | SHA-256 | Future-proof against scope expansion |
| Bloom filter / sharding hash | xxhash or murmur | Designed for speed, not crypto |
| Database row hash for change detection | xxhash or MD5 | Speed matters when scanning rows |
The case for “always SHA-256 even non-security” is conservatism: if your code’s threat model expands later, you don’t have to migrate. That’s a fine choice. Just don’t think faster algorithms are wrong — they’re optimised for the actual non-security workload.
Security: pick by what’s at stake
General-purpose security hashing
SHA-256 is the right default. Used by HTTPS certificates, JWT HS256/RS256, Git’s SHA-256 transition, Bitcoin, most modern code signing. Strong against all known attacks. Broadly supported.
Reach for SHA-512 when:
- You’re on 64-bit-heavy hardware and hashing is hot (1.5–2× faster than SHA-256).
- You want extra security margin (e.g., 30-year archival).
- The output size doesn’t matter (machine-to-machine, not human-readable).
Reach for BLAKE3 when:
- You need maximum throughput on parallelisable inputs (multi-GB files).
- You can ship a non-stdlib dependency (most stdlibs don’t include BLAKE3 yet).
- Tree hashing or streaming verification is needed.
Avoid for new security work:
- MD5 — collision attacks since 2004; can produce two distinct files with the same MD5 in seconds.
- SHA-1 — broken in 2017 (SHAttered: two distinct PDFs, same SHA-1).
- SHA-3 — not insecure, but no compelling reason over SHA-256 unless mandated.
Password storage — completely different problem
Do not use any of the above for passwords.
The reason: SHA-256 is fast — that’s what makes it good for verification. But for password storage, fast is bad. An attacker who exfiltrates your hash database can try billions of passwords per second on a GPU, breaking weak passwords in hours.
Password-hashing functions are intentionally slow and have built-in salt:
| Function | Standard | Notes |
|---|---|---|
| Argon2id | OWASP 2023 default | Best choice for new code; memory-hard, side-channel resistant |
| scrypt | RFC 7914 | Strong; widely supported, simpler tuning |
| bcrypt | 1999, still solid | Battle-tested; ubiquitous library support |
Pick argon2id if your platform has it. Bcrypt if it doesn’t and you need a sure thing. Never PBKDF2-SHA256 unless you’re pinned to a compliance regime that requires it; it’s the weakest of the modern options.
Message authentication — use HMAC, not plain hash
If you’re proving “this message wasn’t tampered with and was sent by someone with the secret,” use HMAC-SHA256, not plain SHA-256.
Why? Plain hash + secret has length-extension vulnerabilities for some hash families (SHA-256 is in fact mostly safe here, but the pattern is bad practice). HMAC is the right primitive for keyed authentication.
const key = await crypto.subtle.importKey(
"raw", secretBytes,
{ name: "HMAC", hash: "SHA-256" },
false, ["sign", "verify"]
);
const sig = await crypto.subtle.sign("HMAC", key, messageBytes);
This is the same operation that JWT HS256 uses under the hood. See jwt.tooljo.com for the JWT-specific UI.
The decision tree, condensed
Are you hashing a password or password-like secret?
├── YES → Argon2id (preferred), scrypt, or bcrypt. Stop reading.
└── NO ↓
Is an attacker in your threat model?
├── NO → MD5 / SHA-1 / xxhash all fine. Pick by speed.
└── YES ↓
Does the hash need to authenticate (prove a sender)?
├── YES → HMAC-SHA256. Don't use plain hash + secret.
└── NO ↓
Default to SHA-256.
Reach for SHA-512 if 64-bit perf matters.
Reach for BLAKE3 if extreme throughput matters.
Common mistakes I keep seeing
- “We use SHA-256(password)” — vulnerable to GPU brute force. Use a password-hashing function.
- “We use MD5 to verify the upload was complete” — fine, but call it a checksum, not security. This avoids the next dev assuming it’s tamper-proof.
- “We use SHA-1 because Git does” — Git is migrating away. Don’t tie new code to the laggard.
- “We use SHA-256(salt + password)” instead of bcrypt — still vulnerable to GPU brute force. The salt prevents rainbow tables but not parallel guessing. The slowness of bcrypt/argon2 is doing the work.
- “We use HMAC-MD5” — actually still considered safe (HMAC’s construction blunts the underlying hash’s weaknesses), but you should still upgrade to HMAC-SHA256 for forward-compatibility.
Try the tools
- hash.tooljo.com — hash text or files with all common algorithms.
- jwt.tooljo.com/verify — for HMAC-SHA256 in the JWT context.
- base64.tooljo.com — for transporting raw hash bytes.
The math behind hashing is fascinating. The decision of which hash to use is mostly about correctly classifying the threat model. Get that right and the choice is usually obvious.