Base58 Encoding in Bitcoin

Base58 is a binary-to-text encoding scheme used in Bitcoin to represent data (e.g., addresses) with fewer symbols, avoiding visually ambiguous characters. It is derived from Base64 but removes the following characters:
0 (zero), O (uppercase o), l (lowercase L), I (uppercase i), +, and /.


Key Steps in Base58 Encoding

1. Add a Prefix

  • A prefix (version byte) is prepended to the data to identify its type.
    Common prefixes include:
    • 0x00 for Pay-to-Public-Key-Hash (P2PKH) addresses.
    • 0x05 for Pay-to-Script-Hash (P2SH) addresses.

2. Compute Checksum

  1. Hash the prefix + payload twice using SHA256: hash = SHA256(SHA256(prefix + payload))

  2. Take the first 4 bytes of the resulting 32-byte hash. This serves as the checksum.

3. Concatenate Components

Combine the prefix, payload, and checksum into a single byte sequence: final_data = prefix + payload + checksum

4. Base58 Encode

Encode the final_data using Base58 encoding.

  • Base58 alphabet: 123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz

Example Structure

For a P2PKH address: [0x00 (1 byte)] + [Public Key Hash (20 bytes)] + [Checksum (4 bytes)]

Why Use Base58?

  • User-friendly: Removes ambiguous characters to prevent errors in manual entry.
  • Error detection: The checksum allows validation of data integrity.
  • Compact representation: Encodes data in fewer characters than hexadecimal.

Summary

Bitcoin uses Base58Check encoding (Base58 with checksum) for addresses:

  1. Add a prefix to identify data type.
  2. Compute a checksum via double SHA256 hashing.
  3. Combine and encode in Base58.

Challenge

1.Given a base58 tprv, extract the private key and chaincode

ECDSA (Elliptic Curve Digital Signature Algorithm)

ECDSA is a cryptographic algorithm used for digital signatures. It relies on the mathematics of elliptic curves to provide security.

Key Generation

  1. A point G on the elliptic curve is chosen and agreed upon.
  2. To generate a private key, a random number k is chosen such that 1 <= k <= n-1, where n is the order of the base point G. k is typically 32 bytes long.
  3. The public key K is computed by multiplying G by k: [ K = G \cdot k ] Since K is a point on the curve, it has x and y coordinates, making it 64 bytes long (32 bytes for x and 32 bytes for y).

Discrete Logarithm Problem

You cannot determine k from G and K because division is not defined on elliptic curves. This is known as the Discrete Logarithm Problem, which forms the foundation of elliptic curve cryptography's security.


Compressed Public Keys

The public key is typically 64 bytes long. An additional prefix byte is added, making the total size 65 bytes. However, the public key can be compressed to reduce its size.

How Compression Works

  • A public key is a point (x, y) on the elliptic curve.
  • If we know x, we can calculate y using the curve's equation. For example, in Bitcoin, the curve equation is: [ y^2 \mod p = (x^3 + 7) \mod p ]
  • Instead of storing both x and y, we can store only the x coordinate (32 bytes) and a prefix byte, making the public key 33 bytes long. This reduces the size by almost 50%.

Prefix Bytes

  • Uncompressed Public Keys: Have a prefix of 04. This indicates that both x and y coordinates are included.
  • Compressed Public Keys: Have a prefix of either 02 or 03. The prefix depends on whether the y coordinate is positive (02) or negative (03).

Mermaid Diagrams

Key Generation Process

graph TD
    A[Choose Base Point G] --> B[Generate Private Key k]
    B --> C[Compute Public Key K = G * k]
    C --> D[Public Key K is a point on the curve with x and y coordinates]

Compressed vs Uncompressed Public Keys

graph TD
    A[Public Key] --> B{Compressed?}
    B -->|Yes| C[Store x , Prefix 02 or 03]
    B -->|No| D[Store x, y, and Prefix 04]

Summary

  • Private Key: A random number k (32 bytes).
  • Public Key: A point K = G * k on the elliptic curve (64 bytes uncompressed, 33 bytes compressed).
  • Compression: Reduces public key size by storing only the x coordinate and a prefix byte.
  • Prefixes:
    • 04: Uncompressed public key.
    • 02 or 03: Compressed public key (depending on the sign of y).

SegWit (Segregated Witness)

🔹 What is SegWit?

  • SegWit = Segregated Witness
  • It was a Bitcoin upgrade in 2017.
  • Separates signature (witness data) from transaction data.
  • Helps fix transaction malleability and improves Bitcoin scalability.

🔹 Why Was SegWit Needed?

✅ Transaction Malleability Fix

  • Before SegWit, txid (transaction ID) included the signature.
  • Attackers could modify the signatureChange txid without changing the transaction.
  • This caused issues for Lightning Network and Bitcoin apps.
  • SegWit removes signatures from the txid calculation, fixing this issue.

✅ More Transactions per Block

  • Transactions become smallerMore transactions fit in a block.
  • Lower fees and faster confirmations.

🔹 How SegWit Works

  • Moves the signature (witness) data outside the main transaction.
  • New txid = Only inputs + outputs (excludes witness data).
  • Prevents txid changes even if the signature is modified.

🔹 Address Formats After SegWit

1️⃣ P2SH (Pay-to-Script-Hash)

  • Starts with 3 (e.g., 3QJmV3qfvL9SuYo34YihAf3sRCW3qSinyC).
  • Legacy way of using SegWit inside a standard script hash.

2️⃣ P2WSH (Pay-to-Witness-Script-Hash)

  • Starts with bc1q (e.g., bc1qxy2kgdygjrsqtzq2n0yrf2493p83kkfjhx0wlh).
  • Fully native SegWit, lower fees.

🔹 Benefits of SegWit

✅ Fixes transaction malleability (prevents txid changes).
More transactions per block (improves scalability).
Lower transaction fees.
✅ Enables Lightning Network (for faster, cheaper payments).


🔹 SegWit Adoption

  • SegWit is optional → Not all wallets/exchanges use it.
  • Most wallets today support SegWit (e.g., Ledger, Trezor, Electrum).
  • Some users still use Legacy (non-SegWit) addresses, leading to higher fees.

Bech32m Encoding

An address is an encoded output script.


Structure of a Bech32m Address

Components:

  1. Human-Readable Part (HRP):
    • Prefix agreed by the network (e.g., bc for Bitcoin mainnet, tb for testnet).
  2. Separator: 1
  3. Data Part:
    • Witness Version
    • Witness Program
    • Checksum
flowchart TD
    A["HRP (e.g., bc)"] --> B["Separator 1"]
    B --> C["Witness Version (e.g., q/p)"]
    C --> D["Witness Program (20-40 bytes)"]
    D --> E["Checksum (6 chars)"]

Data Part Details

1. Witness Version

  • Encoded as 1 character (q = SegWit v0, p = SegWit v1).
  • Values: 0–16 (17 possible versions).

2. Witness Program

  • Length: 2–40 bytes.
    • SegWit v0: 20 or 32 bytes (P2WPKH/P2WSH).
    • SegWit v1 (Taproot): 32 bytes.

3. Checksum

  • 6 characters generated using a BCH code (error detection only).

Address Construction

P2WPKH (SegWit v0)

  1. HRP: bc (mainnet) or tb (testnet).
  2. Witness Version: q (version 0).
  3. Witness Program: RIPEMD160(SHA256(public_key)) (20 bytes).
  4. Checksum: BCH-generated.
flowchart LR
    A[Public Key] --> B[SHA256] --> C[RIPEMD160] --> D[Witness Program]
    D --> E[bc1 + q + Witness Program + Checksum]

Pay to Witness Script Hash (P2WSH) Output

For the pay to witness script hash (P2WSH) output, we don’t use the P2SH algorithm. Instead, we take the script, pass it into a SHA256 hash function, and use the 32-byte digest of that function in the witness program.

For P2SH, the SHA256 digest was hashed again with RIPEMD-160, but that may not be secure in some cases; for details, see P2SH Collision Attacks.

A result of using SHA256 without RIPEMD-160 is that P2WSH commitments are 32 bytes (256 bits) instead of 20 bytes (160 bits).


Pay to Taproot (P2TR) Output

For the pay-to-taproot (P2TR) output, the witness program is a point on the secp256k1 curve.

It may be a simple public key, but in most cases, it should be a public key that commits to some additional data.

Private Key Formats

A private key can be represented in different formats, all corresponding to the same 256-bit number.

Common Private Key Formats

TypePrefixDescription
HexNone64 hexadecimal digits
WIF5Base58check encoding (base58 with version prefix of 128 and 32-bit checksum)
WIF-compressedK or LSame as WIF but with added suffix 0x01 before encoding

Modern Relevancy of Private Key Formats

  • Early Bitcoin wallets generated independent private keys and required frequent backups.
  • Modern wallets use deterministic wallets, generating all private keys from a single seed value.
  • Deterministic wallets only need a single backup.
  • Security concern: If an attacker acquires a single exported key plus some wallet metadata, they may derive all keys in the wallet.
  • No modern wallets support individual key import/export due to security risks.

Example: Same Key in Different Formats

FormatPrivate Key
Hex1e99423a4ed27608a15a2616a2b0e9e52ced330ac530edcc32c8ffc6a526aedd
WIF5J3mBbAH58CpQ3Y5RNJpUKPE62SQ5tfcvU2JpbnkeyhfsYB1Jcn
WIF-compressedKxFC1jmwwCoACiCAWZ3eXa96mBM6tb3TYzGmf6YwgdGWZgawvrtJ

Each format represents the same key in different ways and can be converted into another format easily.


Compressed Private Keys

  • Misnomer: A "compressed private key" is actually longer than an uncompressed private key.
  • The extra byte (0x01) signals that only compressed public keys should be derived from it.
  • Modern wallets only export WIF-compressed private keys (K or L prefix).
  • Old wallets export WIF (uncompressed) keys (5 prefix).

Example: WIF vs WIF-Compressed

FormatPrivate Key
Hex1E99423A4ED27608A15A2616A2B0E9E52CED330AC530EDCC32C8FFC6A526AEDD
WIF5J3mBbAH58CpQ3Y5RNJpUKPE62SQ5tfcvU2JpbnkeyhfsYB1Jcn
Hex-compressed1E99423A4ED27608A15A2616A2B0E9E52CED330AC530EDCC32C8FFC6A526AEDD01
WIF-compressedKxFC1jmwwCoACiCAWZ3eXa96mBM6tb3TYzGmf6YwgdGWZgawvrtJ

Vanity Addresses

  • Vanity addresses are Bitcoin addresses with custom patterns (e.g., 1LoveBPzzD...).
  • They require generating and testing billions of private keys until a match is found.
  • Security: They are as secure as regular addresses but require brute-force computation.
  • Privacy Risk: They encourage address reuse, which can reduce financial privacy.

Example: Vanity Address Search Times

LengthPatternProbabilityAvg. Search Time (Desktop)
11K1 in 58<1 ms
41Kids1 in 11M1 min
71KidsCha1 in 2.2T3-4 months
101KidsCharit1 in 400Q46,000 years
  • Specialized GPUs are required for patterns longer than 7 characters.
  • Vanity pools exist, where miners generate vanity addresses for a fee.

Paper Wallets (OBSOLETE!)

Warning: Paper wallets are obsolete and dangerous due to security risks.

  • Private keys printed on paper for offline storage.
  • Often include the corresponding Bitcoin address.
  • Risks:
    • Many were compromised by malicious generators.
    • Easily lost, stolen, or destroyed.
    • Should not be used for storing Bitcoin.
    • Better alternative: Hardware wallets & seed phrase backups.

Summary

  • Private keys can be stored in different formats, with WIF and WIF-compressed being common.
  • Modern wallets use deterministic seeds and do not allow individual key import/export.
  • Compressed keys refer to public keys, not private keys.
  • Vanity addresses are impractical for long patterns but were popular in Bitcoin’s early days.
  • Paper wallets are insecure and should be avoided.

For safe Bitcoin storage, use deterministic wallets and hardware devices instead of exporting/importing private keys.

Deterministic Key Generation and HD Wallets

Hash Functions and Deterministic Key Generation

  • A hash function always produces the same output for the same input, but even a slight change in input results in a completely different output.
  • A cryptographically secure hash function prevents prediction of the output even if the new input is known.
  • A seed (random value) can be used to deterministically generate a sequence of derived values.

Example of Deterministic Key Generation:

# Generate entropy (random value)
$ dd if=/dev/random count=1 status=none | sha256sum
f1cc3bc03ef51cb43ee7844460fa5049e779e7425a6349c8e89dfbb0fd97bb73  -

# Set the seed
$ seed=f1cc3bc03ef51cb43ee7844460fa5049e779e7425a6349c8e89dfbb0fd97bb73

# Generate deterministic values
$ for i in {0..2} ; do echo "$seed + $i" | sha256sum ; done

Deterministic Wallets

If these derived values are used as private keys, they can always be regenerated using the seed.
A deterministic wallet can be backed up by storing just the seed and the key generation algorithm.

Public Child Key Derivation

Elliptic Curve Cryptography (ECC) allows deriving a public key (K) from a private key (k) using a generator point (G):

[ K = k \times G ]

A child key pair can be created by adding the same value to both sides:

[ K + (123 \times G) = (k + 123) \times G ]

Key tweaks allow generating child public keys without knowledge of the private key.
Bob can generates multiple public keys from one public key by adding a constant c to the public key. If I know the constant c he added, I'll be able to derive the corresponding private key.

This is useful for separating frontend wallet applications from signing devices (e.g., hardware wallets).

Hierarchical Deterministic (HD) Wallets (BIP32)

HD wallets use a tree structure instead of a linear sequence of keys.
Each key can be a parent of multiple child keys, enabling:

  • Separation of receiving and change addresses.
  • Organizational structure (e.g., departments, subsidiaries).

There is no limit on the depth of the key tree.

Seeds and Recovery Codes

HD wallets derive all private keys from a single seed.
Losing access to the seed means losing access to all associated private keys.

Recovery codes use human-readable words for easy backup (e.g., BIP-39 mnemonics).

Example of a Seed Encoded in Hex and Words:

  • Hex-encoded:
    0C1E 24E5 9177 79D2 97E1 4D45 F14E 1A1A

  • Word-encoded:
    army van defense carry jealous true garbage claim echo media make crunch

    We create a bunch of random bytes, use an algorthim to turn them into human readable words and use another algorithm to turn that into a seed which using a seed algorithm can be used to create a bunch of keys.

Risks of Memorizing Recovery Codes:

  • Memory loss results in permanent loss of funds.
  • Physical coercion can force disclosure of the code.
  • Writing down the recovery code is highly recommended.

Conclusion

  • Deterministic key generation ensures that private keys can always be recreated from a seed.
  • Public key derivation allows wallets to distribute public keys securely.
  • HD wallets (BIP32) enhance security and scalability by structuring keys hierarchically.
  • Recovery codes simplify backups but must be stored securely to prevent loss or theft.

BIP-32: Hierarchical Deterministic Wallets

Overview

BIP-32 (Bitcoin Improvement Proposal 32) defines Hierarchical Deterministic (HD) wallets, allowing the generation of an entire tree of keys from a single master seed.

Key Benefits of BIP-32

  • A single seed can derive many keys.
  • Public keys can be derived without exposing private keys.
  • Supports hierarchical organization of keys.
  • Enables watch-only wallets.

BIP-32 Key Derivation Process

graph TD;
    A[Random Entropy] -->|BIP-39| B[Seed]
    B -->|HMAC-SHA512| C[Master Private Key & Master Chain Code]
    C --> D[Master Public Key]
    
    subgraph Child Key Derivation
        E[Parent Private Key & Parent Chain Code & Parent Public Key] -->|Index + HMAC-SHA512| F[New Child Private Key & New Child Chain Code]
        F --> G[New Child Public Key]
        F & G -->|Used as Parent for Next Level| E
    end
    
    C -->|Master becomes Parent| E
    D -->|Master Public Key Used| E

Master Key Generation

  1. Start with a BIP-39 Seed.
  2. Compute HMAC-SHA512 with the key "Bitcoin seed" and the seed as input.
  3. The output is 512 bits:
    • Left 256 bitsMaster Private Key
    • Right 256 bitsMaster Chain Code
  4. The Master Public Key is derived using the secp256k1 curve.

Child Key Derivation

BIP-32 allows two types of child key derivation:

  1. Normal (Non-Hardened) Child Keys (Index 0 to 2³¹ - 1)
  2. Hardened Child Keys (Index 2³¹ to 2³² - 1)

Normal Child Key Derivation

graph TD;
    P[Parent Public Key] -->|Index + Chain Code| HMAC[HMAC-SHA512]
    HMAC -->|Left 256 bits IL| ADD[IL + Parent Private Key mod n]
    ADD --> C[Child Private Key]
    HMAC -->|Right 256 bits IR| CC[New Chain Code]
    C --> CP[Child Public Key]

  • Uses parent public key, index, and parent chain code.
  • Formula:
    HMAC-SHA512(parent chain code, parent public key || index)
    
  • Left 256 bits (IL) modifies the private key:
    child_private_key = (IL + parent_private_key) mod n
    
  • Right 256 bits (IR) becomes the new child chain code.
  • Allows public key derivation without needing the private key.

Hardened Child Key Derivation

graph TD;
    PP[Parent Private Key] -->|Index + Chain Code| HMAC_H[HMAC-SHA512]
    HMAC_H -->|Left 256 bits IL| ADD_H[IL + Parent Private Key mod n]
    ADD_H --> C_H[Child Private Key]
    HMAC_H -->|Right 256 bits IR| CC_H[New Chain Code]
    C_H --> CP_H[Child Public Key]
  • Uses parent private key, index, and parent chain code.
  • Formula:
    HMAC-SHA512(parent chain code, parent private key || index)
    
  • The left 256 bits (IL) tweak the private key:
    child_private_key = (IL + parent_private_key) mod n
    
  • This prevents public key-based attacks that could leak the parent private key.

Summary

FeatureNormal DerivationHardened Derivation
Uses Public Key✅ Yes❌ No (Uses Private Key)
Private Key Needed?❌ No✅ Yes
Risk of Parent Key Leak⚠️ Possible✅ Secure
Index Range0 to 2³¹ - 12³¹ to 2³² - 1

Extended private keys(xprv) and Extended public keys (xpub)

xprv- parent private key + parent chaincode

xpub- parent public key + parent chain code. This can only create public keys.

It is perfect for watch only wallets.