Database Schema

The database uses a long (melted) format also known as EAV (Entity-Attribute-Value). This design was chosen because different companies report different metrics, making a wide-column approach impractical at scale.

Schema: finance

All tables live in the finance schema on Neon PostgreSQL.

active_tickers

Tracks which of the 106K tickers are valid/active on Yahoo Finance.

Column

Type

Description

ticker (PK)

VARCHAR(20)

Stock ticker symbol (e.g., “AAPL”, “MSFT”)

name

VARCHAR

Company name

exchange

VARCHAR

Exchange code (e.g., “NYQ”, “NMS”)

category_name

VARCHAR

Industry category

country

VARCHAR

Country of origin

is_active

BOOLEAN

Whether Yahoo API returns data for this ticker

upsert_datetime

TIMESTAMP

When this record was last updated

Financial Statement Tables

All four tables (income_stmt, cash_flow, balance_sheet, financials) share the same schema:

Column

Type

Description

ticker

VARCHAR

Stock ticker symbol

frequency

VARCHAR

“annual” or “quarterly”

report_date

DATE

Financial report date

metric

VARCHAR

Metric name (e.g., “annualTotalRevenue”)

value

DOUBLE PRECISION

Numeric value (parsed from B/M/K suffixes)

insert_datetime

TIMESTAMP

When this row was inserted

ER Diagram

┌──────────────────────┐
│   active_tickers     │
├──────────────────────┤        ┌──────────────────────┐
│ ticker (PK)          │───────▶│   income_stmt        │
│ name                 │        ├──────────────────────┤
│ exchange             │        │ ticker               │
│ category_name        │        │ frequency            │
│ country              │        │ report_date          │
│ is_active            │        │ metric               │
│ upsert_datetime      │        │ value                │
└──────────────────────┘        │ insert_datetime      │
          │                     └──────────────────────┘
          │
          ├────────────────────▶ cash_flow (same schema)
          │
          ├────────────────────▶ balance_sheet (same schema)
          │
          └────────────────────▶ financials (same schema)

Example Queries

Get latest revenue for all tickers:

SELECT ticker, report_date, value
FROM finance.income_stmt
WHERE metric = 'annualTotalRevenue'
ORDER BY ticker, report_date DESC;

Find top 10 companies by net income:

SELECT DISTINCT ON (ticker) ticker, value
FROM finance.income_stmt
WHERE metric = 'annualNetIncome'
  AND frequency = 'annual'
ORDER BY ticker, report_date DESC, value DESC
LIMIT 10;

Count tickers per exchange:

SELECT exchange, COUNT(*) as count
FROM finance.active_tickers
WHERE is_active = true
GROUP BY exchange
ORDER BY count DESC;