Open hierarchical industry classification for digital assets, validated against daily returns.
Live demo — interactive sunburst, search, asset cards
A community-maintained, hierarchical industry classification for digital assets. Designed for cross-sectional risk decomposition, sector-neutral portfolio construction, and peer-group analysis. The hierarchy is informed by established institutional classification methodologies — see methodology.md for citations.
| File | Description |
|---|---|
taxonomy.yaml |
Source of truth — class / sector / sub-sector definitions and codes |
classification/snapshot.csv |
Current classification of every covered asset |
classification/wide/<field>.parquet |
(date × asset) matrices, drop-in for group_neut style operations |
classification/long/panel.parquet |
(date, asset_id, codes) long form for SQL warehouses |
decisions/ |
One file per non-trivial classification decision — the audit trail |
methodology.md |
The rulebook — how classifications are made |
validation.md |
Empirical evidence the classification co-moves on daily returns |
UNIVERSE.md |
Coverage universe — selection criteria, exclusions, and graduation rules |
GOVERNANCE.md |
Maintainer council, PR merge thresholds, conflict-of-interest policy, appeals |
SCHEMA.md |
Dtype contract for all published fields; Int64 numpy-safety warning |
CHANGELOG.md |
Version history; v1.1 reclassifications.csv forward contract |
import pandas as pd
# Wide matrix: dates × asset_ids, cells = integer sector code
sector = pd.read_parquet(
"https://raw.githubusercontent.com/quantbai/crypto-sectors/main/classification/wide/sector_code.parquet"
)
# Or just the latest snapshot for human reading
snapshot = pd.read_csv(
"https://raw.githubusercontent.com/quantbai/crypto-sectors/main/classification/snapshot.csv"
)
print(snapshot.head())For a sector-neutralization example:
import datetime
# date must be a datetime.date (not pd.Timestamp) to index the wide matrix
date = datetime.date(2025, 1, 15)
# Note: cells before effective_from (2024-05-23) are NaN — a sane backtest
# starts on or after that date.
sector = pd.read_parquet(
"https://raw.githubusercontent.com/quantbai/crypto-sectors/main/classification/wide/sector_code.parquet"
)
# Align sector codes to alpha column order; prevents silent NaN from column mismatch
sector_row = sector.loc[pd.Timestamp(date)].reindex(alpha.columns)
# Cross-sectional demean within sector — a standard alpha-research operation
# (.T.groupby().T replaces the deprecated groupby(axis=1))
# Note: assets with NA sector_row (e.g. pre-effective_from, no-returns) are
# silently set to NaN in alpha_demeaned. Filter or impute upstream.
alpha_demeaned = alpha.sub(alpha.T.groupby(sector_row).transform("mean").T)- Universe: 158 actively classified digital assets. See UNIVERSE.md for selection criteria and exclusions.
- Hierarchy: 4 classes → 14 sectors → ~35 sub-sectors (community-maintained, with extensions in the 90–99 slot of each sector)
- Orthogonal tag:
chain_ecosystem(BTC, ETH, SOL, BNB, …) — categoricalFILTER_ONLYtag; see SCHEMA.md for usage guidance. Do not use as a direct numeric alpha factor. - Update cadence: quarterly snapshot tags (
v2026.Q2,v2026.Q3, …), continuous PR review
Same-sector daily returns co-move significantly more than cross-sector returns (bootstrap-CI spread well above zero), and the classification recovers the same cluster structure that an unsupervised Ward-linkage clustering of correlations would find. See validation.md.
| Existing source | Limitation |
|---|---|
| Commercial institutional classifications | Methodology often public, but asset-level mappings are paid products |
| CoinGecko / CMC categories | Marketing tags — not mutually exclusive, no formal methodology, no empirical validation |
| Internal fund taxonomies | Each fund reinvents the wheel; nothing comparable across teams |
This repository: open methodology, open mappings, empirically validated, community-curated.
Add a new token, propose a reclassification, or open a sub-sector discussion — see CONTRIBUTING.md. Most PRs are one line in classification/snapshot.csv plus a short decisions/<symbol>.md.
Code (scripts/) — MIT. Classification data (taxonomy.yaml, classification/, decisions/) — CC BY 4.0. Attribute as:
crypto-sectors contributors (2026). crypto-sectors: an open industry classification for digital assets. https://github.com/quantbai/crypto-sectors
This is an independent open-source project. It is not affiliated with, endorsed by, or sponsored by MSCI Inc., S&P Global, FTSE Russell, Coin Metrics, Goldman Sachs, WorldQuant LLC, or any other commercial index, classification, or analytics provider. References to third-party methodologies in methodology.md are academic citations and do not imply any business relationship. All trademarks are the property of their respective owners.