Dataset Catalog
Unified dataset catalog for discovery plus the dataset access available to your account. The catalog call itself does not deduct Wallet Credits.
GET
/v2/datasets/catalogDataset Catalog
Returns public discovery rows, your generated dataset rows, and the dataset IDs you can download right now.
cURL
curl -G https://api.omtx.ai/v2/datasets/catalog \
-H "x-api-key: YOUR_API_KEY"Response
{
"catalog": {
"items": [
{
"dataset_id": "dataset-public-v5",
"protein_uuid": "0d64fb6a-8a66-50ad-82b6-fabee8bb1516",
"protein_name": "SOX2",
"uniprot_id": "P48431",
"sequence": null,
"vintage": "20260201_om1",
"vintage_id": "9512ba4b-8a67-4310-9100-fcb6fb06cf70",
"is_public": true,
"num_data_points": 83143571,
"num_actives": 1018756,
"num_inactives": 82124815
}
],
"count": 1
},
"data_generated": {
"items": [
{
"protein_uuid": "43d75238-ae2a-5935-8ade-30115565034e",
"protein_name": "Generated Protein",
"generation_status": "ready",
"license_kind": "exclusive",
"requires_subscription": false,
"subscription_entitled": true,
"dataset_id": "dataset-private-v2",
"latest_vintage_id": "9512ba4b-8a67-4310-9100-fcb6fb06cf70",
"latest_vintage": "20260201_om1",
"is_public": false,
"num_data_points": 83143571,
"num_actives": 450612,
"num_inactives": 82692959
}
],
"count": 1
},
"accessible_generated_protein_uuids": [
"43d75238-ae2a-5935-8ade-30115565034e"
],
"accessible_dataset_ids": [
"dataset-public-v5",
"dataset-private-v2"
]
}- No Wallet Credits charge for catalog discovery.
catalogcontains onlyis_public=truerowsdata_generatedcontains generated rows you can currently useaccessible_generated_protein_uuidsis the deduplicated list of generated protein UUIDs available to your account right nowaccessible_dataset_idslists the datasets your account can download right now- Public dataset downloads require an active subscription or qualifying buyout.
- Private dataset downloads require a qualifying owned data-generation order for that
protein_uuid. - Use
protein_uuidwithclient.load_binders(...)/client.load_nonbinders(...)(recommended) or/v2/data-access/shardsfor direct signed URL access. - This response uses the public SOX2 dataset as a representative example.
Response Fields
catalog.items[]--- Public discovery rows (is_public=true) with dataset metadata fromdatasetscatalog.count--- Number of public discovery rowsdata_generated.items[]--- Generated dataset rows currently available to your accountdata_generated.count--- Number of user generation rowsaccessible_generated_protein_uuids[]--- Deduplicated list of generated protein UUIDs available to your accountaccessible_dataset_ids[]--- Dataset IDs you can download with/v2/data-access/shards
Python SDK
Python
from omtx import OmClient
client = OmClient(api_key="YOUR_API_KEY")
# Get unified catalog payload
catalog = client.datasets.catalog()
print(f"Public rows: {catalog['catalog']['count']}")
print(f"Generated rows: {catalog['data_generated']['count']}")
print(f"Generated protein UUIDs: {len(catalog['accessible_generated_protein_uuids'])}")
print(f"Accessible dataset IDs: {len(catalog['accessible_dataset_ids'])}")
# Recommended next step: load binder/non-binder pools from a catalog protein UUID
items = catalog["catalog"]["items"]
binders = client.load_binders(
protein_uuid=items[0]["protein_uuid"],
n=1000,
sample_seed=42,
)
nonbinders = client.load_nonbinders(
protein_uuid=items[0]["protein_uuid"],
n=10000,
sample_seed=42,
)
print("Loaded shapes:", binders.shape, nonbinders.shape)