PCD-hub: Protein Condensation Diseases resource

We performed a systematic analysis of more than 20,000 human diseases and ranked them based on their association with genes ( DisGeNet database ) that encode known droplet-forming proteins and with disease-associated missense mutations ( HuVarBase ) in known components of membraneless organelles. By comparing contributions from genes encoding condensate-forming proteins with those whose missense mutations affect native structures, we identified over 5000 human disorders, where protein condensation could be expected to have a causative nature. This list includes about 2000 orphan disorders linked with the dysfunction of multiple pathways.

We expect that the identification and understanding of the nature of protein condensation diseases that we report will promote the development of effective therapeutic strategies for their screening and treatment.

Ranking of human diseases based on their links with genes encoding droplet-forming proteins.

Download

Description

9277 diseases from the curated resources (Sheet 1) and 21552 diseases from all resources (Sheet 2) in the DisGeNet database were ranked by the fraction of disease-associated genes, which encode droplet-forming proteins. The ranking evaluates the contribution of experimental (MLO) and predicted (PC) droplet-forming proteins to the disease by comparing it to those of non-condensate forming proteins. We also detail the orphan diseases (from OrphaNet database ), which are associated with condensate-forming proteins (Sheet 3), which may help identifying the potential disease mechanism.

Curated data (Sheet 1): The 5803 diseases in the table are taken from the 9277 diseases in curated resources in the DisGeNet database, and ranked by the number of genes encoding droplet-forming proteins. The sheet lists the disease-associated genes and encoded proteins that are components of membraneless organelles (MLO) and proteins predicted to form droplets (PC).

All data (Sheet 2): The 16393 diseases in the table are taken from the 21552 diseases in all resources in the DisGeNet database and ranked by the fraction of genes encoding droplet-forming proteins.

Orphan diseases (Sheet 3): A list of rare (orphan) diseases with at least one third of the contributing genes associated with protein condensation.


Disease-associated missense mutations in droplet-promoting regions of experimentally identified droplet-forming proteins and diseases ranked by the contribution of missense mutations in droplet-promoting regions.

Download

Description

Causality of condensate perturbations in protein condensation diseases is supported by disease-associated missense mutations, which affect droplet-promoting regions (DPRs). Droplet-promoting regions are prone to form disordered interactions and facilitate the partitioning of proteins into condensates 1 . Thus we analyzed 644,521 disease-associated missense mutations of 17450 human proteins in the Human Variants Database 3, for their position in the protein sequence. We computed the fraction of missense mutations in droplet- promoting regions Nmut(DRP)/Nmut(tot), and identified proteins, where > 70% of the missense mutations are associated with droplet-promoting regions of experimentally identified condensate-forming proteins (Sheet 1). Then we ranked the diseases, which are associated with proteins in Sheet 1 by the fraction of disease-associated missense mutations of droplet- forming proteins (Sheet 2). This ranking evaluates the contribution of missense mutations of experimental (MLO) and predicted (PC) droplet-forming proteins to the disease by comparing it to those of non-condensate forming proteins.

DPR mutations (Sheet 1): Proteins forming membraneless organelles (MLOs) with over 70% disease-associated missense mutations associated with droplet-regions. We also list the diseases, which are associated with these proteins.

HuVarBase (Sheet 2): Diseases associated with proteins, where > 70% of missense mutations are in droplet-promoting regions. Diseases were ranked by the fraction of disease-associated missense mutations of droplet-forming proteins as compared to those of non-condensate forming proteins.