Activity
-
Team OccamzRazor at the ESACT conference in Edinburgh. It was wonderful to see the team, our customers, and our partners! #ESACT2024
Team OccamzRazor at the ESACT conference in Edinburgh. It was wonderful to see the team, our customers, and our partners! #ESACT2024
Liked by Katharina Sophia Volz, PhD
-
Team OccamzRazor at the ESACT conference in Edinburgh. It was wonderful to see the team, our customers, and our partners! #ESACT2024
Team OccamzRazor at the ESACT conference in Edinburgh. It was wonderful to see the team, our customers, and our partners! #ESACT2024
Shared by Katharina Sophia Volz, PhD
-
Sign up for the best therapy experience ever ✨
Sign up for the best therapy experience ever ✨
Shared by Katharina Sophia Volz, PhD
Experience & Education
Publications
-
Biomedical Information Extraction for Disease Gene Prioritization
NeurIPS 2020
We introduce a biomedical information extraction (IE) pipeline that extracts biological relationships from text and demonstrate that its components, such as named entity recognition (NER) and relation extraction (RE), outperform state-of-the-art in BioNLP. We apply it to tens of millions of PubMed abstracts to extract protein-protein interactions (PPIs) and augment these extractions to a biomedical knowledge graph that already contains PPIs extracted from STRING, the leading structured PPI…
We introduce a biomedical information extraction (IE) pipeline that extracts biological relationships from text and demonstrate that its components, such as named entity recognition (NER) and relation extraction (RE), outperform state-of-the-art in BioNLP. We apply it to tens of millions of PubMed abstracts to extract protein-protein interactions (PPIs) and augment these extractions to a biomedical knowledge graph that already contains PPIs extracted from STRING, the leading structured PPI database. We show that, despite already containing PPIs from an established structured source, augmenting our own IE-based extractions to the graph allows us to predict novel disease-gene associations with a 20% relative increase in hit@30, an important step towards developing drug targets for uncured diseases.
-
Relation-weighted Link Prediction for Disease Gene Identification
NeurIPS 2020
Identification of disease genes, which are a set of genes associated with a disease, plays an important role in understanding and curing diseases. In this paper, we present a biomedical knowledge graph designed specifically for this problem, propose a novel machine learning method that identifies disease genes on such graphs by leveraging recent advances in network biology and graph representation learning, study the effects of various relation types on prediction performance, and empirically…
Identification of disease genes, which are a set of genes associated with a disease, plays an important role in understanding and curing diseases. In this paper, we present a biomedical knowledge graph designed specifically for this problem, propose a novel machine learning method that identifies disease genes on such graphs by leveraging recent advances in network biology and graph representation learning, study the effects of various relation types on prediction performance, and empirically demonstrate that our algorithms outperform its closest state-of-the-art competitor in disease gene identification by 24.1%. We also show that we achieve higher precision than Open Targets, the leading initiative for target identification, with respect to predicting drug targets in clinical trials for Parkinson's disease.
-
Chess2vec: Learning Vector Representations for Chess
NIPS
We conduct the first study of its kind to generate and evaluate vector representations
for chess pieces. In particular, we uncover the latent structure of chess pieces and
moves, as well as predict chess moves from chess positions. We share preliminary
results which anticipate our ongoing work on a neural network architecture that
learns these embeddings directly from supervised feedback. -
Training Classifiers with Natural Language Explanations
Association for Computational Linguistics
Training accurate classifiers requires many labels, but each label provides only limited information (one bit for binary classification). In this work, we propose BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision. A semantic parser converts these explanations into programmatic labeling functions that generate noisy labels for an arbitrary amount of unlabeled data, which is used to train a classifier. On…
Training accurate classifiers requires many labels, but each label provides only limited information (one bit for binary classification). In this work, we propose BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision. A semantic parser converts these explanations into programmatic labeling functions that generate noisy labels for an arbitrary amount of unlabeled data, which is used to train a classifier. On three relation extraction tasks, we find that users are able to train classifiers with comparable F1 scores from 5-100 faster by providing explanations instead of just labels. Furthermore, given the inherent imperfection of labeling functions, we find that a simple rule-based semantic parser suffices.
Courses
-
Startup Garage Program
-
Languages
-
German
Native or bilingual proficiency
-
English
Full professional proficiency
-
Portuguese
Elementary proficiency
-
Spanish
Elementary proficiency
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More