Med7
This repository dedicated to the first release of Med7: a transferable clinical natural language processing model for electronic health records, compatible with spaCy, for clinical named-entity recognition (NER) tasks. The en_core_med7_lg model is trained on MIMIC-III free-text electronic health records and is able to recognise 7 categories:
The trained model comprises three components in its pipeline:
tagger
parser
clinical NER with seven categories.
Self-supervised pre-training has shown its efficiency in achieving good results even with a small number of gold-annotated training data. We have experimented with the spacy pretrain approach and trained a number of weights for model initialisation for various parameters of the width a