Eriksson, Robert1; Jensen, Peter Bjødstrup4; Pletscher-Frankild, Sune5; Jensen, Lars Juhl6; Brunak, Søren1
1 Department of Systems Biology, Technical University of Denmark2 Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark3 Integrative Systems Biology, Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark4 Technical University of Denmark5 Center for Biological sequence analysis, Technical University of Denmark6 Department of Biotechnology, Technical University of Denmark
Objective Drugs have tremendous potential to cure and relieve disease, but the risk of unintended effects is always present. Healthcare providers increasingly record data in electronic patient records (EPRs), in which we aim to identify possible adverse events (AEs) and, specifically, possible adverse drug events (ADEs).Materials and methods Based on the undesirable effects section from the summary of product characteristics (SPC) of 7446 drugs, we have built a Danish ADE dictionary. Starting from this dictionary we have developed a pipeline for identifying possible ADEs in unstructured clinical narrative text. We use a named entity recognition (NER) tagger to identify dictionary matches in the text and post-coordination rules to construct ADE compound terms. Finally, we apply post-processing rules and filters to handle, for example, negations and sentences about subjects other than the patient. Moreover, this method allows synonyms to be identified and anatomical location descriptions can be merged to allow appropriate grouping of effects in the same location.Results The method identified 1 970 731 (35 477 unique) possible ADEs in a large corpus of 6011 psychiatric hospital patient records. Validation was performed through manual inspection of possible ADEs, resulting in precision of 89% and recall of 75%.Discussion The presented dictionary-building method could be used to construct other ADE dictionaries. The complication of compound words in Germanic languages was addressed. Additionally, the synonym and anatomical location collapse improve the method.Conclusions The developed dictionary and method can be used to identify possible ADEs in Danish clinical narratives.
American Medical Informatics Association. Journal, 2013, Vol 20, Issue 5, p. 947-953