Healthcare Data Activation technology
leverages the power of
Artificial Intelligence to unlock healthcare data
from hospitals' data sources. We activate data from
both structured and unstructured sources,
including human-written records, thanks to our
Natural Language Processing System.
All data is standardized into
OMOP Common Data Model
, and never leaves hospitals' in-house systems, thanks to our
Federated Data Model
, enabling comprehensive understanding of healthcare information while maintaining data protection and security.
Mapping and standardizing structured data.
Through our advanced technology, we can automate the mapping of structured hospital data and standardize it into the OMOP Common Data Model through leveraging vector representations of source code descriptions.
Structuring and standardizing human-written text inputs through NLP.
Through our Natural Language Processing Systems we can structure
entities from human free-text inputs from clinical notes and extract
relevant information contextually, standardizing it into OMOP CDM. It's
not just about keyword extraction, it's about a deep understanding of
Ensuring security and protection of Healthcare Data.
Our Federated Data Model allows healthcare organizations to leverage the potential of Real-World Data while maintaining the highest level of data protection and security.
The DERMACLEAR study: Verification results of a natural language processing system in dermatology.
Results from the DERMACLEAR study will increase the real-world evidence of clinical practice, obtaining a large amount of information on patients with the studied diseases. The NLP system used is precise in identifying patients diagnosed with HS, PsO, CU and/or AD, and other medical variables from EHRs, highlighting that it is a valid system to use in the DERMACLEAR study.
An open source corpus and automatic tool for section identification in Spanish health records.
This work shows that it is possible to build competitive automatic systems when both data and the right evaluation metrics are available. The annotated data, the implemented evaluation scripts, and the section identification Language Model are open-sourced hoping that this contribution will foster the building of more and better systems.
NATI (NATural language in ThyroId cancer).
A total of 5137 medical records of patients diagnosed with thyroid cancer between 2015 and 2022 were included. The median follow-up (interquartile range) was 29.7 months (8.8-55.8). The mean age at the time of diagnosis was 55 years (SD 18), and 67% were women. The stage could be classified in a subgroup of 520 patients, of which 60% (n=313) had advanced stages. Metastasis was observed in 2177 patients (42%) during the follow-up, mainly in lymph nodes (44%). It was also identified that the majority of patients (71%; n=3629) had some comorbidity.
Extending the OMOP CDM to store the output of natural language processing pipelines.
Although OMOP CDM provides a NOTE_NLP table to store the outputs of NLP algorithms, queries to this table can become clumsy and slow, so we designed and extended the OMOP CDM with our own NLP schema to store the results generated in the annotation process of NLP. We designed an extension of the OMOP CDM able to store the output of NLP solutions while integrating with the vocabulary normalization process of the OMOP CDM.
A Framework for False Negative Detection in NER/NEL.
Finding the false negatives of a NER/NEL system is fundamental to improve it, and is usually done by manual annotation of texts. However, in an environment with a huge volume of unannotated texts (e.g. a hospital) and a low frequency of positives (e.g. a mention of a particular disease in the clinical notes) the task becomes very inefficient.
Efficient automated mapping of internal source codes to OMOP CDM concepts.
Our automated concept mapping system provides an efficient way of mapping source codes to OMOP concepts. By utilizing text-based vector representations and knowledge transfer, our system can find equivalent mappings from other hospitals, thereby reducing the time and effort required for manual mapping.
ContextMEL: Classifying Contextual Modifiers in Clinical Text.
Taking advantage of electronic health records in clinical research requires the development of natural language processing tools to extract data from unstructured text in dif ferent languages. A key task is the detection of contextual modifiers, such as understanding whether a concept is negated or if it belongs to the past. We present ContextMEL, a method to build classifiers for contextual modifiers that is independent of the specific task and the language, allowing for a fast model development cycle.