Getting started with BioMedICUS
- Python >=3.7,<3.11. 3.11 is not supported yet as
- Java JDK 8.0+. Note, you will need to have the “java” command on the your “$PATH”.
Create a Virtual Environment
We recommend that you use a Python 3 virtual environment, a local environment of installed packages, to avoid any dependency conflicts.
Linux / MacOS
pip3 install virtualenv python3 -m virtualenv biomedicus_venv source biomedicus_venv/bin/activate
pip3 install virtualenv python3 -m virtualenv biomedicus_venv biomedicus_venv\Scripts\activate
Install PyTorch Libraries
BioMedICUS requires PyTorch, a machine learning framework. Installation instructions for PyTorch can be found here. Select your platform and “Pip”, and “None” for CUDA unless you have a NVIDIA graphics card and have installed the CUDA toolkit.
pip3 install biomedicus
This will install two packages,
biomedicus_client, with the command line programs
b9client respectively. The main
biomedicus package contains all of the BioMedICUS processor servers and the
biomedicus_client package contains functionality for connecting to the servers and processing documents.
Deploy the default BioMedICUS processors
The following command runs a script that will start up all of the BioMedICUS services for processing clinical notes:
It will ask you to download the BioMedICUS model files if you have not already.
Process a directory of text files using BioMedICUS
After deploying BioMedICUS, you can process a directory of documents using the following command:
b9client run --include-label-text /path/to/input_dir -o /path/to/output_dir
This will process the documents in the directory using BioMedICUS and save the results as json-serialized MTAP Events to output directory.
The default BioMedICUS pipeline and run command will serialize the documents as json. By default the files are not prettified, but you can do that by running the following:
python -m json.tool /path/to/output_file.json