import chlib NRD = chlib.data.Data.get_from_config('../config.json','HCUPNRD') for p_key,patient in NRD.iter_patients(): # patients & visits break print p_key,patient for v in patient.visits: print v # print visits # Retrieve patients / visits by a specific diagnosis/procedure print len(list(TX.iter_patients_by_code('D486'))) # Compute aggregate statistics policy = chlib.entity.aggregate.Policy(min_count=20,min_hospital=5) aggregate = chlib.entity.aggregate.Aggregate() aggregate.init_compute("temp","temp",policy) aggregate.add(v) aggregate.end_compute()
Protocol Buffers are used to represent patients, visits as well as aggregate statistics. LevelDB allows use of removable encrypted drives for at-rest security.
A local DJango server allows quick inspection of patient/visit level data and visualization of aggregate statistics using pre-defined charts and tables.
We currently support HCUP & State databases. However the protocols can be easily extended in future to support other formats such as OHDSI CDM.
We provide a docker container image with a docker-compose file. The compose file contains Postgres & RabbitMQ used for the local Django server.
To run Computational Healthcare clone the repo and start containers using "docker-compose up".
#clone the repo git clone https://github.com/AKSHAYUBHAT/ComputationalHealthcare cd ComputationalHealthcare/docker # launch containers docker-compose up -d
# make sure the containers are running by going to localhost:8111 open localhost:8111 # To load NRD data edit prepare_nrd.sh with the path to NRD 2013 data files ./prepare_nrd.sh # To load Texas data edit prepare_nrd.sh with the path to Texas data files ./prepare_tx.sh # You can load either one or both datasets # The image contains jupyter notebook server # Launch ipython/jupyter notebooks ./jupyter.sh open localhost:8188 # To stop and remove containers docker-compose down # The data volumes are named, retained and automatically attached when started again docker volume ls # The volume chdata contains processed data, other are used by Postgres & RabbitMQ
Iterate over all patients or quickly retrieve patients who underwent a specific procedure
LevelDB & Protocol buffers allow accessing from any programming language.
# Get dataset object NRD = chlib.data.Data.get_from_config('../config.json','HCUPNRD') # Path to the levelDB directory with serialized patients objects as values print NRD.db # Iterate over all patients patients & visits for p_key,patient in NRD.iter_patients(): # break # Retrieve all patients by specific diagnosis or procedure code patients = list(NRD.iter_patients_by_code('D486'))
Codes are prepended with unique character per code type. E.g.
ICD-9 procedure codes are prepended with 'P'.
ICD-9 diagnosis codes are prepended with 'D'.
You can also print string representation of Enums
coder = chlib.codes.Coder() print 'D486',coder['D486'] # output: D486 Pneumonia, organism unspecified print 'P9971',coder['P9971'] # # output: P9971 Therapeutic plasmapheresis print coder[chlib.entity.enums.D_AMA] # output: Against medical advice
Built-in primitives for computing aggregate statistics on "bag of visits/patients"
Aggregation policy for specifying parameters such as minimum number of visits etc.
Statistics and policies are also represented using protocol buffers.
Computed aggregate statistics can be quickly examined using a local django server.
# Aggregate statistics for all inpatient visits in Texas # dataset where patient underwent Therapeutic plasmapheresis code = 'P9971' TX = chlib.data.Data.get_from_config('../config.json','TX') aggregate = chlib.entity.aggregate.Aggregate() policy = chlib.entity.aggregate.Policy(min_count=20) aggregate.init_compute('Test key',"Test dataset",policy) for _,p in TX.iter_patients_by_code(code): for v in p.visits: aggregate.add(v) aggregate.end_compute() visualizer_url = agg.visualize(host='127.0.0.1') print visualizer_url # you can also manually copy paste webbrowser.open(visualizer_url)
# Aggregate statistics for all patients in # HCUP Nationwide Readmission Database where patient had complications due to transplanted kidney from chlib.entity.aggregate import PatientAggregate,Policy HCUPNRD = chlib.data.Data.get_from_config('config.json','HCUPNRD') pagg = PatientAggregate() pagg.init_compute("temp","temp",Policy()) for _,p in HCUPNRD.iter_patients_by_code('D99681'): pagg.add_patient(p) pagg.end_compute() url = pagg.visualize(host="127.0.0.1",port=8000,prefix='local/') print url webbrowser.open(url)
To minimize chances of visit/patient level information leaking via error messages or traceback, we have not enabled issues on the Github repo. If you find any bugs, make sure that your bug report/question does not contains any visit or patient level information. To file bugs, comments or if you plan on citing Computational Healthcare library please contact Akshay Bhat on email below.
© 2017 Akshay U Bhat, Peter M. Fleischut & Ramin Zabih, Cornell University.
All Rights Reserved, At this time we are pursuing the patent process to protect this software.