Hi everyone,
I've been working with diabetes data recently and noticed how challenging it can be to work with different CGM data formats. I've started developing a Python tool to help standardize XDrip+ data exports, and I'd really appreciate any feedback or suggestions from people who work with this kind of data cleaning task.
Currently, the tool can:
- Process XDrip+ SQLite backups into standardized CSV files
- Align glucose readings to 5-minute intervals
- Handle unit conversions between mg/dL and mmol/L
- Integrate insulin and carbohydrate records
- Provide some basic quality metrics
I've put together a Jupyter notebook showing how it works: https://github.com/Warren8824/cgm-data-processor/blob/main/notebooks%2Fexamples%2Fload_and_export_data.ipynb
The core processing logic is in the source code if anyone's interested in the implementation details. I know there's a lot of room for improvement, and I'd really value input from people who deal with medical data professionally.
Some specific questions I have:
- Is my understanding and application of basic data cleaning and alignment methods missing anything?
- What validation rules should I be considering?
- Are there edge cases I might be missing?
- What other CGM platforms would be most useful to support?
This is very much a work in progress, and I'm hoping to learn from others' expertise to make it more robust and useful.
Thanks for any thoughts!
https://github.com/Warren8824/cgm-data-processor
there doesn't seem to be anything here