SPIE Prism Award 2023 Winner
Physical data models in machine learning imaging pipelines
Once the raw data is collected, it is processed through a complex image signal processing (ISP) pipeline to produce an image compatible with human perception. However, this processing is rarely considered in machine learning modelling because available benchmark data sets are generally not in raw format. This study shows how to embed the forward acquisition process into the machine learning model.
machine learning, image signal processing, ISP, physical data model
Machine Learning and the Physical Sciences workshop, NeurIPS 2022
Data models for dataset drift controls in machine learning with images
While there are methods to prospectively validate the robustness of machine learning models to such dataset drifts, existing approaches do not account for explicit models of the primary object of interest: the data. This makes it difficult to create physically faithful drift test cases or to provide specifications of data models that should be avoided when deploying a machine learning model. In this study, we demonstrate how these shortcomings can be overcome…
Machine Learning, Artificial Intelligence, Computer Vision, Pattern Recognition, Data Drift
Preprint available (2022)
Data-centric AI workflow based on compressed raw images
Jetraw images and functions may be used in end-to-end models to generate synthetic data with statistics matching those of genuine raw images, and play an important role in data-centric AI methodologies. Here we show how these features are used for a machine-learning task: the segmentation of cars in an urban, suburban and rural environment. Starting from a drone and airship image dataset in the Jetraw format (with calibrated sensor and optics), we use an end-to-end model to emulate realistic satellite raw images with on-demand parameters.
synthetic data, machine learning, AI, data-centric AI, satellite, drones, compression
8th International Workshop on On-Board Payload, Athens, 26 September 2022
Statistical distortion of supervised learning predictions in optical microscopy induced by image compression
Interestingly, a recent metrologically accurate algorithm, offering up to 10:1 compression ratio, provides a prediction spread equivalent to that stemming from raw noise. The method described here allows to set a lower bound to the predictive uncertainty of a SL task and can be generalized to determine the statistical distortions originated from a variety of processing pipelines in AI-assisted fields.
Artificial Intelligence (AI), Supervised Learning (SL) models, Deep Learning (DL) algorithms
Scientific Reports (2022) 12:3464
ML4H auditing: from paper to practice
In this work, we target the paper-to-practice gap by applying an ML4H audit framework proposed by the ITU/WHO Focus Group on Artificial Intelligence for Health (FG-AI4H) to three use cases: diagnostic prediction of diabetic retinopathy, diagnostic prediction of Alzheimer’s disease, and cytomorphologic classification for leukemia diagnostics.
Machine Learning, Health, Testing
Proceedings of the Machine Learning for Health, PMLR 136:280-317, 2020
Unchaining hyperspectral imaging with quantum-inspired compression (UHIQIC)
The current movement towards increased use of lossy compression is highly risky, because even careful and tedious parameter tuning cannot guarantee that no applications are compromised. We implemented and validated a compression method that simultaneously provides a strong data reduction and preserves analysis results for all possible applications.
hyperspectral imaging, machine learning, Earth Observation, satellites, compression
Proceedings of ATTRACT Online Conference "Igniting the Deep Tech Revolution", 22 September 2020, online
Jetraw: validated image compression for quantitative and AI applications
In this paper, we discuss requirements for compression tuned for machine vision, demonstrate an implementation achieving a compression ratio in the range 5:1–10:1 at a rate 200 MB/s/core in software and 400 MB/s on a VHDL FPGA simulation having a 5k-LUT footprint. We also show that adding a machine-learning component to our compressor increases the compression ratio by 10% and allows for easy portability of an otherwise complex algorithm on enterogenous architectures.
compression, satellites, machine learning, AI, Earth Observation, ESA
7th International Workshop on On-Board Payload Data Compression by ESA and CNES, virtual online workshop, 2020