Publications by Dotphoton's team

Data Models for Dataset Drift Controls in Machine Learning With Images

While there are methods to prospectively validate the robustness of machine learning models to such dataset drifts, existing approaches do not account for explicit models of the primary object of interest: the data. This makes it difficult to create physically faithful drift test cases or to provide specifications of data models that should be avoided when deploying a machine learning model. In this study, we demonstrate how these shortcomings can be overcome…

Authors
Luis Oala, Marco Aversa, Gabriel Nobis, Kurt Willis, Yoan Neuenschwander, Michèle Buck, Christian Matek, Jerome Extermann, Enrico Pomarico, Wojciech Samek, Roderick Murray-Smith, Christoph Clausen, Bruno Sanguinetti
Keywords
Machine Learning, Artificial Intelligence, Computer Vision, Pattern Recognition, Data Drift
Published
TBC announced

Data-Centric AI workflow based on compressed raw images

Jetraw images and functions may be used in end-to-end models to generate synthetic data with statistics matching those of genuine raw images, and play an important role in data-centric AI methodologies. Here we show how these features are used for a machine-learning task: the segmentation of cars in an urban, suburban and rural environment. Starting from a drone and airship image dataset in the Jetraw format (with calibrated sensor and optics), we use an end-to-end model to emulate realistic satellite raw images with on-demand parameters.

Authors
Marco Aversa, Ziad Malik, Phillip Geier, Fabien Droz, Andres Upegui, Roderick Murray-Smith, Christoph Clausen, Bruno Sanguinetti
Keywords
synthetic data, machine learning, AI, data-centric AI, satellite, drones, compression
Published
8th International Workshop on On-Board Payload, Athens, 26 September 2022

Statistical distortion of supervised learning predictions in optical microscopy induced by image compression

Interestingly, a recent metrologically accurate algorithm, offering up to 10:1 compression ratio, provides a prediction spread equivalent to that stemming from raw noise. The method described here allows to set a lower bound to the predictive uncertainty of a SL task and can be generalized to determine the statistical distortions originated from a variety of processing pipelines in AI-assisted fields.

Authors
Enrico Pomarico, Cédric Schmidt, Florian Chays, David Nguyen, Arielle Planchette, AudreyTissot, Adrien Roux, Stéphane Pagès, Laura Batti, Christoph Clausen, Theo Lasser, Aleksandra, Radenovic, Bruno Sanguinetti & Jérôme Extermann
Keywords
Artificial Intelligence (AI), Supervised Learning (SL) models, Deep Learning (DL) algorithms
Published
Scientific Reports (2022) 12:3464

ML4H Auditing: From Paper to Practice

In this work, we target the paper-to-practice gap by applying an ML4H audit framework proposed by the ITU/WHO Focus Group on Artificial Intelligence for Health (FG-AI4H) to three use cases: diagnostic prediction of diabetic retinopathy, diagnostic prediction of Alzheimer’s disease, and cytomorphologic classification for leukemia diagnostics.

Authors
Luis Oala, Jana Fehr, Luca Gilli, Pradeep Balachandran, Alixandro Werneck Leite, Saul Calderon-Ramirez, Danny Xie Li, Gabriel Nobis, Erick Alejandro Mu˜noz Alvarado, Giovanna Jaramillo-Gutierrez, Christian Matek, Arun Shroff, Ferath Kherif, Bruno Sanguinetti, Thomas Wiegand
Keywords
Machine Learning, Health, Testing
Published
Proceedings of the Machine Learning for Health, PMLR 136:280-317, 2020

Unchaining Hyperspectral Imaging with Quantum-Inspired Compression (UHIQIC)

The current movement towards increased use of lossy compression is highly risky, because even careful and tedious parameter tuning cannot guarantee that no applications are compromised. We implemented and validated a compression method that simultaneously provides a strong data reduction and preserves analysis results for all possible applications.

Authors
Christoph Clausen, Bruno Sanguinetti, Yosef Akhtman, Enrico Pomarico, Jérôme Extermann
Keywords
hyperspectral imaging, machine learning, Earth Observation, satellites, compression
Published
Proceedings of ATTRACT Online Conference "Igniting the Deep Tech Revolution", 22 September 2020, online

Validated efficient image compression quantitative and AI applications

In this paper, we discuss requirements for compression tuned for machine vision, demonstrate an implementation achieving a compression ratio in the range 5:1–10:1 at a rate 200 MB/s/core in software and 400 MB/s on a VHDL FPGA simulation having a 5k-LUT footprint. We also show that adding a machine-learning component to our compressor increases the compression ratio by 10% and allows for easy portability of an otherwise complex algorithm on enterogenous architectures.

Authors
Bruno Sanguinetti, Christoph Clausen, Michael Desert, and Evgeniya Balysheva
Keywords
compression, satellites, machine learning, AI, Earth Observation, ESA
Published
7th International Workshop on On-Board Payload Data Compression by ESA and CNES, virtual online workshop, 2020