Important: The GCConnex decommission will not affect GCCollab or GCWiki. Thank you and happy collaborating!

AI-Assisted Quality Control of CTD Data

From wiki
Revision as of 10:30, 22 December 2022 by Lee.croft (talk | contribs)
Jump to navigation Jump to search

As part of the suite of Conductivity, Temperature, Depth (CTD) AI tools being produced by the Office of the Chief Data Steward (OCDS), we are developing a model to assist with identifying and deleting poor quality scans during the CTD quality control process. Using a combination of a Gaussian Mixture Model (GMM) to cluster CTD scans into groups with similar physical properties and Multi-Layer Perceptrons to classify the scans in each group, we are able to automatically flag the poor-quality scans to be deleted with a high degree of accuracy. Through the deployment of the model as a real-time online endpoint and the support of model communication through a client-side program, we have successfully integrated an experimental model into the client's business process in a field testing environment. The continuation of this line of work will now look to bring the model into a production environment for regular usage in the quality control process.

Use Case Objectives

  • Machine Learning Task: Flag in advance the scans to be deleted during CTD quality control
  • Business Value: Flagged scans allow the analyst to quickly focus attention on crucial areas, reducing the time and effort required to delete scans
  • Measures of Success:
    • Accuracy of model predictions
    • Client feedback on quality control speed-ups
  • Aspirational Goals:
    • Mitigation of uncertainty in human decisions
    • Semi or full automation of scan deletions


Flow diagram for the model integration into the business process.

Machine Learning Pipeline

Three-step process used in the machine learning pipeline.


Experimental Model Performance

Model performance and dataset histogram over the depth range from which CTD scans are collected.


Model Deployment and Integration

Information flow in the integration of the model deployment into the business process.


Next Steps