The performance of deep learning AI is determined by model design and dataset quality at once. Pebblous focuses more on the data side operations with <Data Clinic> solution, which is composed of three key modules such as data imaging, data diagnoisis, and data enhancement. Our <Data Imaging> module transforms a large high-dimensional dataset into a proper form with more managable and observable dimensions. Once a data image is generated, we perform <Data Diagnoisis> to assess the data quality with various mathematical and semantic measures. Then, we generate <Data Quality Report> to promote customers’ data insight and further enhancement actions. As a final step, we apply <Data Enhancement> module based on the qulity report. <Data Synthesis> is one of the such enhancement actions to fill in the missing data or to anonymize privacy data on the data manifold defined in the data imaging stage.
Quality assessment and synthetic data generation for AI dataset based on solution.