SynthDa: Exploiting Existing Real-World Data for Usable and Accessible Synthetic Data Generation
Acquiring real-world data for computer vision presents challenges such as data scarcity, high costs, and privacy concerns. We introduce SynthDa, an automated approach for usable synthetic data generation (SDG) that empowers users with varying expertise to create diverse synthetic data from existing real-world datasets. It combines pose estimation, synthetic scene creation, and domain randomization to offer data variants. Ease of SDG through SynthDa enables different permutations and combinations of synthetic data that allow users to explore efficacy of various data configurations in relation to their specific AI tasks. Our experiments across multiple existing datasets and models demonstrate the utility of SynthDa in challenging nuances such as the “more data, the better” paradigm; revealing that excessive synthetic data may degrade performance and vice versa. In a pilot user study with 24 participants, we show the perceived usefulness of SynthDa as a promising SDG tool for overcoming challenges related to real-world data acquisition.
Journal/Conference/Book titleSA '23: SIGGRAPH Asia 2023 Technical Communications