Computer vision (CV) is a field of artificial intelligence that seeks to develop techniques to help computers "see" and understand the content of digital images from cameras and videos. According to the McGraw-Hill Dictionary of Scientific and Technical Terms, synthetic data is "any production data applicable to a given situation that are not obtained by direct measurement". While deep learning has brought rapid progress for many computer vision problems, the approach requires large training datasets with annotated ground truth. Concurrently, progress in computer graphics has enabled realistic rendering of synthetic scenes. Such synthetic data generated from computer graphics offer fast acquisition and labelling solutions, thus providing an alternative means to fuel computer-vision approaches that requires copious training datasets.
There are many benefits to using synthetic images for CV-based systems. Real-world training datasets are often low quality, insufficient in quantity, poorly tagged, and may not represent rare conditions. Several systems trained on synthetic data have reportedly outperform their conventional counterparts trained on real-world data. However, synthetic data is not always suited for CV-based problem. This is especially so when datasets are too complex to "fake" correctly. Care should be taken to ensure datasets correctly represent real-world scenarios and encompass all conditions required by the CV-based system.
CV-based object detection relies on algorithms to churn out different object categories present in the image, with bounding boxes to indicate their detected positions and scales. Examples of CV-based object detection models trained using deep learning approach include Yolo, SSD, and FasterRCNN. These models are often data hungry and require substantial amount of data to achieve acceptable accuracy.
To generate synthetic training dataset, 3D environments are created. 3D models, that provide good representative of the target objects to be identified by the CV-based object detection model are to be included. The following are some websites where 3D models could be obtained.
The section describes the approach to generate synthetic training dataset using unity3D. The idea focusses on rendering 3D object of interest against varied background images, using different camera settings and lighting conditions. The training dataset can be used to train object detection models.
View a 3 minute youtube tutorial
Email at lynn@scry3D.com.