Towards Contextualized Synthetic Dataset for Construction Site
Datasets are the cornerstone of AI[29], the absence of which has contributed to the slow adoption of AI technologies in the construction industry[10]. The hazardous nature of the construction site, the costly and labor-intensive labeling process, and other challenges make collecting construction datasets difcult. In order to overcome this issue, this research explores an approach to synthetic data generation and presents and validates its benefts compared to real datasets. To specify the scope, the problem of brick recycling from construction waste on-site is chosen for this study as it well represents the construction context’s messy nature. This research utilizes Unity Perception to build the scene of synthetic data generation to simulate the construction context. A series of experiments are proposed with diferent levels of realism, generating thousands of images for the brick recycling scenario at a considerably lesser cost than the real dataset. In an attempt to validate the benefts of the synthetic dataset in terms of performance, real datasets are collected, and object detection models are trained on real data, synthetic data, or real data combined with synthetic data. A benchmarking test is performed and validates that synthetic data mixed with real data could improve the performance of the model in the construction context. However, it still needs improvements if the synthetic data is to replace real data fully.
Funding
CD research support microgrant
History
Date
2023-05-12Degree Type
- Master's Thesis
Department
- Architecture
Degree Name
- Master of Science in Computational Design (MSCD)