<p dir="ltr">This research presents an iterative, precision-oriented pipeline for generating 3D high-rise building mass using natural language inputs and Retrieval-Augmented Generation (RAG). By addressing limitations in existing generative models, it introduces an interactive framework that aligns linguistic descriptions with 3D model latent spaces, enabling precise customization. Built on a high-rise dataset with detailed point cloud representations, the system employs a 3D Adversarial Autoencoder (3DAAE) for diverse, high-accuracy outputs. </p><p dir="ltr">A key component is an interactive platform integrating semantic retrieval and generative modeling, allowing users to input descriptions, retrieve base models, and iteratively refine designs with real-time feedback. The system ensures fine-grained control by mapping language embeddings to model latent spaces. </p><p dir="ltr">Contributions include a semantically rich high-rise dataset, a robust retrieval mechanism, and an interactive workflow for optimizing 3D forms. This research advances AI-driven generative architecture, enhancing usability, semantic mapping, and user-driven design processes.</p>