With the expansion of open government data policies and the rapid deployment of Generative AI technologies, the utilization of KOGL (Korea Open Government License)-licensed open government works has significantly increased. However, copyright infringement issues continue to arise due to misunderstanding of license conditions, inaccurate labeling, and ambiguity in third-party ownership, leading to violations of attribution (BY), non-commercial use (NC), no-derivatives (ND), and false registration (CF). Traditional manual verification processes are insufficient for handling the growing volume and complexity of shared content, and they fail to automatically detect contextual and multimodal infringement patterns.
To address these challenges, this study proposes a multimodal AI-based copyright infringement risk prediction model that integrates textual descriptions and image content to evaluate KOGL compliance. The model consists of a six-stage pipeline-Input, Pre-processing, CUR Feature Extraction, GRU-based Sequence Modeling, CRU Rule Engine, and Final Decision-and learns contextual temporal patterns to classify infringement types. Through 1,000 repeated experiments, the proposed model achieved significantly improved performance over single-modal baselines in Accuracy, F1-Score, and AUC, and provides evidence-based explanatory outputs for practical decision support.