Topic Modeling of Social Media X Users’ Perceptions on The Kampus Merdeka Internship Program Using BERTopic
Date
2025Author
Azizah, Kamilah Nurul
Notodiputro, Khairil Anwar
Mualifah, Laily Nissa Atul
Metadata
Show full item recordAbstract
The Merdeka Belajar Kampus Merdeka (MBKM) program, initiated by the Ministry of Education, Culture, Research, and Technology, aims to bridge the gap between academic theory and industry needs. One of its flagship programs is the Kampus Merdeka Internship, designed to equip students with practical work experience. As a large-scale national program with widespread impact, its implementation has generated massive public discussion and diverse responses on social media platforms, particularly X. This study aims to implement BERTopic, a transformer-based topic modeling method, to identify the main topics in the discourse surrounding the program. The analysis was performed on 16,943 data collected from the X platform, covering the period from May 21, 2021, to February 28, 2025. The BERTopic modeling process involves IndoSBERT for text embedding, UMAP for dimensionality reduction, HDBSCAN for clustering, and c-TF-IDF for topic representation. The model was optimized using Optuna and the application of Maximal Marginal Relevance, yielding eight topics with a coherence score of 0.48 and a diversity score of 0.96. The identified topics cover benefits, such as professional development and financial support, as well as challenges ranging from administrative hurdles to ideological debates. These findings provide a basis for recommendations aimed at improving the program's implementation, governance, and sustainability. Program Merdeka Belajar Kampus Merdeka (MBKM) yang diinisiasi oleh Kementerian Pendidikan, Kebudayaan, Riset, dan Teknologi bertujuan menjembatani kesenjangan antara teori akademik dan kebutuhan industri. Salah satu program utamanya adalah Magang Kampus Merdeka yang dirancang untuk membekali mahasiswa dengan pengalaman kerja praktis. Sebagai program berskala nasional, pelaksanaannya menimbulkan berbagai respons dan diskusi publik yang masif di platform media sosial, khususnya X. Penelitian ini bertujuan menerapkan metode pemodelan topik berbasis transformer yaitu BERTopic untuk mengidentifikasi topik utama dalam diskusi mengenai program tersebut. Analisis dilakukan pada 16.943 data dari platform X untuk periode 21 Mei 2021–28 Februari 2025. Proses pemodelan BERTopic meliputi IndoSBERT untuk embedding teks, UMAP untuk reduksi dimensi, HDBSCAN untuk klasterisasi, dan c-TF-IDF untuk representasi topik. Model dioptimalkan menggunakan Optuna serta penerapan Maximal Marginal Relevance, yang menghasilkan delapan topik dengan skor koherensi 0,48 dan diversitas 0,96. Topik yang teridentifikasi mencakup manfaat seperti pengembangan profesional dan dukungan finansial, serta tantangan berupa kendala administratif hingga perdebatan ideologis. Temuan ini menjadi landasan rekomendasi untuk perbaikan implementasi, tata kelola, dan keberlanjutan program.
