dc.contributor.advisor | Kurnia, Anang | |
dc.contributor.advisor | Rizki, Akbar | |
dc.contributor.author | Azzahra, Aulia | |
dc.date.accessioned | 2022-09-25T23:52:29Z | |
dc.date.available | 2022-09-25T23:52:29Z | |
dc.date.issued | 2022 | |
dc.identifier.uri | http://repository.ipb.ac.id/handle/123456789/114650 | |
dc.description.abstract | Text mining merupakan teknik dalam pengolahan data teks untuk
mengekstrak informasi tertentu. Salah satu contoh text mining adalah mengukur
nilai kemiripan antara dua string. Algoritme Levenshtein Distance dapat
digunakan untuk mengukur kemiripan teks dengan menghitung jarak perbedaan
antara dua string. Algoritme lain yang digunakan adalah algoritme Cosine
Similarity. Algoritme Cosine Similarity melakukan pengukuran dengan
membandingkan jarak antara dua string yang dinyatakan dalam dua buah vektor.
Penerapan algoritme ini juga dilakukan dengan berbagai teknik text pre-processing
yang dapat memberikan hasil yang lebih maksimal. Kedua algoritme tersebut
digunakan dalam penelitian ini untuk mengukur kemiripan teks antara dua dataset.
Data yang digunakan yaitu, data hasil survei dengan master data pada database
Perusahaan X. Data hasil survei berjumlah 24.659 respon sedangkan pada database
Perusahaan X berjumlah 725 daftar nama program kelas online. Objek yang diukur
kemiripannya ialah nama program kelas online pada data hasil survei dengan
database Perusahaan X. Nilai kemiripan yang dihasilkan digunakan untuk
mengidentifikasi daftar nama program kelas online yang diikuti peserta pada hasil
survei. Hasil identifikasi dari 24.659 respon hasil survei diperoleh 322 daftar nama
program kelas online dengan algoritme Levenshtein Distance dan 327 daftar nama
program kelas online dengan algoritme Cosine Similarity. | id |
dc.description.abstract | Text mining is a technique in processing text data to extract certain
information. One example of text mining is measuring the similarity value between
two strings. Levenshtein Distance algorithm can be used to measure the similarity
of text by calculating the distance difference between two strings. Another
algorithm used is the Cosine Similarity algorithm. The Cosine Similarity algorithm
performs measurements by comparing the distance between two strings expressed
in two vectors. The application of this algorithm is also carried out with various text
pre-processing techniques that can provide maximum results. The two algorithms
are used in this study to measure the similarity of the text between the two datasets.
The data used are survey results data with master data in Company X database. Data
from survey results totaled 24,659 responses, while in Company X database there
were 725 lists of online class program names. The object whose similarity is
measured is the name of the online class program in the survey data with the
Company X database. The similarity value generated is used to identify the list of
names for the online class program that participants participated in the survey
results. The identification results from 24,659 responses to the survey results
obtained 322 lists of names for online class programs with the Levenshtein Distance
algorithm and 327 lists of names for online class programs with the Cosine
Similarity algorithm. | id |
dc.language.iso | id | id |
dc.publisher | IPB University | id |
dc.title | Analisi Kemiripan Teks Terhadap Goal Standard Menggunakan Algoritme Levenshtein Distance dan Cosine Similarity (Studi Kasus : Data Hasil Survei Perusahaan X) | id |
dc.title.alternative | Analysis of Text Similarity to Goal Standard Using Levenshtein Distance and Cosine Similarity Algorithm (Case Study: Data Result of Company X Survey) | id |
dc.type | Undergraduate Thesis | id |
dc.subject.keyword | cosine similarity | id |
dc.subject.keyword | kemiripan teks | id |
dc.subject.keyword | levensthein distance | id |
dc.subject.keyword | text mining | id |
dc.subject.keyword | cosine similarity | id |
dc.subject.keyword | text similarity | id |
dc.subject.keyword | levensthein distance | id |
dc.subject.keyword | text mining | id |