Volume : 9, Issue : 6, JUN 2023
DETECTION OF PRIVATE DATA IN MOODLE'S TEXT-BASED FIELDS
BEATA GANCEVSKA, PAULIUS NOMGAUDAS
Abstract
Moodle is the most popular and widely used learning management system, but it collects and stores a lot of personal information. The use of such data for analysis, development or system testing raises significant privacy concerns. This paper investigates a combination of approaches to anonymizing specific attributes in the Moodle database. The newly proposed method uses named entity recognition techniques to find personally identifiable information, while more traditional methods can be used for other private data (IP, e-mail, etc). The obtained results identify that Lithuanian language-specific data is lacking accuracy in the named entity recognition area (recall for names and company titles reaches up to 94%, however for different address elements recall values reach just up to 81%), while common format data is easily recognized even without the usage of machine learning solutions.
Keywords
DATA ANONYMIZATION, NATURAL LANGUAGE PROCESSING, NAMED ENTITY RECOGNITION, MOODLE.
Article : Download PDF
Cite This Article
-
Article No : 3
Number of Downloads : 499
References
1. Biesner, D., Ramamurthy, R., Lübbering, M., Fürst, B., Ismail, H., Hillebrand, L., ... & Sifa, R. (2020). Leveraging Contextual Text Representations for Anonymizing German Financial Documents. Proc. Knowledge Discovery from Unstructured Data in Financial Services. AAAI.
2. Eftimov, T., Korouši? Seljak, B., & Korošec, P. (2017). A rule-based named-entity recognition method for knowledge extraction of evidence-based dietary recommendations. PloS one, 12(6), e0179488.
3. Goyal, A., Kumar, M., & Gupta, V. (2017). Named entity recognition: applications, approaches and challenges. International Journal of Advance Research in Science and Engineering, 35(5), 482-489.
4. Li, J., Sun, A., Han, J., & Li, C. (2020). A survey on deep learning for named entity recognition. IEEE Transactions on Knowledge and Data Engineering, 34(1), 50-70.
5. Perera, N., Dehmer, M., & Emmert-Streib, F. (2020). Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673.
6. Qiu, Q., Xie, Z., Wu, L., & Tao, L. (2019). GNER: A generative model for geological named entity recognition without labeled data using deep learning. Earth and Space science, 6(6), 931-946.
