Home > Journals > Electronic & Technology > Journal of Computer Science Research > 2020 > Vol. 2, Iss. 1 >

Research and Application on Spark Clustering Algorithm in Campus Big Data Analysis

Updata:01-01-1970

Source:
By:Author(s)

DOI: https://doi.org/10.30564/jcsr.v2i1.1808

Abstract:

Big data analysis has penetrated into all fields of society and has brought about profound changes. However, there is relatively little research on big data supporting student management regarding college and university’s big data. Taking the student card information as the research sample, using spark big data mining technology and K-Means clustering algorithm, taking scholarship evaluation as an example, the big data is analyzed. Data includes analysis of students’ daily behavior from multiple dimensions, and it can prevent the unreasonable scholarship evaluation caused by unfair factors such as plagiarism, votes of teachers and students, etc. At the same time, students’ absenteeism, physical health and psychological status in advance can be predicted, which makes student management work more active, accurate and effective.

References:

[1] Yihua Huang. Understanding Big Data[M]. China Machine Press, 2014. [2] Meiling Huang. Spark MLlib Machine Learning: Algorithm, Source Code and Actual Combat Details[M]. Publishing House of Electronics Industry, 2016. (in Chines) [3] Aiwu Zhou, Dandan Cui, Yong Pan. An Optimization Initial Clustering Center of K-means Clustering Algorithm[J]. Microcomputer and Its Applications, 2011, 30(13): 1-3. [4] Weizhong Zhao, Huifang Ma, Yanxiang Fu, et al. Research on Parallel K-means Algorithm Design Based on Hadoop Platform[J]. Computer Science, 2011(10): 166-168. [5] Dean J, Ghemawat S. MapReduce: Simplified Data Processing on Large Clusters[J]. Communications of the ACM, 2008, 51(1): 107-113. [6] Jianpei Zhang, Yue Yang, Jing Yang, et al. Algorithm for Initialization of K-Means Clustering Center Based on Optimized-Division[J]. Journal of System Simulation, 2009, 21(9): 2586-2589. [7] The Apache Software Foundation. Apache Mahout: Scalable Machine Learning and Data Mining [EB/ OL], 2014. [8] F Wang, Z Liu. Optimization method of distributed K-means algorithm based on Spark. Computer Engineering and Design, 2019; 40(6): 1595-1600. DOI: https://doi.org/10.16208/j.issn1000-7024.2019.06.017 [9] Y Qu, W Deng, F Hu, et al. Algorithm for ordering points to identify clustering structure based on spark. Computer Science, 2018; 45(1): 97-102+107. DOI: https://doi.org/10.11896/j.issn.1002-137X.2018.01.015 [10] M Xu, C Yu, H Shen. Research on K-means algorithm of spark parallelization. Microelectronics & Computer, 2018, 35(5): 95-99. [11] Liu P, Teng J, Zhang G, et al. Parallel K-means algorithm for massive texts on spark. The 2nd CCF Big Data Conference, 2014. (in Chinese). Available from: http://mahout.apache.org/

Tags:

Location Determination of Optimal Emergency System for Hurricane Disaster Based on Mathematical Modeling

Based On K-means Disease Diagnosis Research

Review of Artificial Intelligence with Retailing Sector

International Journal of Bioprinting

Research and Application on Spark Clustering Algorithm in Campus Big Data Analysis

Related

Anthropic Principle Algorithm:A new Heuristic Optimization Meth

Extracellular Vesicles and Their Significance in COVID-19

Modeling of Mechanisms Providing the Overall Control of Human Circulation

Do Endocrinopathies Differ in Most Prevalent Hemoglobinopathy of Middle East: Beta

Cortisol Response in Breast Cancer: The Role of Physical Activity and Exercise

Monograph

title