KDDCup 99 was used to detect anomalies in this source code. K-Means algorithm was implemented by semi-supervised learning.
Apache Spark, Zeppelin is necessary tools to run the code. You should upload "KDDCupFull.json" to zeppelin repository.
More details about the project in this link: https://goo.gl/kvCGZZ