A Survey on Privacy Preserving Clustering Analysis in Big Data Environment

  • Manjunath T N, Amogh Pramod Kulkarni, Ravindra S Hegad, Prabhuram


Big data computing has gained wide acceptance for its capability to mine knowledge from a large volume of data. It has been used in many knowledge mining requirements in various domains like medicine, finance, social network analysis etc. Clustering is one of the most common important methods for knowledge extraction from large volumes of data. Mining on data in domains like medicine, finance and social network does compromise the privacy of the individual and often leaks sensitive information. The leak of sensitive data can be direct or through inference. Many methods have been proposed in literature for privacy preservation during data mining. This work studies those methods and identifies the weakness in those solutions when applied for big data analytics.