<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/resources/xsl/jats-html.xsl"?>
<article article-type="research-article" dtd-version="1.1" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
	<journal-meta>
		<journal-id journal-id-type="publisher-id">jkits</journal-id>
		<journal-title-group>
		<journal-title xml:lang="ko">한국지식정보기술학회 논문지</journal-title>
		<journal-title>Journal of Knowledge Information Technology and Systems</journal-title>
		</journal-title-group>
		<issn pub-type="ppub">1975-7700</issn>
		<publisher>
		<publisher-name xml:lang="ko">한국지식정보기술학회</publisher-name>
		<publisher-name>Korea Knowledge Information Technology Society</publisher-name>
		</publisher>
	</journal-meta>
	<article-meta>
		<article-id pub-id-type="publisher-id">jkits_2020_15_06_955</article-id>
		<article-id pub-id-type="doi">10.34163/jkits.2020.15.6.004</article-id>
		<article-categories>
			<subj-group>
				<subject>Research Article</subject>
			</subj-group>
		</article-categories>
		<title-group>
			<article-title>A Study on Big Data-based GraphX Model for Social Network Service</article-title>
			<trans-title-group xml:lang="ko">
				<trans-title>소셜 네트워크 서비스를 위한 빅데이터 기반 GraphX 모델에 관한 연구</trans-title>
			</trans-title-group>
		</title-group>
		<contrib-group>
			<contrib contrib-type="author" xlink:type="simple">
				<name-alternatives>
					<name name-style="western">
						<surname>Cho</surname>
						<given-names>Leesang</given-names>
					</name>
					<name name-style="eastern" xml:lang="ko">
						<surname>조</surname>
						<given-names>이상</given-names>
					</name>
				</name-alternatives>
					<xref ref-type="aff" rid="A1"><sup>1</sup></xref>
			</contrib>
			<contrib contrib-type="author" xlink:type="simple">
				<name-alternatives>
					<name name-style="western">
						<surname>Kim</surname>
						<given-names>Jinhong</given-names>
					</name>
					<name name-style="eastern" xml:lang="ko">
						<surname>김</surname>
						<given-names>진홍</given-names>
					</name>
				</name-alternatives>
				<xref ref-type="fn" rid="fn001">*</xref>
				<xref ref-type="aff" rid="A2"><sup>2</sup></xref>
			</contrib>
					</contrib-group>
		<aff-alternatives id="A1">
			<aff><sup>1</sup><italic>Division of Mechanical and Electronics Engineering, Hansung University</italic></aff>
			<aff xml:lang="ko"><italic>한성대학교 기계전자공학부 조교수</italic></aff>
		</aff-alternatives>
		<aff-alternatives id="A2">
			<aff><sup>2</sup><italic>Department of Computer Engineering, Paichai University</italic></aff>
			<aff xml:lang="ko"><italic>배재대학교 컴퓨터공학과 조교수</italic></aff>
		</aff-alternatives>
		<author-notes>
			<fn id="fn001"><label>*</label><p>Corresponding author is with the Department of Computer Engineering, Paichai University, 155-40 Baejae-ro, Seo-Gu Daejeon, 35345, KOREA.</p><p><italic>E-mail address</italic>: <email>jinhkm@pcu.ac.kr</email></p></fn>
		</author-notes>
		<pub-date pub-type="ppub">
			<month>12</month>
			<year>2020</year>
		</pub-date>
		<volume>15</volume>
		<issue>6</issue>
		<fpage>955</fpage>
		<lpage>962</lpage>
		<history>
			<date date-type="received">
				<day>27</day>
				<month>11</month>
				<year>2020</year>
			</date>
			<date date-type="rev-recd">
				<day>03</day>
				<month>12</month>
				<year>2020</year>
			</date>
			<date date-type="accepted">
				<day>11</day>
				<month>12</month>
				<year>2020</year>
			</date>
		</history>
		<permissions>
			<copyright-statement>&#x00A9; 2020 KKITS All rights reserved</copyright-statement>
			<copyright-year>2020</copyright-year>
		</permissions>
		<abstract>
		<title>ABSTRACT</title>
		<p>Nowaday, towards adopting big data processing system has increased and it is commonly seen in every aspect of life. For this, the problem of finding connected components in undirected graphs has been well studied, and it is an essential pre-processing step to many graph computations, and a fundamental task in graph analytics applications. Recently, it has been a main area of interest in the large graph processing. However, much of the research has focused on solving the problem using High Performance Computers. In large distributed systems, the MapReduce framework dominates the processing of big data, and has been used for finding connected components in big graphs although iterative processing is not directly supported in MapReduce. Current big data processing systems have developed into supporting iterative processing and providing additional features other than MapReduce. This research investigates how to enhance the performance of finding connected components algorithm for large graph in distributed processing system. It uses the approach to considering the graph degree property in choosing the component identifier, reviewing how this can affect the efficiency of the algorithm. In the design of our proposed algorithm features provided by current new processing systems such as moving the computation more toward the data partition in Spark framework model are integrated.</p>
		</abstract>
		<trans-abstract xml:lang="ko">
		<title>요약</title>
		<p>오늘날 빅데이터 처리 시스템의 증가로 인해 삶의 모든 측면에서 일상적으로 볼 수 있다. 이를 위해, 무방향 그래프에서 연결된 구성 요소를 찾는 문제에 대해 잘 연구되고 있으며, 많은 그래프 계산에 필수적인 전처리 단계로서 그래프 분석 애플리케이션의 기본 작업이다. 최근 큰 그래프 처리에서의 주요 관심 분야이기도 하다. 하지만, 대부분의 연구는 고성능 컴퓨터를 사용하여 문제를 해결하는 데 중점을 두고 있다. 대규모 분산 시스템에서 MapReduce 프레임워크는 빅데이터 처리를 이용하며, 반복 처리가 MapReduce에서 직접 지원되지는 않지만 큰 그래프에서 연결된 구성요소를 찾는 데 사용되고 있다. 현재 빅데이터 처리 시스템은 반복 처리를 지원하고, MapReduce 이외의 추가 기능을 제공한다. 따라서, 본 연구는 분산 처리 시스템에서 큰 그래프에 대한 연결 성분 찾기 알고리즘의 성능을 향상시키는 방법을 제안하였다. 구성 요소 식별자를 선택할 때 그래프 정도 속성을 고려하는 접근 방식을 사용하여 알고리즘의 효율성이 어떤 영향을 미칠 수 있는지 검토하였다. Spark의 프레임워크모델로 더 많은 계산을 이동하는 것과 같이 현재 새로운 처리 시스템에서 제공되도록 제안하였다.</p>
		</trans-abstract>
		<kwd-group kwd-group-type="author" xml:lang="en">
<title>K E Y W O R D S</title>
			<kwd>Graph analytics applications</kwd>
			<kwd>Large graph processing</kwd>
			<kwd>Distributed systems</kwd>
			<kwd>Big data processing systems</kwd>
			<kwd>Spark framework model</kwd>
		</kwd-group>
	</article-meta>
</front>
<body>
<sec id="sec001" sec-type="intro">
	<title>1. Introduction</title>
	<p>Ever since, the trend towards adopting big data processing system has increased and it is commonly seen in every aspect of life. Therefore, it has become cost of storage is decreasing and the ability to capture different kind of data is growing especially [<xref ref-type="bibr" rid="B001">1</xref>]. In view of the diversity of data acquired nowadays and the massive amount of data stored, it needs to find new methods to deal with data beyond the traditional database, such as handling data that do not usually fit in a single machine memory or disk. One approach is to look at data as a network or a graph with edges connecting things together, those edges can take different forms of relationships [<xref ref-type="bibr" rid="B002">2</xref>]. This metaphor of graph is currently used in many areas: computer science, economics, sociology, biology, and many more. Almost anything can be represented as a graph. Graphs are considered to be a very flexible data model and to recognize local and global characteristics of the system, and to analyse different features of the complex networks. Extensive research has been carried out on graphs and graph processing, and have been extensively used to efficiently process data and extract knowledge [<xref ref-type="bibr" rid="B003">3</xref>]. However, recent graphs are beyond the ability of traditional systems to handle, either because the sizes of current graphs are very big, and they usually do not fit in a machine's memory, or because current algorithms cannot process such graphs efficiently, particularly when using the current distributed systems. Our focus in this research is on the problem for finding Connected Components efficiently in an undirected graph [<xref ref-type="bibr" rid="B004">4</xref>]. A component represents a graph (or subgraph) where any two vertices inside that graph are connected via paths, and there is no edge that connects any vertex outside the component [<xref ref-type="bibr" rid="B005">5</xref>]. This problem has been well studied, as it is an essential pre-processing step to many graph computations, and is a building block in complex graph analysis such as clustering. They have large graph process ing and much of the research so far has focused on solving the problem using High Performance Computers [<xref ref-type="bibr" rid="B006">6</xref>], with high computation power and equipped with very large memory capacity. Large-scale graphs (or big graphs) are usually stored using a distributed file system, like Hadoop, either in the cloud or locally [<xref ref-type="bibr" rid="B007">7</xref>]. Hadoop provide open source software framework of commodity computers, and the MapReduce framework dominates the processing of large-scale data on Hadoop, and it is commonly used for mining big graphs. However, iterative processing is not directly supported in MapReduce. Our research builds on the knowledge that current big data processing systems have become more advanced with features beyond MapReduce. For example, a processing system, like Spark, supports iterative processing and provides additional features other than MapReduce such as data partitioning and caching [<xref ref-type="bibr" rid="B008">8</xref>]. Spark also supports graph processing using GraphX, which is an open source Spark API for graph-parallel computation. Moreover, current connected component algorithms in large distributed processing system only use the traditional approach in choosing the component identifier for each connected component based on the lexical ordering of the node ID value [<xref ref-type="bibr" rid="B009">9</xref>].</p>
	<p>This paper is organized as follows. In Section 2 details Big Data. Then graph for system modeling is presented in Section 3 and Big Data-based GraphX in Section 4. Finally, Section 5 is conclusion in this paper.</p>
</sec>
<sec id="sec002">
	<title>2. Big Data</title>
	<p>Big data is broadly defined as data that is too big, fast, and hard to deal with using conventional database tools. A more technical view is provided that define Big data as data that requires new technologies and architectures. This is because the database management tools or traditional data processing applications are unable to process the data in a timely, cost effective way, because it is too large to be stored and processed and too complex and varied to be analysed and visualized. Big data have a three concepts that are used to explain as bellows; 1) Volume: is the word associated with “BIG” in big data [<xref ref-type="bibr" rid="B010">10</xref>]. It includes the increasing massive amount of data collected and produced and goes beyond the ability to hold and process easily. 2) Variety: data come from many sources. These include, for example, web logs, sensor data, social media data, emails, images, documents and audio. Data in general comes in three types: structured, semi-structured and unstructured. Data Variety is probably the hardest to manage when processing a large amount of data. 3) Velocity: is concerned with the speed of the data coming from various sources. For example, streaming data and sensor data or data that is required to be handled in real-time.</p>
	<p>Hadoop is an open source software framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models [<xref ref-type="bibr" rid="B011">11</xref>]. It is based on MapReduce. HDFS was developed for reliable, scalable, distributed computing. It allows working with thousands of computers and dealing with petabytes of data. It operates on commodity computers to store data across hundreds of computers [<xref ref-type="bibr" rid="B012">12</xref>]. Data nodes will host files, files are divided into chunks (usually 64 megabytes size), which are replicated on different disks (usually three times, one disk should be on a different rack). A Master node has a directory that records where each file is stored and replicated. MapReduce gives the programmer the advantages of not needing to consider the details of data distribution [<xref ref-type="bibr" rid="B013">13</xref>], parallel executing, replication and load balancing. Its programming concept is familiar and allows parallelised and distributed execution for jobs across clusters of computers [<xref ref-type="bibr" rid="B014">14</xref>]. It requires two functions as bellows; 1) Map function, which is defined by the programmer to process key-value data. Each chunk or more of a given data will be processed by the map function and gives output as key-value pairs. They are then divided among reducers in such a way that each group with the same key goes to the same reducer. 2) Reducer function, takes the key-value pairs and combines all the values associated with the same key and carries out any computation defined by the programmer. It then outputs the new value. The reducer output could be in key-value pairs to feed another mapper in an iterative way [<xref ref-type="bibr" rid="B015">15</xref>].</p>
	<p>After all, Hadoop cluster is at least one machine running the Hadoop software. In each cluster, there is a single master node with a varying number of slave nodes. Slave nodes can act as both the computing nodes for the MapReduce and as data nodes for the HDFS.</p>
</sec>
<sec id="sec003" sec-type="methods">
	<title>3. Graph for System Modeling</title>
	<p>With the huge amount of data collected, there is an urgent need to efficiently deal with it and extract knowledge that no one has discovered before. One approach is to look at this data as a network with links connecting things together, those links can take different kind of forms of relationships. This metaphor of networks is currently used in many areas: computer science, economics, sociology, biology, and many more. It can effectively address many of the challenges in each area by understanding the “connectedness” of these complex systems. Modelling the relationship in a network by graphs helps to generate a natural human interpretation and simple mechanical analysis. Since then mathematicians extensively studied graph and its properties. Graphs have been used to understand complex human and natural phenomena. In general, graph is used in any domain when there is a need to find a network representation of logical or physical links between entities. Its applications spread on wide variety of domains such as: linguistics, economics, sociology, biology, chemistry, and pharmacology (e.g. graphs model the complicated structure of chemical compounds and protein structures), and computer science (e.g. Worldwide Web, workflows, XML documents, computer networks, physical connections, computer vision, video indexing, text retrieval and social networks) and many more, where graph algorithms have been developed to solve different kinds of problems. &#x003C;<xref ref-type="fig" rid="f001">Figure 1</xref>&#x003E; shows the detailed Graph for Spark Architecture.</p>
	<fig id="f001" orientation="portrait" position="float">
		<label>Figure 1.</label>
		<caption>
			<title>Graph for Spark Architecture</title>
		</caption>
		<graphic xlink:href="../ingestImageView?artiId=ART002663655&amp;imageName=jkits_2020_15_06_955_f001.jpg" position="float" orientation="portrait" xlink:type="simple"></graphic>
	</fig>
</sec>
<sec id="sec004" sec-type="Results">
	<title>4. Big Data-based GraphX</title>
	<p>There is an urgent need to deal with structured, unstructured, and heterogeneous data instead of the traditional one. Thus, graphs are now widely used because of its expressive power and the ability to connected object in different way. However, mining is hard to implement because of the structure of the data, and size of the data as the real-world graphs are very big, and usually does not fit in a machine memory. Adding to that, there is no single model that efficiently fits all the types of graph algorithms and application, nonetheless many have been developed to solve specific problems or to meet some special classes of applications. A rising trend to capture and store any data available, especially as the cost of storage decreasing and ability to capture different kind of data increase. Pushed by the world getting more connected; more connected devices and embedded sensors and expanding networks and others all contribute to found the area of the Internet of Things (IoT); in addition, people life is more digital nowadays than ever before, and there is increasing presence of social media in our life (Facebook, Twitter, Snapchat, Instagram, and many more). The existing real-world dataset is getting large enormously, these datasets reflect different kind of relationships and can be generally efficiently represented using graph structures. However, as the graph grow larger their size and complexity go beyond single processing machine ability and make processing it with HPC systems a challenging task which is not always suitable for it. Accordingly, we have several extensions that it made to Spark to efficiently relational data and run SQL for GraphX as shown by &#x003C;<xref ref-type="fig" rid="f002">Figure 2</xref>&#x003E;.</p>
	<fig id="f002" orientation="portrait" position="float">
		<label>Figure 2.</label>
		<caption>
			<title>Spark Interface SQL for GraphX</title>
		</caption>
		<graphic xlink:href="../ingestImageView?artiId=ART002663655&amp;imageName=jkits_2020_15_06_955_f002.jpg" position="float" orientation="portrait" xlink:type="simple"></graphic>
	</fig>
	<p>GraphX processing library on top of Spark, and it is implemented on top of its dataset API.</p>
	<p>It supports the implementation of different graph processing models, such as vertex-centric, gather-sum-apply, and scatter-gather. GraphX is represented by a dataset of vertices and a dataset of edges, and also &#x003C;<xref ref-type="fig" rid="f003">Figure 3</xref>&#x003E; shows the detailed description about each data processing stages Graph for Spark Framework Model.</p>
	<fig id="f003" orientation="portrait" position="float">
		<label>Figure 3.</label>
		<caption>
			<title>Spark Framework Model</title>
		</caption>
		<graphic xlink:href="../ingestImageView?artiId=ART002663655&amp;imageName=jkits_2020_15_06_955_f003.jpg" position="float" orientation="portrait" xlink:type="simple"></graphic>
	</fig>
	<p>As mentioned above, we have a simulation message_Identification class used for message used in the max identification &#x26; pruning steps. message_Tree class used for generate update messages for seed propagation tree T. message_Propagation used in the Seed Propagation Phase to hold the updates are generated from the root to the child nodes, where each node notifies its child nodes with its component identifier.</p>
	<p>&#x003C;<xref ref-type="fig" rid="f004">Figure 4</xref>&#x003E; presents the class diagram of the classes used for exchanging messages between nodes.</p>
	<fig id="f004" orientation="portrait" position="float">
		<label>Figure 4.</label>
		<caption>
			<title>Class Diagram</title>
		</caption>
		<graphic xlink:href="../ingestImageView?artiId=ART002663655&amp;imageName=jkits_2020_15_06_955_f004.jpg" position="float" orientation="portrait" xlink:type="simple"></graphic>
	</fig>
</sec>
<sec id="sec005" sec-type="Conclusion">
	<title>5. Conclusions</title>
	<p>In this paper, this research has been to examine the processing of large-scale graphs and more specifically, enhance the performance of finding connected components algorithms in large graphs. Finding connected components is an essential pre-processing step to extract knowledge about the graph. It is also a fundamental operation for some graph computations. The MapReduce framework dominates the processing of large-scale data on Hadoop, and it is commonly used for mining big data graphX. However, iterative processing is not directly supported in MapReduce. Nonetheless, some recent works show that it is possible to outperform other models for finding connected components using MapReduce. Accordingly, we need to more experiment for GraphX within Big data.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<!--[1] S. T. Hwang, A study on big data platform architecture-based conceptual measurement model using comparative analysis for social commerce, Journal of Knowledge Information Technology and Systems(JKITS), Vol. 15, No. 5, pp. 623-630, Oct. 2020.-->
<ref id="B001">
<label>[1]</label>
<element-citation publication-type="journal">
<person-group>
<name><surname>Hwang</surname><given-names>S. T.</given-names></name>
</person-group>
<year>2020</year>
<month>Oct.</month>
<article-title>A study on big data platform architecture-based conceptual measurement model using comparative analysis for social commerce</article-title>
<source>Journal of Knowledge Information Technology and Systems(JKITS)</source>
<volume>15</volume><issue>5</issue>
<fpage>623</fpage><lpage>630</lpage>
<pub-id pub-id-type="doi">10.34163/jkits.2020.15.5.005</pub-id>
</element-citation>
</ref>
<!--[2] J. S. Kim, and J. H. Kim, A study on adaptive smart platform for intelligent software in big data environment, Journal of Knowledge Information Technology and Systems(JKITS), Vol. 15, No. 3, pp. 347-355, Jun. 2020.-->
<ref id="B002">
<label>[2]</label>
<element-citation publication-type="journal">
<person-group>
<name><surname>Kim</surname><given-names>J. S.</given-names></name>
<name><surname>Kim</surname><given-names>J. H.</given-names></name>
</person-group>
<year>2020</year>
<month>Jun.</month>
<article-title>A study on adaptive smart platform for intelligent software in big data environment</article-title>
<source>Journal of Knowledge Information Technology and Systems(JKITS)</source>
<volume>15</volume><issue>3</issue>
<fpage>347</fpage><lpage>355</lpage>
<pub-id pub-id-type="doi">10.34163/jkits.2020.15.3.004</pub-id>
</element-citation>
</ref>
<!--[3] H. K. Chang, Context recognition method for personalization Service in bigdata environment, Journal of Knowledge Information Technology and Systems(JKITS), Vol. 15, No. 5, pp. 631-638, Oct. 2020.-->
<ref id="B003">
<label>[3]</label>
<element-citation publication-type="journal">
<person-group>
<name><surname>Chang</surname><given-names>H. K.</given-names></name>
</person-group>
<year>2020</year>
<month>Oct.</month>
<article-title>Context recognition method for personalization Service in bigdata environment</article-title>
<source>Journal of Knowledge Information Technology and Systems(JKITS)</source>
<volume>15</volume><issue>5</issue>
<fpage>631</fpage><lpage>638</lpage>
<pub-id pub-id-type="doi">10.34163/jkits.2020.15.5.006</pub-id>
</element-citation>
</ref>
<!--[4] T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. Lax, S. McVeety, D. Mills, F. Perry, E. Schmidt, and S. Whittle, The dataflow model: A practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. Proceeding VLDB Endow, Vol. 8 No. 12, pp. 1792-1803, Aug. 2015.-->
<ref id="B004">
<label>[4]</label>
<element-citation publication-type="paper">
<person-group>
<name><surname>Akidau</surname><given-names>T.</given-names></name>
<name><surname>Bradshaw</surname><given-names>R.</given-names></name>
<name><surname>Chambers</surname><given-names>C.</given-names></name>
<name><surname>Chernyak</surname><given-names>S.</given-names></name>
<name><surname>Lax</surname><given-names>R.</given-names></name>
<name><surname>McVeety</surname><given-names>S.</given-names></name>
<name><surname>Mills</surname><given-names>D.</given-names></name>
<name><surname>Perry</surname><given-names>F.</given-names></name>
<name><surname>Schmidt</surname><given-names>E.</given-names></name>
<name><surname>Whittle</surname><given-names>S.</given-names></name>
</person-group>
<year>2015</year>
<month>Aug.</month>
<article-title>The dataflow model: A practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing</article-title>
<conf-name>Proceeding VLDB Endow</conf-name>
<volume>8</volume><issue>12</issue>
<fpage>1792</fpage><lpage>1803</lpage>
<pub-id pub-id-type="doi">10.14778/2824032.2824076</pub-id>
</element-citation>
</ref>
<!--[5] P. Alvaro, N. Conway, J. M. Hellerstein, and R. Marczak, Consistency analysis in bloom: A calm and collected approach. In Proceedings 5th Biennial Conference on Innovative Data Systems Research, pp. 249-260, 2011.-->
<ref id="B005">
<label>[5]</label>
<element-citation publication-type="paper">
<person-group>
<name><surname>Alvaro</surname><given-names>P.</given-names></name>
<name><surname>Conway</surname><given-names>N.</given-names></name>
<name><surname>Hellerstein</surname><given-names>J. M.</given-names></name>
<name><surname>Marczak</surname><given-names>R.</given-names></name>
</person-group>
<year>2011</year>
<article-title>Consistency analysis in bloom: A calm and collected approach</article-title>
<conf-name>Proceedings 5th Biennial Conference on Innovative Data Systems Research</conf-name>
<fpage>249</fpage><lpage>260</lpage>
</element-citation>
</ref>
<!--[6] B. Schilit, N. Adams, and R. Want, Context-aware computing applications, First Work shop on Mobile Computing Systems and Applications, pp. 85-90, 1994.-->
<ref id="B006">
<label>[6]</label>
<element-citation publication-type="paper">
<person-group>
<name><surname>Schilit</surname><given-names>B.</given-names></name>
<name><surname>Adams</surname><given-names>N.</given-names></name>
<name><surname>Want</surname><given-names>R.</given-names></name>
</person-group>
<year>1994</year>
<article-title>Context-aware computing applications</article-title>
<conf-name>First Work shop on Mobile Computing Systems and Applications</conf-name>
<fpage>85</fpage><lpage>90</lpage>
<pub-id pub-id-type="doi">10.1109/WMCSA.1994.16</pub-id>
</element-citation>
</ref>
<!--[7] D. Salber, A. K. Dey, and G. D. Abowd, The context toolkit: aiding the development of context Aware applications, In the Workshop on Software Engineering for Wearable and Pervasive Computing, Limerick Ireland, Jun. 2000.-->
<ref id="B007">
<label>[7]</label>
<element-citation publication-type="paper">
<person-group>
<name><surname>Salber</surname><given-names>D.</given-names></name>
<name><surname>Dey</surname><given-names>A. K.</given-names></name>
<name><surname>Abowd</surname><given-names>G. D.</given-names></name>
</person-group>
<year>2000</year>
<month>Jun.</month>
<article-title>The context toolkit: aiding the development of context Aware applications</article-title>
<conf-name>the Workshop on Software Engineering for Wearable and Pervasive Computing</conf-name>
<conf-loc>Limerick Ireland</conf-loc>
</element-citation>
</ref>
<!--[8] N. Pandeeswari, and G. Kumar. Anomaly detection system in cloud environment using fuzzy clustering based ANN. In: Mob. Netw. Appl. 21.3, pp. 494-505, 2016.-->
<ref id="B008">
<label>[8]</label>
<element-citation publication-type="journal">
<person-group>
<name><surname>Pandeeswari</surname><given-names>N.</given-names></name>
<name><surname>Kumar</surname><given-names>G.</given-names></name>
</person-group>
<year>2016</year>
<article-title>Anomaly detection system in cloud environment using fuzzy clustering based ANN</article-title>
<source>Mob. Netw. Appl.</source>
<volume>21</volume><issue>3</issue>
<fpage>494</fpage><lpage>505</lpage>
<pub-id pub-id-type="doi">10.1007/s11036-015-0644-x</pub-id>
</element-citation>
</ref>
<!--[9] R. J. Hyndman, E. Wang, and N. Laptev, Large-scale unusual time series detection. In: 2015 IEEE International Conference on Data Mining Workshop (ICDMW). IEEE, pp. 1616-1619, 2015.-->
<ref id="B009">
<label>[9]</label>
<element-citation publication-type="paper">
<person-group>
<name><surname>Hyndman</surname><given-names>R. J.</given-names></name>
<name><surname>Wang</surname><given-names>E.</given-names></name>
<name><surname>Laptev</surname><given-names>N.</given-names></name>
</person-group>
<year>2015</year>
<article-title>Large-scale unusual time series detection</article-title>
<conf-name>2015 IEEE International Conference on Data Mining Workshop (ICDMW)</conf-name>
<publisher-name>IEEE</publisher-name>
<fpage>1616</fpage><lpage>1619</lpage>
<pub-id pub-id-type="doi">10.1109/ICDMW.2015.104</pub-id>
</element-citation>
</ref>
<!--[10] B. Agrawal, A. Chakravorty, C. Rong, and T. W. Wlodarczyk. R2Time: A framework to analyse open TSDB time-series data in HBase. In: Cloud Computing Technology and Science (CloudCom), 2014 IEEE 6th International Conference, pp. 970-975, 2014.-->
<ref id="B010">
<label>[10]</label>
<element-citation publication-type="paper">
<person-group>
<name><surname>Agrawal</surname><given-names>B.</given-names></name>
<name><surname>Chakravorty</surname><given-names>A.</given-names></name>
<name><surname>Rong</surname><given-names>C.</given-names></name>
<name><surname>Wlodarczyk</surname><given-names>T. W.</given-names></name>
</person-group>
<year>2014</year>
<article-title>R2Time: A framework to analyse open TSDB time-series data in HBase</article-title>
<conf-name>Cloud Computing Technology and Science (CloudCom), 2014 IEEE 6th International Conference</conf-name>
<fpage>970</fpage><lpage>975</lpage>
<pub-id pub-id-type="doi">10.1109/CloudCom.2014.84</pub-id>
</element-citation>
</ref>
<!--[11] C. Wang, V. Talwar, K. Schwan, and P. Ranganathan, Online detection of utility cloud anomalies using metric distributions. In: Network Operations and Management Symposium (NOMS), 2010 IEEE, pp. 96-103, 2010.-->
<ref id="B011">
<label>[11]</label>
<element-citation publication-type="paper">
<person-group>
<name><surname>Wang</surname><given-names>C.</given-names></name>
<name><surname>Talwar</surname><given-names>V.</given-names></name>
<name><surname>Schwan</surname><given-names>K.</given-names></name>
<name><surname>Ranganathan</surname><given-names>P.</given-names></name>
</person-group>
<year>2010</year>
<article-title>Online detection of utility cloud anomalies using metric distributions</article-title>
<conf-name>Network Operations and Management Symposium (NOMS), 2010 IEEE</conf-name>
<fpage>96</fpage><lpage>103</lpage>
<pub-id pub-id-type="doi">10.1109/NOMS.2010.5488443</pub-id>
</element-citation>
</ref>
<!--[12] M. Jahrer, A. Toscher, J. Y. Lee, J Deng, H. Zhang, and J. Spoelstra. Ensemble of collaborative filtering and feature engineered models for click through rate prediction. In: KDDCup Workshop. 2012.-->
<ref id="B012">
<label>[12]</label>
<element-citation publication-type="paper">
<person-group>
<name><surname>Jahrer</surname><given-names>M.</given-names></name>
<name><surname>Toscher</surname><given-names>A.</given-names></name>
<name><surname>Lee</surname><given-names>J. Y.</given-names></name>
<name><surname>Deng</surname><given-names>J.</given-names></name>
<name><surname>Spoelstra</surname><given-names>J.</given-names></name>
</person-group>
<year>2012</year>
<article-title>Ensemble of collaborative filtering and feature engineered models for click through rate prediction</article-title>
<conf-name>KDDCup Workshop</conf-name>
</element-citation>
</ref>
<!--[13] P. Gaikwad, A. Mandal, P. Ruth, G. Juve, D. Krol, and E. Deelman. Anomaly detection for scientific workflow applications on networked clouds. In: High Performance Computing ＆ Simulation (HPCS), 2016 International Conference on. IEEE, pp. 645-652, 2016.-->
<ref id="B013">
<label>[13]</label>
<element-citation publication-type="paper">
<person-group>
<name><surname>Gaikwad</surname><given-names>P.</given-names></name>
<name><surname>Mandal</surname><given-names>A.</given-names></name>
<name><surname>Ruth</surname><given-names>P.</given-names></name>
<name><surname>Juve</surname><given-names>G.</given-names></name>
<name><surname>Krol</surname><given-names>D.</given-names></name>
<name><surname>Deelman</surname><given-names>E.</given-names></name>
</person-group>
<year>2016</year>
<article-title>Anomaly detection for scientific workflow applications on networked clouds</article-title>
<conf-name>High Performance Computing ＆ Simulation (HPCS), 2016 International Conference on. IEEE</conf-name>
<fpage>645</fpage><lpage>652</lpage>
<pub-id pub-id-type="doi">10.1109/HPCSim.2016.7568396</pub-id>
</element-citation>
</ref>
<!--[14] U. Kang, D. H. Chau, and C. Faloutsos, Mining large graphs: Algorithms, inference, and discoveries, in 2011 IEEE 27th International Conference on Data Engineering, pp. 243-254, 2011.-->
<ref id="B014">
<label>[14]</label>
<element-citation publication-type="paper">
<person-group>
<name><surname>Kang</surname><given-names>U.</given-names></name>
<name><surname>Chau</surname><given-names>D. H.</given-names></name>
<name><surname>Faloutsos</surname><given-names>C.</given-names></name>
</person-group>
<year>2011</year>
<article-title>Mining large graphs: Algorithms, inference, and discoveries</article-title>
<conf-name>2011 IEEE 27th International Conference on Data Engineering</conf-name>
<fpage>243</fpage><lpage>254</lpage>
<pub-id pub-id-type="doi">10.1109/ICDE.2011.5767883</pub-id>
</element-citation>
</ref>
<!--[15] T. Rabl, N. Raghunath, M. Poess, M. Bhandarkar, H.-A. Jacobsen, and C. Baru, Eds., Advancing Big Data Benchmarks, Vol. 8585. Cham: Springer International Publishing, 2014.-->
<ref id="B015">
<label>[15]</label>
<element-citation publication-type="book">
<person-group person-group-type="editor">
<name><surname>Rabl</surname><given-names>T.</given-names></name>
<name><surname>Raghunath</surname><given-names>N.</given-names></name>
<name><surname>Poess</surname><given-names>M.</given-names></name>
<name><surname>Bhandarkar</surname><given-names>M.</given-names></name>
<name><surname>Jacobsen</surname><given-names>H.-A.</given-names></name>
<name><surname>Baru</surname><given-names>C.</given-names></name>
</person-group>
<year>2014</year>
<article-title>Advancing Big Data Benchmarks</article-title>
<publisher-name>Cham</publisher-name>
<publisher-name>Springer International Publishing</publisher-name>
<comment>Vol. 8585</comment>
</element-citation>
</ref>
</ref-list>
<ack>
<title>Acknowledgments</title>
<p>This research was financially supported by Hansung University.</p>
</ack>
<bio>
	<p><graphic xlink:href="../ingestImageView?artiId=ART002663655&amp;imageName=jkits_2020_15_06_955_f005.jpg"></graphic><bold>Leesang Cho</bold> is Assistant Professor of Division of Mechanical and Electronics Engineering at the Hansung University, Seoul, Rep. of Korea. He received his Ph.D. degrees in Mechanical Engineering from Hanyang University, Rep. of Korea, in 2008. Moreover, He also served or currently serving as a reviewer and Technical Program Committee for many important Journals, Conferences and Research Project in Aircraft and Drone. His research interests include Artificial Intelligent and Big Data for Aircraft and Drone.</p>
	<p><italic>E-mail address</italic>: <email>ppome815@hansung.ac.kr</email></p>
	<p><graphic xlink:href="../ingestImageView?artiId=ART002663655&amp;imageName=jkits_2020_15_06_955_f006.jpg"></graphic><bold>Jinhong Kim</bold> is Assistant Professor of Department of Information Computer Engineering at the Pai Chai University, Daejeon, Korea. He respectively received his Ph.D. degrees in Electronic, Electrical and Computer from Sungkyunkwan University, Korea, in 2006. Moreover, He also served or currently serving as a reviewer and Technical Program Committee for many important Journals, Conferences, Symposiums, Workshop in Big Data area. His research interests include Smart Vehicular Network, Smart Platform, Machine Learning, Artificial Intelligent, Intelligent Software, Intelligent Agent System, and Big Data. He is a life member of the KKITS.</p>
	<p><italic>E-mail address</italic>: <email>jinhkm@pcu.ac.kr</email></p>
</bio>
</back>
</article>
