Sunteți pe pagina 1din 12

Analysis and Mapping Design of Central Java Agro-

Industry MSME with K-Means Cluster Method for Export


Improvement

M Luqmanul Hakim¹, Taufik Djatna²


¹ ² Post-Graduate Program, Agricultural Industry Technology, Bogor Agricultural
Institute Bogor, Indonesia ¹luqmanul_hakim@apps.ipb.ac.id

Abstract. Agro-industry is a superior product of Indonesia's exports to date. In the


era of the emergence of a global marketplace like Alibaba.com, it is expected that our
agro-industrial products export will increase. Through Alibaba.com or
globalbuyersonline.com, exporters can accept import requests from all over the world
from tens to hundreds every day. Thus, it can be said that Indonesian exporters have
more trouble finding and preparing goods than looking for buyers. Central Java has
products from Agro-industry MSME that are quite nationally known. There are
Candied Carica from Dieng, Semut Sugar Purworejo, Ropes from Bumiayu, Milk and
Cheese from Boyolali etc. However, the contribution of Central Java in exports is still
insignificant compared to other provinces in Java. In this study, a system for
clustering of Agro-industry MSME in Central Java will be made with the k-means
clustering method to evaluate the input of MSME data, so that it is easy to read
deeper information (data mining). Data entered from Central Java Agro-industry
MSME is formed by cluster based on production capacity because in the export of
goods transactions that occur in large numbers. The cluster formed is three clusters.
Then, it is entered into the K Means Clustering algorithm, by calculating the euclidan
distance and the smallest difference. After the calculation and iteration done, the final
cluster results in the 4th iteration, with the best cluster for increasing exports in the
first cluster in the capacity range of 4-18.3.
Keywords: Central Java Agro-industry MSME, Clustering, K – Means Algorithm

1. Introduction
Indonesia has the largest export from agro-industrial products. Three
Indonesian export products are dominated by agro-industrial products, namely palm
oil, coffee, and tea. In the era of the emergence of global marketplace like
Alibaba.com., the export process to other countries becomes easier, because the
communication process between exporters and importers takes place in real time and
efficiently. Through Alibaba.com or globalbuyersonline.com, the exporters can
accept import requests from all over the world from tens to hundreds every day. Thus,
it can be said that Indonesian exporters have more trouble finding and preparing
goods than looking for buyers.
The Province of Central Java has products from Agro-industry MSME that are
quite nationally known. There are Candied Carica from Dieng, Sugar Semut Purworejo,
Mine from Bumiayu, Milk and Cheese from Boyolali etc. Still, the contribution of
Central Java in exports is still insignificant compared to other provinces in Java. Based
on the data from BPS of West Java and East Java, they are included in the top 5
provinces with national export contributions [1].
Based on the description above, in this study, a system for clustering of Agro-
industry MSME in Central Java will be made by evaluating input data entered with data
mining technique. Data mining is the process of finding relationships in data that are not
known by users and presenting them in a way that can be understood so that the
relationship can be the basis for decision-making [2]. The data mining technique that
will be used in this research is clustering. By using clustering, we can identify solid
areas, find patterns of distribution as a whole, and find interesting links between data
attributes. In data mining, efforts are focused on discovery methods for clusters in large
databases effectively and efficiently [3].
Previous research with the theme of Agro-industry MSME has been done a lot,
but most researches are using the GIS method. Besides that, the existing research is not
intended for the needs of supporting export information systems [4][5][6]. Regarding
the K-Means Clustering method itself, it has been widely used, but it is still rare for
clustering in MSME [7][8].

2. Theoretical Framework of MSME Understanding

In accordance with Law Number 20 of 2008 concerning Micro, Small and


Medium Enterprises (MSMEs), the definition of Micro Business is a productive
business owned by an individual and / or individual business entity that meets the
criteria of Micro Enterprises as stipulated in the Act.

Small Business is a productive economic enterprise that is independent, carried


out by an individual or business entity that is not a subsidiary or not a branch of a
company that is owned, controlled, or becomes a direct or indirect part of a medium or
large business that meets the Small Business criteria as referred to in the Act.

Medium Business is a productive economic enterprise that is independent,


carried out by an individual or business entity that is not a subsidiary or branch of a
company that is owned, controlled, or becomes part of either directly or indirectly with
a Small Business or large business with amount of net wealth or annual sales results as
stipulated in the Act.
The criteria for MSME according to the law are:
1. Micro Enterprises, the maximum asset is 50 million, with a maximum turnover of 300
million
2. Small Business, total assets> 50 Million - 500 Million, with a turnover of> 300
Million - 2.5 Billion
3. Medium Business, total assets> 500 Million - 10 Billion, with a turnover of> 2.5
Billion - 50 Billion [9]

Export Definition
According to Minister of Trade Regulation No. 13/20 concerning "General
Provisions in the Export Sector", the definition of export is the activity of taking out
goods from the customs area. The Customs Area is the territory of the Republic of
Indonesia which covers land, waters, air space above it, and certain places in the
exclusive economic zone and continental shelf in which the Customs Law applies.
[10]
Meanwhile exporters are people or business entities that carry out export
activities. According to the survey Association of the German Trade Fair Industry
(AUMA), 2009, Exporters get effective buyers through Trade Shows (81 %),
Personal Sales (74 %), Direct Mailing (53 %), Advertising in Trade Journals (49 %),
Events (43%), Public Relations (41 %) and Internet Sales (36 %). [11]
Clustering
Clustering is the process of grouping objects or data into classes that have
similar characteristics [12]. Clustering is part of the data mining analysis method.
Data mining is an activity that includes collecting, data using, historically to find
order, patterns, or relationships in large data sets [13]. Classes on clustering are called
clusters. The purpose of this data clustering is to minimize the objective function set in
the clustering process, which generally attempts to minimize variations within a cluster
and maximize variation between clusters [14].
In general, there are two clustering approaches, namely partitioning approach
and hierarchical approach. Clustering with a partitioning approach is a grouping of
data from one large group then divided into several smaller groups. An example of a
clustering method with partitioning approach is K-Means Clustering. Clustering with
a hierarchical approach or often referred to as Hierarchical Clustering classifies data
by combining each record or individual in the data into clusters. An example of a
clustering method with a hierarchical approach is Agglomerative Hierarchical
Clustering [15].

K-Means Algorithm
K-Means algorithm is done by grouping objects that have same characteristics
into one cluster so that the level of similarity in one cluster is high while the level of
similarity between clusters is low [16]. The steps to cluster using the K-Means
method are as follows [17].
1. Determine the number of clusters k.
2. Initialize the cluster center k randomly.
3. Allocate all of the data to the nearest cluster. The proximity of a data to a cluster is
determined by the distance between the data and the center of the cluster. The
closest distance determines a data included in a cluster.
4. Recalculate the cluster center based on the new membership. The new cluster
center is calculated based on the average data on each attribute in one cluster.
5. Recalculate the distance of data to the center of the new cluster. If the cluster
members do not experience movement from one cluster to another cluster, the
clustering process is complete. If the cluster members experience movement from
one cluster to another cluster, then return to step 3 until there is no cluster member
is moved.

3. Research Method
Research Data
The data taken were assumptions data on Central Java Agro-industry MSME.

Data Processing
The stage of data processing was done in three ways, namely data selection, data
cleaning, and data transformation.

Grouping Using K-Means Algorithms


Group used K-Means started with initializing the number of clusters k. Then,
initialize the cluster k center randomly or partition. The next step was to calculate the
distance of each data to the initial cluster center using Euclidean Distance. Then,
group each data based on the closest distance to the center of the cluster. Recalculate
cluster center values based on average values in one cluster. Clustering was
completed when cluster members did not experience movement from one cluster to
another cluster. If the opposite happened, returned to stage 3.

Cluster Result Analysis


The analysis was carried out by describing the characteristics of groups formed from
the results of K-Means grouping

4. Results and Discussions


Input data:
The data entered from the general data of Agro-industry MSME in Central Java are:
Name of MSME, Type of Product, Address, Production Capacity, Turnover, and
Contact Person. Data are in the form of assumptions and estimates data based on the
observations on the condition of regional MSME in Central Java province. The data is
assumed to use a scale from 1 - 20, to show the size of the MSME. The form of data
can be as in table 1:

No UMKM-Products Address/Contact Person Capacity Turnover


1. Catfish Chips Doro, Pekalongan / 0858 xxx xxx xx 2.5 5
2. Apem Kesesi Kesesi, Pekalongan/ 0813 xxx xxx xx 3 4
3. Kulit Batang Batang, 0813 xxx xxx xx 3.5 7
4. Tambang Bumiayu Bumiayu/ 0818 xxx xxx xx 5 9
5. Batik Rizieq Medono, Pekalongan City/0815 xxx 13.3 7.5
xxx xx
6. Ogel-ogel Pemalang / 0819 xxxx xx xx 4.5 5.4
... ....... ............... ....
... ....... ............... ....
50. Wood Kendal / 0813 xxx xxx xx 9 8

Table 1 Central Java Industry MSME database

Data selection:
Data is deleted if it does not include contact person, production capacity, and business
location. The main thing is about production capacity because it will become a
clustering variable.

Data transformation:
The data of Agro-industry MSME that have been selected and cleaned are then
aggregated by each type based on the criteria needed for the analysis process. Data
transformation can be done with aggregation and normalization.
Aggregation is performed on data when operating cumulative calculations. Data
normalization is done by changing the attribute value into a smaller and predetermined
value range. One of the normalization method is the Z-Score Normalization. The Z-
Score Normalization formula is shown in equation 1 below.
..................................(1)
Description: A is the attribute that will be normalized; d 'is a new value resulting from
normalization; d is the initial value of the attribute to be normalized; mean (A) is the
average attribute value to be normalized; std (A) is the standard deviation of attribute
that will be normalized.

Grouping Using K-Means Algorithms


1. Initialize the number of clusters: in this study, the number of clusters examined in
manual calculation is the number of clusters 3.
2. Cluster center initialization: The initial cluster center used in this study is done in
two ways, namely taking objects randomly and partitioning objects into group
initials. In the example of a manual calculation, the researcher takes a random object
to get the value of the initial cluster center.
The initial cluster that researchers take randomly is as in table 2
23rd data as the center of Cluster 1 C1 1
47th data as center of Cluster 2 C2 0.21
38th data as center of Cluster 3 C3 0
Table 2. The cluster center table which is randomly selected

3. Calculate the distance of each data to the initial cluster center using Euclidean:
distance is calculated using Equation 2. To calculate the distance of data to each
cluster center can use the Euclidean Distance formula shown in Equation 2.

.......................... (2)

Description: d(i,j) is the data distance to i for the cluster center to j; p is the number of
attributes; l is 1, 2, 3, .... p; x ij is the i data on attribute l; x jl is j cluster center on
attribute l
4. Group each data based on the closest distance to the center of the cluster: in the first
iteration, the data is grouped into each cluster based on the closest distance value
between the data object and the cluster center.
5. Recalculate cluster center values based on average values in one cluster: After each
object is grouped against each cluster in the first iteration, the next step is to
recalculate the cluster center. The formula for calculating the new cluster center is
shown in Equation 3

................................ (3)

Description: vj is the new j-center cluster; Nj is the amount of data which becomes the
member of j cluster; i is 1, 2, 3, ... Nj; p is the number of attributes; l is 1, 2, 3, ... p; x il
is i data on attribute l.
The calculation result of the new cluster center will be used as the cluster center for the
next iteration. The center of the cluster will change when the cluster members in each
move.

Clustering is completed when cluster members do not experience movement from one
cluster to another cluster: the second iteration begins by using the new cluster center. If
the object moves, then return to stage 3 using the new cluster center. Based on the
formation of n clusters as many as 3 clusters, grouping stops at the fourth iteration. In
the fourth iteration, there is no moving of objects grouping in each cluster. The results
of the fourth iteration can be seen in Table 3.

Cluster MSME’s Names

C1
Kulit Batang, Rizieq Batik, Ogel Ogel, Jepara Furniture, Indrakila Cheese,
Carica, Dieng Vegetables, Guci Vegetables, Tawangmangu Vegetables,
Semarang Temulawak, Ants Sugar, Paper Prima, Sindoro Coffee, Jolotigo
Tea, Batang Tea, Boyolali Milk, Bandeng Presto, Ubi Presto, Briquettes,
Charcoal, Choir, Kecap Primasari, Brebes salted eggs, solo batik, batik, pati
rice, pekalongan fish, Temanggung goats, rengginang cilacap, jenang
purbalingga, gudeg jogja, bakpia patok, Pekalongan durian, Semarang
durian, magelang dragon fruit, salak pondoh, magelang chili, brebes onion,
semarang shrimp, pellet grejen, tawang briquette, wood

C2
Catfish chips, corn chips

Apem Kesesi, tambang bumiayu, Lapis legit, Opak Magelang, Red Ginger,
C3
Petung Coffee

Table. 3. The table of clustering result uses K-Means Clustering in the 4th iteration
with the center of the final cluster formed as shown in table 4

C1 0.207741
C2 0.018182
C3 0.040909
Table 4 . The table of cluster center in the 4th iteration

The Analysis of Cluster Result


The Number of MSME’s members on Each Cluster
4
5
4
0
3
5
3
Capacity

0
2
5
2
0 C1 C2 C3
1 Cluster
5 r
1
0
Chart 1. Graph of the number of members in each cluster
5
0 1, we can see that the distribution of the membership of each cluster
Based on chart
differs very sharply. Cluster 1 (C1) has a total of 42 MSME members. Whereas cluster
2 (C2) has the smallest member, which is as much as 2 MSME. And cluster 3 (C3) has
6 MSME members.
Distribution Chart of Capacity in Each Cluster
14

12

10

2
C1 C2 C3
Cluster

Chart 2. Capacity distribution chart for each cluster

From chart 2, it can be obtained information that the range of members of each
cluster is also different. From the result of cluster 1 (C1), it can be seen that the
cluster member range is 14.3 that is between the values of 4-18.3. In cluster 2 (C2),
the cluster member range is 0, because the two members have a value of 2.5. And in
cluster 3 (C3), the cluster member range is 0.5, which is between 3-3.5.
Then the cluster sequentially from large to small is divided into three, namely the
largest cluster C1, the middle cluster C3 and the smallest cluster C2. Thus, the most
potential MSME to be targeted towards export activities are MSME in the first
cluster. It it due to it has a very large production capacity. The strategy to increase
exports from various parties can be started from MSMEs in the first cluster (C1). The
strategy can be:
1. Require the creation of websites in order to ease importers to access further
information
2. Meet exporters for stronger networks so that they can adjust exporters' standards
3. Provide insight and training on exports
4. Provide support and assistance to attend the International Trade Expo in various
countries
5. Invite the business matchmaking agenda with importers

5. Conclusions and suggestions


Data entered from Central Java Agro-industry MSME is formed by cluster based
on production capacity because in the export of goods transactions that occur in large
numbers. The best cluster result on assumption data based on capacity is in the 4th
iteration, with the most potential cluster being the first cluster (C1), which has a
capacity scale range of 4-18.3.
Further research will be better if more types of clusters are used. It is necessary to
do research on clustering of Central Java Agro-industry MSME with real data, then
displayed on the website in accordance with the cluster so that it is more easily
accessed by exporters. Research with this method is very helpful if we deal with very
large amounts of data (Big Data).

References
[1] BPS, https://databoks.katadata.co.id/datapublish/2017/03/20/inilah-10-
provinsi- penopang-ekspor-indonesia, March 2017.
[2] McLeod, Jr.R. dan G.P. Schell. Management Information System. 10thed.
Pearson Education, Inc. Ali Akbar Yulianto dan Afia R. Fitriati (penterjemah). .
Sistem Informasi Manajemen. Edisi 10. Nina Setyaningsih (editor). Jakarta
Salemba Empat, 2008.
[3] Defiyanti, Sofi. Integrasi Metode Clustering dan Klasifikasi untuk Data Numerik.
CITEE. ISSN: 2085-6350, July 2017.
[4] Oktavia, Harma. Perancangan Aplikasi Pemetaan Lokasi Usaha Kecil Menengah
(UMKM) Di Kota Lubuklinggau Berbasis Goegraphic Information System (GIS) Dan
Location Based Service (LBS). Jatisi, Vol. 3 No. 2nd March 2017.
[5] Sri W, Noveandini R, Sutarno. Digitalisasi Pemetaan UMKM Tenun Garut
Berbasis Sisterm Informasi geografis Sebagai media Komunikasi dan Pemasaran
Produk Lokal. Prosiding Seminar Nasional Multi Disiplin Ilmu & Call For Paper
UNISBANK (SENDI_U). ISBN: 978-979-3649-81-8.
[6] Aji S,Basukianto, Jeffry A. Klasterisasi UMKM dan Potensi Wilayah Berbasis
Peta Sebagai Strategi Pengembangan Ekonomi Daerah. Jurnal Pekommas, Vol. 2 No.
2, October 2017: 143 – 150, 2017.
[7][17] Ong, J. O. Implementasi Algoritma K-Means Clustering Untuk Menentukan
Strategi Marketing President University. Jurnal Ilmiah Teknik Industri , page 10-20,
June 2013
[8] Evy D, Dyah H, Eto W. Implementasi K-Means Clustering Untuk Pemetaan Desa
dan Kelurahan di Kabupaten Bangkalan Berdasarkan Contraceptive Prevalence Rate
dan Tingkat Pendidikan. Seminar Nasional Matematika dan Aplikasinya, Universitas
Airlangga Surabaya, 21st October 2017.
[9] Kemendag, Undang- Undang Nomor 20 Tahun 2008 tentang Usaha Mikro, Kecil
dan Menengah (UMKM).
[10] Peraturan Kementrian Perdagangan No. 13/20 mengenai “Ketentuan Umum
Dibidang ekspor”.
[11] Marpaung S, 2016 . http://ipbtraining.com/wp-
content/uploads/2016/03/Finding-
Intl-Buyers-Export-Promotion-Strategy-Delivered.pdf
[12] [16] Han, J., & Kamber, M. Data Mining. San Fransisco: Morgan Kaufmann
Publisher, 2006
[13] Santosa, B Data Mining: Teknik Pemanfaatan Data untuk Keperluan Bisnis.
Yogyakarta: Graha Ilmu , 2007.
[14] Agusta, Y. K-means - Penerapan, Permasalahan dan Metode Terkait. Jurnal
Sistem dan Informatika Vol. 3 : 47-60 Februari 2007.
[15] A. Yusuf and H. Tjandrasa, “Prediksi Nilai Dengan Metode Spectral Clustering
Dan Clusterwise Regression,” vol. VIII, no. 1, pp. 39–45, 2013.

S-ar putea să vă placă și