Documente Academic
Documente Profesional
Documente Cultură
Noriyuki Fujimoto
tsutsui@hannan-u.ac.jp
fujimoto@mi.s.osakafu-u.ac.jp
ABSTRACT
! "
"
#
$
%&'('
!
)*+,
-
&./
0
1'2
0 $
.
"
3
4* "
'
5 67,
Categories and Subject Descriptors
' * + 89
'
:; <
" =
/
"
<
>?
/ @ ( 4 3 8
:; =
>(
"
General Terms
Keywords
" (
"
INTRODUCTION
"
"
$
"
'
$"
=(
D
; !"
$
<
$ E
$ <
"
F
"
3 4
0
$"
"
3 4* "
'
= 5 67,
"
$
'
"
$
<
*
"
<
3
'
<
G" C
$
!
$" <
,
2.
0
%&'('
46*
0
*
0
8*3:
&./
$
7GH2 8*3:
$
0
$
+ H2
8*3:
2.2 Applications of GPU Computation to the
Evolutionary Computation
84+" 43" **" 45:"
multiprocessor
multiprocessor
16KB
shared
shared
memory
mem.
proc.
proc.
proc.
device memory
VRAM
! 4; =(
$
84,: . *BB+
F
$
8*7:
D
K 2
*BB+
$
833:
!0 *BB5 $
4 *, ,
8,:
"
I
1
(
) 3( D
" / 1 K D
"
*BB7
$
?
83G: ? C
$
=
$
$
" =$
*BB+ (
("
$
8G:
'
"
" O
E
E
F
$
C
/
"
/ C
<'/(
!"
F
$ $
/ 47H2
!
)*+," C
on GPU
grid
block 2
block 3
thread 0
thread 1
thread 0
thread 1
...
thread n-1
thread n-1
thread n-1
thread 0
thread 1
...
thread n-1
thread n-1
d
an
s
te
ra
ne
ge
cate
ed
allocated
cat
allocated
allo
allo
...
ex
ec
ut
es
int main()
{
block m-1
...
block 1
thread 0
thread 1
...
block 0
thread 0
thread 1
...
...
on CPU
...
multiprocessor 1
multiprocessor 2
...
multiprocessor p
VRAM
GPU
location 1
5
$
$
$
location 2
4
0
'
$"
$
$
$
! 3 C
G
C
O C
'
" $ $
9 $ $
$
9"
PQ*" 4" G" 3R
$ 4
*"
$ *
4"
$ 3
G"
$
11
21
facility 4
30
facility 2
location
location
10
10
facility 3
12
facility
21
11
44
21
12
30
11
12
44
30
flow matrix f ij
I=
location 3
10
6
location 4
3
44
facility 1
2
facility
3.
cost (I )
ij
d I ( i )I ( j )
i 1 j 1
1524
G
3"
$ !
"
# 4 4,*G
8*5:
<
4
0
1'2
0 $ 84:
I1
I2
Ii
better
IN
Pair wise
selection
W
I1'
I'2
I'N
D
=
'
"
$
$
=I 8*6" *4" 34:
#( 8*,: ?"
$
$
! G
1
D
0
"
0
0
$
E
$
" ;
< 4 <
< * #
< 3 !
"
$
$
"
< G !
" $
$
< , #
< 7 !
"
'
"
< 5 '
< + '
"
I" < 3
$"
E
?"
$
C
$ 0
$ / 846: !
"
$
$
$
0
#%' I. $ D$ 83*:
!
"
$
0
" "
I) 8*G:
$
/) 8+: $
=
/) 0
I)
"
C
<
G"
/) !
"
$
C
<
$
$
$
$ 8*6" *4" 34:
<
*I
?" $
F
$"
$
! C
"
83B: $
0
$
F
"
E
"
$
$
$
4
4
8 individuals
4 4
$
"
0
O
$
* /C
/
=(
$ &./
0 847: $ E
$" C
$
3 =
an array of L
elements of
type unsigned
char; L is
a multiple of 4
and at most 56
each individual
4 4
4 4
! ,; =
4. EXPERIMENTS
4.1 Experimental Conditions
9
C C
4,
*B" 3B
*,"
4*B
03B" 03B" 3B" 3,"
37"
$
3 * *
<
4.2 Results
5.
CONCLUSIONS
"
$
'
"
0 $"
"
3
4* "
'
= 5 67,
"
$
0;
<
" *BB+
83: # =
A #F
H
" *BBB
8G: =$
" 1 " 1
" /$"
' 1
$ .
'
4B
$
" *BB+
8,: H !0" D
"
/ D
#
$
'### '
<$" ***" *BB5
=
GPU computation
QAP
instances
tai20b
Total
#OPT
population
128u30u1
10
T avg
(sec)
std
0.064
0.005
GA-1
T avg
(sec)
Total
#OPT
population
128u30u1
10
0.428
std
GA-2
T avg
(sec)
Total
#OPT
population
0.039
128u30u1
10
Speedup ratio to
CPU computation
std
GA-1
GA-2
0.422
0.042
6.7
6.6
tai25b
128u30u1
10
0.169
0.015
128u30u1
10
1.386
0.135
128u30u1
10
1.286
0.145
8.2
7.6
kra30a
128u30u5
10
2.002
1.741
128u30u2
9.651
4.541
128u30u4
11.870
3.115
4.8
5.9
kra30b
128u30u5
1.332
0.732
128u30u5
23.399
11.492
128u30u4
16.745
11.164
17.6
12.6
tai30b
128u30u3
10
0.947
0.576
128u30u3
10
22.649
6.830
128u30u1
10
7.203
6.274
23.9
7.6
tai35b
128u30u4
10
2.510
0.740
128u30u3
10
22.649
6.830
128u30u1
10
7.203
6.274
9.0
2.9
ste36b
128u30u4
10
3.337
1.056
128u30u4
10
33.274
13.062
128u30u2
10
14.675
3.836
10.0
4.4
tai40b
128u30u1
10
1.088
0.087
128u30u1
6.016
0.486
128u30u1
10
5.811
0.482
5.5
5.3
#OPT : the number of runs in which the algorithm succeeded in finding the optimal solution
T avg : the average time to find optimal solutions in successful runs in second
std : standard deviation of T avg
'
**
<
" *BB+
84*: H
/ 2
0
#
" *," 46,5
843: < 1
%$
<
'
'### '
K (
<$ '(<" " *BB6
84G: D 1
D 2
'
44 #
=
<
" *BB+
84,: D 1
?
< =
!
!
"
/
" 4*4*" *BB+
847: # 1
" T %
0" < I
"
T /
$ % ;
9
'### /
" *+*" *BB+
845: U 1" # V
"
) <
0
'
'### '
K
(
<$ '(<
" *BB6
84+: /
H 2
9
;
C
'
'
/
= <
'
$" *BB+
846: < /
'
<C
'
=
/
H
" 466,
8*B: < /
0 =
F
$$ '
'### '
=
<
=
'=<= *BB5" *BB5
8*4: & /
=
$
'###
H
( #
"
44," 4666
8**: < /0
$
'
+ '
/
?
=
=
<
&#=." 1%=< ,337" *BB+
8*3: %&'('" *BB6
;JJ
JJ
8*G: ' I" ( <"
T ?
$
'
*
'
=
1
#
'
" 46+5
8*,: / 0
" < "
. H (
$
"
"
'
8*7:
8*5:
8*+:
8*6:
83B:
M CC 37B '
'### =
#
$
=
*BB+ =#=MB+" *BB+
83G: / D
D
$
'
'### =
#
$
=
*BB7 =#=MB7" *BB7