Documente Academic
Documente Profesional
Documente Cultură
M. Gastineau
IMCCE - Observatoire de Paris - CNRS UMR8028
77, avenue Denfert Rochereau
75014 PARIS
FRANCE
gastineau@imcce.fr
Full
• All terms are computed
Naive method
• efficient for low degree or for sparse polynomials
Karatsuba’s algorithm
• efficient for intermediate degree and dense polynomials
• reduce the number of multiplications
FFT method
• efficient only for large degree
Burst tries
• trie node = dense container
• leaf node = sparse container
8 3
5
7
11
9
13
1 0 0 1 1 1 0 8
1 3 4 5 1 2 9
2 3 6 7
dc
!max
with
dcmin = damin + dbmin
dcmax = damax + dbmax
!
BHδ (c) = BHi (a) × BHj (b)
i+j=δ
build the tables of exponents for homogeneous blocks in 10 variables up to the degree 20
0.4 20
0.35
0.3
19
0.25
time (s)
0.2
18
0.15
17
0.1
16
0.05 15
14
13
12
0
0 2 4 6 8 10 12
number of terms (in Millions)
Advanced School on Specific Algebraic Manipulators, 2007, © M. Gastineau, ASD/IMCCE/CNRS
Execution time to build the addressing tables
initialization
after initialization
10
20+20
20+19
6
time (s)
20+18
20+17
4
20+16
20+15
2 20+14
20+13
20+12
20+11
20+10
0
0 20 40 60 80 100 120
muliplication operations (in Millions)
Advanced School on Specific Algebraic Manipulators, 2007, © M. Gastineau, ASD/IMCCE/CNRS
Overhead to load the addressing tables from disk
30
25
execution time overhead (%)
20
15
10
0
20 22 24 26 28 30 32 34 36 38 40
total degree of the addressing table
BH(X,Y,Z)
BH0 BH1 BH2 BH3 BH4
3 5 0 0 0
11 9 0 0
0 0 0 0
0 0 0
0 0 0
0 13
0 0
0 0
0 0
0 0
0 8
0
0
0
0
9
BHC(X,Y,Z)
BHC0 BHC1 BHC2 BHC3 BHC4
3 5 1 9 2 13 6 8 10
11 2 9 15
BH!(a) BH!(a)
BH!(b)
BH!(b)
normal flow flow with cache blocking
Factor of the reduction of the execution time for the product of two
homogeneous blocks in 8 variables of degree 7
G5 processor Core2 duo processor
1000 1.1 1000 1.1
900 900
1.05 1.05
800 800
1 1
700 700
0.95 0.95
600 600
400 400
0.85 0.85
300 300
0.8 0.8
200 200
0.75 0.75
100 100
0 0.7 0 0.7
0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000
900 900
1.05 1.05
800 800
1 1
700 700
0.95 0.95
600 600
400 400
0.85 0.85
300 300
0.8 0.8
200 200
0.75 0.75
100 100
0 0.7 0 0.7
0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000
s × (s + 1) with s = (1 + x + y + z + t + u)14
0.6
0.5
time (s)
0.4
0.3
0.2
0.1
0
0 500 1000 1500 2000 2500 3000
number of terms
Advanced School on Specific Algebraic Manipulators, 2007, © M. Gastineau, ASD/IMCCE/CNRS
full product VxV with different representations
The serie V has 3052 terms and the result has 227453 terms
! d2 d4 d6 d8 !
V (λ, λ , X, X, Y, Y , X , X ! , Y , Y ! ) =
! ! !
X d1 X Y d3 Y X !d5 X ! Y !d7 Y ! eı(k1 λ+k2 λ )
d1 + d2 + d3 + d4 + d5 + d6 + d7 + d8 = 7
200
time (s)
150
100
50
0
4 5 6 7 8 9 10 11 12
degree
C(X) = a0 b0 + (a0 b1 + a1 b0 )X 1 + a1 b1 X 2
a0 b1 + a1 b0 = (a0 + a1 )(b0 + b1 ) − a0 b0 − a1 b1
the multiplication of the coefficients, the recursive call could be stopped before. A
generalization of the karatsuba algorithm to any degree is available in [33].
Karatsuba’s multiplication could be computed at most in 9nlog2 3 or O(n1.59 ) opera-
tions [12].
Algorithm 11: Compute the full product of two polynomials A and B using the
Karatsuba’s multiplication algorithm
Input: A : polynomial of degree at most n − 1 with n = 2k for k ∈ N
Input: B : polynomial of degree at most n − 1
Output: C : polynomial
if n = 1 then return C ← AB
C1 ← A(0) B (0) by a recursive call
C2 ← A(1) B (1) by a recursive call
C3 ← A(0) + A(1)
C4 ← B (0) + B (1)
C5 ← C3 C4 by a recursive call
C6 ← C5 − C1 − C2
C ← C1 + C6 X n/2 + C2 X n
return C
Univariate polynomials
!
(a0 + a1 X + ... + an X + O(X )) (b0 + b1 X + ... + bn X n + O(X n ))
n n
=
(c0 + c1 X + ... + cn X n + O(X n ))
Multivariate polynomials
! d!1 d!2 d!n
• keep the term ai X1d1 X2d2 ...Xndn bj X1 X2 ...Xn
Let ! !
A= BHδ (a) , B= BHδ (b)
600
recursive vector
recursive list
flat vector
BH
500
BHC
BHDAL
BHDALC
400
300
200
100
0
6 8 10 12 14 16 18 20
degree
p
Let a variable !, and a small parameter !!0 such that !!0 = !0 with p ∈ N.
Each coefficient ai of the serie A(x) could be written as
log |ai | ai
ai = a!i xi !k with k = " # and a!
=
log !!0 i
!!0 k
! !
A (!, x) =
!
( a!j xj )!k
k j
A(x) = A! (!!0 , x)
The truncated product A(x) ! ⊗ B(x) on the amplitude of the coefficient is transformed to a
truncated product A! (!, x) B ! (!, x) on the variable !.
The degree of truncation is p.
source code
s 1 =(1+0.05∗ x ) ˆ 3 ;
s 2 =(1+0.04∗ x ) ˆ 4 ;
/∗ i n t r o d u c e t h e v a r i a b l e e p s i n s 1 and s 2 ∗/
s 1 e=s e r e p s ( s1 , eps , 0 . 1 ) ;
s 2 e=s e r e p s ( s2 , eps , 0 . 1 ) ;
/∗ d e f i n e t h e t r u n c a t u r e on a m pl i t u d e t o ( 0 . 1 ) ˆ 2 ∗/
t r =({ eps , 2 } ) ;
usetronc ( tr ) ;
/∗ perform t h e p r o d u c t ∗/
s 3 e=s 1 e ∗ s 2 e ;
/∗ remove t h e v a r i a b l e e p s from s 3 e ∗/
s 3=i n v s e r e p s ( s3e , eps , 0 . 1 ) ;
tr = ( { eps, 2 } )