Compression of Ultrasonic Files Ir-Sb-Ex-0501

Compression of Ultrasonic Files
SVERKER NYSTROM
Master of Science Thesis Stockholm, Sweden 2004-01-03
IR-SB-EX-0501
Compression of Ultrasonic Files
Master of Science thesis work by Sverker Nystrm, December 2004 Westinghouse/WesDyne TRC Department of Signals, Sensors and Systems Royal Institute of Technology (KTH) Examiner: Bjrn V lcker
Contents
1 Abstract 2 Introduction 2.1 Background 3 Ultrasonic Equipment and Techniques 3.1 Ultrasonic Instruments 3.2 Ultrasonic Theory 3.3 Inspection Sequences 4 Compression Methods 4.1 Lossless Compression 4.1.1 Huffman Coding 4.1.2 Lempel- Ziv Coding 4.2 Lossy Compression 5 Transform Theory 5.1 Fourier Transform 5.2 Short Time Fourier Transform (STFT) 5.3 Wavelet Transforms 5.3.1 Scale and translation. 5.4 Standard Wavelets vs. Wavelet Packets 5.5 Two Dimensional Transforms 5.6 Transform Compression 5.7 Mathematical Error 6 File Compression 6.1 Hardware compression 6.2 Compression Scheme 6.3 Lossy File Compression 6.4 Noise 7 Experimental Result and Evaluation 7.1 Signal Pre Processing 7.2 Error Analysis 7.2.1 Visual Error 7.2.2 Visual File Evaluation 7.3 Lossless Compression Results 7.4 Lossy Compression Evaluation Results 7.5 Reality Compression Tests 8 Conclusions and Future Work 9 References
1 (49)
Appendices A1 Defect 4 A1.1 Mathematical Error Calculation Results A1.2 Visual Pulse-echo Evaluation Results A1.3 Pulse-echo Plots A1.4 Visual TOFD Evaluation Results A1.5 TOFD Plots A2 Defect 5 A2.1 Mathematical Error Calculation Results A2.2 Visual Pulse-echo Evaluation Results A2.3 Pulse-echo Plots A2.4 Visual TOFD Evaluation Results A2.5 TOFD Plots A3 Defect 8 A3.1 Mathematical Error Calculation Results A3.2 Visual Pulse-echo Evaluation Results A3.3 Pulse-echo Plots A3.4 Visual TOFD Evaluation Results A3.5 TOFD Plots
2 (49)
1 Abstract
This thes is is made at WesDyne International in Pittsburgh, USA and at WesDyne TRC in Tby, Sweden. WesDyne International is a company that performs non-destructive testing using different inspection techniques such as ultrasonic and eddy current. The inspection objects are mainly welds in nuclear power plants. Normally data collection and data analysis are done at site. There has been a wish to have some of the analysis done at the head quarter instead. The task for this thesis is to investigate the possibilities to transfer inspection data from the inspection site to the head quarter via a slow transmission line. Due to many jobs short duration in time it is not always cost effective to set up a high-speed transmission line and therefore the only possible transmission channel is a telephone modem. Relatively large file sizes, around 200 MB demand a high compression ratio and the suggested solution is to use a destructive transform compression technique with wavelets in combination with WinZip. The results show that a compression ratio of 1:13 is achievable if a wavelet from the Daubechies family is used which would reduce a 200 MB file size to around 15 MB.
3 (49)
2 Introduction
2.1 Background
WesDyne Internatonal is a company that performs non-destructive testing of welds in nuclear power plants worldwide. The inspection techniques used are the ultrasonic technique (UT) and the eddy current technique (ET). An inspection team consists of around 20 persons where each person has his own field of expertise. A crew can roughly be divided into three groups, equipment personnel, acquisition personnel and analyst personnel. Inspections are often performed at different sites simultaneously which some times cause a shortage of qualified personnel. This shortage could be reduced if some of the inspection work could be done off site, e.g., from the head quarter. Only one of the three groups mentioned previously is suited for distance work namely the analysis of the inspection data. If the collected data could be transferred from the inspection site to the head quarter the analyst personnel could evaluate data from more than one inspection at the same time and, therefore, reduce the shortage to some extent. Due to many jobs short duration in time it is not always cost effective or even possible to set up a high speed transmission line and therefore the remaining alternative is to use the standard telephone net. The problem is that if a file is to be transferred over the telephone net via a standard telephone modem which has a maximum bit rate of 56600 bits/s and the file size is 200 MB the total transfer time would theoretically be:
200000000 8 = 7 hours 51 minutes 56600
(1)
In reality the transmission time will most likely be twice as long due to network overhead, congestion and shared use. This means that to get a reasonable transmission time of around 60 minutes the file size has to be compressed at least 15 times. A natural first step is to use any of the many commercial compression programs on the market and see how effective they are. As seen later in the report this simple approach will only yield a compression ratio of around 1:2 which is not high enough to meet the original criteria. The idea in this thesis will be to investigate if it is possible to improve this compression ratio by pre processing the raw ultrasonic data and then as a final step use the well known compression program WinZip.
4 (49)
3 Ultrasonic Equipment and Techniques
3.1 Ultrasonic Instruments

The instruments used to generate the signals and collect the data are manufactured by R/D Tech (Tomoscan) and WesDyne (Intraspect). The basic functions are the same between the two models as far as it comes collecting data. What differs are, for example, signal processing and filtering functions. The data is also stored differently. The Tomoscan uses a TIFF file format system (RDTIFF) [Benoit95], see figure 3.1, whereas the Intraspect always stores the data the same way, see figure 3.2. To be able to extract the data from a TIFF file a special reading program that decodes the file structure has to be used. The data analysis is however done the same way on both systems.
FILE HEADER
..
FILE CONTENTS GROUP
..
ACQUISITION DATA GROUP
ACQUISITION DATA
Figure 3.1: RDTIFF file data structure.
FILE HEADER
ACQUISITION DATA
Figure 3.2: Intraspect file data structure.
5 (49)
3.2 Ultrasonic Theory

The transducer is made out of a piece of piezoelectric material. The asymmetric atom structure of the material gives it certain electro mechanical properties that are used to generate sound impulses. When an electric puls e is applied to such a material the shape will be contracted or extracted depending on the polarity of the impulse. This will emit a sound wave which has a wavelength proportional to the wavelength of the electric impulse. The opposite thing will happen when a sound wave hits the surface of the material. When it contracts, and extracts, an electric voltage will be generated proportional to the amplitude of the incoming wave. The crystal is normally attached to a piece of Plexiglas that works as a lens. This makes it possible to focus the sound wave at different depths.
To investigate if a weld or a piece of metal has any defects, such as internal cracks or material defects, a sound wave is sent into the material. The sound travels with different speed in different materials and therefore it is possible to calculate the time
t= s v
s = distance , v = velocity
(2)
it should take for the sound wave to travel through the material, reflect on the backside and return to the transducer. If the transmitted wave returns in t ' < t time units something else has reflected the puls e. This could for example be an air cavity or a crack which both have a lower density than metal. The frequency of the signal is selected depending on which material to be inspected since the attenuation in the material is frequency dependent. The pulse generated by the UT instrument ha s the form of a negative square wave, half a period long, see figure 3.3. The pulse length t can be varied between 25 to 500 ns. The pulse amplitude can be set from 0 to 200 Volts. Although the pulse shape from the instrument is a square wave the actual sound wave coming out of the transducer has a sinusoidal shape due to the stiffness in the piezoelectric material, figure 3.4. The knowledge of the pulse shape will be used later in this work.
U U
Figure 3.3
Figure 3.4
6 (49)
The ultrasonic instruments also have a number of signal processing features. Different filters can be combined so that band pass, low pass and high pass filters can be applied. Another form of filtering is the averaging functio n which takes the average of a number of samples. If the built in compression function is activated a choice can be made to only store the highest peak of 2, 4, 8 or 16 consecutive samples. This can be seen as a form of down sampling which will change the relative bandwidth of the signal. The effect of this can be seen on the stored pulse as the shape gets more saw tooth like due to the removal of samples, see figures 3.5 and 3.6. There are two different types of ultrasonic techniques that are used when inspecting a weld.
Pulse-echo: A sound pulse is transmitted and received from either one or two transducers. The most common penetration angles used are 0 degrees, 45 degrees and 70 degrees. The aim is to detect any potential indications (cracks, air pockets etc). When evaluating a file the amplitude of the signal is important, i.e., signal amplitudes higher than a certain level are considered to be indications and need further evaluation. Figure 3.5 shows the pulseecho principle together with a plot of a so called A-scan. An A-scan is a picture that shows the echo signal from a transmitted pulse.
UT probe(s)
Weld Crack
Wave
20
Amplitude
-20 50 Samples
Figure 3.5: Pulse -echo principle (top), A-scan (bottom).
7 (49)
Time of Flight Diffraction (TOFD): This method is used only if any indications have been found with the pulse-echo method and it is used to measure them with respect to depth, length and width. The important information in this method is the phase of the received signal. The amplitude is of less interest. It is important that the above mentioned signal properties are preserved as much as possible by the compression method. Figure 3.6 shows the TOFD principle together with a plot of an A-scan.
Tr. probe Rec. probe
Weld
10000
Crack
Amplitude
-10000 100 Samples
Figure 3.6: TOFD principle (top), A-scan (bottom).
3.3 Inspection Sequences

The received echo signal is normally sampled at 60 MHz and depending on the thickness of the inspected object a different number of samples will be stored. One sampled echo signal is called an A-scan, see figures 3.5 and 3.6. The transducer is then moved a certain distance in the scan direction and another A-scan is collected. When a sufficient length has been scanned a short step motion is followed and the scanning continuous in the opposite direction as previously until the surface has been covered, see figure 3.7. Another way of presenting the data is to make a plot of all A-scans in one scan direction. Such a plot is called a B-scan. A B-scan plot could be seen as a slice that has been cut out from the scanned object. The amplitude levels in each A-scan are either colour coded or grey scale coded which make the B-scans look like figure 3.8 shows. This makes it possible to see any irregularities in the materia l which can be caused by, e.g., a crack. The B-scans in figure 3.8 come from a pulse-echo file and a TOFD file. Pulse-echo signals are normally colour coded, see figure 3.7 and TOFD signals grey scale coded. The bow shaped section in the middle of the TOFD picture is what an indication typically looks like.
8 (49)
9 (49)
Step direction Scan direction Transducer
A-scan
B-scan view
Amplitude
Time
Figure 3.7: Scan sequence (top), colour coded A-scan (bottom). Step Scan
A-scan direction
Figure 3.8: Pulse-echo B-scan (left), TOFD B-scan (right).
10 (49)
4 Compression Methods
The purpose of compressing a file is to reduce the size and this can be done in a number of different ways. Generally, compression is achieved by representing an original set of data more effective. This can be seen as reducing the amount of redundancy. The compression ratio is defined as:
Compression Ratio =
File Size After Compression File Size Before Compression
(3)
The compression techniques can be separated into two groups, lossy and lossless. In the case of lossless compression the file is completely reproducible and does not lose any information. This compression is achieved by representing the compressed data with fewer bits than the original data. Two lossless compression algorithms will be described in the section below, Huffman and Lempel-Ziv. They are both used in combination in the WinZip program.
4.1 Lossless Compression

4.1.1 Huffman Coding The basic idea is borrowed from an older and slightly less efficient method called Shannon-Fano coding. Huffman coding belongs to the group of statistical compression algorithms. This means that the probabilities for all the symbols in a message have to be known in advance. The most probable ones are assigned the shortest codeword and so on. The encoding algorithm is as follows [Proakis, Salehi 94]: 1. Sort source outputs in decreasing order of their probabilities. 2. Merge the two least-probable outputs into a single output whose probability is the sum of the corresponding probabilities. 3. If the number of remaining outputs is 2 then go to the next step, otherwise go to step 1. 4. Arbitrarily assign 0 and 1 as code words for the two remaining outputs. 5. If an output is the result of the merger of two outputs in a preceding step append the current codeword with a 0 and a 1 to obtain the codeword for the preceding outputs and repeat step 5. If no output is preceded by another output in a preceding step then stop. 11 (49)
4.1.2 Lempel- Ziv Coding Lempel-Ziv is sometimes referred to as a substitutional or dictionary based encoding algorithm. The algorithm builds a data dictionary of the data in an uncompressed data stream. Patterns of data (substrings) are identified in the data stream and are matched to entries in the dictionary. If the substring is not present in the dictionary a code phrase is created based on the data content of the substring and it is stored in the dictionary. The phrase is then written to the compressed output stream. When a matching substring occurs in the data the phrase in the dictionary is written to the output instead. Because the phrase value has a physical size smaller than the substring it represents data compression is achieved. The example in figure 4.1 [Proakis, Salehi 94] shows a data sequence encoded with the Lempel-Ziv algorithm.
Data seq.
0100001100001010000010100000110000010100001001001
Dictionary Location 1 0001 2 0010 3 0011 4 0100 5 0101 6 0110 7 0111 8 1000 9 1001 10 1010 11 1011 12 1100 13 1101 14 1110 15 1111 16
Dictionary Contents 0 1 00 001 10 000 101 0000 01 010 00001 100 0001 0100 0010
Codeword 0000 0000 0001 0011 0010 0011 0101 0110 0001 1001 1000 0101 0110 1010 0100 1110 0 1 0 1 0 0 1 0 1 0 1 0 1 0 0 1
Encoded sequence: 0000 0, 0000 1, 0001 0, 0011 1, 0010 0, 0011 0, 0101 1, 0110 0, 0001 1, 1001 0, 1000 1, 0101 0, 0110 1, 1010 0, 0100 0, 1110 1 Figure 4.1
12 (49)
4.2 Lossy Compression

A much higher compression ratio is achievable if a lossy compression method is used. The disadvantage is that you lose infor mation in the process. This loosing of information can be seen as a filtering operation where unwanted data is removed. The filtering process can be done either in the time or some other domain, e.g., the frequency domain. Fourier, Cosine and Wavelet trans forms are often used in this type of compression when a signal is transformed to the frequency domain. They all use the same principal, i.e., to represent a signal in the time domain with a set of basis functions in the frequency domain. The type of basis functions used is what differs between the three methods. Filtering a signal in the frequency domain can be done by setting a number of coefficients to zero and then inverse transform the signal back to the time domain. The principal of lossy compression in the frequency domain is to set as many coefficients as possible equal to zero without distorting the signal too much after it has been inverse transformed. The zeroed coefficients are easily further compressed using any of the, in section 4.1, mentioned methods. In this report I have chosen to focus on lossy compression since the compression ratio needs to be as high as possible due to the size of the files. A comparison will however be done with a lossless technique to show the difference. All calculations and tests will be done on six data files containing pulse-echo and TOFD data. The compression ratio will be calculated after the files have been pre processed and compressed. The idea of this thesis is to investigate how a different preparation of the data in the UT-files will affect the final compression result using WinZip. With a lossy compression algorithm it is impossible to exactly reconstruct the original signal. But sometimes it could be an advantage to lose some information. Removing the noise from a noisy signal is an example of a positive loss of information. The signals in this thesis will have a various amount of noise added. The noise is normally a bigger problem when evaluating TOFD data. This is due to a preamplifier that has to be used to amplify the signal. The noise is modelled as additive white Gaussian noise. Other noise sources at nuclear power plants are TIG welding equipments which generates a high frequency noise when in use. This will not be taken into account in this report. The inspected object itself also contributes with a certain amount of material noise. The material noise depend s on the size of the material grains.
13 (49)
5 Transform Theory
A useful definition in transform theory is the scalar product of two continuous functions defined as:
f , g = f ( x) g ( x)dx
a b
(4)
This can also be seen as the projection of the function f(x) onto the basis function g(x). Another interpretation is, how similar are f(x) and g(x). If f , g = 0 , the two functions are called orthogonal. Figure 5.1 shows a simple projection example in a Cartesian coordinate system.
A = (0,3)
V = (3,1)
V A = (3 0) + (1 3) = 3 V B = (3 4) + (1 0) = 12 A B = (0 4) + ( 2 0) = 0 orthogonal
B = ( 4,0)
Figure 5.1: Vector projection example
5.1 Fourier Transform

A well known and very useful transform is the Fourier transform developed by the French baron Jean Joseph Fourier in the 19th century. His idea was to represent a periodic signal f(t) with period T as a sum of weighted sine and cosine functions as
f (t ) =
a0 + (a n cos( nT t ) + bn sin( nT t )) 2 n =1
T =
2 T
(5)
14 (49)
This sum is called the Fourier series of a signal [Petersson 97]. The basis functions used in the Fourier series are sine and cosine and have the following properties
cos(nt ), cos(mt ) = 0 n m sin( nt), sin( mt) = 0 nm
(6) (7) (8)
cos( nt), sin( mt) = 0 n = m
The coefficients a 0 , an , bn can easily be calculated due to the orthogonality of the basis functions. A continuous aperiodic signal can not be written as a Fourier series but as a Fourier integral [Petersson 97].
F () =
f (t) e
jt
dt
(9)
1 f (t ) = 2
F ( )e
j t
(10)
Fourier transform (FT).
Inverse Fourier transform (IFT).
The function F ( ) is called the Fourier transform (FT) of f (t ) and conversely f (t ) is called the inverse Fourier transform (IFT) of F ( ) . It should be noted that a Fourier transformed signal does not contain any information of where the different frequencies occur in time. It only gives the overall spectral content of the signal. This is d ue to the assumption that the signal to be transformed is stationary. One criteria of stationarity is that the frequency content does not change over time. In the discrete time-domain a discrete version of the Fourier series (DFT) has to be used. The formulas for the DFT and the IDFT are [Proakis, Manolakis 96]
N 1 n= 0
DFT
X (k ) = x( n)e x( n ) =
j 2 kn
k = 0, 1, 2, ..., N 1
(11)
IDFT
j 2 kn 1 N 1 X ( k )e N N k =0
n = 0, 1, 2, ...., N 1
(12)
15 (49)
5.2 Short Time Fourier Transform (STFT) To overcome the time resolution problem of the Fourier transform the signal is cut into small slices followed by a Fourier transformation of these slices. This can be seen as moving a rectangular window along the signal t time units at a time as shown in figure 5.2. At each instant the window function w(t) is multiplied with the signal f(t) and the product is then Fourier transformed. f(t)w(t)
w(t)
f(t)
t 0 Figure 5.2: Windowing principle. To avoid large Fourier coefficients due to the sharp edges of the window function in figure 5.2, smoother window functions are normally used. Hanning, Hamming and Bartlett are examples of windows commonly used. Figure 5.3 shows a smoother window function. t
f(t)w(t)
w(t)
f(t)
t 0 t Figure 5.3: Smoother window function, w(t).
16 (49)
The resulting local time- frequency analysis procedure is called Short Time Fourier Transform (STFT) or windowed Fourier Transform. The STFT is defined as
STFT ( , ) =
f ( t ) w (t )e
j t
dt
denotes komplex conjugate (13)
so the window function basically controls the time- frequency resolution according to Narrow window Wide window good time resolution, poor frequency resolution. good frequency resolution, poor time resolution.
This time- frequency resolution compromise has its roots in the Heisenberg uncertainty principle. It simply states that one can not know the exact time-frequency representation of a signal, i.e., one can not know what spectral components exist at what instances of times.
17 (49)
5.3 Wavelet Transforms

Wavelet transforms have become very popular within signal processing the past decade especially in the field of compressing images (JPEG 2000). The wavelet transform is very similar to the Short Time Fourier Transform, i.e., it uses a window function to solve the time resolution problem. The analysis is done in a similar way to the STFT. There is however one main difference between the STFT and the wavelet transform: The width of the window function is changed for every single spectral component. This means that the time- frequency resolution will not be fixed as it is in the STFT.
The variable time-frequency resolution of the wavelet transform (WT) compared to FT and STFT is shown in figure 5.4. The pictures show how the STFT and the wavelet transform has a time resolution as well as a frequency resolution as a comparison to the FT. It can be seen how the wavelet transform has a varying time- frequency resolution depending on the frequency range of the signal. The high frequency content in a signal gets a lower frequency resolution but a better time resolution than the low frequency content.
Freq.
Freq.
Freq.
Time
Time
Time
a)
b) Figure 5.4: Time-frequency resolutions for a) FT, b) STFT and c) WT.
c)
18 (49)
The formula for the continuous wavelet transform is
( s, ) =
f (t)
s ,
(t )dt
denotes complex conjugate
(14)
This shows how a signal f(t) is decomposed into a set of basis functions s, ( t ) , where the two variables s and represent scale and translation. The function 1,0 (t ) is called mother wavelet and the wavelets are generated from this single function which is defined as
s , (t ) =
t s s
(15)
where
1 s
is a normalization factor for the energy at different scales.
The mother wavelet is similar to the windowing function in the STFT. A difference between wavelet transforms and other transforms, e.g., Fourier transform is that the basis function (t ) can be chosen and designed by the user as long as it satisfies certain conditions. Each wavelet family has a number of subclasses distinguished by the number of coefficients. The wavelets are often classified in each family by the number of vanishing moments. This is an extra set of mathematical relationships for the coefficients that must be satisfied and is directly related to the number of them. A higher number gives a smoother wavelet due to the increased number of coefficients. Figure 5.5 shows the Daubechies wavelet with 3 different vanishing moments and the Haar wavelet. The wavelets used in this report are the Daubechies wavelet with 8 vanishing moments and the Haar wavelet.
19 (49)
Daubechies, 2 vanishing moments.
Haar transform.
Figure 5.5: Daubechies and Haar wavelets.
From a formal point of view the mother wavelet (t ) has to satisfy a number of conditions and the two most important ones are the admissibility and the regularity conditions [Valens 99]. The admissibility condition is defined as
( )
d < +
( ) = Fourier transform of (t )
(16)
20 (49)
This implies that the function ( ) vanishes at zero frequency, i.e.,
( ) = 0 = 0
2
(17)
This means that the average value in the time domain must be zero as well, i.e., the positive and negative areas under the curve must cancel out
(t ) dt = 0
(18)
A function with these properties is an oscillating function which means that (t ) is a wave. This is where the word wavelet comes from. The regularity conditions has to do with the approximation order of the wavelet transform and the decay of the coefficients ( s, ) so if the wavelet has N vanishing moments then the approximation order of the wavelet is also N.
5.3.1 Scale and translation. The two parameters s (scale) and (translation) are used when a signal f(t) is transformed. The s parameter can be seen as a zooming tool which dilates and compresses the mother wavelet and, thus, changes the frequency resolution. A small s value will compress the mother wavelet whereas as high value will dilate or stretch it out. The parameter is used to slide the wavelet over the signal at the different scales. At every time instant t the signal f(t) is multiplied with the wavelet and integrated over all times. This procedure is repeated until the end of the signal is reached. Then the scale is changed and it all repeats again. When the scale has a value that makes the wavelet curve similar to the signal f(t) at a certain time instant the multiplication and integration will give a large value compared to when they do not match very well. Figures 5.6, 5.7 and 5.8 illustrate how an input signal is affected by a wavelet with 3 different scale values. In this example the input signal f(t) (fig. 5.6) is a wavelet called Mexica n hat. Figure 5.7 shows 3 plots of the Daubechies wavelet with 8 vanishing moments and 3 different scale values. The last plot, figure 5.8, shows how the output signal varies in amplitude depending on how well the input signal f(t) matches the wavelet. From the output plot it can be seen that the middle wavelet matches the input signal best as it gives the highest output of all 3 wavelets.
21 (49)
Amplitude
t
time Figure 5.6: Mexican hat wavelet as input signal f(t).
Amplitude
t
time Figure 5.7: Daubechies wavelets with 3 different scale values.
22 (49)
Amplitude
Figure 5.8: The output signals from the 3 differently scaled wavelets with the Mexican hat as input signal.
5.4 Standard Wavelets vs. Wavelet Packets

Wavelet packets are outside the scope of this report but will be shortly described in this section due to its relation to standard wavelets. A description of the DWT (Discrete Wavelet Transform) will make the understanding of wavelet packets easier. A time-scale representation of a digital signal is obtained using digital filtering techniques. The CWT (Continuous Wavelet Transform) was computed by changing the scale of the analysis window, shifting the window in time, multiplying it by the input signal and integrating over all times. In the discrete case filters of different cut off frequencies are used to analyze the input signal at different scales. The DWT analyzes the input signal at different frequency bands with different resolutions by decomposing it into a coarse approximation and detail information with the use of two sets of functions, called scaling function and wavelet function. The scaling function is associated with low pass filtering and the wavelet function with high pass filtering. The decomposition of the signal is then obtained by successive high pass and low pass filtering of the time domain input signal, see figure 5.9. The filtering operation corresponds to convolution of the signal x[n] with the impulse response of the filter h[n] as:
y[n] = x[n] h[n] =
k =
x[k ]h[n k ]
(19)
23 (49)
l[n] x[n] h[n]
ylow[n ]
l[n] = LP filter
yhigh[n]
h[n] = HP filter
Figure 5.9: Low and high pass filtering of the signal x[n].
If the original signal x[n] has a frequency content of 0, 2s and the HP and LP filters used are half band filters, the filtered outputs ylow[n ] and yhigh[n] will have the frequency bands (0, 4s ) and ( 4s , 2s ) respectively. After the filtering operation, half of the samples can be eliminated according to Nyqvists rule and the signal is therefore decimated by 2 by discarding every other sample. Figure 5.10 shows the filtering and decimation process.
f f f
( )
f
l[n] x[n] h[n]
ylow[n ]
yhigh[n]
Figure 5.10
A signal that is passed trough such a filter is said to be decomposed one level. The level of decomposition is varied by repeatedly filter the LP part of the signal until only two samples remain which means that the signal has been fully decomposed. A three level DWT decomposition of the signal x[n] is shown in figure 5.11 as an example.
24 (49)
LP LP LP x[n] HP 2 2 HP 2 2 HP
2 2
Figure 5.11: Three level DWT decomposition of signal x[n].
As can be seen only the LP part of the signal is passed on to the next level of filtering and this is what differs between the wavelet packet and the wavelet decomposition. A wavelet packet decomposition uses both the LP and the HP part of the filtered signal, see figure 5.12, which will increase the possibilities of an efficient representation of the signal.
LP HP LP 2 2
2 2
x[n]
HP LP HP 2
.
2
Figure 5.12: Wavelet packet decomposition scheme.
Without going too deep into the theory of wavelet packets an advantage is the possibility to choose the best basis for a given application. This means that the optimal representation of a signal is calculated with the help of a so called cost function [Jensen, la Cour-Harbo 01]. After the signal has been fully decomposed, the cost function is applied to the elements in each level of the decomposition. The best representation of the signal is then found by taking the elements which correspond to the lowest cost values. This means that the user can design the cost function in such way that it chooses the best basis for a specific application. One example is the picture format JPEG 2000 where the cost function has been designed according to the sensitivity of the human eye. The use of different cost functions will change the time- frequency resolution of a signal and it will no longer always look like it does in figure 5.4c. Figure 5.13 shows four example plots of different time- frequency resolutio ns obtained with different cost functions. In this report we will however stick with the description in figure 5.4c.
25 (49)
Freq.
Freq.
Time
Time
Freq.
Freq.
Time
Time
Figure 5.13: Examples of time-frequency representations obtained with different cost functions.
5.5 Two Dimensional Transforms

The transforms dealt with so far have all been applied to one dimensional data sequences. As described in section 3.3 the scan pattern forms a two dimensional picture (or three dimensional if the amplitude is taking into account). This makes it possible to apply a 2 dimensional (2D) transform on. If the scanned data is viewed as a M N matrix then the 2D discrete Fourier transform is defined as:
F (u , v ) =
1 MN
x = 0 y =0
f ( x, y )e
j 2 (
ux vy + ) M N
M = rows, N = columns
(20)
When a 2D transform is applied to a data matrix it involves a number of 1D transformations. More precisely, it is achieved by first transforming each row, replacing each row with its transform and then transforming each column replacing it with its transform. The same theory is applicable on the 2D wavelet transforms.
26 (49)
5.6 Transform Compression

The transform operation itself is not lossy. The reason it is described in this section is that it can be used as a filter and then becomes lossy. The definition of a transform is however that without any loss of information transform a signal or a function between different domains e.g. from the time domain to the frequency domain. The following example will illustrate the principle of using a Fourier transform to compress a signal by setting the smallest coefficients to zero. As a comparison the same operation will be performed on the signal in the time domain. We wish to get a 1:2 compression ratio by setting 50% of the smallest values equal to zero. The signal f(t) is 3000 samples long and consists of two sinusoids with different frequency and amplitude, see figure 5.14. The notation F(f) will be used to denote the signal f(t) in the frequency domain.
f (t ) = sin( 0t ) (t 1500)(sin( 0t ) 0.05 sin( 500t ))

0 = 2f 0 = Heaviside step function
(21)
Figure 5.14: The original sampled signal f(nT).
In the time domain the zeroing operation would result in the signal g(t) which is plotted in figure 5.15.
27 (49)
Figure 5.15: Filtered sampled signal g(nT ).
This signal is completely different from the original signal and it contains only one frequency. If the original signal f(t) is Fourier transformed it will have two peaks with different heights corresponding to the two sinusoid signals, see figure 5.16. The height differences represent different energies and are due to the different amplitudes of the two signals.
1
Energy
fs 2
fs 2
Frequency
Figure 5.16: F(f) with normalized energy.
28 (49)
If the same operation is done on F(f), i.e. setting 50% of the smallest coefficients equal to zero, E(f), and then inverse transform the sig nal it will look like figure 5.17.
1
Amplitude
1
500
Samples
Figure 5.17: Inverse transform of filtered signal E(f).
This signal is almost identical to the original and it has been reconstructed with only half the number of samples. This is the basic principle how to lossy compress signals with the use of transforms and still achieve a good result. The result is however dependent of what kind of signal it is. In this example the signal contained two sinusoids only and since the Fourier transform has sinusoids as its basis functions a good result should be expected. A downside with the Fourier transform is that it has no time resolution. It only shows the different frequencies in a signal but not where in time they occur. This is no problem if the signal is stationary i.e. all frequencies exist at all time. The signal f(t) is however nonstationary. To visualize this effect a stationary signal, s(t) containing the same two frequencies is Fourier transformed. The signal is:
s(t ) = sin( 0 t ) + 0.05 sin( 50 0 t ) 0 = 2f 0
(22)
In this signal the two frequencies occur at the same time and have the same amplitude and frequency as the previous signal f(t). Figure 5.18 shows a plot of s(t).
29 (49)
Amplitude
500
Samples
Figure 5.18: The sampled signal s(nT).
To show the similarities between F(f) and S(f) two comparative plots have been made. Figure 5.19 shows F(f) again and figure 5.20 shows S(f).
Energy
fs 2
0
Frequency
fs 2
Figure 5.19: F(f) with normalized energy.
30 (49)
Energy
fs 2
0
Frequency
fs 2
Figure 5.20: S(f) with normalized energy. As can be seen it is not possible to determine which of the two frequency plots that come from which signal.
5.7 Mathematical Error

The error can be defined in many different ways. The error measure used here is defined as:
Error =
( x
i =1
xi ) 2 m = number of samples
2 i
x
i =1
(23)
where x is the original signal and x is the reconstructed signal after compression. The error value from the different compression methods is then divided with the sum of the square of the original samples and plotted in a diagram. This way of measuring the error is motivated by the compression technique used i.e. by setting small coefficient values to zero. Plots of the calculated errors are found in section 7.2. The error tables are sorted defect wise in the appendices.
31 (49)
6 File Compression
The examined files contain three cracks in total and each crack has been scanned with one pulse-echo and one TOFD probe. This gives a total of six files to be compressed. To make sure no false indications are created in the compression process one of the examined cracks is below detection level which means that it is a crack but not big enough to be reportable. To save computational time the files have been modified so they only contain the defect areas. This has saved a lot of time as each file has been compressed and decompressed four times using three different compression algorithms. The software used is Matlab Student Version 6.0 R12 with the Wavelet toolbox and Borland C++ 5.02. All compression work has been done with Matlab and the data extraction and file modification has been done with Borland C++. To be able to read the headers in the UT files a special RDTIFF reading program, RDTV from R/D Tech has been used. This program makes it possible to determine the size of the raw UT data and where it is stored in the file.
6.1 Hardware compression

The hardware compression function built into the Tomoscan is used on the pulse-echo files in this report. It has been set to save the largest of 8 consecutive samples. This compression/down sampling operation distorts the original sinusoidal shaped pulse and makes it more saw tooth like. This will give its frequency spectrum a small contribution of high frequency components due to the cutting of the signal by the down sampling process. Broader frequency spectrum means less number of small value coefficients in the frequency domain. This implies that the compression ratio gets lower before the signal shape is affected. Hardware compressed pulse-echo files are therefore more sensitive to higher compression ratios than TOFD files. Figure 6.1 and 6.2 shows an Ascan from a TOFD file and a pulse-echo file from one of the defect files examined in the report. It is clearly visible how the down sampling operation affects the signal. An advantage with this down sampling technique compared to just taking every 8th value is that the amplitude resolution gets higher.
32 (49)
5000
Amplitude
5000
Samples
228
Figure 6.1: A-scan from a TOFD file, no hardware compression.
15
Amplitude
15
Samples
540
Figure 6.2: A-scan from a pulse-echo file with hardware compression.
33 (49)
6.2 Compression Scheme

As mentioned in the preface the object ive of this report is to investigate if a high enough compression ratio is achievable if the UT- files are pre processed in some way before the final WinZip compression. A flow chart over the complete compression and decompressio n process is shown in figure 6.3. The compressio n described here does not have anything to do with the hardware compression in section 6.1. These files have been compressed after the data has been collected and saved.
UT- file
Receive Compressed File
Extract UTdata
UT-data
File Header
Split File
Transform File Header Lossy Compression WinZip Decompression WinZip Decompression UT-data
WinZip Compression
WinZip Compression
Inverse Transform Rebuild Compressed File Rebuild Original File File decompression schem e. Figure 6.3
Send File File compression scheme.
34 (49)
6.3 Lossy File Compression

The pre processing of the data in this report will be of lossy character only due to the wish of getting as high a compression ratio as possible. Three slightly different transforms have been studied and they are: The Fourier transform. This method was selected due to its commonness as a transform and the fact that it can be implemented using fast algorithms (The Fast Fourier Transform). The basis functions are also very similar to the UT transmitting pulse which is good. The Haar wavelet transform. This is the most simple of the two wavelets used and thus the simplest to implement with a computer. It was chosen out of interest to see how the shape of the basis functions affects the final result. The Daubechies wavelet transform. A widely used transform known for its good results in e.g. image compression. A transform with eight vanishing moments is used in this report which hopefully will give a good result. The downside with this transform is that it requires more calculations than the Haar transform.
The files have been compressed with four different compression ratios and only with the 2D compression method. A comparison with the 1D transform would be interesting but would be too time consuming due to the extra evaluation work required. The compression ratios are 1:5, 1:6.7, 1:20 and 1:100. A compression ratio of 1:5 means that one fifth of the transformed coefficients have been saved and that four fifth have been set to zero.
6.4 Noise
The addition of unwanted noise is always a problem because it distorts the signal and thus makes the data more difficult to analyse. The noise has various sources of origin. The most common is thermal noise which is present all the time. The two methods (pulse-echo and TOFD) used are not equally sensitive to noise because they are evaluated differently. As the purpose of the pulse echo method is to detect cracks the only interest is the amplitude of the received signal and normally the noise is not a problem. The TOFD method on the other hand requires the use of a preamplifier to boost up the received signal and this means that the noise as well is amplified. When a TOFD signal is evaluated the phase of the received signal is what is important and the added noise often makes it hard to determine the phase. The UT instruments has a built in noise reducing feature called averaging and this means that the average of 2, 4, 8 or 16 consecutive Ascans is taken which reduces the noise to some extent. One downside with this function is that the scanning speed has to be reduced when higher averaging is used e.g. 8 and 16.
35 (49)
7 Experimental Result and Evaluation
7.1 Signal Pre Processing

As mentioned in section 2.1 and 6.2 the UT files have to be pre processed before they are compressed. This is done with a small C program that reads the UT data from the original file which is a form of TIFF file called RDTIFF. The program extracts all the UT data by reading the address pointers in the file and then stores the data as a long vector instead of being spread out as it is in a RDTIFF file. The raw UT data is then transformed to the frequency domain, compressed with different ratios and WinZipped if it was to be used in reality. During the analysis the file is just inverse transformed after the compression and then evaluated. Before the evaluation the UT data has to be put back into the original UT file by another small C program.
7.2 Error Analysis

The error analysis is divided in two parts. Part one is a mathematical analysis where the error is calculated for different compression ratios and compression techniques. Part two is a visual evaluation of the compressed files based on criteria used in reality. The mathematical error is defined in section 5.7. The result of the error calculations is presented in graphical form below (figures 7.1-7.6). The graphs show the error as a function of the amount saved elements for the different transforms. As can be seen the TOFD files are less affected by high compression ratios compared to the pulse-echo files and that the Daubechies transform seem to give the overall smallest error.
36 (49)
10 0
Error
10
% saved elements
20
Figure 7.1: Error plots from the defect 4 pulse-echo file.
10 0
Error
10 1
102
10
% saved elements
20
Figure 7.2: Error plots from the defect 4 TOFD file.
37 (49)
10 0
Error
10
% saved elements
20
Figure 7.3: Error plots from the defect 5 pulse-echo file..
10 0
Error
10 1
10
% saved elements
20
38 (49)
10 0
Error
101
10
% saved elements
20
Figure 7.5: Error plots from the defect 8 pulse-echo file.
10 0
Error
10 2
10
% saved elements
20
39 (49)
To show the importance of combining a visual evaluation together with a mathematical error calculation the following example can be studied. It shows an uncompressed A-scan compared with the same A-scan compressed with the FFT and the Haar transform respectively. The A-scan comes from defect 8 and is scanned with a pulse-echo probe. The chosen compression ratio is 1:20 which means that 95 % of the coefficients are set to zero. According to table AT3 in appendix A3.1 this compression would give approximately the same mathematical error for both methods i.e. FFT = 0.435510 and Haar = 0.440680. The difference between the two compressed scans, figure 7.8 and 7.9 and the origina l scan, figure 7.7 is clearly visible.
10
Amplitude
10
50
Samples
Figure 7.7: Original A-scan.
5
Amplitude
50
Samples
Figure 7.8: FFT compressed A-scan.
10
Amplitude
10
50
Samples
Figure 7.9: Haar compressed A-scan.
40 (49)
7.2.1 Visual Error As mentioned earlier there are two different types of UT- files, pulse-echo and TOFD. The evaluation is done using different criteria depending on which type it is. As the pulse-echo method is used to detect cracks the received signal amplitude is what is of most interest. When such a file is analyzed the signal is often rectified and if the peak amplitude at a certain distance reaches over a predefined level it is considered to be an indication. From a compression point of view it is important that the compression program affects the signal amplitude as little as possible. The way the compression is achieved in this case i.e. by setting a number of small coefficients in the frequency domain to zero should not affect a strong echo signal too much as it would be represented as a large coefficient in the frequency domain. When a TOFD- file is analyzed the phase of the signal is what is most important. When there is a crack or air pocket in the material the signal is reflected due to different reflection index when it hits the material/air intersection. This reflection causes it to change 180 degrees in phase. So in this case it is important that the compression does not change the phase of the signal. Again zeroing a number of small coefficients only decrease the frequency content of the original signal and not the phase so an assumption is that there will not be any significant changes of the signal phase. Even if the compression does not change the phase notably it must not change the shape of the ultra sonic signal in this case the A-scan. It is however inevitable that a signal shape is not preserved if it has been transformed with basis functions that do not match the signal in the first place. This phenomenon was seen on the Haar compressed files when they were to be evaluated. The original pulse shape was too distorted by the Haar wavelet basis function, see figure 5.5, that the method was dismissed from the beginning and thus never evaluated. This conclusion would not have been drawn with the mathematical error calculations as the only quality measurement as the Haar and the FFT methods roughly seem to be giving similar error results. The signal distortion will be most apparent on the Haar compressed TOFD files as they are recorded without the hardware compression and thus not as saw tooth shaped as the pulse-echo files. Figure 7.10 show a TOFD A-scan which has been compressed and reconstructed with the Daubechies and the Haar wavelets, figure 7.11 and 7.12. The square wave look on the Haar signal originates from its basis function and it was considered to be too distorted to evaluate.
41 (49)
5000
Amplitude
5000
50
samples
Figure 7.10: Uncompressed TOFD A-Scan.
5000
Amplitude
5000
50
samples
Figure 7.11: Daubechies compressed TOFD A-scan.
5000
Amplitude
5000
50
samples
Figure 7.12: Haar compressed TOFD A-scan.
42 (49)
7.2.2 Visual File Evaluation The compressed files have been evaluated by two persons with a level 2 ultra sonic certificate. This means that they are qualified to collect and evaluate ultra sonic data at nuclear power plants in Sweden and in other countries around the world. The persons were given the original files together wit h the compressed ones without knowing the compression level on each file. The compressed files are labelled as the example below shows. pe4_2d_db8_1 = FileType_CompDimension_Transform_FileNumber FileType: CompDimension: Transform: FileNumber: Pe = pulse-echo or to = TOFD + defe ct number. Dimension of compression method, always 2D. Used transform. FFT, Db8 or Haar. Random number between 1 and 4.
The random numbering of the files is done to remove the compression ratio information that could affect the evaluation result. The need to have the original file as a reference is however making the evaluation a bit biased but could not be avoided. The three defects are called defect 4, defect 5 and defect 8. The three files are all a bit different from each other.
Defect 4: This file contains an indication that is below the reportable limit. In this case reportable limit means that the received signal amplitude must reach over a specific level in three consecutive scans. A reportable defect has to be scanned with a TOFD probe to determine its size. This defect was chosen to see if any of the compression methods and ratios made it reportable i.e. made the signal level reach over the critical level. The TOFD file is eva luated for this report anyway. The signal does not contain a lot of noise. Defect 5: This file has one reportable indication which is measured with a TOFD probe. The TOFD file is however very noisy. This will make it interesting to see if the compression reduces the noise and thus have a positive effect on the signal. Defect 8: This file also has one reportable indication but the TOFD file is much less noisy than defect 5.
43 (49)
7.3 Lossless Compression Results

Table 7.1 shows the compression ratio s when the 6 UT files are compressed with WinZip without any pre processing i.e. in original format. It should be noted that this compression is lossless and the files do not lose any information. The letter m at the end of each file name means that the file is modified and only contains the defect area as mentioned in section 5.
File Name pe4m to4m pe5m to5m pe8m to8m
Original Size (KBytes) 10972 6406 13855 54414 17450 103724
WinZipped Size (KBytes) 4497 3894 5490 38432 7905 69054
Compression Ratio 1:2.4 1:1.6 1:2.5 1:1.4 1:2.2 1:1.5
Table 7.1 The resulting compression ratios achieved with this straight forward method is however not sufficient but it gives a good indication of the efficiency of lossless compression applied to ultra sonic files.
7.4 Lossy Compression Evaluation Results

This section give s an overview of the data presented in the appendices. The tables below show conclusions drawn from the visual evaluations and the mathematical calculations in the appendices. As mentioned in section 7.2.1 a visual evaluation is needed as a complement to the mathematical error calculations in order to determine if a compressed signal is good or bad. Based on the data from those two evaluation methods the Daubechie s wavelet compression has proven to be better than the FFT transform compression. It preserves the signal shape better at higher compression ratios and gives the smallest mathematical error. The tables below therefore only show the results from compressions made with the Daubechies wa velet. The results from the FFT compressions are however presented in the appendices. The results show that TOFD files can be compressed more than the pulse-echo files studied in this report. This is due to the hardware compression that damages the original sinusoid pulse shape and makes it more broadband. This means that more coefficients have to be saved to be able to reconstruct the original pulse. The pulse-echo files are on the other hand already compressed to 1/8 of the original size.
44 (49)
Defect 4 Daubechies compressed One non reportable indication, original signal not noisy % saved elements Pulse-echo TOFD 20 No visual change in A-scan 15 shape. No visual change in A-scan Minor change in A-scan shape. 5 shape. 1 Change in A-scan shape. Table 7.2 A-scan not too good.
From table 7.2 it is seen that saving 15% of the pulse-echo and TOFD samples can be done without loosing any significant information. From the Daubechies curve in figure AP1.2 in appendix A1.1 it seems likely that 8-10% of the TOFD samples can be saved without significant changes in pulse-shape as the error only increases marginally.
Defect 5 Daubechies compressed One reportable indication, original signal noisy % saved elements Pulse-echo TOFD 20 No visual change in A-scan No visual change in A-scan 15 shape and ampl. shape. Small change in A-scan Minor change in A-scan 5 shape and ampl. shape, most noise gone. 1 Change in A-scan shape. Table 7.3 A-scan shape not good
From table 7.3 it is seen that saving 5% of the TOFD samples can be done without loosing any significant information and 15% of the pulse-echo samples. From the Daubechies curve in figure AP2.1 in appendix A2.1 it seems likely that 10% of the pulseecho samples can be saved without significant changes in pulse shape as the error only increases marginally.
45 (49)
Defect 8 Daubechies compressed One reportable indication, original signal not noisy % saved elements Pulse-echo TOFD 20 No visual change in A-scan No visual change in A-scan 15 shape and ampl. shape. Small change in A-scan Minor change in A-scan 5 shape and ampl. shape. Change in A-scan shape 1 A-scan shape not good. and ampl. Table 7.4
From table 7.4 it is seen that saving 15% of the TOFD and the pulse-echo samples can be done without lo osing any significant information. From the Daubechies curve in figure AP2.1 in appendix A2.1 it seems likely that 5-10% of the pulse-echo and TOFD samples can be saved without significant changes in pulse shape as the error only increases marginally. All background data to these tables can be found in the appendices. Error calculations, error plots and pictures from the visual evaluations are listed defect wise. The Haar transform is only represented in the error calculation table and in the error plots but not in any of the evaluation tables due to its poor visual compression results.
46 (49)
7.5 Reality Compression Tests

So far all compression calculations have been on a theoretical level. In this section a complete compression test is made with one of the files to see if the theoretical calculations are accurate. When a practical test is to be done a number of problems occur that do not exist on the theoretical level. The list below describes two of them. 1. The amplitude resolution is different between the TOFD and the pulse-echo files in the time domain. Each TOFD sample is 12 bits long and each pulse-echo sample is 8 bits long in the examined files. After transformation to the frequency domain the coefficients may be larger than 8 or 12 bits. 2. This problem follows from nr 1. When a UT file is transformed to the frequency domain the magnitude of the coefficients varies a lot. This means that some coefficients may need a 32 bit representation while the major part only needs 16 bits. The number of bits has to be defined when data is saved with Matlab which implies that the file size gets unnecessary big if it has to be saved with a 32 bit representation.
The problems described above could in the worst case make the compressed file even bigger than the original one if for example a file with 8 bit data values is saved with a 32 bit representation. Fortunately WinZip almost removes the effect of this problem but not all which means that there still is room for some improvement when it comes to effective data storage. The file in this test is to4, i.e. the TOFD file from defect 4. The whole file is now compressed compared to the modified versions previously used where the defect area was cut out. A TOFD file was chosen because it only contains data from one transducer whereas the pulse-echo files have data from two different transducers. The coefficient vector has been saved in two parts to overcome the storage problem described above (nr 2). Part one has ~7% of the coefficients and was saved with 32 bits and part two with the remaining 93% was saved with 16 bits.
47 (49)
Three different compression ratios were WinZipped and compared with the original TOFD file. Table 7.5 shows the result.
Pre processed compression ratio 1:6.7 (85% zeros) 1:10 (90% zeros) 1:20 (95% zeros)
to4 Size = 6406 KB WinZipped WinZipped pre processed file original file size size (KB) (KB) 1181 854 3903 486 Table 7.5
Reality compression ratio 1:5.4 1:7.5 1:13.2
As can bee seen the calculated compression ratio does not exactly match the compression in reality when the file has been WinZipped. A possible explanation could be the storage problem mentioned before.
48 (49)
8 Conclusions and Future Work

The aim of this report is to investigate if it is possible to compress large ultrasonic data files, around 200 MB, enough to transfer them over a slow transmission line e.g. a telephone modem in a reasonable period of time. According to the results in section 7.4 a satisfactory quality on the compressed files is achieved at a compression ratio of 1:10 when the 2 dimensional Daubechies wavelet with 8 vanishing moments is used as a transform. This would reduce the data size to 1/10 of the original size. The 200 MB file above would then be compressed to 20 MB giving a theoretical transfer time over a 56.6 Kb/s modem of:
20000000 8 = 47 minutes 56600
(24)
as opposed to the original 7 hours and 51 minutes. Normally the files are smaller than that and of course the transfer time decreases accordingly. The results also show the importance of choosing the right transfer function depending on the signal shape. In this case the Daubechies wavelet matched the transmitting pulse the best which could be seen in both the visual and the mathematical plots and tables. A factor that has not been taken into account is the time the compression and decompression process takes which can be substantially if the files are large. The compression/decompression time can probably be reduced if the algorithms are implemented in C instead of using the wavelet toolbox in Matlab. The practic al test results in section 7.5 show that in order to get a final compression ratio of 1:10 a slightly higher pre processing compression ratio must be used. This result is however based on the way the file was saved with the 32 and 16 bit word lengths. This file splitting was far from optimal and a more effective method would probably result in a better final compression ratio. Interesting work for the future would be to use wavelet packets together with the best basis concept and see what the improvements in compression ratio and signal quality are like.
49 (49)
9 References
[Proakis, Salehi 94] John G. Proakis, Masoud Salehi, Communication Systems Engineering, Prentice Hall Int. Editions 1994.
[Proakis, Manolakis 96]
John G. Proakis, Dimitris G. Manolakis, Digital Signal Processing, Prentice Hall Int. Editions 1996.
[Petersson 97]
Jan Petersson, Fourieranalys, Rex Offsettryck 1997.
[Salomon 97]
David Salomon, Data Compression, Springer-Verlag 1998.
[Valens 99]
Clemens Valens http://perso.wanadoo.fr/ polyvalens/clemens/clemens.html
[Jensen, la Cour-Harbo 01]
Arne Jensen, Anders la Cour -Harbo, Ripples in Mathematics, Springer-Verlag 2001
50 (49)
Appendices
Appendix A1.1 Mathematical Error Calculation Results Defect 4
Error % saved elements FFT 20 15 5 1 0.317940 0.406600 0.687840 0.811960 Pulse-echo Haar 0.233810 0.301030 0.622270 1.000000 Db8 0.175670 0.248250 0.568250 0.881240 FFT 0.004870 0.010118 0.086005 0.152450 TOFD Haar 0.017057 0.026142 0.069284 0.220020 Db8 0.001660 0.004193 0.025588 0.091936
Table AT.1: Compression error calculations
Appendix A1.2 Visual Pulse-echo Evaluation Results Defect 4
FFT
File name % saved elem. Max ampl. dB -3,5 -3,4 -3,3 -4,4 Max scan deg 84,2 84,6 84,6 84,6 Max index mm 149 149 149 149 Soundpath mm/2 14,7 13,9 13,9 13,9 Scan 1 6dB deg 79 79 83,4 83,4 Scan 2 6dB deg 87,8 86,2 86,2 86,6 Index 1 -6dB mm 149 149 149 149 Index 2 -6dB mm 161 161 149 153 Scan length deg. 8,8 7,2 2,8 3,2 Length index mm 12 12 0 4 Comment Uncompressed file Distorted A-scan Distorted A-scan Distorted A-scan Signal amplitude to low
pe4m Original p e4_2d_fft_3 20 p e4_2d_fft_1 p e4_2d_fft_2 p e4_2d_fft_4 15 5 1
Db8
File name pe4_2d_db8_3 pe4_2d_db8_2 pe4_2d_db8_1 pe4_2d_db8_4 % saved elem. 20 15 5 1 Max ampl. dB -3,5 -3,5 -3,5 -3,1 Max scan deg 84,2 84,2 84,6 84,6 Max index mm 149 149 149 149 Soundpath mm/2 14,7 14,7 13,9 13,9 Scan 1 6dB deg 83 83 79 83 Scan 2 6dB deg 86,2 86,2 87,4 85,8 Index 1 -6dB mm 149 149 149 149 Index 2 -6dB mm 149 149 161 149 Scan length deg. 3,2 3,2 8,4 2,8 Length index mm 0 0 12 0 Comment
Appendix A1.3 Pulse-echo plots. Defect 4
pe4m orginal
pe4_2d_fft_3
20% saved elements
pe4_2d_fft_1
15% saved elements 2
pe4_2d_fft_2
5% saved elements
pe4_2d_fft_4
1% saved elements 3
pe4_2d_db8_3
20% saved elements
pe4_2d_db8_2
pe4_2d_db8_1
5% saved elements
pe4_2d_db8_4
1% saved elements 5
Appendix A1.4 Visual TOFD Evaluation Results
Defect 4
FFT
File name to4m to4_2d_fft_1 to 4_2d_fft_2 to 4_2d_fft_4 to 4_2d_fft_3 % saved elements Original 20 15 5 1 Max ampl. dB 24,4 24,8 24,9 21,7 17,6 Lath 7,894 7,894 7,894 7,894 7,894 Tip 1 9,877 9,877 9,877 9,877 9,877 time 1,983 1,983 1,983 1,983 1,983 Depth (mm) 12,63 12,63 12,63 12,63 12,63 Num. of scan lines 11 11 11 11 11 Signal shape not to good Noise Comment
Db8
File name to 4_2d_db8_3 to 4_2d_db8_1 to 4_2d_db8_2 to 4_2d_db8_4 % saved elements 20 15 5 1 Max ampl. dB 24,1 24,7 24,5 25,0 Lath 7,894 7,894 7,894 7,894 Tip 1 9,877 9,877 9,877 9,877 time 1,983 1,983 1,983 1,983 Depth (mm) 12,63 12,63 12,63 12,63 Num. of scan lines 11 11 11 11 Signal shape not to good Noise Comment
Appendix A1.5 TOFD Plots. Defect 4
to4_2d_fft_1 20% saved elements
to4_2d_db8_3 20% saved elements
Appendix A2.1 Mathematical Error Calculation Results
Defect 5
Error % saved eleme nts 20 15 5 1 FFT 0.198140 0.287320 0.627750 0.745690 Pulse-echo Haar 0.203560 0.256490 0.549540 0.817920 Db8 0.167630 0.249140 0.438320 0.821220 FFT 0.040112 0.089521 0.343180 0.580810 TOFD Haar 0.059895 0.089051 0.227780 0.561480 Db8 0.020100 0.036558 0.147360 0.374410
Table AT2: Compression Error Calculations.
FFT
File name Pe5m pe5_2d_fft_3 pe5_2d_fft_2 pe5_2d_fft_1 pe5_2d_fft_4 % saved elem. Original 20 15 5 1 Max ampl. dB -2,6 -2 -2 -3,4 Max scan deg 169,2 169,2 169,2 171,2 Max index mm 27 27 27 31 Soundpath mm/2 25,6 25,6 25,6 29,9 Scan 1 6dB deg 165,7 165,7 164,2 163,7 Scan 2 6dB deg 173,7 173,7 178,7 182,8 Index 1 -6dB mm 21 21 18 18 Index 2 -6dB mm 47 47 49 49 Scan length deg. 8 8 14,5 19,1 Length index mm 26 26 31 31 No detectable indication Comment Uncompressed file
Db8
File name pe5_2d_db8_2 pe5_2d_db8_1 pe5_2d_db8_4 pe5_2d_db8_3 % saved elem. 20 15 5 1 Max ampl. dB -2,6 -2,2 -1,4 -1,9 Max scan deg 169,2 169,2 169,2 169,2 Max index mm 27 27 27 31 Soundpath mm/2 25,6 25,6 25,6 26,4 Scan 1 6dB deg 165,7 167,2 167,2 167,2 Scan 2 6dB deg 173,7 174,2 173,7 172,7 Index 1 -6dB mm 18 23 23 23 Index 2 -6dB mm 49 47 39 39 Scan length deg. 8 7 6,5 5,5 Length index mm 31 24 16 16 Comment
10
pe5m orginal
11
pe5_2d_fft_3
20% saved elements
pe5_2d_fft_2
15% saved ele ments 12
pe5_2d_fft_1
5% saved elements
pe5_2d_fft_4
pe5_2d_db8_2
20% saved elements
pe5_2d_db8_1
pe5_2d_db8_4
5% saved elements
pe5_2d_db8_3
Defect 5 FFT
File name to5m to5_2d_fft_3 to5_2d_fft_1 to5_2d_fft_2 to5_2d_fft_4 % saved elements Original 20 15 5 1 Max ampl. dB 26,4 17,6 24,8 24,9 21,7 Lath 13,694 13,694 13,694 13,710 13,710 Tip 1 15,360 15,344 15,360 15,360 15,377 time 1,666 1,650 1,666 1,650 1,667 Depth (mm) 14,90 14,81 14,90 14,81 14,90 Num. of scan lines 11 11 11 11 11 Noise Comment Noisy signal Noisy signal Noisy signal Noisy signal Noisy signal, signal shape not ok
Db8
File name to5_2d_db8_1 to5_2d_db8_2 to5_2d_db8_3 to5_2d_db8_4 % saved elements 20 15 5 1 Max ampl. dB 24,7 24,5 24,1 25,0 Lath 13,694 13,710 13,694 13,694 Tip 1 15,344 15,344 15,360 15,344 time 1,650 1,634 1,666 1,650 Depth (mm) 14,81 14,73 14,90 14,81 Num. of scan lines 11 11 11 11 Noise Comment Noisy signal Noisy signal Most noise gone Signal shape not ok
16
17
18
Appendix A3.1 Mathematical Error Calculation Results
Defect 8
Error % saved elements 20 15 5 1 FFT 0.143830 0.204240 0.435510 0.689470 Pulse-echo Haar 0.091941 0.144500 0.440680 0.795590 Db8 0.069915 0.096112 0.272820 0.865840 FFT 0.005039 0.021851 0.206570 0.383470 TOFD Haar 0.007032 0.012428 0.051234 0.117070 Db8 0.000722 0.001901 0.015114 0.069977
Table AT.3: Compression error calculations.
19
FFT
File name pe8m p e8_2d_fft_2 p e8_2d_fft_1 p e8_2d_fft_4 p e8_2d_fft_3 % saved elem. Original 20 15 5 1 Max ampl. dB -1,1 -1,1 -1,1 -1,6 -3,2 Max scan deg 295,9 295,9 295,9 295,9 296,2 Max index mm 140 140 140 140 140 Soundpath mm/2 26,3 25,5 25,5 25,5 25,5 Scan 1 6dB deg 294,1 293,8 294,1 294,1 294,1 Scan 2 6dB deg 298,9 298,9 298,9 298,9 298,9 Index 1 -6dB mm 120 120 120 120 120 Index 2 -6dB mm 140 140 140 140 140 Scan length deg. 4,8 5,1 4,8 4,8 4,8 Length index mm 20 20 20 20 20 Comment Uncompressed file
Db8
File name pe8_2d_db8_2 pe8_2d_db8_1 pe8_2d_db8_4 pe8_2d_db8_3 % saved elem. 20 15 5 1 Max ampl. dB -1,4 -1,4 -1,4 -1,3 Max scan deg 295,9 295,9 295,9 295,6 Max index mm 140 140 140 140 Soundpath mm/2 26,3 26,3 26,3 27,1 Scan 1 6dB deg 293,8 293,8 294,1 294,1 Scan 2 6dB deg 298,9 298,9 298,9 298,9 Index 1 -6dB mm 120 120 120 120 Index 2 -6dB mm 140 140 140 140 Scan length deg. 5,1 5,1 4,8 4,8 Length index mm 20 20 20 20 Comment
20
pe8m orginal
21
pe8_2d_fft_2
20% saved elements
pe8_2d_fft_1
pe8_2d_fft_4
5% saved elements
pe8_2d_fft_3
pe8_2d_db8_2
20% saved elements
pe8_2d_db8_1
pe8_2d_db8_4
5% saved elements
pe8_2d_db8_3
Defect 8
FFT
File name to8m to8_2d_fft_3 to8_2d_fft_1 to8_2d_fft_2 to8_2d_fft_4 % saved elements Original 20 15 5 1 Max ampl. dB 24,4 24,6 25,6 17,8 Lath 7,277 7,277 7,277 7,277 Tip 1 8,444 8,444 8,444 8,460 time 1,167 1,167 1,167 1,183 Depth (mm) 8,29 8,29 8,29 8,35 Num. of scan lines 18 18 18 18 Noise Comment Some noise before latheral Some noise before latheral Signal noisier Signal noisier No measurable signal
Db8
File name to8_2d_db8_3 to8_2d_db8_1 to8_2d_db8_2 to8_2d_db8_4 % saved elements 20 15 5 1 Max ampl. dB 24,9 24,4 20,7 12,5 Lath 7,277 7,277 7,244 7,277 Tip 1 8,444 8,444 8,427 8,427 time 1,167 1,167 1,183 1,150 Depth (mm) 8,29 8,29 8,35 8,21 Num. of scan lines 18 18 18 18 Noise Comment No noise No noise No noise Bad signal shape
26
27
28

Compression of Ultrasonic Files Ir-Sb-Ex-0501

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Compression of Ultrasonic Files Ir-Sb-Ex-0501

Încărcat de

Drepturi de autor:

Formate disponibile

Compression of Ultrasonic Files

Master of Science Thesis Stockholm, Sweden 2004-01-03

Compression of Ultrasonic Files

3 Ultrasonic Equipment and Techniques

3.1 Ultrasonic Instruments

ACQUISITION DATA GROUP

Figure 3.1: RDTIFF file data structure.

Figure 3.2: Intraspect file data structure.

3.2 Ultrasonic Theory

Figure 3.5: Pulse -echo principle (top), A-scan (bottom).

-10000 100 Samples

Figure 3.6: TOFD principle (top), A-scan (bottom).

3.3 Inspection Sequences

Step direction Scan direction Transducer

Figure 3.8: Pulse-echo B-scan (left), TOFD B-scan (right).

File Size After Compression File Size Before Compression

4.1 Lossless Compression

4.2 Lossy Compression

Figure 5.1: Vector projection example

5.1 Fourier Transform

cos(nt ), cos(mt ) = 0 n m sin( nt), sin( mt) = 0 nm

(6) (7) (8)

cos( nt), sin( mt) = 0 n = m

Fourier transform (FT).

Inverse Fourier transform (IFT).

t 0 t Figure 5.3: Smoother window function, w(t).

denotes komplex conjugate (13)

5.3 Wavelet Transforms

b) Figure 5.4: Time-frequency resolutions for a) FT, b) STFT and c) WT.

The formula for the continuous wavelet transform is

denotes complex conjugate

is a normalization factor for the energy at different scales.

Daubechies, 2 vanishing moments.

Daubechies, 4 vanishing moments.

Daubechies, 8 vanishing moments.

Figure 5.5: Daubechies and Haar wavelets.

This implies that the function ( ) vanishes at zero frequency, i.e.,

5.4 Standard Wavelets vs. Wavelet Packets

y[n] = x[n] h[n] =

l[n] x[n] h[n]

l[n] x[n] h[n]

Figure 5.11: Three level DWT decomposition of signal x[n].

Figure 5.12: Wavelet packet decomposition scheme.

5.5 Two Dimensional Transforms

5.6 Transform Compression

f (t ) = sin( 0t ) (t 1500)(sin( 0t ) 0.05 sin( 500t ))

Figure 5.14: The original sampled signal f(nT).

Figure 5.15: Filtered sampled signal g(nT ).

Figure 5.16: F(f) with normalized energy.

Figure 5.17: Inverse transform of filtered signal E(f).

Figure 5.18: The sampled signal s(nT).

Figure 5.19: F(f) with normalized energy.

5.7 Mathematical Error

6.1 Hardware compression

Figure 6.1: A-scan from a TOFD file, no hardware compression.

Figure 6.2: A-scan from a pulse-echo file with hardware compression.

6.2 Compression Scheme

Receive Compressed File

Send File File compression scheme.

6.3 Lossy File Compression

7 Experimental Result and Evaluation

7.1 Signal Pre Processing