Use of Influence Diagrams and Neural Networks in Modeling Semiconductor Manufacturing Processes

II
52
IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL 4, NO I , FEBRUARY 1991
Use of Influence Diagrams and Neural Networks in Modeling Semiconductor Manufacturing Processes
Fariborz Nadi, Alice M. Agogino, and David A. Hodges, Fellow, IEEE
Abstract-An adaptive learning architecture has been developed for modeling manufacturing processes involving several controlling variables. This paper describes the application of the new architecture to process modeling and recipe synthesis for deposition rate, stress, and film thickness in low pressure chemical vapor deposition (LPCVD) of undoped polysilicon. In this architectqre the model for a process is generated by combining the qualitative knowledge of human experts, tqptured in the form of influence diagrams, and the learning abilities of neural networks for extracting the quantitative knowledge that relates parameters of a process. To evaluate the merits of this methodptogy, we have compared the accuracy of these new models to that of more conventional models generated by the use of first principles and/ or statistical regression analysis. Accuracy of the different models is compared using the same empirical data sets form realistic experiments. The models generated by the integrati n of influence diagrams and neural networks are shown to have halfthe error or less, even though given only half as much informatiop in creating the models. Furthermore, it is shown that by employing the generalization ability of neural networks in the synthesis algorithm new recipes can be produced for the process. Two such recipes are generated for the LPCVD process. One is a zero-stress polysilicon film recipe; the second, a uniform deposition rate recipe which is based on use of a non-uniform temperature distribution during deposition.
INTRODUCTION
CCURATE models of manufacturing processes are needed for automating the process of recipe generation (setting of the process parameters). Creation of accurate models requires the use of architectures that possess learning abilities so that they can capture the complex and dynamic behavior of a manufacturing process. Nadis work [ 11 has introduced an adaptive learning architecture that requires minimal human knowledge in modeling and synthesizing behavior of a process. In this approach the model for a process is created by observing and thereby learning the behavior of the process over time, and continually adapting to its changes. Once a model is created, the architecture employs a stochastic optimization routine to synthesize the model. The synthesis algorithm is very flexible, and allows for arbitrary extraction of information from a model. This architecture captures both qualitative and quantitative aspects of the knowledge of the manufacturing process. The qualitative knowledge (the knowledge of the human expert) is captured by making use of the relational level of influence diagrams [2], [3]. Two critical tasks are performed at this level, both of which can greatly reduce the complexity and the amount of data required for creation of a model. First, important process parameters are identified. Each parameter is then represented by a single input or output node. Second, directed, Manuscript received December 19, 1989; revised March 6, 1990. F. Nadi is with the Department of Mechanical Engineering, Sharif University of Technology, Tehran, Iran. A. M. Agogino is with the Department of Mechanical Engineering, University of California at Berkeley, Berkeley, CA 94720. IEEE Log Number 9040872.
acyclic arcs are established between some of the nodes. Each arc represents existence of a probable relationship between two connected nodes. The exact nature of the relationships are not needed at this level, although such knowledge can be incorporated into the model. These diagrams are an abstraction of the process, and they can be used to break down a single, multiinput multi-output process into several smaller, independent, multi-input single-output subprocesses. Influence diagrams can be created with the help of human experts, or they can be induced by observing sampled behavior of a process over time [4]. Unlike some other tools for knowledge acquisition and transfer, influence diagrams at the relational level are easily created by human experts. Once a process is divided into several subprocesses, the unknown quantitative relationships among groups of related parameters in each of the subprocesses are extracted using the learning abilities of neural networks. These networks have been proven to be very effective in complex learning tasks [ 5 ] . Neural networks crudely resemble the architecture of the brain. In these networks many simple processors are interconnected and the knowledge is stored, in a distributed manner, in the weight of the connections between processars [ 6 ] . The speed and accuracy of neural networks in complex pattern recognition tasks results from several features. Information is acquired and stored in a distributed manner. Given an appropriate parallel computing configuration, computation can proceed in a highly parallel way. Finally, each node processes the adaptively weighted sum of its inputs with a nonlinear function. This last feature gives neural networks an important degree of freedom not usually provided in other modeling techniques such as statistical regression. Useful neural networks can be implemented in software in a general purpose computing environment (the approach we used), or with special-purpose VLSI hardware [7]. We have employed the above architecture to model and synthesize a cqpplex semiconductor manufacturing process, LPCVD of polysilicon. The model relates deposition rate, stress, and film thickness to process time, process temperature, silane flow rate, wafer position during depostion, and ambient pressure. To demonstrate the usefulness of this architecture, we have compared the predictive capability of the models it creates to the corresponding capabilities of models based on first principles and/or statistical regression analysis, using exactly the same data. In our study, the model generated by the combination of influence diagrams and neural networks is, on the average, more then three times as accurate in predicting deposition rate, and more than twice as accurate in predicting stress, (as measured by the average regression error) while given half as much information in creating the model. We have also used this architecture to synthesize process recipes for two specific practical objectives. The new process rec-
0894-6507/91/0200-0052$01 .OO
0 1991 IEEE
NAD1 er ul: INFLUENCE DIAGRAMS, NEURAL NETWORKS IN MODELING MANUFACTURING PROCESSES
53
ipes are created by generalizing the information that is used in training the networks. The first recipe is for a zero stress polysilicon film; the second, for a uniform deposition rate over a production lot in the presence of significant source gas depletion.
OVERVIEW OF
THE
ARCHITECTURE
-&%FFEBP network 1st output FFEBP network Input

An network
2nd output
Flnal values
Partlal inrorrnatlon
......... .........
In order to solve a complicated problem, such as modeling a complex manufacturing process, a standard approach is to break down the problem into several smaller independent problems, solve them separately, and integrate them back together. This is the approach chosen by Nadi [ 11 for modeling complex manufacturing processes. Influence diagrams, by capturing the qualitative knowledge of the human experts, identify important process parameters, and, decompose the problem of modeling a multi-input multioutput process into modeling several smaller, independent, multi-input single-output subprocesses. This is done by establishing conditional independence among some of the process parameters. Decomposing a process into several subprocesses gives us the ability to gradually build a more complex model in a modular fashion. The quantitative relationships that relate the parameters in each subprocess are learned using feed-forward error-backpropagation (FFEBP) networks. Each FFEBP network is responsible for learning the local and causal relationships that exist among the parameters of a subprocess. Next, in order to learn how the subprocesses interact, or in other words, to re-integrate the subprocesses into a complete process, the common and optimal behavioral space of all of the subprocesses is learned by a single associative memory (AM) network which simultaneously looks at all the parameters in a process. The AM network, by being trained on the set of choices that an expert makes in order to get an over-all optimum process output, extracts the quantitative and/or qualitative optimality criteria that the expert uses for generating a process recipe. Therefore, the AM network can be used to provide good initial estimates for the unknown parameters in the general synthesis of a model. The AM network also uses the error-backpropagation learning procedure. This constitutes the modeling or knowledge acquisition phase. For the synthesis or knowledge extraction phase, both types of networks are synthesized by employing a stochastic optimization routine called ALOPEX [8] that is very similar to simulated annealing [9]. The synthesis algorithm is very flexible and allows for arbitrary extraction of knowledge from single neural nets or groups of networks simultaneously. Synthesis is the situation in which, given some known parameters for a process, the networks will produce values of the unknown parameters according to the optimal behavior stored in the AM network, and the allowed range of behavior (causal relationships or local information) stored in the FFEBP networks. Recipe generation involves synthesizing all of the networks in the following manner. First, the AM network, constrained by the optimal behavior of the process and the value of the known parameters, is synthesized in order to generate good initial estimates for the unknown parameters. Next, all of the FFEBP networks are simultaneously synthesized to fine tune the initial estimates to the behavior of the process. One abstraction of the synthesis process is that the AM network provides the ideal solutions, and the FFEBP networks provide a match between the ideal solutions and the real behavior of the process. The schematic of the synthesis process is shown in Fig. 1.
lnltlal estjrnates
Fig. 1. Schematic of synthesis process.

Modeling Using Integrated Networks
In the modeling phase, an integrated network of influence diagram(s) and neural networks is created for a specifi5 process. The integrated network is then trained on a subset ,of the experimental data that spans the parameter ranges of interest, and is tested on the remaining data. We then compared the performance of the integrated network in modeling LPCVD of polysilicon to the performance of other models, which are generated by the use of the first principles and/or statistical regression analysis, employing all of the same experimental data. The influence diagram and the integrated network for modeling LPCVD of polysilicon films are shown in Figs. 2 and 3 respectively. The model consists of five input parameters (ambient pressure, process temperature, process time, flow rate of silane gas, and position of the silicon wafer in the processing tube), and three output parameters (depostion rate, film thickness, and stress). There are only two FFEBP networks used in the model, because there is an exact relationship between deposition rate, depostion time, and the resultant film thickness. The AM network does not cover the film thickness variable, for the same reason. The data set was provided by Kuang-Kuo Lin and Costas Spanos of the Electronics Research Laboratory at the University of California at Berkeley. The data set was the result of an optimally-designed experimental sequence. The ranges of process parameters are shown next: Variable pressure (mtorr) temperature "C flow rate (sccm) time (min) position ( c m ) thickness ( A ) stress (io9 !yne/cm2) dep. rate ( A /min) Range 300-550 605-650 100-250 60- 150 3.6-28.8 9100-18700 -5.9 to 7.6 90-300 Type continuous continuous continuous continuous continuous continuous continuous continuous
There were 12 experiments performed with six wafers (within a lot of 50 wafers) positioned along the tube for each experiment. The values of input parameters in the experiments were chosen in such a way as to cover the desired range of the process behavior with a minimum number of experiments. The use of first principle models for such purposes is of great importance. The reader can refer to [ 101 for parameter selection criteria. The film thickness was measured for each sample wafer, resulting in 72 data points. Stress data could not be obtained for all wafers, so there are only 54 data points for stress. For deposition rate modeling, the data set was broken into two parts: a training set and a testing set. The data gathered for every other wafer in the tube, for all 12 experiments, were used
54
IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 4, NO. 1, FEBRUARY 1991

Pressure Temperature Flow-rate Position
Thickness
U
Position
Fig. 2. Influence diagram for LPCVD of undoped polysilicon.
Pressure Temperature
Flow-rate
Thickness
( 3
Associative memory network feed-forward error-backpropagatlon network
tion rate and stress is compared to the accuracy of models generated in [ 101 by means of statistical techniques. The models in [lo] were developed using the same experimental data. These models are presented in Appendix 1-11. The model for deposition rate was developed by combining statistical regression analysis and knowledge about the physics of the process. The model for stress was developed purely based on statistical regression analysis. This is due to the lack of a first principle model for stress. A point to be mentioned is that the neural networks were trained on only part of the data, whereas the models in [lo] used all the data points in creating the process models. In order to visually compare the accuracy of different models in predicting deposition rate and stress of the polysilicon film, the values predicted by the neural networks and the regression models [lo] are plotted versus the actual data points for the different experimental conditions of their corresponding testing sets. Figs. 4 and 5 are the results of this test for deposition rate and stress respectively. The amount of scatter around the 45 degree line is a measure of error in predicting the empirical data. One can see that neural networks, given the disadvantage of smaller training sets, have generated more accurate models of the process. Comparing Figs. 4 and 5 it can be seen that the values of stress are not predicted as accurately as the values of deposition rate. Although a lot of different FFEBP network configurations were tried in attempts to achieve better learning, most of the networks produced the same amount of total error in learning the behavior of stress. This could be due to one or both of the following reasons. 1) Absence of an important variable, affecting the stress, in the influence diagram. 2) Presence of excessive noise in measuring the values of the stress. Notice that the stress model in [IO] does not include silane flow rate and position of the wafer as part of the parameters affecting stress. These parameters were found to be statistically insignificant in determining the amount of stress [lo]. HOWever, in order to test the ability of neural networks in filtering statistically insignificant parameters, the model generated by the neural network includes all five of the input parameters (Fig. 3). Looking at Fig. 5, it is obvious that the neural network generated model is still more accurate in predicting the amount of stress. A neural network with only three input parameters, same ones as the stress model in [lo] (temperature, pressure and time), was also constructed which produced similar results (in terms of average regression error) as the one in Fig. 5 . The quantitative comparison of the performance of the different models, in terms of maximum and average error in predicting the experimental values, is given in Table I. The important conclusions from this experiment are as follows.
1) Although the neural networks were given half as much information in creating the models, they were able to create a mbre accurate model of the process (refer to Table 1). 2) The inclusion of unceitaih influences in modeling a process does not degrade the performance of neural networks in creating an accurate model. Although silane flow rate and position of the, wafer were found to be statistically insignificant in modeling the behavior of the stress [lo],
Fig. 3 . Integrated network for LPCVD of undoped polysilicon.
for the training set, and the rest of the testing set, resulting in 36 data points for the training set and 36 data points for the testing set. For the stress modeling, however, 34 data points were used for the training set and 20 data points wcre used for the testing set. This was due to the lack of stress data for all the wafers. Conventional techniques were used to configure the. networks. Three layers were used for each. The structure of the networks, determined by the structural algorithm (an iterative algorithm) [ 11, for each of the three networks is shown next: FFEBP network for deposition rate FFEBP network for stress AM network 4-5-1 5-5-1 7-7-7
In the notation used above, the first number represents the number of nodes in the first or input layer, the second number represents the number of hidden layer nodes, and the last number represents the number of nodes in the output layer. The optimality criteria used for the selection of the training set of the AM network was arbitrarily chosen to be data points with ambient pressures equal or less than 520 mtorr. There are 27 such data points.
Comparing P e ~ o r m a n c e Neural Networks to Other Models of
To answer the question: Why use neural networks and not any other statistical andlor regression techniques?, the accuracy of the neural networks in creating the models for deposi-
NAD1 et al: INFLUENCE DIAGRAMS, NEURAL NETWORKS IN MODELING MANUFACTURING PROCESSES

PREDICTED DEPOSITION RATE
55
300
DEPOSITIONRATE, Armtroms/minute
250
,
+
'
NETWORK MODEL REGRESSON MODEL IDEAL RESULT
-_
50 50
1W
150
2W
250
3W
10
EXPERIMENTALDEPOSITIONRATE
15 x, LOCATION IN TUBE. cm.
25
30
Rates in Mgstromsnnlnutn
Fig. 4. Experimental deposition rates compared with predicted rates for both regression model and the network model.
Fig. 6. Film deposition rate versus sample position for operating at uniform temperature.
SYNTHESIS
PREDICTED STRESS IN FILM
lo
~
#;+
8 6 -
,
/
, '
In this part we have done two experiments, both of which test the generalization abilities of the integrated network to produce novel ideas.
.
__
0f
Generating a Recipe for a Zero Stress Polysilicon Film

NETWORK MODEL REGRESSION MODEL IDEAL RESULT
4[
,+ + Jq;iI I I ' ' ' I
d
-6
-2
10
MEASURED STRESS IN FILM
S r s in 10E9 dvnekq. un tes
Fig. 5. Measured film stress compared with predicted stress for both regression model and network model.
TABLE I COMPARISONTHE ACCURACY TWO METHODS PREDICTING OF OF IN STRESS DEPOSITION AND RATE
Integrated Networks Stress Dep. rate max. error = 21.2% avg. error = 13.2% max. error = 11.0% avg. error = 1.4% Regression Models max. error = 131.1% avg. error = 28.2% max. error = 17.0% avg. error = 5.1%
their inclusion in the model did not degrade its performance. Two points must be mentioned here. First, the conclusions here are based on the 12 experimental runs. Comparison of the two modeling techniques should also be made with a larger set of data. Second, although we have shown that in this experiment neural networks were capable of generating more accurate models for the process, they are not meant to replace the statistical models. On the contrary, the statistical models are needed in creating bounds on the amount of variablity of process parameters and in giving an understanding of the degree of significance of each of the process parameters, features which are of great importance in terms of planning exploratory experiments, or setting up a process initially.
In this experiment the integrated network is supplied with information about the desired film thickness, stress, and some of the input conditions such as process time and ambient pressure. The integrated network is then asked to produce values for process temperature and silane flow rate that would result in the asked for outputs. The position of wafer #3 was chosen for the position variable because it is in the middle of the processing tube and, therefore, would yield a better approximate for the rest of the wafers. The novel point about this experiment is that the required stress value is zero, and the FFEBP network for the stress has not seen any such values for stress in its training set. All of the values of the stress in the training set were either compressive or tensile. Therefore, the integrated network is required to generalize its knowledge in order to solve the problem. Table I1 shows the result of the experiment. The input row is the given information. In the first output row the value of the deposition rate is calculated by using the exact relationship between time, thickness, and deposition rate. The second output shows the initial estimates for the temperature and silane flow rate produced by the AM network. The last output shows the final values produced by synthesizing the two FFEBP networks simultaneously. For the purpose of comparison, the resultant values were plugged into the models developed in [lo] using the same experimental data, Appendix 1-11. The result of the comparison is presented in Table 111. It is evident that the results are fairly consistent with each other.
Generating a Non-Uniform Temperature Distribution to Obtain a Uniform Deposition Rate
In this experiment only the deposition rate FFEBP network is used to calculate a non-uniform temperature distribution that results in a uniform deposition rate along the processing tube. Fig. 6 shows a typical deposition rate versus wafer position for constant temperature across the tube. The drop in the deposition rate from the inlet to the outlet is because of the depletion of the active gas ( SiH,) as it passes over the wafers and its
II
56
IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 4, NO. 1, FEBRUARY 1991

TEMPERATURE. degrees C
625
01
0
10
LOCATION IN TUBE, cm.
15
20
25
30
Fig. 7. Temperature profile needed to produce uniform deposition rate for all locations. TABLE I1 RESULT THE ZERO OF STRESS SYNTHESIS Pressure input 1st output 2nd output 3rdoutput 450 $50 450 450 Temperature
? ?
Flow Rate
? ?
Time
70 70 70
Location 13.2 13.2 13.2 13.2
DepRate 142.8 142.8 142.8

?
Stress
0 0 0 0
Thick.
70
62 1 615
177 190
loo00
1Oooo loo00
loo00
TABLE 1 1 1 COMPARISONRESULTS THE ZERO OF FOR STRESS SYNTHESIS

~ ~ ~~~~
Integrated Networks Stress x IO9 dyne/cm2 depostion rate angstroms/ min.
Regression Models 0.89 151.9
0.0
142.8
concentration is gradually decreased. To compensate for the depletion effect, a non-uniform temperature distribution along the processing tube (lower at the inlet and higher at the outlet) would result in a uniform deposition rate across the tube. This is due to the fact that the deposition rate is an expontential function of the temperature. To produce the temperature distribution, the deposition rate FFEBP network was synthesized once for each of the six wafer positions, specifying the deposition rate, silane flow rate, and ambient pressure, and then calculating the process temperature that would result in such a deposition rate across the tube. Fig. 7 shows the resultant temperature distribution. In actual practice, however, the process tube is heated up in three heating zones, which can be controlled independently. Therefore, the result is a step-wise temperature distribution that approximates the distribution in Fig. 7. The novel point about this synthesis is that all of the data in the training set of the deposition rate FFEBP network were for uniform temperature distributions across the tube, which resulted in exponentially dropping deposition rates (refer to Fig. 6). This again shows how these networks can effectively generalize their knowledge to produce novel ideas.
flexible modeling architecture that can capture both the qualitative and the quantitative aspects of a process model. 2) FFEBP networks can learn and effectively generalize the quantitative relationships between the process parameters just by observing discrete experimental data points (inherently polluted with some statistical noise), without the need for any quantitative knowledge about the physics of the process. The ability to generalize can be used to produce novel ideas for the process. 3) For this experiment, the models generated by the FFEBP networks were more accurate in predicting the behavior of the process than models generated by the statistical regression analysis. A similar study with a larger amount of experimental data would be useful for extending this conclusion. Although experimental results in other areas of learning such as vision and speech processing which deal with processing large amounts of data tend not to disprove this result. 4) The synthesis procedure developed in [l] can be used to extract information about the relationships between the process parameters in any arbitrary manner. This capability is useful for recipe generation, and diagnostics.
THE FUTURE DIRECTION RESEARCH OF

This work can be extended in two different directions: Theory and application. In terms of theory, the interested reader is referred to [l] for a list of possible research directions. In terms of application, this method can be applied to model different domains which require learning. Two such areas are diagnostics and real-time process monitoring. If an influence diagram can be constructed for a problem domain, and enough experimental data can be gathered for the variables in the model, our integrated architecture could be used to develop a mechanistic model of the process or improve the value of the numerical parameters in the current influence diagram model. Real time in-
CONCLUSION
The following is a summary of conclusions from this work.
1) Integration of influence diagrams and neural results in a
NAD1 er al: INFLUENCE DIAGRAMS, NEURAL NETWORKS IN MODELING MANUFACTURING PROCESSES
51
fluence diagrams have been successfully tested in monitoring, diagnosing and controlling complex systems linked to sensory input [ 111. The major flaw is their inability to learn and improve the numerical values of the probabilities over time as more data are collected. The integrated architecture outlined in this paper forms the basis for recent extensions in the integration of influence diagrams and probabilistic neural networks for use in real time sensor validation. The qualitative knowledge captured in the influence diagram is used to structure the neural learning model. The probabilistic neural network operates off-line and is used to periodically update the numerical parameters in the real time influence diagram. Another useful application in diagnostics would be in the area of preventive maintenance [121. The up-time of processing equipment plays a very definite role in the efficiency of a manufacturing environment, and as such it would be very valuable to detect and correct equipment related problems in their early stages, before they become catastrophic. Most major problems, in their early stages, manifest themselves in minor drifts in some of the process variables. One can create a preventive maintenance and diagnostic model for the processing equipment by creating an integrated network that relates trends in the process and equipment related parameters to an upcoming maintenance problem. The neural networks of the integrated network would each be trained on the history of a set of parameters that relates them to specific equipment problems. Once a model is generated, it can be used as a tool for scheduling preventive maintenance. APPENDIXI POLYSILICON DEPOSITIONRATE MODEL FOR TYLAN 16 [8]
APPENDIXII EMPIRICAL AVERAGE* RESIDUALSTRESSMODEL FOR TYLAN 16
898 - T
120
-t
T deposition temperature in K. Range 878-923 O K . P deposition pressure in mtorr. Range 250-550 mtorr. t deposition time in min. Range 60-150 min.
al a2
3.856 3.445 -2.521 -1.764 3.376 -5.140
a3
a4 a5
a6
ACKNOWLEDGEMENT The authors would like to thank Kuang-Kuo Lin, and Costas Spanos for providing the very valuable and hard to gather experimental data, and Semiconductor Research Corporation, MICRO, Harris Semiconductor, IBM, Intel, National Semiconductor, Philips/Signetics, Rockwell International, Sandia National Laboratories, Siemens AG, and Texas Instruments, for providing the funds for this research. REFERENCES
R ( T , P , Q,X ) = APue-AE/RT
mole fraction of depleted SiH,:
[ 11
where R a t X = 0. deposition temperature in K. Range 878-923 K. P deposition pressure in mtorr. Range 250-550 mtorr. Q silane flow rate in sccm. Range 100-250 sccm. X wafer position in cm. Wafers are 1.2 cm apart. First wafer is at X = 0 cm; second at X = 1.2 cm; third at X = 2.4 cm, etc. the Arrhenius frequency factor = 9.2935 X 10 /min A mtofl a 0.2910 AE the activation energy = 30183.8 cal/mol k , 23.9845 seem-' R the universal gas constant = 1.98719 cal/mol - K C,, the gas solid conversion factor = 1.85 X lo- cm/ A S Surface area of the furnace, 12740.8 cm2 L Length of the furnace, 80 cm.
[2]
R, T
[3]
[4]
[5] [6]
F. Nadi, Modeling complex manufacturing processes via integration of influence diagrams and neural networks, Ph.D. dissertation, University of California at Berkeley, Department of Mechanical Engineering, Nov. 1989, also Electronics Research Laboratory Memorandum M89/ 123, University of California at Berkeley, Nov. 1989. A. C. Miller, M. M. Merkhofer, R. A. Howard, J. E. Matheson, and T. R. Rice, Development of automated aids for decision analysis, Final Technical Report, DARPA #2742, SRI International, Menlo Park, CA. A. Rege, and A. M. Agogino, Topological Framework for Representing and Solving Probabilistic Inference Problems in Expert Systems, ZEEE Trans. Syst. Man. Cybern., vol. 18, no. 3, May/ June 1988. S. Russell, S. Srinivas, and A. M. Agogino, Inducing influence diagrams from examples, Berkeley Expert Systems Technology Laboratory, Department of Mechanical Engineering, 5136 Etcheveny Hall, U.C. Berkeley, Berkeley, CA 94720. Working paper #88-0202-1. D. Rumelhart and J. McClelland, Parallel Distributed Processing, vol. 1, Cambridge, MA: MIT Press, 1986. S. Rangwala and D. Dornfeld, Learning and optimization of machining operations using computing abilities of neural networks, IEEE Trans. Sysr. Man Cybern., MarchlApril, 1989.
*AVERAGING over the wafers in 2 boats.
58
IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 4, NO. I , FEBRUARY 1991
[7] H. Graf, L. Jackel, R. Howard, B. Straughn, J. Denker, W . Hubbard, D. Tennant, and D. Schwartz, VLSI implementation of a neural network memory with several hundreds of neurons, Proc. American Institute of Physics Con$ Neural Networks for Compuring, New York, NY, 1986, pp. 182-187. [8] E. Harth and A. S . Pandya, Dynamnics of the ALOPEX process: Application to optimization problems, Biomathematics and Related Computational Problems, pp. 459-47 1, 1988. [9] S. Kirkpatrick, C. D. Gelatt, Jr., M. P. Vecchi, Optimization by simulated annealing, Science, vol. 220, pp. 671-680, 1983. [lo] K. K. Lin and C. Spanos, Statistical equipment modeling for VLSI manufacturing, presented in 176th Electrochemical Society Meeting, Automated Manufacturing Session, Oct. 1989. [I 11 C. Spanos, personal communication. 1121 A. M. Agogino, S . Snnivas, and K. Schneider, Multiple sensor expert system for diagnostic reasoning, monitoring and control of mechanical systems, Mechanical Systems and Signal Processing, vol. 2, no. 2, pp. 165-185, 1988.
mechanical engineering from the University of California at Berkeley and the Ph.D. degree from the Department of Engineering-Economic Systems at Stanford University, Stanford, CA, in 1978 and 1984, respectively. She is currently on the faculty in the Department of Mechanical Engineering at UC Berkeley and directs research in the Berkeley Expert Systems Technology Laboratory. She is a Registered Professional Mechanical Engineer in California and is actively involved in consulting with industry. Her research interests include multiobjective and strategic product design, nonlinear optimization, probabilistic modeling, graphics, and computer-aided design, artificial intelligence and decision and expert systems. She is a member of AAAI, ASME, ORSAITIMS, SWE, SME, SAE, and ACM. She received an NSF Presidential Young Investigator Award in 1985; IBM Faculty Development Award, 1985-86; Pi Tau Sigma Award for Excellence in Teaching, 1986; Ralph R. Teeter Educational Award, 1987; and SME Young Manufacturing Engincer of the Year Award, 1978-88.
Alice M. Agogino received the M.S. degree in
Fariborz Nadi was born in Tehran, Iran. He received the B.S. degree in mechanical engineering from George Washington University, Washington, DC, and the M.S. and Ph.D. degrees in mechanical engineering from the University of California at Berkeley, in 1981, 1983, and 1989, respectively. After receiving the M.S. degree, he worked as an Optical Thin-Film Engineer for the laser interferometer products division of HewlettPackard Company, Santa Clara, CA, for two years, then returned to the University of California at Berkeley for his Ph.D. degree. He is presently with the Department of Mechanical Engineering at Sharif University of Technology, Tehran, Iran. His research interests are in the area of machine learning, with the use of parallel distributed processing algorithms and influence diagrams, applied to real-time manufacturing problems.
received the B.E.E. degree from Cornell University, Ithaca, NY, and the M.S. and Ph.D. degrees in electrical engineering from the University of California at Berkeley. From 1966 to 1970 he worked at Bell Telcphone Laboratories, first in the components area, Murray Hill, NJ, then as Head of the Systems Elements Research Department, Holmdel, NJ. He is currently Professor of Electrical Engineering and Computer Sciences (EECS) at the University of California at Berkeley, where he has been a membcr of the faculty since 1970. He became Chairman of the EECS department in July 1989. Since 1970 he has been active in teaching and research on microelectronics technology and design. Since 1984 his research has centered on integrated information systems for semiconductor manufacturing. Dr. Hodges is founding Editor of the IEEE TRANSACTIONS O N SEMICONDUCTOR MANUFACTURING, and with Robert Broderson and Paul R. Gray, he received the 1983 IEEE Morris N. Liebmann Award for pioneering work on switched-capacitor circuits. He is also a member of the National Academy of Engineering.
David A. Hodges (S59-M65-SM71-F77)

Use of Influence Diagrams and Neural Networks in Modeling Semiconductor Manufacturing Processes

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Use of Influence Diagrams and Neural Networks in Modeling Semiconductor Manufacturing Processes

Încărcat de

Drepturi de autor:

Formate disponibile

II

IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL 4, NO I , FEBRUARY 1991

NAD1 er ul: INFLUENCE DIAGRAMS, NEURAL NETWORKS IN MODELING MANUFACTURING PROCESSES

-&%FFEBP network 1st output FFEBP network Input

Fig. 1. Schematic of synthesis process.

IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 4, NO. 1, FEBRUARY 1991

Fig. 2. Influence diagram for LPCVD of undoped polysilicon.

Associative memory network feed-forward error-backpropagatlon network

Fig. 3 . Integrated network for LPCVD of undoped polysilicon.

NAD1 et al: INFLUENCE DIAGRAMS, NEURAL NETWORKS IN MODELING MANUFACTURING PROCESSES

15 x, LOCATION IN TUBE. cm.

Generating a Recipe for a Zero Stress Polysilicon Film

,+ + Jq;iI I I ' ' ' I

MEASURED STRESS IN FILM

S r s in 10E9 dvnekq. un tes

IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 4, NO. 1, FEBRUARY 1991

LOCATION IN TUBE, cm.

Location 13.2 13.2 13.2 13.2

DepRate 142.8 142.8 142.8

TABLE 1 1 1 COMPARISONRESULTS THE ZERO OF FOR STRESS SYNTHESIS

Integrated Networks Stress x IO9 dyne/cm2 depostion rate angstroms/ min.

Regression Models 0.89 151.9

THE FUTURE DIRECTION RESEARCH OF

NAD1 er al: INFLUENCE DIAGRAMS, NEURAL NETWORKS IN MODELING MANUFACTURING PROCESSES

APPENDIXII EMPIRICAL AVERAGE* RESIDUALSTRESSMODEL FOR TYLAN 16

3.856 3.445 -2.521 -1.764 3.376 -5.140

mole fraction of depleted SiH,:

*AVERAGING over the wafers in 2 boats.

IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 4, NO. I , FEBRUARY 1991

Alice M. Agogino received the M.S. degree in

David A. Hodges (S59-M65-SM71-F77)

S-ar putea să vă placă și