Documente Academic
Documente Profesional
Documente Cultură
Microcontrollers
Programming the Freescale HCS12
Preface
Dear Reader,
This is a book about microcontrollers. Microcontrollers rule the world when we arent
looking. They run our cars, trains and planes, medical devices, our kitchen appliances, our
telephones, printers, robots, and about a zillion other things that were not aware of. Theres even
a microcontroller inside the battery in the laptop Im typing this on, and probably yours, too.
There are billions of them in use today and they do their job ceaselessly and without complaint
for years at a time (at least for the most part).
Applications such as those just mentioned are examples of what are known as embedded
systems. These are systems designed to turn on when you push the power button, do their job
without intervention from you, and then turn off when you push the button again. The
microcontroller is buried deep inside these systems and you never see it. Microcontrollers are
also used around the world by artists and hobbyists to make animated art, toys, robots, or just
flashing lights for the fun of it.
A microcontroller is similar in some ways to a microprocessor (in fact, it has a
microprocessor contained inside it) and dissimilar in others. The biggest difference is that they
are systems-on-a-chip, that is, they usually have everything on a single integrated circuit
needed to sense and control the outside world. Also, they usually run a lot slower than the
microprocessor in your laptop or desktop. This keeps the power and cost down, and many of
them cost less than a dollar. Power is important because they are often used in battery-powered
autonomous operations.
In this course you will learn whats inside a microcontroller and how they do their job. You
will also learn how to program them to do all sorts of useful and fun things.
A note about the textthis is the first edition, and it may have many errors, both small and
big. If you spot any, dont hesitate to let me know; you may make it into the Acknowledgements
if there is ever a second edition.
Table
of
Contents
Table
of
Contents
...............................................................................................................................................
3
Chapter
1.
Introduction
..................................................................................................................................
5
Number
Systems
...........................................................................................................................................................
5
Introduction
to
Microcontrollers
........................................................................................................................
13
What
Goes
on
Inside
a
Microcontroller
...........................................................................................................
15
What
You
Need
to
Know
to
Program
a
Microcontroller
...........................................................................
19
Instant
Quiz
Answers
...............................................................................................................................................
22
Homework
....................................................................................................................................................................
22
References
....................................................................................................................................................................
23
Chapter
2.
Programming
the
Microcontroller
...................................................................................
24
Programming
languages
.........................................................................................................................................
24
The
CodeWarrior
IDE
..............................................................................................................................................
25
Some
Useful
Instructions
.......................................................................................................................................
25
Some
Useful
Assembler
Directives
....................................................................................................................
34
Instant
Quiz
Answers
...............................................................................................................................................
36
Homework
....................................................................................................................................................................
38
Chapter
3.
I/O
Ports
......................................................................................................................................
39
Parallel
I/O
Ports
.......................................................................................................................................................
41
An
Example:
Process
Control
Automation
.....................................................................................................
44
Instant
Quiz
Answers
...............................................................................................................................................
50
Homework
....................................................................................................................................................................
51
Chapter
4.
Indexed
Addressing
................................................................................................................
52
Instant
Quiz
Answers
...............................................................................................................................................
59
Homework
....................................................................................................................................................................
59
Chapter
5.
Assembly
Directives
................................................................................................................
61
Shift
Operations
..........................................................................................................................................................
67
Some
Useful
Assembly
Directives
......................................................................................................................
71
Assembly
Expressions
.............................................................................................................................................
77
Instant
Quiz
Answers
...............................................................................................................................................
79
Homework
....................................................................................................................................................................
80
Chapter
6.
Timing
and
Pulse
Width
Modulation
...............................................................................
82
The
HCS12
Timing
Module
....................................................................................................................................
87
The
HCS12
Pulse-Width
Modulation
Module
................................................................................................
92
Instant
Quiz
Answers
...............................................................................................................................................
97
Homework
....................................................................................................................................................................
98
Chapter
7.
Interrupts
....................................................................................................................................
99
What
are
Interrupts
and
how
do
They
Work?
..............................................................................................
99
What
happens
when
an
interrupt
occurs?
...................................................................................................
101
The
WAI
Instruction
...............................................................................................................................................
107
Instant
Quiz
Answers
.............................................................................................................................................
109
3
Chapter
1. Introduction
In this chapter well discuss three topics. The first is that of number systems used by
microcontrollers and the development software used to program them. Next, well describe what
a microcontroller is, what theyre used for, and the differences between a microcontroller and a
microprocessor. Well conclude with what is called the microcontroller organization. This is a
description of how the internal components (central processing units, memory, etc.) are
organized (what is connected to what), a description of special registers inside the
microcontroller used for things such as doing arithmetic, the instruction set (a description of
what instructions the microcontroller can perform, such as add two numbers together), and the
memory map, which is a graphical description of the memory in terms of which addresses in the
memory are used to perform what special functions (for example, where is the program or the
data stored).
Number
Systems
Microcontrollers, like microprocessors, operate at the physical level using binary numbers
represented by physical voltage levels, so we have to understand binary numbers in detail. We
can understand how binary numbers work by remembering how ordinary decimal numbers work.
For example, in Figure 1-1, the number 124.375 (base 10), the 4 to the left of the decimal point
represents the number of ones (100), 2 represents the number of tens (101), and the leftmost 1 the
number of hundreds (102). Note that 10 is the base, which means the digits run from 0 to the base
minus 1 (9).
(=5810)
(=1710)
(=7510)
The numbers in red are the carry bits. For example, in the fourth column from the left, the
sum is 1 + 1 = 0, carry the two (the first red bit on the right).
The most significant carry bit (the zero on the extreme left) is important. The reason is that,
while on a piece of paper you can add numbers of as many bits as you like, the processor only
has storage space for a finite number of bits. For example, suppose you wanted to add the two 8bit numbers below using a processor that stores results as 8-bit bytes.
1011
10111010
+10010001
101001011
On paper you can just write down the leftmost carry bit next to the zero in the answer, as
shown, but in the processor the answer would be truncated to 8 bits.
To deal with this problem the processor keeps track of special bits like the carry (or C) bit in
a special register called the Condition Code Register, or CCR. Well talk about some of the other
bits in the CCR in a while.
Subtraction
Doing subtraction in a processor is an interesting problem. On paper when you want to do
subtraction, you just put a little minus sign in front of the number you want to subtract. This
converts it to a negative number. In a processor theres no place to put a minus sign, just ones
and zeros. To deal with this problem, most processors represent negative numbers using what is
known as Twos Complement Arithmetic.
To get the twos complement of a number, you first complement it (change all the 1s into 0s
and all the 0s into 1s) and add 1 to the result. (The complement, before adding 1, is called the
ones complement.)
As an example, lets find the twos complement of the number 0510. In binary
0510 = 00000101,
so the ones complement is
05 = 11111010.
To get the twos complement, add 1 to this in the normal way you do addition
11111010
+00000001
-05 = 11111011.
If you dont believe this is -5, just add +5 to it to see what you get:
1
-05 = 11111011
+05 = 00000101
00000000.
So we get all zeros, plus a carry out. The carry bit isnt part of the answer; its just stored in
the CCR in case we need to pay attention to it.
By the way, this is how a computer does subtraction using twos complementit just
converts the second number to its twos complement and adds the result to the first number.
Note that in the twos complement representation for -5 the most significant bit is a 1. This is
a common feature of twos complement negative numbersall negative numbers begin with one.
This means that the magnitude of the number is represented by the remaining 7 bits. This, in turn,
means that an 8-bit twos complement number can represent the numbers -128 (10000000)
through +127 (01111111).
Instant Quiz (the answers are at the end of the chapter, but try doing them yourself first, before
looking)
1. Convert the following decimal numbers to 8-bit binary:
a) 33
b) -33
2. Add the following two binary numbers: 00010011 + 11101101.
3. Perform the following binary subtraction: 00111111 01001100. (Just convert the
second number to its twos complement and add.) Is the result positive or negative?
suppose your microcontroller is taking and analyzing data from some sensors but you also have a
smoke detector attached to an interrupt pin. You can set the microcontroller up so that it runs the
normal data acquisition program but if it gets an interrupt signal it stops and turns on a sprinkler
system and alarm bell. Well talk more about the I and X bits in a later chapter.
The half carry bit, H, does the same thing as the C bit but just for addition and subtraction of
nibbles (4-bit symbols).
Finally, the S bit enables or disables the Stop instruction. This instruction, as the name
implies, stops the processor from running.
Before moving on, when we talked about the Z and N bits, we said that they indicate whether
the last operation resulted in a zero or negative number. This isnt completely true; some
operations dont affect these bits. What we really meant was that they are determined by the last
relevant operation. If, for example, we add numbers together, or load or store them somewhere,
the bits can change, but there are some operations that dont affect them. We have to be a little
careful. In a few chapters well see how to find out if a particular operation affects the Z and N
bits. In the meanwhile, its time for a quick quiz.
Instant Quiz
5. What are the values of the Z and N bits for the example above (15 5)?
Hexadecimal Numbers
As you can see, its difficult for us to read and calculate with long strings binary numbers. As
a result, instructions and data are often represented in hexadecimal (base 16). Of course, the
processor still uses binary 1s and 0s; hexadecimal is just an easy way for us to deal with them.
Just as in decimal the digits run from 0 to 9 and in binary they run from 0 to 1, in base 16
they run from 0 to 15. The first 9 digits are just 0 to 9, but then 1010 is represented by A16, 1110
by B16, 1210 by C16, and so on, up to 15, represented by F16.
For example, for the hexadecimal number 7C.616:
7C.616 = 7 x 161 + 12 x 160 + 6 x 16-1,
which adds up to 124.37510, the same number as in Figure 1-1 and Figure 1-2.
We should note that what happens to the right of the decimal point doesnt always convert so
easily. For example, if we wanted to convert 124.110 to binary we would get
124.110 = 01111100.00011001100110011001100110011001,
and in hex
124.110 = 7C.19999999999999999,
so we would need an infinite series to represent the 0.1.
Its easy to convert back and forth between binary and hexadecimal. To go from binary to
hex, just divide the binary number into groups of 4-bit nibbles, starting on the right and working
left, and then just convert each nibble into its corresponding hexadecimal digit. For example,
10
11
When you write your program you have to let your software know if the number youre
typing is in decimal, binary, hexadecimal, or ASCII. Writing subscripts like 10, 2, or 16 is a little
cumbersome, so the developers of the assemblers give you some shortcuts.
Table 1. ASCII Character Set.
Unfortunately the shortcuts arent the same for every manufacturer of microcontrollers. For
Freescale microcontrollers (i.e., when youre using CodeWarrior), the notation for the shortcuts
is as follows:
Binary numbers are indicated by a prepended %
Hexadecimal numbers are indicated by a prepended $
Decimal numbers have no precedent
ASCII characters are indicated by single quotes
For example, the decimal number 72 can be written as any of the following:
72 = %01001000 = $48 = H.
Writing any of these in the assembler will have the same effect.
12
Introduction
to
Microcontrollers
Now were ready to start looking at actual microcontrollers, starting with what they are. A
microcontroller is a system-on-a-chip typically intended for embedded applications such as
telephones, automobile engine control systems, remote controls, office machines, appliances,
toys, etc. By system-on-a-chip we mean a single integrated circuit that has everything we need
13
to sense and control the outside world. Some of the things we need are on-board memory,
provision for on-board clocking, lots of I/O, on-board ROM for storing programs, and so on.
In contrast, in a desktop or laptop computer many of these functions would be provided by
external integrated circuits, boards, or modules. This provides a lot of flexibility (for example, if
your desktop computer doesnt have the latest version of USB, you can add a small board to
provide it) but usually at much higher cost.
Some of the characteristics of microcontrollers include
Most of these are related to the application space. Microcontrollers are intended to be used in
embedded applications where quantities of thousands to millions are typical, so low cost is
important. They are also often used in battery-powered applications, so low power operation is
essential. Most control applications require response times in the range of a few hundred
microseconds to milliseconds, so high-speed processing is usually not required. Moreover, there
is a tradeoff between speed and power consumption, which is another reason you dont often
need or want high-speed devices.
Microcontrollers usually offer a high degree of integration of on-board functions. These can
include, for example, a relatively large amount of general-purpose onboard memory.
Microprocessors also have on-board memory, but this tends to be high-power, high speed
memory that is used to store the next bunch of instructions the processor expects to execute; so
its used to speed up the processing. The memory in a microcontroller is used to store the entire
program to be run plus any needed constants or data. Also, microcontrollers typically have onboard analog-to-digital conversion circuitry, pulse width modulation (PWM) modules, clock(s)
and timer(s), and support for chip-to-chip communications protocols. A high degree of
integration reduces the cost but also makes the microcontroller more reliable, since there are
fewer individual parts that can fail.
14
Microcontrollers typically have lots of I/O to sense, control, and communicate with the
outside world. These can include simple, digital I/O pins, inputs for the analog-to-digital
converters, serial pins for communications protocols, pins used for external interrupts, and many
others.
We talked about interrupts a little earlier. One of the things that distinguishes
microcontrollers from microprocessors is that the former emphasizes low interrupt latency,
which is a measure of how quickly the microcontroller can service an interrupt.
Finally, microcontrollers are typically programmed at a fairly low level, usually using
assembly or C/C++.
A typical home in the US is likely to have between one and two-dozen microcontrollers,
compared to jut a few microprocessors (desktop, laptop computers, etc.). A typical mid-range car
can have over 50 microcontrollers, and, as we mentioned in the Preface, the battery in your
laptop has a microcontroller in it for power management.
15
There are two ways in which the memory can be organized to store the instructions and data
in memoryeither they can be stored separately in separate parts of the memory, each with its
own set of buses, or they can all be lumped together in a common memory. The latter scheme is
called a Von Neumann architecture, named after John Von Neumann, who was a very famous
scientist, mathematician, and a whole lot of other things. The former is known as a Harvard
architecture because it was developed at Harvard. Some people like to call the Von Neumann
architecture the Princeton architecture because Von Neumann was at Princeton and that way
they could both be named after universities. Either way, as you can see, they were both named
before the computer scientists lifted the name, architecture. Both schemes have advantages and
disadvantages, but, as a practical matter, you dont pick a microcontroller for how its memory is
organized.
Figure 1-4 shows the internal components of the CPU. There are three main parts. The
Arithmetic and Logic Unit (ALU) contains adders, subtractors, multipliers, dividers, and logic
functions such as AND, OR, etc. Next is a set of special registers. These include one or more
Accumulators, which are used to temporarily store numbers for subsequent arithmetic or logic
functions, one or more Index Registers, used for something called indexed addressing (more
about that later), a Program Counter (PC), which keeps track of what part of the program the
CPU is going to execute next, and a memory register, which stores the number to be placed on
the address bus. Finally, there is a Control Unit, comprising an Instruction Decoder and a
Sequence Controller. The Instruction Decoder looks at each instruction that arrives on the data
bus and figures out what to do with it (e.g., add a number to the number in an accumulator, store
a number in an accumulator, or whatever). The Sequence Controller makes sure that everything
runs in the proper order at the proper time (addresses put their data on the data bus, or read the
data on the data bus, etc.).
16
1. The program counter places the address containing the next instruction to be executed
on the address bus
2. The instruction is read from memory and decoded by the Instruction Decoder (some
examples are ADD, logical AND, shift contents of a register or memory right or left,
etc.there are lots of them)
3. If data is required (for example, to be added to the number in the accumulator) the
address of the data is read and the data fetched
4. The instruction is executed
5. Any results are placed in the appropriate register or memory
6. The Program Counter is advanced to the location of the next instruction
The great physicist Richard Feynman named this the file clerk model of computing.1 The
file clerk fetches numbers on pieces of paper, does something with them, and then files the
results somewhere.
As we mentioned, this is a simplified, generic model of a microcontroller. An actual device is
a lot more complicated (although it does the same things). Figure 1-5 shows a real
microcontroller, in fact its the one youll be using for this class, the Freescale MC9S12DG256.
Its part of the HCS12 family, so you may see it referred to in both ways. It doesnt look very
much like the generic version. Most of what youre looking at is I/O. Starting near the middle of
the left side youll see a box marked PTE, and a little below that youll see two boxes marked
PTA and PTB. These are parallel I/O ports E, A, and B, respectively. Theyre just memory
addresses brought out to physical pins on the integrated circuit and they just represent digital
voltage levels (5 V. for logical 1 and 0 V. for logical 0 for this device). The boxes marked DDRE,
DDRA, and DDRB are control registers that you use to tell the processor whether the pins should
be inputs or outputs. There are actually more of these parallel ports, such as Ports H and J. Port J
has two pins reserved for external interrupts.
Now look at the upper right. There are two Analog-to-digital converters, ATD0 and ATD1.
Into each there are 8 analog inputs that can be multiplexed into the converter, so you can actually
have 16 analog inputs that you can digitize.
Below the ATD section you see a box marked PPAGE. This is the section of ROM that can
be brought out externally at parallel I/O Port K.
Below this is the Timer module, attached to Port T. This module provides all kinds of useful
functions related to timing and counting.
Below that are (except for a block in the middle labeled PWM) a bunch of modules that
implement various communications protocols that can be used to connect the chip to sensors,
other microcontrollers, or even computers. SCI stands for Serial Communications Interface,
BDLC stands for Byte Data Link Controller, CAN (CAN bus) stands for Controller Area
Network, IIC (aka I2C) for Inter-Integrated Circuit, and SPI for Serial Peripheral Interface.
Theyre all just different communications protocols that people have come up with over time.
The point of having so many different kinds is that it allows you to talk to a wide variety of
different devices that may only have one of them.
17
The PWM (Pulse Width Modulation) module in the middle of all this provides up to 8 pulsewidth modulated outputs. If you havent heard of this term before, it refers to the generation of
square waves with an on time that you can vary, i.e., the pulse width. This is really useful for a
number of applications ranging from light dimmers to controlling robot servomotors.
18
19
Next is the Program Counter. This points to the address containing the next instruction to be
executed. As each instruction is executed, the Program Counter figures out where the next
instruction after that will be.
Finally, there is the Condition Code Register, which stores the condition code bits, V, C, N, Z,
etc. that we talked about earlier.
The Memory Map
Figure 1-7 shows the memory map for the MC9S12DG256 microcontroller. The column you
want to focus on is the second from the left. This is the memory map the device uses in normal
operation. The HCS12 has a 16-bit-wide address bus, which means it can address 216 = 65,536
individual memory locations. These addresses are indicated in the figure as $0000 through
$FFFF. Its important to note that not all of these are implemented in a given device in a
particular family. In the HCS12 family you can buy small versions that dont have a lot of
onboard memory, but are cheaper and have a smaller footprint, or large versions with lots of
onboard memory for big programs. By the way, the same is true for how many of the ports are
actually brought out to physical pins on a given family member. This again is so that you can
buy smaller, cheaper versions if you dont need lots of I/O.
20
There are three types of memory present in the HCS12. The first is RAM (Random Access
Memory). This type of memory can be written to or read from as the program runs, however the
information in it is lost when the microcontroller is powered down. You would use this memory
to store variables, such as data obtained from external sensors or results of calculations that you
can afford to lose when you shut off the device. In the MC9S12DG256 version of the HCS12,
the RAM is located at addresses $0000 through $03FF and $1000 through $3FFF. There are
about 12,000 bytes of RAM on the chip.
The second type of memory is Flash PROM (Programmable Read Only Memory). This type
of memory retains its contents when the chip is shut off. It can be erased and reprogrammed in
large blocks but only when the program is downloaded; once the program is running it cant be
written to by the microcontroller. Typically the program code and any data constants are stored
in ROM, so its available when you restart your microcontroller. The reason that you cant write
to ROM when the program is running is that you dont want to inadvertently write some data
over part of your stored program. The memory in your thumb drive is flash, which is why theyre
often called flash drives.
The third kind of memory is EEPROM, which stands for Electrically Erasable
Programmable Read Only Memory. Its another kind of ROM that you can use. (Theres also
Flash EEPROM on the chip that you can use in the same way as the other kinds of PROM.)
You might take a minute to locate the various types of ROM in your device. Again, you just
need to look at the second column from the left.
Some Popular Microcontrollers
To finish up this chapter lets look at some popular microcontrollers and microcontroller
families that you may come across.
Atmel: Atmel makes a variety of different microcontrollers using several different
architectures, including the AT89 (their version of the Intel 8051 described below) and the
ATmega series used in the increasingly popular Arduino microcontroller boards.
Intel 8051: this is the second generation of Intels microcontrollers. Its been around a long
time and it dominates the microcontroller market. Its powerful, easy to program, and uses a
Modified Harvard architecture.
MicroChip Technology, Inc. PIC: these are very popular among hobbyists, with over 5
billion sold. They were the first RISC (Reduced Instruction Set) microcontrollers. 8-, 16-,
and 32-bit versions are available. PICs use a Harvard architecture.
Motorola/Freescale: these are very popular for industrial applications. They use a Von
Neumann architecture.
21
0 (note that the second number is just the twos complement of the first).
11110011, negative.
V = 0.
Z = N = 0.
FF16 = 111111112; 011111002 = 7C16. (By the way, in decimal, these numbers are 255
and 124, respectively.)
7. 1 1 = 10100101.
Homework
0. Read the article http://www.nytimes.com/2010/02/05/technology/05electronics.html
1. Convert the hexadecimal numbers on the left side of the memory map (Figure 1-7) to
decimal and binary
2. For the following two operations give the result and indicate the status of the C, V, N,
and Z bits in the Condition Code register.
a) $2A
+$52
b) $AC
+$8A
3. Do the following subtractions and indicate the value of the C, V, N, and Z bits
a) $7A
-$5C
b) $8A
-$5C
c) $5C
-$8A
22
d) $2C
-$72
4. Write the sequence of hexadecimal numbers that represents the character string HELLO
WORLD! followed by a carriage return.
5. How many individual memory locations could you access with a 32-bit address bus?
How may with a 64-bit address bus?
6. Search the web for an MC9S08QG8 microcontroller (this is a member of the HCS08
family). Compare the programming model to that of the HCS12 you will be using in class
with respect to number, sizes, and types of registers (accumulators, index registers, etc.)
7. Compare the memory map of the MC9S08QG8 with that of your HCS12.
References
1. Feynmann, Richard P., Anthony Hey, and Robin W. Allen. 2000. Feynman Lectures on
Computation. Boulder, Colorado: Westview Press. pp. 58.
2. Dragon12-Plus-USB Trainer For Freescale HCS12 microcontroller family, Users
Manual for Rev. G board Revision 1.10.
http://www.evbplus.com/download_hcs12/dragon12_plus_usb_9s12_manual.pdf.
Accessed 4 April 2013.
23
Programming
languages
At the most fundamental level, all processors are programmed in machine code, using
instructions and data represented by voltage levels corresponding to binary 1s and 0s. As an
example, the machine code sequence to load Accumulator A with the number 2210 is
10000110
00010110
These two 8-bit numbers would be stored sequentially in the processors memory. The first
number (in hexadecimal, $86) is the machine code instructing the processor to fetch the next
number in memory (in this case, 000101102 = 2210) and load it into Accumulator A. To write a
complete program you have to look up the machine code for each instruction you want to
execute and load it into the appropriate place in memory.
This is obviously a cumbersome process. To speed the process assembly languages were
developed. In assembly, mnemonics are used to represent the machine code. For example, the
assembly code for the two lines above is
LDAA
#22
LDAA stands for Load Accumulator A and the # sign means the next number in
memory (i.e., right after the address containing the load instruction). The # is used to
distinguish between loading the next number and loading the number into address 22.
A piece of software known as an Assembler takes what you have written and converts it
into the machine code above, and then loads it into the microcontrollers memory. The
CodeWarrior IDE that we mentioned above is the tool we will be using in this course.
The trend is toward programming using higher-level languages such as C or C++. This
makes programming a lot easier since youre writing straightforward programming instructions,
such as
x
=
y
+
z;
which are pretty easy to understand. In this case, CodeWarrior will take your C or C++ code and
compile it to machine code for you.
The downside is that you have very little control over how the compiler converts your
program to machine code, and there is no guarantee that the resulting machine code will be the
most efficient or run the fastest. This isnt a problem unless youre doing something that is time
critical, such as developing video games. In that case, programmers often end up writing the
time-critical part of the code in assembly and incorporating it into the C code as a function call.
25
a few simple categories that are easy to remember. Here are a few that are useful for us when
starting out:
instructions that move data around
instructions that do arithmetic
instructions that perform Boolean algebra
instructions that test or manipulate data
instructions that control the flow of the program
Lets look at some examples of each.
Instructions that Move Data
Here are three instructions that move data from one place to another:
LDAA youve already seen this one
STAA Store contents of Accumulator A somewhere in memory
MOVB moves a byte from one location to another (but doesnt change the contents
of the source)
For the load and store instructions there are also versions for many of the special registers.
For example, LDAB loads Accumulator B, LDD loads the double Accumulator D, LDX and
LDY load Index Registers X, and Y, and there are more. Similarly, for the Store instruction,
there are STAB, STX, STY, etc.
You can see how you quickly get to 1,000 instructions, but you can also see how they group
together so you can more or less remember them. For example, if you were to guess that there
would be an STS instruction to store the value in the Stack Pointer, youd be correct. Another
reason that there are so many instructions is that there are a lot of ways to specify what address
the instruction is referring to. Remember that LDAA #22 loads the accumulator with the number
22, while LDAA 22 loads it with the number in address 22; theyre actually two different
instructions.
The LDAA and STAA instructions move one byte; the equivalent loads and stores for the 16bit index registers, Accumulator D, and the Stack Pointer move two contiguous bytes at a time.
In the same way, there is a MOVW instruction that moves two contiguous bytes (a word) from
one place to another and a MOVL that moves four contiguous bytes (a long word) from one
place to another.
Heres some sample code:
;
remember,
everything
to
the
right
of
the
semicolon
is
a
comment
LDAA $00
STAA $2000
This code moves the number in address $00 to address $2000 by first loading it into
Accumulator A and then storing the contents of A in address $2000. Notice how the comments
after the ; help you to see whats going on and, more importantly, remind you of what the code
does if you come back to look at it weeks or months later.
26
You might be wondering, why do it one way rather that the other? Well, the second way is
quicker to write, but the first, it turns out, will run faster. It depends on which is more important
to you at the time. Also, there are limits to how you can use an instruction. For example, you
might think that there would be an instruction
MOVB
A,
$2000
;
move
the
byte
in
Accumulator
A
to
$2000,
but there isnt. What instructions are included in the instruction set is a decision made by the
system architect, based on what he or she thinks is useful or can fit on the chip.
One more point, instructions are not case sensitive, so you could have written, e.g., ldaa
$00.
The cool kids do it this way since its faster to type, and Ill start doing it most of the time from
now on.
Instructions that do Arithmetic
Here are some instructions that do arithmetic. There are many more.
ADDA Add a number to the contents of A (theres also an ADDB, ADDD, etc.)
ABA add contents of B to A and store the results in A
SUBA Subtract a number from the contents of A
MUL multiplies contents of A and B and stores the result in D (also several divides)
Heres an example of something you might do with them:
ldaa $00
adda $01
staa $2000
What it does is to load Accumulator A with the number in address $00, then add the number
in address $01 to the contents of A, then store the resulting sum in address $2000.
As we said above, there are a lot more. Some examples are Add with Carry to A (ADCA),
which adds a number to the contents of Accumulator A and then adds the value of the Carry bit.
You would use this to add two 16-bit (or more) numbers together. You add the two first bytes,
resulting in a 1 or 0 in the Carry bit, then you add with carry the second two bytes, so the
Carry bit from the first two bytes gets added in. There are also Add Accumulator A to
Accumulator B and store the sum in A (ABA), and ABX and ABY instructions, which adds the
contents of Accumulator B to the X and Y Index Registers, respectively.
Now, lets see what youve learned.
27
Instant Quiz
1. How would you modify the code above so that it added the number 1 to the number in the
accumulator, rather than the number in address 1? (Hint: use a # sign in the adda
instruction in the same way we used it in the ldaa instruction, i.e., to indicate the number
rather than the address.)
2. Try to guess the mnemonic for an Add with Carry to B instruction.
28
rotates, which rotate the bits around, as well as all these for shifts
to the left.)
A note about nomenclature: forcing a bit to be 1 is called setting the bit, and forcing it to be
0 is called clearing the bit. Well start using this notation from now on.
Note that BSET and BCLR only operate on memory. That is, you cant use them to set or clear
a bit in an accumulator. If you want to do this, you still can, simply by ORing the number in the
accumulator with a byte that has 1s where you want to set bits, or ANDing the number with a
byte that has 0s where you want to clear bits.
Instant Quiz
3. Consider the byte, 00001111. Suppose you want to set the first two bits and clear the last
two, and leave the other bits alone. Try doing this by, first, ORing the byte with
11000000 and then ANDing the result with 11111100. What do you get? Does it
accomplish what you want? What happens to the bits that you dont want to change?
29
One more note: when we say, the last instruction executed we mean the last one that
affected the condition codes. Not all instructions can change the CCR bits.
Addressing
Almost all HCS12 instructions operate on one or more memory locations. The address in
which the data an instruction operates on is found is called the effective address (EA), and the
way in which the effective address is specified is called the addressing mode. Each instruction
has information in it that tells the HCS12 its addressing mode, and therefore how to figure out
the effective address of the data it operates on.
To make this clearer, remember the ldaa instruction to load Accumulator A with a number.
Also remember that there were two ways to say where the number to be loaded was located: the
instruction ldaa #22 meant load Accumulator A with the number 22, but the instruction ldaa 22
(without the # sign) meant load the accumulator with the number in address 22. These are two
different addressing modes. The effective address of the first addressing mode is just the memory
location right after the one in which the ldaa instruction was stored. For the second mode, the
effective address is just memory location 22.
The ldaa instruction has four different addressing modes, and each has a different machine
code, so essentially each is a different instruction. When you write down the 188 different types
of instructions such as ldaa and add up all the addressing modes for each, you get about 1,000
possible distinct instructions; this is the scary number that we mentioned earlier.
For now, we just need to look at four fairly basic addressing modes, and they happen to be
the simplest. They are
Immediate (IMM) the number in the address immediately following the instruction is
the number to be used by the instruction. Immediate addressing is
indicated by a # sign before the number. This is the first kind of
addressing youve seen.
Extended (EXT) (indicated by no # sign) the number following the instruction is the
address of the location in memory that contains the number to be used
Inherent (INH) the affected address or register is implicit in the instruction, so no EA
(example: INCA)
Extended addressing requires two bytes to specify the 16-bit address in memory so it takes
two fetches to get the full address. There is a special case of extended addressing that refers to
the first 256 locations in memory. Its called Direct Addressing (DIR). The nice thing about
Direct Addressing is that it only needs 1 byte to specify the effective address, so instructions
using Direct Addressing can be performed faster. Because its faster many of the addresses in the
first 256 locations are used for possibly time-critical I/O operations.
When you begin writing your codes you dont have to worry about the difference between
extended and direct addressing because CodeWarrior automatically figures out which one you
are doing. For example, if you write
30
ldaa $00FF
CodeWarrior will figure out that the effective address is in the first 255 memory locations
($FF) and it will know that it should use the Direct Addressing form of the instruction to run
more efficiently. Its as if you had written
ldaa $FF
For practice, here are some typical instructions with the addressing mode and effective
address as comments:
ldab
$1000
; EXT, EA = $1000
ldaa $01
; DIR, EA = $01
ldaa #255
aba
; INH, no EA
staa $2000
; EXT, EA = $2000
31
memory. The top shows a few addresses in RAM in which some data is stored ($2000, etc.).
Then, beginning at address $4000 is the program above. Remember that each address only stores
one byte. The first address ($4000) contains the first instruction, ldda (IMM).
Of course, what
actually appears there is the machine code, 10000110 ($86 in hexadecimal). Next is the number
to be loaded, $22. Since the number to be loaded is I address $4001, this is the Effective Address
for the load instruction.
The next instruction, in $4002, is adda
(DIR). The address containing the number to be added
is address $01, so $01 is the Effective address of the add instruction (not $4003). Because this is
direct addressing, only one byte is needed to tell the microcontroller where the number is.
The next instruction, in $4004, is staa
(EXT). The address where the contents of A will be
stored is $2000, so that is the Effective Address. Note that $2000 is 16-bits long; so two bytes are
needed to store the Effective Address, $4005 and $4006. Also notice that the high byte of the
address, $20, is stored first, then the low byte. This scheme of high byte first goes by the
charming appellation, big-endian. Intel processors use a little-endian scheme, in which the
low byte goes first.
Next is an ldaa
(IMM) instruction. The EA for this is $4008. Finally, there is an inca
instruction, which has the same effect as adda
#$01, but takes less time to run because it doesnt
have to fetch the number to add. The addressing mode of the inca instruction is Inherent (INH),
so there is no Effective Address.
Figure 2-8. Memory map for the sample code shown above. The numbers in $2000
$2001 are random data.
32
ldaa #0
loop:
; loop is a label
adda $1000
dec $1000
bne
loop
The first instruction moves the decimal number 10 to address $1000. Were going to use this
as a counter to do something ten times. Next, we load A with the number 0. We have to do this
because when the microcontroller first turns on theres liable to be anything in the accumulator,
and, for this example, we want to make sure we start with 0. The next statement, loop:, is a label.
We know this because its followed by a semicolon. We could have just started it in the first
<ws> to indicate that its a label, but since you cant see where the first <ws> is on the page
well guarantee that its interpreted as a label using the semicolon.
Lets skip the next two instructions for now and look at the bne instruction. This says,
branch to where the label loop is if the previous instruction didnt result in a 0. That previous
instruction is the dec
$1000 instruction; it subtracts 1 from the contents of address $1000, which
was originally loaded with the number 10. After it executes the dec instruction, the number in
$1000 is 9. Since this isnt 0, the program branches to the loop label. It then runs through the
instructions again and decrements $1000 again, resulting in the number 8 being in $1000. Since
this isnt 0 it branches again, and again, until it decrements $1000 from 1 down to 0. Now the
bne instruction sees that the last operation did result in a 0, so it doesnt branch to loop. Instead,
it just goes on to the next instruction in memory, whatever that is.
What the program is doing is just looping around ten times, but each time it goes through a
loop it adds the number in $1000 to the number in the accumulator. For the first loop it adds 10,
the next time around it adds 9, then 8, 7,1. It just adds the numbers from 1 to 10 to get 55, but
it does this by adding in the reverse order. This may seem like a rather mundane thing to do, but
its the heart of a lot of useful activities. For example, you might want to take 10 readings from a
sensor and add them all together. Now you see why we had to initialize the accumulator by
loading it with 0. This is such an important task that it gets its own instruction, Clear A (CLRA).
(Of course theres a Clear B, Clear Interrupt Mask, Clear V Bit, and lots more.)
A few notes about the program:
1. Nothing has to be justified, the rules are that labels must either start in the first column
or be followed by a colon and instructions must have at least one space (<ws>)
preceding them.
2. You can have blank spaces between lines to make it more readable.
3. The instructions and comments are not case sensitive, although the labels are.
4. Its usually better to count down to zero that to count up to a number, because the CCR
keeps track of zeros for you and there are lots of branch instructions that use the Z bit.
33
Before moving on, theres a handy convention for describing the instructions in your
programs. Here are the rules:
A register name (e.g., A for Accumulator A) indicates both the register itself and
its contents
An arrow () indicates a transfer
() indicates the contents of a memory location
(()) indicates and address whose contents are the actual address of the data (this
is used in something called Indirect Addressing, which you dont need to think about
this for now)
Here are some examples:
$100 A
($1000) A
B ($2000)
A + ($2000) A
A+BA
34
org ROMStart
ldaa #$22
staa $200
This tells CodeWarrior that every time you write ROMStart, it should replace it with the
number $4000 when converting your program to machine code.
Heres another example:
roomTemp:
equ
20
ldaa #roomTemp
ldab roomTemp
; roomTemp = 20 C
The equ directive says to CodeWarrior, every time I write roomTemp, you use the number
20 (typical room temperature in Celcius). The second line loads Accumulator A with the number
20, and the third line loads Accumulator B with the number in address 20.
The big advantage in using this is that it makes the program self-documenting. What we
mean by this is that if youre looking at some long code and there are a bunch of 20s in it, you
dont know their significance. If you write the code as above, wherever you write roomTemp
you will know at some later time that you were talking about normal room temperature, and
every time you write 20 youre not, youre just writing the number 20.
Putting all this together, your code might look like this:
ROMStart:
counter:
loop:
equ
$4000
equ
$1000
org
ROMStart
movb
#10,
counter
ldaa
#0
adda
counter
dec
counter
bne
loop
; counter = $1000
Notice that we used the label counter
in three places, the movb, adda, and dec instructions.
In each it will be replaced with the number $1000 by CodeWarrior.
We should note two more points. The first is that when you open up CodeWarrior for the first
time you will notice that it already adds the first and third lines to the code for you, as a
convenience. If, for some reason, you wanted to start your code at, e.g., $5000, you would need
to add a separate org directive.
35
The second point is about the bne
instruction. Branches use something called Relative
Addressing (REL). You dont need to do anything about this since CodeWarrior does it for you.
You do have to think about the Effective Address, though. Actually there are two EAs, one for
when it takes the branch and one for when it doesnt. If the microcontroller takes the branch in
the example above it jumps to the adda
counter
instruction (the label doesnt appear anywhere in
the memory; its just used by CodeWarrior to figure out where to jump to (i.e., the Effective
Address). For this example, the EA is the address containing the adda instruction. If it doesnt
take the branch the EA is just the next address after the one containing the bne instruction.
Now its your turn:
Instant Quiz
6. Rewrite the code above to use Accumulator B as the counter. (Hint: you will need an aba
instruction.)
7. For the code you write in question 5, show how it is loaded into memory (like the one in
Figure 2-8). Indicate the addressing type and Effective Address (if any) for each
instruction.
2. ADCB
3. After the OR you get 11001111; after the AND you get 11001100. Yep. Nothing.
4. a) EXT, EA = $2000, adds 1 to the number in $2000
b) INH, no EA, adds 1 to the number in Accumulator A
c) DIR, EA = $00, ANDs the number in Accumulator A with the number in address $00
d) IMM, EA = address right after the instruction, ORs the number in Accumulator B with
the number %00000011
5.
; 10 ($1000)
ldaa #0
; 0 A
loop:
adda $1000
; ($1000) + A A
dec $1000
; A 1 A
bne
loop
; loop is a label
36
$4005
Instructi
on
ldab
(IMM)
10
($0A)
ldaa
(IMM)
0
aba
(INH)
decb
(INH)
$4006
bne (rel)
$4007
$4000
$4001
$4002
$4003
$4004
37
EA
$4,00
1
$4,00
3
N/A
N/A
$400
4
or
$400
7
; counter = $1000
Homework
0. Read Understanding the Microprocessor Part 1 at
http://arstechnica.com/paedia/c/cpu/part-1/cpu1-1.html (follow the links at the bottom of
each page for the entire article).
1. The HCS12 microcontroller instruction set evolved from the original Motorola 6800
microprocessor. This device reputedly had a number of undocumented instructions, the
most important of which was the HCF instruction. Search the web and briefly report what
this instruction does.
2. Write a code fragment that adds the even numbers from 0 to 10 and stores the result in
memory address $2000.
3. Modify your code from problem 1 to increment Index Register X each time a number is
added. (You have to take some care to do everything in the right order or the branch
instruction wont work.)
4. Write a code fragment to load the numbers in addresses $2001 through $2004
successively into Accumulator A. Each time you load a number, increment the contents
of Index Register X if the number was zero. (Note that when you do a load, the Z-bit is
set or cleared depending on whether the number loaded is 0 or not.)
5. For the following code, explain which statements are instructions, which are assembly
directives, which are labels, and which are comments:
;
this
program
adds
3
to
address
$2000
and
increments
Accumulator
B
org $4000
ldaa $2000
adda #3
; add 3 to (A)
staa $2000
incb
; increment B
6. For each instruction in problem 5, indicate the addressing mode and effective address, if
any.
38
ANDA #%00000100
The result in the accumulator will be %00000100 if there was a 1 in bit 2 of the number
originally in A, but it will be %00000000 if the number originally in bit 2 was a 0. In the first
case the Z bit will be 0 (cleared); in the second the Z bit will be 1 (set). You just have to look at
the Z bit to see if the fire is on the second floor.
The only problem with this is that now if you want to test to see if the fire is on, e.g., the third
floor, youve destroyed the information in A. You would have had to store it somewhere and
then reload the accumulator to do this next test.
As an aside, if you wanted to test if the fire is on either the second or the fourth floor (or
both), you can test both floors at the same time with the instruction
ANDA #%00010100
Now the Z bit will be 1 only if both bits 2 and 4 are 0, if either one or both are 1 the Z bit will
be cleared. If the Z bit is 0 you have a fire on one or the other of these floors, or both, but you
still have the problem that the information in A has been destroyed, so you cant easily test other
floors.
Instant Quiz
3. How would you test if there is a fire on any of the floors?
The solution to the destruction of the data is to use a Bit Test instruction, BITA (or BITB for
Accumulator B). Heres the syntax
BITA #%00010100
Now, the Z bit will be set or cleared as if you had done an ANDA, but the data is left
unchanged.
The number used for the test is called a mask. You can also use a mask stored somewhere
in memory if you like, as in
BITA $2000
which uses the number in address $2000 as the mask (youre using Extended addressing if you
do).
One thing to keep in mind is that you cant do bit tests on numbers in memory, just the
accumulators.
This has the same effect as ORing the bit with 1, but with just one instruction, and you dont
have to use the accumulator, so its much more efficient.
Note the syntax, BSET
address,
mask. You list the target address first, then the mask with the
bits to be set, separated by a comma. A few remaining points:
1. You can set any number of bits you want with a single instruction, for example
BSET
$1000,
%01010101
2. You can only use BSET with a memory address; you cant use it with an accumulator,
so its the opposite of a BITA instruction
3. You cant use a number in memory as a mask, as you do with a BSET instruction; you
can only an Immediate mask. That is, the instruction above is equivalent to
BSET
$1000,
#%01010101
In fact, you can write it either way and CodeWarrior will understand it.
40
41
Figure 3-2. Functional representation of Port A. (DDR stands for "Data Direction
Register.)
42
To recapitulate, you can make any bit in Port A an output simply by writing a 1 to the
corresponding bit in the DDR. Then you can write either a 1 or a 0 to the bit in address $0000 to
make the actual output pin either 5V or 0V, respectively. You can make any bit in Port A
($0000) an input by writing a 0 to the corresponding bit in DDRA, but now the value of the
corresponding bit in address $0000 will be determined by whatever signal you present to the pin;
for example, connecting 5V to the pin will result in a 1 appearing in that bit in $0000.
You can, if you wish, change any bit or bits in Port A on the fly. That is, any time you want
in your program you can change a bit in the DDR from a 1 to a 0 (output to an input) or a 0 to a 1
(input to an output). Of course, you might want to be careful about changing a bit that was an
input to an output, and then sending 5V to it, particularly if it was connected to an external
voltage source.
You might be wondering why Port A gets such a special place as the very bottom of the
memory map, in address $0000. Youll see as we go along that a lot of the I/O is placed in the
first 256 memory addresses ($0000 through $00FF). Remember that in Direct Addressing you
only address the first 256 addresses, and, because of that, you only need one byte to specify the
Effective Address. This makes accessing these memory locations faster, which might be useful if
youre trying to control or sense something quickly. Its actually pretty cleaver. If, in fact, you
want to use those addresses for something else, it turns out that you can reassign the addresses
associated with Port A by simply writing to another control register. Were not going to spend
any time thinking about that, but you can if you like.
By the way, how do you actually go about writing to the port and its DDR? Well, if youre
just going to change a few bits you can use BSET and BCLR instructions. If you want to change
the whole byte you could use a movb or load/store; its up to you.
If youd like to use Port B, its attached to address $0001 and its DDR (DDRB) is at $0003.
Heres what the first few addresses in the memory map look like:
43
(1)
1 = 1 2,
(2)
2 = 1 + 2 3 ,
(2)
3 = 0.
(4)
If you are familiar with Programmable Logic Controllers (PLCs), which are devices used for
industrial automation, the program to do this (called ladder logic) is shown in Figure 3-4. If
youre not familiar with these devices, no matterthe important thing to know is that theyre
expensive.
Figure 3-4. PLC ladder logic for the industrial control problem.
The question we would like to address is, how do we do this with a 50 microcontroller? You
start by connecting the inputs and outputs to pins on the parallel ports. Suppose you make Port A
all input bits and Port B all outputs. Then connect the inputs from your industrial system (i.e., I0
44
through I3) to bits 0 through 3 of Port A, and the outputs (O0 through O3) to bits 0 through 3 of
Port B. Now we have to write a program to sense the bits at Port A, figure out what the bits of
Port B should be, and then write those bits to the port.
Lets start with one of the outputs, O1 (bit 1 of Port B, which well start calling B1). Equation
(2) tells us that B1 will be 0 if either A1 is 0 OR A2 is 1. (Take a second and convince yourself
that this is correct.) Heres the prescription for figuring out B1:
1. Set up your DDRs (forgetting to do this is a major failure mode for beginners)
2. Initialize B1, usually by making it 0 (you usually dont want it turned on when you first
start up your program since it might turn whatever it controls on, possibly with
unpleasant consequences)
3.
4. Test bit A1; if its 0 then B1 should be 0, so clear the bit and go back to step 3; if its 1,
go on to test A2 I Step 5
5. Test bit A2; if its 1, make B1 0 and go back to step; if its 0, make B1 =1 (since you
know that A1 is already 1 or you wouldnt have gotten to this step)
6. Go to step 3 and stat the whole process again so that you are continually sampling the
inputs and updating the outputs
If you like flowcharts, heres the one for this process:
45
Instant Quiz
5. Develop a flowchart for the problem 1 = 1 2. If you prefer, just write out the
prescription, as we did above.
Now, just take each piece of the flow chart and convert it to assembly code. Heres what you
get:
bclr
$02,
%00001111
;
set
up
DDRs
for
Ports
A,
B
bset
$03,
%00001111
bclr
1,
%00000010
;
Initialize
Bit
1
loop:
ldaa
$0
;
load
Port
A
into
Accumulator
A
bita
#%00000010
;
Test
if
A1=1,
if
it
isnt,
branch
beq
clearB1
;
to
clearB1.
If
its
1,
continue
to
next
test
bita
#%00000100
;
Test
if
A2=
0,
if
it
isnt,
bne
clearB1
;
branch
to
clearB1.
If
it
is
0,
proceed
to
set
B1.
bset
1,
%00000010
;
then
go
back
to
loop
to
bra
loop
;
start
again
clearB1:
bclr
1,
%00000010
;
Clear
Bit
1,
then
bra
loop
;
branch
to
"loop"
to
start
again
Lets go through it a step at a time. The first three lines set up the DDRs for Ports A and B
and clear bit B1. Next, the loop label is where we want to branch to each time after we figure out
what B1 should be, so that we are constantly looking to see if the inputs change, and updating the
outputs when they do.
Inside the loop is where we do all the testing. First, we load the accumulator with the number
in Port A (address $0000). Then, we bit test bit A1. If its equal to 0 we branch to the label
clearB1, which clears the bit and then branches to loop to continue testing. If A1 equals 1, then
we have the possibility that B1 should be 1, but only if bit A2 is equal to 0. So, we test A2 and
branch to clearB1 if it is not equal to 0. If it is equal to 0 then we have arrived at a point where
both A1 is 1 AND A2 is 0. This means that B1 should be 1. We set B1 and then branch to loop,
skipping the last lines that clear the bit.
There are lots of different ways to do this, and this is a source of confusion for beginners.
You may be thinking to yourself, how did he know to start by testing if A1 is equal to 1 rather
than testing if A1 was equal to 0? Well, you can do that if you like, but you just have to switch
the positions of the No and Yes answers, and then you have to switch the beq and the bne.
Its a little like doing a double negative. This is one of the reasons why there are so many
possibilities. The thing to do as a beginner is to just dive in, write some code, then test your code
46
with some possible combinations of A1 and A2. If it works with all possible variations of A1 and
A2, youre good. If not, just back out a little to see where it went wrong. If you practice enough
you will get some confidence that you can get to the right answer.
Once you get good at it the trick is to find the way that is efficient enough to satisfy your
application requirements (fast enough, smallest memory usage, or whatever) while not taking
forever to figure it out.
Some (possibly) Helpful Hints
Before leaving this topic, here are some helpful hints that can speed up your programming by
helping you test several bits at a time in some circumstances. Feel free to ignore them if they
seem overly complicated. You can always do things in a brute force way by testing each bit
one at a time.
The first is to use ORs or NORs as much as possible. Heres an example of why this is so:
Suppose you want to do B1 = A1 + A2 + A3 (essentially this is a 3-input OR gate). You
could do this by the following prescription
1. Read Port A by loading the result into Accumulator A
2. Test A1. If its 1, make B1 = 1 and go to step 1 to see if the number at Port A has changed
3. If A1 1, test A2. If A2 = 1, make B1 = 1 and go to step 1 again
4. If A2 1, test A3. If A3 = 1, make B1 = 1. If A3 1, the number at Port A has failed all
three tests, so Make B1 = 0. Then go back to step 1 to continue the process.
You can see that testing the number (A3, A2, A1) one bit at a time can be a long process,
both in coding time and in execution time when the program in running in the microcontroller.
Heres an easier way:
loop:
ldaa
0
bita
#%00001110
beq
clearB1
bset
$1,
%00000010
bra
loop
clearB1:
blcr
$1,
%00000010
bra
loop
What is happening is that the bit test at line 2 tests all three bits at once. If the result is zero, it
means that none of the three bits was a 1, so branch to clearB1 to clear the bit. If one or more of
the bits were 1, the result of the bit test would be 1, so the beq test fails and the next instruction
executed is the bset, followed by the branch to loop.
Notice that we put the label clearB1 on the same line as the bclr instruction. You can do it
either way, but doing it this way cleans up your code a bit. It doesnt affect how CodeWarrior
converts the assembly code into machine code. We did the same thing with the loop label and the
ldaa instruction.
47
This works just as well for a 3-input NOR gate. For an OR the answer (B1) was 1 if any of
the inputs was 1. Now, you want B1 = 0 if any of the inputs is 1. All you have to do is switch the
beq for a bne.
The second hint follows from the firsttry to convert your problem into ORs or NORs if
possible. Heres an example of how it might work:
Suppose you want to make a 3-input AND gate,
B1 = A1 A2 A3.
Now, B1 = 1 only if all three bits are 1, so you cant use the trick of testing all three bits at
once because you will get a false positive if any of the bits are 1 while either or both of the other
two are 0.
Well, you can still use the trick if first you apply DeMorgans theorem. (You remember
DeMorgans theorem from your digital electronics course, right? Right???). DeMorgans
theorem says
1 = 1 2 3
= 1 + 2 + 3.
Now, if any of the Not As is 1, B1 should be zero. Take a second and convince yourself
that this is true.
This means that you just need to load the Accumulator with the contents of Port A, then
complement this number, and then do the test (being sure to choose the right kind of branch
instruction). You can do the complement with a coma instruction. Heres the full code
loop:
ldaa
0
coma
bita
#%00001110
bne
clearB1
bset
$1,
%00000010
bra
loop
clearB1:
blcr
$1,
%00000010
bra
loop
Notice the coma instruction right after the ldaa. Also notice that we switched the beq
instruction in the original code for a bne to make the logic work out correctly. If you dont
believe that it works, try all eight possible combinations of (A3, A2, A1) and step through the
code for each.
Now suppose you have something a little complicated, such as this
1 = 1 2 3.
You cant just apply DeMorgans theorem by complementing both sides because it would
give you this
48
1 = 1 2 3
= 1 + 2 + 3,
(5)
which you cant test for with a simple bit test due to the middle term on the right.
The problem is that you cant flip individual bits with the coma instruction. You can solve
this problem using a simple trick that you might have learned in digital electronics. To see how it
works, first note that for any bit, A,
0 = ,
1 = .
This is how you make something called an optional inverter. You can, optionally flip a bit by
Exclusive-ORing it with 1, or not flip it by Exclusive-ORing it with 0. Now you just apply it to
all the bits of a byte. For example, suppose you Exclusive-OR the byte
A = (A7, A6, A5, A4, A3, A2, A1, A0)
with the Mask
M = ( 0, 0, 0,
0, 0,
1, 0,
0).
This means you can flip any bit(s) you want just by putting 1s in the appropriate position(s)
in the mask.
Instant Quiz
6. Use an Exclusive-OR instruction to do the same thing as a coma instruction. (The syntax
for an Exclusive-OR with Accumulator A using Immediate Addressing is EORA #mask,
where the mask is an 8-bit number.)
Now lets apply this to our original problem (Eq. 5). We just complement Accumulator A as
we did before, but before we do the bit test, we flip bit A2. Heres the code
loop:
ldaa
0
coma
EORA
#%00000100
bita
#%00001110
bne
clearB1
bset
$1,
1
bra
loop
clearB1:
bclr
$1,
1
bra
loop
;
flip
bit
A2
;
do
the
rest
of
the
code
as
before
Again, you may want to work through the code with some sample bit patterns for A1, A2,
and A3 to see how it works.
49
6. eora #%11111111 ; (or you could just use eora #$FF or eora #255).
50
Homework
0. Go to http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=CWHCS12X&fpsp=1&tab=Design_Tools_Tab and download the Special version of
CodeWarrior HCS12(X). Its free. Load it onto an available PC and try running some of
the code from the first laboratory exercise on it.
Also, go to
http://www.freescale.com/files/microcontrollers/doc/ref_manual/CPU12RM.pdf and
download the HCS12 Reference Manual. Its 414 pages long. Do not be afraid.
1. Write a code fragment to make the even-numbered bits of Port A all inputs and the oddnumbered bits all outputs.
2. Make a three-input AND gate: Write the code to make PORT A bits 02 all inputs and
PORT B bit 1 an output, then add the code to set PORT B bit 1 if bits 0 AND 1 AND 2 of
PORT A are all set, and clear the bit otherwise. Dont forget to include the appropriate
ORG statement, and loop your code so that it repeatedly tests the inputs.
3. Repeat problem 2 for a 3-input NOR gate.
4. Write the code to realize an Exclusive-OR gate (recall that = + ). (Hint:
start by writing the flow chart.)
N.B. If you would like to test your codes with your newly installed CodeWarrior, Use some
memory location (e.g., $1000) as your input instead of Port A. Then start by moving 0 to $1000
and put your code in a continuous loop that increments $1000 starting from 0.
51
Indexed
Addressing
Indexed Addressing uses the contents of the X or Y Index Registers as a base, to which an
offset is added to get the effective address. (Indexed Addressing can also use the Stack Pointer or
Program Counter, but for now we just need X and Y.)
There are three types of Indexed Addressing that we need to know about:
Constant offset indexed addressing
Auto pre/post decrement/increment indexed addressing
Accumulator offset indexed addressing
Constant Offset Indexed Addressing
In Constant Offset Indexed Addressing a constant is added to the number in the base (the X
or Y registers) to get the EA. The number in the base is not changed; its just used for the
calculation of the EA.
The syntax is
instruction
offset,
base
register
Heres an example of how you would use it
ldaa
3,
X
Suppose the current number in the X Register is $2000. The effect of this instruction is to
load Accumulator A with the contents of address $2003 (the offset, 3, plus the contents of the X
Register). After the instruction is executed, the number in X is still $2000. (Youll see in a bit
that with some indexed addressing modes the content of the base changes.)
Actually there are three different kinds of Constant Offset Indexed Addressing, characterized
by how many bits are needed to specify the value of the offset, which can be 5, 9, or 16 bits. One
of the reasons for having three different kinds is efficiency. If you just need a 5-bit offset you
dont need to fetch two bytes to execute the instruction. You do need two bytes for a 16-bit offset.
The offset is a 2s complement signed number. For example, with a 5-bit signed number you can
represent an offset ranging from -16 to +15.
The size of the offset only affects the size and execution time of the code, so you dont have
to worry about it for now. Well revisit the issue of execution time later.
So (you are probably asking), what good is it? Well, suppose youve just taken some data
from a sensor and the data resides in addresses $2001 through $2100. Now you would like to add
all these numbers (for example, to take an average). The only way to do this with the tools you
have so far, is to just do brute force addition:
clra
adda
$2001
adda
$2002
adda
$2003
adda
$2100
This can make for a lot of typing and it uses up a lot of memory. With Constant Offset
Indexed Addressing you can do it this way
clra
;
clear
A
ldx
#$100
;
$F
->
X
continue:
adda
$2000,
X
dex
;
decrement
X
bne
continue
;
continue
adding
until
(X)
=
0
The first two lines initialize A and loads the X Register with $100. The rest is a loop,
between the label continue and the bne instruction. This loop does three things: first, it adds the
contents of the address $2000 + X to whatever is in A. Next, it decrements X. Finally, it tests to
see if the decrement resulted in a 0, and branches to continue if it didnt.
Lets go through a few cycles of this loop:
The first time around the number in X is $100, so the adda instruction adds the
contents of address $2000 + $100 = $2100 to A, then it decrements X, so that now
the number in X is $00FF (remember, were subtracting in hexadecimal). Then it
tests to see if $00FF is zero, which it isnt, so the program branches to continue,
and does the next loop.
The second time around the number in X is $0FF, so the adda instruction adds
the contents of address $2000 + $0FF = $20FF to A, then it decrements X, so that
now the number in X is $0FE. Then it tests to see if $0FE is zero, which it isnt, so
the program branches to continue, and does the next loop.
The third time around the number in X is $00FE, so the adda instruction adds
the contents of address $2000 + $00FE = $20FE to A, then it decrements X, so that
now the number in X is $0FD. Then it tests to see if $00FF is zero, which it isnt, so
the program branches to continue, and does the next loop.
53
The program will continue until the X register contains the number $001, and the number in
$2001 has just been added. Now when it does the dex, the number in X is 0, so the program
doesnt take the branch to continue; instead it continues on with the next instruction.
Instant Quiz
4. Suppose Port A is connected to a sensor that produces a constant stream of 8-bit numbers
that represent the analog signal it is sensing. Write a program that takes $100 readings
from Port A and stores them in addresses $2001 through $2100. You can store the
numbers in any order you like, that is, you dont have to put the first one in $2001, the
second in $2002, etc. That is, you could, if you wish, put the first number in $2100, the
second in $20FF, and so on.
Auto Pre/Post Decrement/Increment Indexed Addressing
In the example above, it would be useful and faster if the dex instruction were done
automatically. This is the motivation behind the next addressing mode, which goes by the oddsounding title, Auto Pre/Post Decrement/Increment Indexed Addressing.
The reason it sounds odd is that its really four versions of the same idea. The four are
Auto Pre-decrement Indexed Addressing
Auto Pre-increment Indexed Addressing
Auto Post-increment Indexed Addressing
Auto Post-decrement Indexed Addressing
The way it works is that the offset is added (increment) or subtracted (decrement) to or from
the contents of the base register (e.g., X) but now the contents of the base register are changed.
You indicate an increment by putting a + sign next to the base register and a decrement by
putting a - sign next to the base register.
The pre- or post- refers to whether the Effective Address is calculated before or after the
instruction is executed. If you want to increment or decrement the base register before the
instruction is executed you put the + or sign to the left side of the base register (e.g., +X). If
you want to do it after the instruction is executed you put the +/- sign on the right side (e.g., X+).
Here are some examples (suppose before each instruction is executed, the contents of X and
Y are $2000 and $2100, respectively):
ldaa
$1E,
+X
staa $1, -Y
; $2100-1 Y; A ($20FF)
dec 3, X+
adda 1, Y-
; ($2100) + A A; $20FF Y
The first is an example of auto pre-increment. The number $1E is added to the number in X
and the sum ($201E) stored in X. This number is the Effective Address for the load instruction.
54
In the second, the number 1 is subtracted from the number in Y and the result is both the new
number in Y and the address where the contents of A are stored.
In the next two instructions, the offset is added to or subtracted from the base register after
the instruction is executed. In the first of these the effective address is $2000, since the increment
isnt performed until after the dec instruction is executed. That means the number in address
$2000 is decremented by 1, then the offset, 3, is added to X. In the last instruction, the number in
$2100 is added to A and then 1 is subtracted from the number in Y.
Instant Quiz
2. For each of the examples below, find the addressing mode, the Effective Address, and the
content of the X Register when the instruction is completed. Assume the number in the X
Register is $3000 before the instruction is executed.
a) ldaa
3,
X+
b) staa
3,
-X
c) adda
5,
X+
d) inc
5,
X-
Now were going to do an example of how you might use this kind of addressing. It uses a
new instruction that you havent seen, cpx. What this does is to compare the number in the X
Register with a mask by calculating the difference. If theyre the same it sets the Z bit because
the difference is zero. If not, it clears it. Note that it doesnt change the number in X. Were
using it here to count up to a number rather than counting down to zero.
Heres the code:
ldx
#$2001
ldy
#$2101
continue:
ldaa
1,
X+
suba
#20
staa
1,Y+
cpx
#$2101
bne
continue
What does it do? The first two instructions load $2001 and $2101 into X and Y, respectively.
They will be our counters. Heres what happens in the first few cycles of the loop:
55
root of 2. We cant save it as 1.4 because we cant store decimal fractions, only integers. To get
around this well store it as the integer 14.
Calculating square roots is a pretty complicated thing to do in a microcontroller, or even a
microprocessor with floating point arithmetic for that matter. A much easier way to do it is to use
a lookup table, which stores the answer for the range of numbers were interested in.
For simplicity, suppose the sensor just sends 4 bits, so that we need a table with 16 possible
entries. Heres the table we need to store
What you see are the 16 possible 4-bit numbers we can get from the sensor. The middle
column is the decimal square root of each, and the right column is the integer representation of
each.
Now lets set up the lookup table in memory with a bunch of movb instructions. (Well see in
the next chapter a much easier way to do this.)
movb
#00,
$2000
movb
#10,
$2001
movb
#14,
$2002
movb
#17,
$2003
:
movb
#39,
$200f
and now add the code to use Accumulator Offset Indexed Addressing to pick the number in
memory ($2000, etc.) that corresponds to the square root of the number at Port A:
ldx
#$2000
ldab
$00
ldaa
B,
X
57
The ldx instruction makes the base equal to $2000. The next instruction loads Accumulator B
with the number at Port A (address $00). Finally, the last instruction loads A with the number in
the address pointed to by the sum of the numbers in B and X.
Heres how it works. Suppose the number at Port A is 3. This is loaded into B in the second
instruction. The third instruction loads A with the number in address 3 + $2000 = $2003. This is
the number 17, the integer square root of 3.
Instant Quiz
3. Find the effective address of the ldaa instruction in the following code:
ldab
#$A
ldy
#
$5000
ldaa
B,
Y
A Last Note
Finally, you can do some pretty sophisticated stuff, like this
movb 3, X-, 5, Y
CodeWarrior will figure out from the locations of the commas and the minus sign the unique
addresses of the source and target for the move instruction. In this example, 3,
X- can only
indicate Auto Post-decrement Indexed Addressing for the source. The Effective Address for the
source is the number in X, and 3 will be subtracted from the number in X as the instruction is
completed. For the target of the move, 5,
Y can only indicate Constant Offset Indexed
Addressing; the address of the target of the move is 5 plus the number in Y, and Y doesnt
change. You would characterize the addressing mode of this instruction as
Auto Post-decrement IDX/Constant Offset IDX.
This can clean up your code a bit. Suppose you want to take a succession of readings from
Port A and store each result sequentially, starting at $2100. Heres how you might do it:
loop:
ldx
#$2000
ldy
$2100
ldab
$00
movb
B,
X,
1,
Y+
bra
loop
58
movb $0, $2
ldx #$100
; $F -> X
continue:
ldaa
$00
staa
$2000,
X
dex
bne
continue
Homework
0. Read Chapter 3, Sections 3.1-3.7 and 3.9-3.10 of the CPU12 Reference Manual that you
downloaded previously.
1. For the following code, indicate the addressing mode, the effective address for each
instruction, and the value in the Index Register, X, after the instruction is completed:
org
$4000
ldab
#3
ldx
#$2015
;
load
X
immediate
with
$2015
staa
2,
X
staa
6,
-X
staa
2,
X+
staa
B,
X
59
2. Using whatever form of indexed addressing you like, write the code to load the numbers
1 through $F into the addresses $2001 through $200F, respectively.
3. Write a code fragment to do the following: suppose the number at Port A is N. Load the
number in Port A into Accumulator A, then increment the number in address N + $1000.
For example, if the number is 4, increment the number in address $1004. Repeat 100
times.
60
62
63
From Figure 5-5 we see that oprx16 is Any label or expression that evaluates to a 16-bit
value. Also, xysp means any of the 16-bit registers, X, Y, S (Stack Pointer), or the Program
Counter.
Figure 5-3. The LDAA page from the HCS12 Reference Manual.
Figure 5-4. Section 6.3 of the Reference Manual showing the symbols indicating effects
of instructions on the CCR bits.
The addressing mode of this instruction is just Constant Offset, using oprx16 as a 16-bit
offset and the number in the X, Y, SP, or PC register as the base. Since the offset is 16 bits, the
addressing mode is IDX2. If you couldnt figure this out, the next column, Address Mode, tells
you what the addressing mode is. You may sometimes need to look at this a little carefully to see
what addressing modes are available. Fore example, a movb instruction does not have a Direct
(DIR) addressing mode. It does have an Extended mode, but its a little less efficient due to the
extra address byte fetch.
64
Figure 5-5. Partial list of source forms from Section 6.5 of the Reference Manual.
Instant Quiz
1. The LSR instruction (Logical Shift Right) shifts all the bits in a memory location to the
right, moving the least significant bit (LSB) bit into the C bit in the CCR and loading a 0
into the most significant bit (MSB). Its useful for, among other things, converting a
parallel byte to a serial bit stream for transmission over a serial data link. Search the
HCS12 Reference Manual that you downloaded and find
a) What addressing modes are supported by the instruction?
b) How are the CCR bits affected?
The next column to the right of the Address Mode is the actual machine code (Object Code)
of the instruction in hexadecimal. For an immediate load (the top row), you see 86 ii. You might
remember that $86 (=%10000110) is the machine code for Load Immediate. To see what ii
65
means, you look in another section in Section 6. In Section 6.4 Object Code Notation, you will
find a listing (reproduced in Figure 5-6) of all the possible symbols you can have for the
argument of the instruction. In the figure you will see that ii means 8-bit immediate data
value. This is just the 8-bit number to be loaded into A. If, instead of Immediate Addressing,
you were doing extended, you would have
B6
hh
ll
in which B6 is the machine code for Load Extended, hh is the high byte of the 16-bit
address containing the number to be loaded into A, and ll is the low byte. (The last symbol is two
lower case els, not two upper case is.)
66
ldaa
#$3 takes one bus cycle (P), whereas ldaa
$3000 takes three (rPf). If you bus is running at 1
MHz the first will take 1 s; the second will take 3 s.
Shift
Operations
In Instant Quiz Question 1 we saw an example of a shift operation. There are actually 21
different kinds of shift operations, summarized in Figure 5-8, which is taken from Table 5-12 of
the Reference Manual. These 21 operations break down into three categoriesLogical Shifts,
Arithmetic Shifts, and Rotations. Each of these breaks down into two typesshifts to the left
and shifts to the right. Finally, each type can be applied to a number in a memory location or one
67
of the 8-bit accumulators, A or B, and the first two, logical and arithmetic, can be applied to the
Double Accumulator, D.
Figure 5-8. HCS12 shift operations. (Table 5-12 of the HCS12 Reference Manual.)
Lets look at the first category, logical shifts. Weve already seen a Logical Shift Right (LSR)
in the quiz. In this kind of shift the least significant bit is shifted into the C bit and a zero is
shifted into the most significant bit. You can see this in the figure. You can also shift left, in
which case a zero is shifted into the least significant bit and the most significant bit is shifted into
C. You can do the same thing to Accumulator A with an LSRA, Accumulator B with an LSRB, or
Double Accumulator D with an LSRD. Of course, you can also shift each of these to the left with
an LSL, LSLA, LSLB, or LSLD instruction.
Logical shifts, as we mentioned, are useful in converting back and forth between parallel and
serial representations of the data. Every byte sent between your computer and a remote server
over a network is first converted into a serial bit stream at the source and then converted back at
the destination by just this kind of process.
68
The next category is that of the arithmetic shifts. These, too, can shift left or right, and can
shift the contents of a memory or an accumulator. Arithmetic shifts to the left look exactly like
logical shifts left. This might strike you as an odd thing to do, but what is odder is what happens
in arithmetic shifts to the right. For these shifts, the least significant bit is again shifted into the C
bit, but the most significant bit is shifted back into itself.
To see why they do such a strange thing, lets apply a logical shift right to the
number %00000110 (= 610). The result is %0000011 (= 310). Shifting to the right is equivalent to
dividing by 2. Shifting to the right twice is equivalent to dividing by 4, and so on. Similarly,
shifting to the left is equivalent to multiplying by 2.
Heres where the problem occurs: suppose the number to be shifted is %10001100 (= -11610).
When you do a logical shift right, you get %01000110 (= +70!!!). What has happened is that the
1 in the most significant bit has been shifted to the right, turning a 2s complement negative
number into a positive number.
To get around this problem, the Arithmetic Shift Right instruction shifts the low 7 bits to the
right, shifting the lowest bit into the C bit, but then it shifts what was the most significant bit
back into itself, preserving the original sign of the number, as shown in the figure. For the
example above (%10001100 = -11610), after the shift we get
%11000110 = -5810,
which is correct.
We should note three things about arithmetic shifts. First, when using shifts to the right as
division by 2, you get a rounding error whenever the least significant bit is a 1. For example,
shifting %00000011 (= 310) to the right once gives %00000001 (= 1, not 1.5). Second, arithmetic
shifts to the left do not preserve the sign of the answer, so you cant use it to multiply negative
numbers by 2, 4, etc. Third, you can arithmetically shift the Double register D left, but not right.
Instant Quiz
3. a) For the binary number %11001000, find the result of two successive arithmetic shifts
to the right. Is the answer equal to the original number divided by 4?
b) For the same number, find the result of two arithmetic shifts to the left.
The final kind of shift in this group is the rotate. From Figure 5-8 you can see that you can
rotate the number in a memory location or Accumulators A and B, and you can rotate right or
left. If you are rotating the number right, the least significant bit (b0 in the figure) is shifted into
the C bit, bits b7 through b1 are shifted right one position, and the number that was in the C bit
before the shift is shifted into the b7 position. If you executed, for example, a rora nine times, the
number in Accumulator A would just be the one you started with.
Rotate left works the same way, except the most significant bit is shifted into the C bit and
the value that was in the C bit is shifted into the least significant bit of the memory or
accumulator.
69
You can do some pretty neat things with the rotation instructions. One of them is shown in
Figure 5-9. This device is known as a Linear Feedback Shift Register (LFSR). Its also known
as a pseudo-random pattern generator, for reasons that will soon be obvious.
Instant Quiz
4. Suppose the seed number in the LFSR is %11001011. What will be the numbers in the
shift register after the first shift? After the second and third? After each shift, what will be
the value of the C bit in the CCR?
It turns out that this is a very useful thing to do. In fact, you do it all the time without
knowing it. Every time you get a web page from a server that isnt very close to you (like on the
same campus) your request and the information returned from the server is scrambled using this
technique. The reason for scrambling the data is that the clocks in your computer and the
computer at the other end dont run at exactly the same rate. At the receiving end of any
transmission the receiver circuitry is trying to sample each incoming bit at just the right time to
tell if its a 1 or a 0. If its clock is running at a slightly different rate eventually it will eventually
miss a bit or sample the same bit twice. The way the receiver gets around this is to use the
transitions in the incoming data (i.e., from 1 to 0 or 0 to 1) to re-sync its own clock to that of the
sending circuit.
The problem is, what happens when you send a very large number of 1s in a row (or 0s in a
row)? The receiver has no transitions to sync up to and you will start getting errors eventually.
The solution is to run your data through a LFSR. This will start generating transitions quickly as
the Exclusive-OR sees a string of 1s (or 0s) in a row. This technique is used in a communications
70
protocol named SONET (for Synchronous Optical Network) and almost all of your long-distance
communications use this protocol.
71
Equate directives are particularly useful for self-documenting your program so that you can
remember what you did when you look at it months or years later. For example, in the equ
example above, you can see that you are loading the normal room temperature, 20 C, into A. If
you just wrote ldaa
#20 you might miss the significance of the instruction.
To see how this can really help, lets recall the code that we wrote in Chapter 3 to implement
1 = 1 2:
org
$4000
movb
#$ff,
3
;
set
up
DDRs
for
Ports
A,
B
movb
#0,
2
loop:
ldaa
0
;
load
Port
A
into
Accum
A
bita
#%00000010
;
Test
if
A1=0,
if
it
is,
beq
clearB1
;
branch
to
clearB1.
If
;
its
1,
continue
to
next
test
bita
#%00000100
;
Test
if
A2=1,
if
it
is,
bne
clearB1
;
branch
to
to
clearB1.
If
its
;
its
0,
proceed
to
set
B1.
bset
1,
%00000010
;
then
go
back
to
loop
to
bra
loop
;
to
start
again
clearB1:
bclr
1,
%00000010
;
Clear
Bit
1,
then
bra
loop
;
branch
to
"loop"
to
start
again
Remember that this code looks at A1 and clears B1 if its 0, but if its not, it goes on to look
at A2. If A2 = 0, then it sets B1, otherwise B1 is cleared. You dont really see what its doing
unless you read it carefully and recognize a bunch of things such as that address 0 is Port A and
address 0 is DDRA, and so on.
Now, lets add a few equate directives at the beginning of the code:
A1:
equ
%00000010
A2:
equ
%00000100
BIT1:
equ
%00000010
PORTA:
equ
$00
PORTB:
equ
$01
DDRA:
equ
$02
DDRB:
equ
$03
ROMStart:
equ
$4000
72
73
remember all the mnemonics with their idiosyncrasies, or you can blithely type in your best
guess and see if CodeWarrior complains.
Now lets look at another useful assembly directive, Define Constant. This is used to store
constants, usually in ROM. There are four versions, each for a different bit size for the constant
being stored:
DC.B
Defines
one
or
more
one-byte
constants
DC.W
Defines
one
or
more
wordsize
(16-bit)
constants
DC.L
Define
one
or
more
long
word
(32-bit)
constants
DCB
Allocates
a
block
of
memory
If no size is specified, byte-length constants are assumed. Also, the dc directive is not case
sensitive. The syntax is (for one-byte constants),
dc.b
byte1,
byte2
,
If you like, you can associate a label with the constants. For example, if you are storing a
bunch of sine data in a table
You can put the table anywhere in your code you like, in which case it will occupy the
addresses immediately after the instruction above it. Alternatively, you can tell CodeWarrior
where you want it:
org
$5000
sines:
dc.b
sin(1),
sin(2),
74
The Boolean expressions for each output, B0, B1, in terms of the inputs, A0, A1, are
0 = 1 + 3 2,
(1)
1 = 1 2,
(2)
2 = 1 + 2 3 ,
(3)
3 = 0.
(4)
Lets begin by construction the truth table for every B in terms of the As. We can combine all
four truth tables (one for each B) into a single table. There are 16 entries for each B output. The
top half of the table is shown in Table 5-1, below.
Table 5-1. Truth table for industrial control problem.
To see how this table was constructed, look at the next-to-last row. For the inputs
(A3, A2, A1, A0) = (0, 1, 1, 0),
the outputs are
0 = 1 + 3 2 = 1,
1 = 1 2 = 0,
2 = 1 + 2 3 = 1,
3 = 0 = 0.
Instant Quiz
5. Figure out the next two (9th and 10th) rows of the table.
75
Next, take each 4-tuple, (B3, B2, B1, B0), and convert it into its equivalent hexadecimal
number. For example, for the next-to-last row in Table 5-1, this gives 0101 = $5. The full list for
all 16 entries in the table looks like this
$0, $8, $6, $E, $0, $8, $5, $D, $0, $8, $6, $E, $5, $D, $5, $D.
Now create your lookup table (suppose we start it at address $5000, etc.).
org
$5000
dc.b
$0, $8, $6, $E, $0, $8, $5, $D, $0, $8, $6, $E, $5, $D, $5, $D
76
forever:
lookup:
org
$4000
ldx
#lookup
ldaa
PORTA
movb
A,
X,
PORTB
bra
forever
dc.b
$0,
$8,
$6,
$E,
$0,
$8,
$5,
$D,$0,
$8,
$6,
$E,
$5,
$D,
$5,
$D
What will happen is that CodeWarrior will figure out that if you start your code at $4000, the
last line (the bra) will end up at $4016. It will then load your lookup table starting at $4017, etc.,
and then load $4017 into the X register to be used as the base for the movb
A,
X,
PORTB
instruction. Pretty smart, dont you think?
Theres also an assembly directive for storing variables in RAM. Its called Data
Storage
(DS or ds). The allowable sizes of the stored variable are the same as for Define Constant, so you
can have
DS.B
Defines
one
or
more
one-byte
constants
DS.W
Defines
one
or
more
wordsize
(16-bit)
constants
DS.L
Define
one
or
more
long
word
(32-bit)
constants
DSB
Allocates
a
block
of
memory
Heres an example of the syntax:
org
$2000
ds.b
10
This sets aside 10 bytes of storage, starting at $2000. You can use it with a label, too. For
example
org
$2000
results:
ds.b
10
org
$4000
staa
results
The staa instruction will store the number in A in address $2000.
Assembly
Expressions
The last topic we need to discuss in this chapter is that of Assembly Expressions. Some
examples of these are arithmetic, Boolean, and comparisons. A complete list appears in Table
5-2.
77
You can use Assembly expressions to help self-document your program. Also, you can use
them to make it more easily modifiable. As an example, suppose you want to control the
temperature of a room to within 2C of normal room temperature (20C). Suppose, also, that your
temperature sensor produces an 8-bit temperature value and is attached to Port A. You might
start with the following equ directives
roomTemp:
highTemp:
lowTemp:
equ
20
equ
roomTemp
+
2
equ
roomTemp
-
2
roomTemp
+
2 and roomTemp
-
2 are assembler expressions that define the temperatures at
which you might want to turn on the air conditioner or heater. Now follow this at some point by
the code
continue
ldaa
PORTA
cmpa
#highTemp
;
compare
temperature
to
highTemp
bgt
turnOnAC
;
if
result
of
comparison
is
>
0
branch
;
branch
to
routine
that
turns
on
AC
cmpa
#lowTemp
;
next,
compare
temperature
to
lowTemp
blt
turnOnHeater
;
if
result
is
<
0
branch
to
code
that
;
turns
on
heater
The ability to use assembler expressions in this way has two nice advantages. First, it makes
clear what we are doing. Second, we can make global changes quickly. Suppose we decide that
to save power we are willing to allow a 3C degree temperature swing before turning on heating
or air conditioning. We just have to make the change in the assembly expressions and it changes
the values everywhere the labels appear.
We can also use expressions in instructions. For example, in this code, which we looked at a
bit earlier
78
org
$2000
results:
ds.b
10
org
$4000
staa
results
%00110110
$0f
!var1
var1
*
var2
var1&var2
var1/\var2
79
;
Fragment
2
movb
#$FF,
$03
;
movb
IMM/EXT
takes
4
clock
cycles
(OPwP)
;
total
time
for
4
clock
cycles
=
2.0
s
%01100101; C = 1
%01011001; C = 0.
5.
Homework
1. For the following instructions, find which addressing modes are supported (note that you
can find all these instructions at the end of the CPU12 Reference Manual): INCA, BITB,
DEC, BCLR, MOVB, STAA. Also find how the condition code bits are affected.
2. Write a code fragment that directs the assembler to set aside four bytes in RAM for the
variable named output and stores the numbers 1 through 4 in byte form in ROM as data
constants input (dont forget to use the appropriate ORG statements).
3. Write a code fragment that stores the sines of the angles 0 through 10 degrees in onedegree increments in addresses $5000 through $500A. The sines should be stored as 3-
80
digit integers since the microcontroller cant do floating point arithmetic (i.e., the sine of
5o = 0.087; this should be stored as 087).
4.
5. Give the result for Q1 through Q4 in hexadecimal for each of the following assembler
expressions:
MS_MASK:
EQU
%11110000
LS_MASK:
EQU
%00001111
MOST:
EQU
255
NEG:
EQU
-128
SIX:
EQU
6
TEN:
EQU
10
a)
b)
c)
d)
Q1:
Q2:
Q3:
Q4:
DC.B
DC.B
DC.B
DC.B
SIX+TEN
SIX | TEN
MOST ^ LS_MASK
~MS_MASK
6. For your code in problem 4, use the Reference Manual to determine how many bus clock
cycles the code takes to run one time around (including the branch).
81
Delay
Loops
For some of the applications above you can use simple delay loops. To see how these work,
look at the following code:
82
ldy
#6000
loop:
dey
bne
loop
;
2
;
1
this
decrements
the
Y
Register
;
3
if
loop
is
taken,
1
if
not
What does it do? Well, basically nothing. The important thing is that it takes some time to do
it, generating a delay.
The code loads the Y Register with the number 600010 and then just loops 6000 times as it
decrements the Y Register using the dey instruction. The numbers in the comments are the
number of bus clock cycles each instruction takes (from the reference manual). So, how long is
the delay?
The code takes 2 + (1+3) x 6000 +1 = 24,003 clock cycles to run,
The bus runs at 24 MHz, so each bus clock cycle takes 1/24 x 10-6 seconds,
So the code takes ~1 ms to run.
This is an example of a 1 ms delay loop. To get longer delays, just embed this in an outer
timing loop:
ldx
#1000
outer_loop:
ldy
#6000
loop:
dey
bne
loop
dex
bne
outer_loop
;
2
;
1
this
decrements
the
Y
Register
;
3
if
loop
is
taken,
1
if
not
;
this
decrements
the
X
register
The instructions in blue are the original 1 ms code. What weve done is just to embed the
1 ms code in an outer loop that runs 1000 times, resulting in a 1 second delay. In this way you
can get a delay of any size you want.
Instant Quiz
1. Write the code to get a 10-second delay.
The delay loop written this way wont give you exactly a 1 ms delay because of the time
needed to do the additional branching. If precision is important, you will need to connect your
microcontroller to an oscilloscope and tweak your code to give you the precision you want. You
do this by adding or subtracting to the number of loops you do. If you still dont get the precision
83
you want you can strategically add a number of no op instructions (NOP). NOP stands for no
operation. These dont do anything but they take one clock cycle to execute.
Its often the case that you are using the X or Y Registers for something else in your program.
When this happens you can just use an ordinary memory address as your counter. The only
problem is that in a memory location you only have 8 bits so you can only count down from
28-1 = 255. In the example above you need to count down from 1000. The workaround is to use
two nested loops with two different memory locations as counters. Heres one way to do it
movb
#4,
$2001
outer_loop:
movb
#250,
$2000
inner_loop:
ldy
#6000
loop:
dey
bne
loop
dec
$2000
bne
inner_loop
dec
$2001
bne
outer_loop
;
2
;
1
this
decrements
the
Y
Register
;
3
if
loop
is
taken,
1
if
not
;
this
decrements
address
$2000
What is going on here is that you are running the millisecond delay 250 times using the inner
loop, producing a 250-ms delay, and then running the 250-ms delay 4 times to give a one second
delay. You might take a minute and trace how the logic flows.
If you need to generate delays at several different points in your code, its easier to write the
code for the delay once and then call it at each point in your code that its needed using a
subroutine. A subroutine (in C/C++ its called a function) is a piece of code that could
function independently to do some specific task but you stash it somewhere in the memory and
then call it where you need it. You call the subroutine with a Jump to Subroutine (jsr)
instruction (you could also use a Branch to Subroutine, or bsr instruction), which tells the
microcontroller to jump to the location containing the first instruction of the subroutine and start
running the code there. At the end of your subroutine code you have to add a Return from
Subroutine (rts) instruction. This tells the microcontroller to return to where it was before the
call and continue executing the main code. You can indicate the place where youve put the
subroutine either with the actual address or using a label.
As a simple example, suppose you want to turn on an output (say B0), wait for one
millisecond, then turn it off for one millisecond, and then repeat the process forever. If you
connected your output to a light emitting diode you would have a 500 Hz flasher. Heres the
code to do it, using an address:
84
org
$4000
bset
DDRB,
%00000001
forever:
bset
PORTB,
%00000001
jsr
$5000
bclr
PORTB,
%00000001
jsr
$5000
bra
forever
org
$5000
ldy
#6000
loop:
dey
bne
loop
rts
program
Notice that our main code starts at $4000 and weve put the subroutine at $5000. The jsr
instruction tells the microcontroller to jump to address $5000. Inside the forever loop youre
just turning on the output, then jumping to your millisecond delay, then turning off the output
and delaying for a millisecond again. Then you loop back to forever to repeat the process. Also
note the rts instruction at the end of the subroutine. If you forget to include this the
microcontroller will just continue trying to execute whatever is in the memory following the
subroutine, which could be anything. It will never get back to your main program.
Using a subroutine saves you writing the delay loop twice. This might not seem like a big
deal since you could just cut and paste it the second time but you would have to change the loop
label to something else to avoid duplicate labels. This is an inconvenience that grows in size with
the number of times you call the subroutine in your code. It also uses up more memorya lot of
memory if your subroutine is big.
Heres how you can call a subroutine using a label
org
$4000
bset
DDRB,
%00000001
forever:
bset
PORTB,
%00000001
jsr
delay_1_ms
bclr
PORTB,
%00000001
jsr
delay_1_ms
bra
forever
85
delay_1_ms:
ldy
#6000
loop:
dey
bne
loop
rts
What weve done is to identify the subroutine with the label delay_1_ms. Then we just
replace jsr
$5000 in the original code with jsr
delay_1_ms. CodeWarrior will put the subroutine
right after your main code and just use the address where the first instruction of the subroutine
winds up to replace the label delay_1_ms everywhere it appears with an instruction. Of course,
you can use an org directive followed by the label to force CodeWarrior to put the subroutine
anywhere you like if you want to.
Suppose you now want to do something a little more complicated, such as turning on the
output for one millisecond but then turning it off for five milliseconds to produce a signal with a
20% duty cycle (the duty cycle is the ratio of the on time to the period). You could write a
second delay loop of five-milliseconds, or you could just call the one-millisecond delay loop five
times in a row; either with 5 successive jsr instructions or by putting a single jump instruction in
a loop of five iterations.
This is what the loop part of the code would look like (the rest is the same as above):
forever:
Weve used Accumulator A as our counter but you could just as easily used an index register
or a memory address.
As an aside, most programmers accumulate a collection of subroutines that they use regularly
and can just paste into any code theyre writing. To get a range of possible timing combinations
they might have a millisecond delay and also separate microsecond, 10-microsecond, and
hundred-microsecond delays so they can plug in whatever is easiest and most efficient to use.
86
87
The divided-down clock is sent to a 16-bit free running counter (TCNT) that counts pulses
from the prescaler. It starts at $0000 and runs up to $FFFF, then resets and starts again on the
nest pulse. Each time it resets it sets a flag, the Timer Overflow Flag (TOF), to let you know that
this has happened. You can check if the overflow has occurred in two ways. First, you can
periodically monitor the TOF bit manually. Second, you can enable an interrupt that will be
generated when it happens.
These functions, and many others, are set up by writing to control registers, in the same way
you write to the DDRs to set up parallel I/O ports. The difference is that its a bit more
complicated and there are more registers to write to. The power of 2 used in the prescalar is
defined by the lowest three bits of register TSCR2 (for Timer System Control Register 2), located
at address $004D. For example, the lines
makes the three bits of the prescaler value 101 = 5, so the bus clock will be divided by 25 =
32. If your bus clock is running at 24 MHz, the TCNT counter will be incrementing at 24 MHz
32 = 750 KHz. By the way, you cant easily use a movb
#%00000101,
TSCR2 instruction because
some of the other bits in the register are used for something else and you may be altering them.
The TCNT register itself is located at addresses $0044:$0045 (remember, its a 16-bit
register). You can read TCNT anytime you want, for example with an ldd (Load D) instruction.
You cant use two ldaa instructions to read the high byte and then the low byte because the value
in the low byte will have changed in the time it takes to read the high byte.
An example of a simple thing you can do with this is to time the time interval between two
events. You set up the prescaler value to give you the time accuracy you want and then just read
the number in TCNT when each event happens. The difference times the prescaled clock period
is the time interval between events.
Instant Quiz
2. The speed of a car is measured by the time it takes for the car to break two laser beams
1 m apart. Youre using an HCS12 microcontroller to determine the time. The bus clock is
running at 8 MHz and the last three bits of the prescaler (at $004D) are 011. The two
values read from TCNT as the car breaks each beam successively
are %0000000100101110 and %0001010010110110, respectively. Assuming the counter
hasnt overflowed, what is the time interval between the breaks in the two beams, and how
fast is the car going?
Another useful register is the Modulus Down-Counter. It doesnt appear in Figure 6-1
because its isnt available in all members of the HCS12 family. When it is available, its part of
something called the Enhanced Capture Timer (ECT) Module.
The way the Modulus Down-Counter works is that you load a number into the Modulus
Down-Counter Count Register (MCCNT) at address $0076:$0077 (its a 16-bit register) and it
begins to count down using the bus clock divided by a 4-bit prescalar. This prescalar can be set
88
to divide the bus clock by 1, 4, 8, or 16. When the counter counts down to zero you have the
option to either have it stop or continuously reload itself and count down again. Either way, once
it hits zero a flag is raised in the MCFLG register. The flag is in bit 7 and the other bits arent
used (theyre all zero by default), so you could, for example load an accumulator from MCFLG
and test the Z bit to see if the counter has reached zero.
You also have the option of having the counter generate an interrupt each time it reaches
zero. This allows you to do some task periodically, such as monitor an external sensor without
the CPU doing any of the timing.
Setting up the counter is a little more complicated than setting up the DDRs for a parallel
port. You set the counter up by writing to its control register, MCCTL, at address $0066. Figure
6-2 shows the function of each bit in this register. To use the Modulus Down Counter you have
to do the following (refer to the figure):
1. Enable the interrupt (assuming you want to generate an interrupt) by setting Bit 7 (Modulus
Counter Underflow Interrupt Enable). If you dont want to use interrupts just clear this bit.
2. Enable Modulus Mode by setting Bit 6 (Modulus Mode Enable). This tells the counter to
reload itself and begin counting down again each time it reaches 0. If you clear this bit the
counter will count down once and stop.
89
3. Clear Bit 5 (Read Modulus Down-Counter Mode). When this bit is cleared, a read of
MCCNT will give you the current value. If the bit is set, a read will give you the
value in the register from which it is reloaded each time it counts down all the way. If
all you want to do is, e.g., just generate periodic interrupts this bit doesnt matter.
4. Bit 4 is used in something called input capture. Well see what that means in a
while but for what were doing now just leave it cleared.
5. Bit 3 forces a load of MCCNT with the number it counts down from, and also resets
the prescaler. Leave it cleared for now.
6. Bit 2 enables the modulus counter. Note that its different from Modulus Mode
Enable in Bit 6. That bit tells the counter whether to stop when it counted down to
zero. This bit turns the whole thing on. It should be set for now.
7. The next two bits are Modulus counter Prescaler bits 1 and 0. They determine the
value of the prescaler according to Table 6-1. You can see, for example, that setting
these bits to 1 and 0, respectively, divides the bus clock by 8.
Table 6-1. Modulus Counter Prescaler Select.
As an example, suppose you want to generate a 1 ms delay from a 24 MHz bus clock. First,
figure out how many ticks of your bus clock corresponds to 1 ms. The period of your bus clock is
1/24 x 10-6 seconds. You need to load the counter with N ticks, where
1/24 x 10-6 s x N = 10-3 s,
so you need 24,000 ticks. This is the number that you load into MCCNT. Before you do that
you have to set up the control register. Heres the full millisecond delay code:
movb
#$04,
MCCTL
movw
#24000,
MCCNT
checkFlag:
ldaa
MCFLG
beq
checkFlag
In this code the checkFlag loop just keeps checking to see if the MC flag is set, indicating
that the countdown is complete.
90
Instant Quiz
3. Explain the effect of each bit loaded into the MCCTL register in the example above.
The biggest delay we can get this way is by loading MCCNT with 65,535. This corresponds
to a delay of about 2.7 ms. To get longer delays we just divide the bus clock by the prescaler
value determined by bits 1 and 0 in MCCTL. For example, in the code above if we again loaded
MCCNT with 24,000 but changed the first line to movb
#$06,
MCCTL, the prescalar would be set
to 8, and we would get an 8 ms delay.
Instant Quiz
4. Write the code to generate a 20 ms delay.
Now you can do all sorts of complicated things. For example, suppose you want to generate a
100 Hz square wave at bit 0 of Port B. You just change the code above to generate a 5 ms delay
and then put it in a loop that runs forever, but each time you loop, complement bit B1. This turns
the bit on for 5 ms, then off for 5 ms, resulting in a 10 ms period (100 Hz). Heres the code:
checkFlag:
There are three things you should notice about this code:
1. By moving $45 (%01000101) to MCCTL in the third line weve set bit 6; this enables
the Modulus mode, which means MCCNT is automatically reloaded and starts
counting down each time the count reaches zero
2. You have to clear the flag for the counter to start counting down again; we do this by
writing 1 to bit 7 of MCFLG in the next to last line
3. We flip bit 0 of Port B using a Complement instruction (com
1). This flips all the bits
in address $1, but since weve only made bit 0 an output and the rest inputs, the other
bits wont be affected. If you want to use any of the other bits as outputs for other
purposes you cant do it this way because the other bits will be flipped, too. You need
91
to do the exclusive-OR trick that you learned before. Just replace the com
1 instruction
with the following code:
ldaa
1
eora
#%00000001
staa
1
The problem with doing delays in this way is that the processor is still occupied full time
with running the code, so the only advantage it has over a simple delay loop is convenience (and
a little better accuracy). To free the processor for other work while the counter is counting, you
need to employ the interrupt function. Well see how to do this in the next chapter. For now,
well just briefly cover some of the other functionality of the timing module. Then well finish
up with a detailed description of the pulse width generation module.
Looking again at Figure 6-1, we see eight channels of multiplexed input capture and output
compare functions. Input-capture latches the contents of TCNT when a specified event occurs.
This could, for example, be used to time the period between two events. We described a way to
do this earlier by reading the contents of TCNT at each event, but the Input capture function does
this in background without having to use the CPU. The 16-bit Pulse Accumulator counts events
arriving in a defined time interval. This can be used to find the frequency of an incoming signal.
Each Input Capture register can generate an independent interrupt. Output-compare compares the
value in the timer counter with that of the output-compare register. Each channel can also
generate an interrupt when they are equal. Each of the 8 channels can be connected to an external
bidirectional pin via Port T.
All functions are controller via timer registers in the same way as we saw for the modulus
down counter. The devil, of course, is in the details. You can see how involved the process of
setting up a register to do what you want can be. You basically have two choices when
confronted with the need to do this for a timer function you havent used before. You can pour
over the extremely dry description of the control registers in the reference manual or some
textbook to see how the individual bits should be programmed, or you can search the web to see
how other people have solved a similar problem. Most people take the more sensible of the two
approaches. If you ever have to do this you will quickly learn the benefits of using a popular and
strongly supported microcontroller family.
92
93
There are four possible clock sources for the PWM module: Clock A, Clock SA, Clock B,
and Clock SB. Clock A and Clock B are derived by dividing the bus clock by 2n, where n can
range from 0 to 7. You can set the divisor independently for each clock. Clock SA is derived by
dividing clock A by an even number ranging from 2 to 512. Similarly, Clock SB is derived by
dividing the clock B by an even number ranging from 2 to 512. Clocks SA and SB also can be
set independently by writing to an 8-bit register. The number that you write is multiplied by 2 to
get the range 2 to 512 for the divisor.
To set up the PWM module you do the following by writing to the appropriate registers:
1. Select Clock A, B, SA, or SB (PWMCLK)
2. Select prescale value (PWMPRCLK)
3. Set period (PWMPER)
4. Select duty cycle (PWMDTY)
5. Enable output for selected channel (PWME)
Optionally, you can enable active high polarity of the output signal by writing to register
PWMPOL, and/or align the edge of the PWM signal with the left edge or center of the bus clock
by writing to register PWMCAE.
Clock A is determined by dividing the bus clock by the prescale value. The prescale value is
2n, where n is a 3-bit number written to register PWMPRCLK (PWM Prescale Clock Register) at
memory location $A3. Figure 6-5 shows the bit definitions in this register.
If you do select Clock SA, you can still write a prescaler value to Clock A. For example, if
you write 6 to PWMPRCLK and 4 to PWMSCLA you will get the bus clock divided by 26 = 64
(Clock A) and then further divided by 2 x 4 = 8, so the bus clock will be divided by 512.
reset:
PCLK7
PCLK6
PCLK5
PCLK4
PCLK3
PCLK2
PCLK1
PCLK0
Figure 6-6. Register PWMCLK. Each bit selects the clock for a different PWM output
(0 through 7).
Note that the resulting frequency is NOT the frequency of the PWM signal; its just the
frequency of Clock A (or B, SA, or SB). To get the actual period of the PWM signal you have to
decide how many ticks of the clock youre using (A, B, SA, or SB) you want to be in one PWM
cycle.
For example, look at the waveform in Figure 6-7. The ticks along the abscissa represent ticks
of the prescaled clock (A, B, etc.). In the figure the period of the PWM clock is 10 ticks and the
on time is 4 of these ticks. Suppose were using clock A from the example above. The period of
Clock A is 8 s, so the period of the PWM clock is 80 s. This corresponds to a frequency of
12.5 KHz, which is the actual PWM frequency. The duty cycle is the on time divided by the
period, expressed as a percentage, so its 4/10 x 100% = 40% for this example.
Figure 6-7. Pulse width modulated signal with period of ten clock ticks and 40%duty
cycle.
The number of ticks in the period is set by writing the number to register PWMPER0 (which
somehow manages to stand for PWM channel Period 0 Register) at address $B4. This is for
Output 0 in Figure 6-4. If you want to use, e.g., Output 1 you write to PWMPER1 (address $B5),
and so on. The signal frequency is determined by dividing the bus clock frequency by the
prescale value and then by the number in the period register. Alternatively, the period is of the
PWM clock determined by multiplying the period of the source clock (Clock A, or whatever) by
the number in PWMPER.
As an aside, you should be forewarned that Freescale sometimes changes the mnemonic that
it uses for these registers in new versions of CodeWarrior. If you get a funny looking warning
when you use one you might search through the Source window for the equ directive that assigns
the mnemonic to see if theyve changed it. This is one of the first things you should check when
youve downloaded some (possibly old) code from the web, or copied it from a book thats more
than a few years old.
95
The duty cycle is set by writing the number of clock ticks that the signal is on to register
PWMDTY0 (PWM channel Duty 0 Register) at address $BC (for Output 0).
All these registers are 8 bits wide, so the largest period you can get is 255 ticks. The smallest
on time you can get is one tick, so the smallest duty cycle you can get is about 0.39%. The
largest duty cycle you can get is, of course, 100%.
The contents of all of these registers can be changed on the fly while the program is running,
so you can change the frequency and/or the duty cycle however you want whenever you want.
You can generate some pretty complex waveforms in this way. The microcontroller will align
the changes with the next whole bit edge, so you dont have little pieces of bits being transmitted.
All of these clocks and what you do with them can get a little confusing. Normally you know
what frequency you want to end up with, so the trick is to start with the bus clock frequency and
figure out what prescaler for Clock A, divisor for clock SA, and number of ticks in the period
will get you there. The frequency of the PWM signal will be the bus clock frequency divided by
the Clock A (or B) prescaler (1 to 128), then divided by the prescaler for Clock SA (2 times the
number in PWMSCLA), and finally divided by the number of ticks in the PWM period register.
As and example, suppose your bus clock runs at 24 MHz and you want to generate a 10 KHz
PWM signal with a 10% duty cycle. Heres one way to get it:
movb
#0,
PWMPRCLK
movb
#12,
PWMSCLA
movb
#100,
PWMPER0
movb
#10,
PWMDTY0
Alternatively, you could have loaded 120 into PWMSCLA and 1 into PWMDTY, or any of a
lot of other combinations that would give you the same frequency and duty cycle. The only
reason to pick one over another is that if youre going to change the duty cycle you might need
finer granularity. For example, in the code above, you can change the duty cycle in increments of
1%. If, on the other hand, you used 120 and 10 for the Clock SA prescaler and PWMDTY,
respectively, you could only change the duty cycle in increments of 10%.
Once youve got all this set up, the last thing you have to do is to enable the PWM output.
The default is disabled, as a failsafe feature since you dont want the PWM outputs to just start
up with some random signal when you first turn the microcontroller on. You enable the output
by writing a 1 to register PWME_0, at address $A0 (for Output 0).
Instant Quiz
5. Suppose the first line in the code above is replace by
movb
#1,
PWMPRCLK
How would you change the rest of the code to get the same frequency and duty cycle?
96
Finally, if this is all you want to do, you should write some code to stop the microcontroller
from running through the rest of the memory and trying to execute whatever instructions it thinks
it finds there. You could, for example, end your code with
forever:
bra
forever
There are a lot more registers in the PWM module that you can do some really complicated
things with, but you now have everything you need to solve some fairly sophisticated control
problems.
ldx
#10000
outer_loop:
ldy
#6000
loop:
dey
bne
loop
dex
bne
outer_loop
;
2
;
1
;
3
if
loop
is
taken,
1
if
not
97
4.
;
(MCFLG)
A
;
if
bit
7
wasnt
set,
branch
to
checkFlag,
;
keep
checking
MCFLG
until
bit
7
is
set
;
you
could
also
load
MCCNT
with
30,000
and
set
the
prescalar
to
26
5. Moving 1 to the Clock A prescaler register means divide the bus clock by 21 = 2. There
are a lot of combinations that will give the correct result. Heres one:
movb
#1,
PWMPRCLK
movb
#12,
PWMSCLA
movb
#50,
PWMPER0
movb
#5,
PWMDTY0
Note that its not easy to just change PWMSCLA to get the right answer since the
number you load gets multiplied by 2.
Homework
0. Search the web for the MC9S12GC Family Reference Manual. Download it and skim
through Chapter 12 on PWM. Dont get too crazy about it, though.
1. Write a short code fragment to implement a delay of one minute based on your
millisecond delay loop, then turn on an LED at Port B, bit 0.
2. Write the code necessary to generate a delay of 256 counts (0 to 255) and turn on an LED
at Port B for the first 115 counts, then turn it off for the remaining 140 counts. The cycle
should repeat forever.
3. What values would you have to write to memory locations $b4 and $bc to generate a
pulse with a duty cycle of 45%?
4. Suppose you have an 8 MHz bus clock. Write the code to generate a 1 KHz PWM signal
with a 20% duty cycle. Note that if you want to use clock SA, the frequency is divided by
Clock A/(2*PWMSCLA), where PWMSCLA is at address $08. Note also that you can do
this just using Clock A.)
98
Chapter
7. Interrupts
In previous chapters we often alluded to the term interrupt as something that caused the
microcontroller to stop what it was currently doing and instead begin to do something else,
presumably something more important at the moment. In particular, in the last chapter we
mentioned that many of the timer functions could be set up to generate an interrupt when some
specified timing or counting sequence had completed. Other modules in the microcontroller can
generate interrupts to notify the processor that theyve completed a task or require attention.
Some examples are the analog-to-digital converter module and the various communications
modules. Interrupts can be triggered by some external event, as well, such as some sensor
detecting a condition that required special attention (e.g., a fire alarm). Another external interrupt
source is the Reset pin on the microcontroller chip. When this pin is toggled the microcontroller
goes into a reset routine that might, for example, flush some part of its memory and start running
your code from the beginning. There are also software-generated interrupts. The HCS12 has an
swi instruction (Software Interrupt) that stops the current program and runs a special program
that you write. You might use this to debug your program if you think its going off to some part
of the code it shouldnt be. To see if its actually going there you might drop in an swi instruction
that tells the controller to run some special diagnostic.
In this chapter well see how interrupts work in the HCS12 and how you use them. In
particular, well look at how you use two external interrupt pins, IRQ and XIRQ, and then well
examine how you use the interrupt connected to the MCCNT register that we talked about in the
last chapter to generate periodic events. By the way, this is a good place to warn you that the
term IRQ is used interchangeably to represent both the IRQ pin and an Interrupt Request
generated by any of the other possible sources.
99
software). The request is serviced via an Interrupt Service Routine (ISR). This is a program
that you write and store somewhere in memory. When the ISR is finished responding to the
request, the microcontroller returns to what it was doing before.
Some common uses of interrupts include:
switching from one task to another
asynchronous communications
time-critical applications (e.g., alarms, emergency shutoffs, etc.)
health checks
Health checks are particularly important. Because microcontrollers are often embedded in a
system and expected to run autonomously for months and years, it is important to have a way to
test the health of the system periodically. You could do this by setting up your own timer if you
like, but the HCS12 provides a dedicated timer for this purpose. Its called the Computer
Operating Properly (COP) timer in the HCS12, but a more common name is Watchdog Timer.
You can see it in Figure 1-5 on the left a little below the CPU12 block. You can set it up to
generate an interrupt periodically and then design the ISR to check various registers or addresses
to ensure that the program is stepping along properly.
Interrupt Types
Interrupts are handled by special circuitry in the microcontroller. They are classified by the
method in which they are handled. There are two basic systemspolled and vectored. A polled
interrupt system notifies the interrupt controller that a device is ready to be read or otherwise
handled but does not indicate which device is making the request. The interrupt controller must
poll (send a signal out to) each possible source of the IRQ in turn until it finds which one made
the request. In a vectored interrupt system the IRQ includes the identity of the device sending the
interrupt signal. The HCS12 uses vectored interrupts.
Each method has advantages and disadvantages. A polled interrupt system is simpler and
uses fewer resources (circuitry, memory, power) on the chip. However, its slower due to the
poling procedure. Vectored interrupt systems are much faster but use more resources.
Individual sources of interrupts (the IRQ pin, modulus down-counter, etc.) are also classified
by whether they are maskable or non-maskable. A maskable interrupt is one that you can tell
the processor to ignore, usually by setting or clearing a bit somewhere. You might want to do
this when the processing youre doing at the moment is more important than servicing an
interrupt. Suppose youre flying along in your F-35 a hundred feet off the ground and your
processor is helping with the terrain-following radar processing when you get an interrupt saying
that smoke is coming out of the engine. The system designer might decide that, at that moment,
its more important to not hit a mountain youre approaching than to deal with the smoke
problem.
In the HCS12 you clear the I bit in the CCR to enable maskable interrupts. You can do this
with a cli instruction. The default value (i.e., when you turn the device on) is 1 (masked). The
reason for this is that you dont want the microcontroller to inadvertently go directly into an
interrupt service routine when you first turn it on. For example, suppose a small bit of wire lying
100
on you workbench managed to land on your board at a spot where it shorts out the IRQ pin,
generating a continuous interrupt service request. If interrupts were enabled on startup your
program might just keep running the ISR over and over. You would probably have a lot of
trouble debugging this situation. Many sources of interrupts also require a local enable (usually a
bit set or cleared) in order for the interrupt to be unmasked. The Modulus Down Counter is one
of these.
Non-maskable interrupts (NMIs) are hardware interrupts that do not have a bit-mask
associated with them, and therefore these interrupts can never be ignored. An example is the
Reset pin. The swi instruction mentioned above is also non-maskable. Another is something
called an Unimplemented Opcode Trap. This is a bit of a long-winded term that means that the
CPU has tried to fetch the next instruction from memory but cant decode the machine code it
finds there as a valid instruction. When this happens you might want to halt execution and do
something else (like reset the controller).
On many microcontrollers you have the option to select between having the interrupt edgeor level-triggered, and you also have the option to prioritize your interrupts in case servicing
some is more important than servicing others. The interrupt systems in most microcontrollers are
pretty flexible.
101
it should return to wherever it was before the interrupt occurred. Its like the rts
instruction for subroutines.
5. Returns the system to the state it was in before servicing the interruptonce the rti
instruction is encountered, the processor pops the stack, restoring the system to the
state it was in right before it started to execute the ISR, and begins to execute the next
instruction after the one it completed when the interrupt happened.
How fast all this happens is an important question, since the generation of an interrupt
usually means you want something to happen pretty soon. The figure-of-merit here is called the
interrupt latency. The interrupt latency is the time between when the interrupt request occurs
and the first instruction in the interrupt service routine starts to be executed. Note that its not the
time needed to complete the ISR since you could write an ISR of just about any length and a
really long one would artificially make the performance of the microcontroller seem poorer than
it really is.
The Interrupt Vector
As we mentioned above, the HCS12 uses a vectored interrupt system. The way it works is
that every source of an interrupt (IRQ pin, timers, etc.) has associated with it a memory location
in the memory space between $FF00 and $FFFF. This space is known as the interrupt vector
table (see Figure 7-1). These memory locations contain the address of the first instruction of the
ISR for each interrupt source, so in a sense, it points to the ISR the way vectors that you learned
about in high school physics point somewhere. Because they can point anywhere in the 16-bit
address space, the vectors are stored as 16-bit words.
Remember that you write the code that services the request, then store the code somewhere,
and then write the address of the first instruction to the appropriate vector table location. You
also need to remember to write a cli instruction in the main program and an rti instruction at the
end of the ISR. Also, if you want to allow interrupts during the ISR, you need to clear the I bit
again in the ISR.
Figure 7-1. Memory map for the HCS12 (DG256). The vector table is at the bottom.
102
A section of the vector table is shown in Figure 7-2. Here you can see, for example, that the
Reset Vector is located at address $FFFE:$FFFF (remember, its a 16-bit address). Another
interesting one is the IRQ vector located at $FFF2:$FFF3. This address is where you would put
the vector pointing to the location of the start of the ISR that youve written for this interrupt
source.
Figure 7-2. Vector table for the MC9S12DG256. Note that each vector requires two
bytes in the table.1
103
What happens when the IRQ pin on the chip is brought low (assuming that the interrupt is not
being masked) is that the CPU finishes its current instruction then goes to address $FFF2:$FFF3
to find the vector for that interrupt source. Suppose youve put the ISR starting in address $5000.
Then the number in address $FFFE:$FFFF should be $5000. The processor loads this number
into the Program Counter register, so that the next instruction executed is the first one of your
ISR.
By the way, the reason that you bring the IRQ pin low (i.e., to zero) is that its actually .
Also note that the vector is the number in the address ($5000 in this case), not the table address
itself ($FFFE). Finally, as long as were talking about the IRQ pin, we should mention that the
physical pin itself is connected to Port E, bit 1.
The table also contains some other interesting information, such as whether the interrupt can
be masked by a bit in the CCR and whether (and where) an interrupt has a mask in a local
register. For example, you can see that the Reset interrupt cant be masked by the I bit and
doesnt have a local mask. This means its a non-maskable interrupt.
In contrast, to unmask the IRQ interrupt you have to clear the I bit and also set a bit in the
(local) IRQ Control Register (INTCR) at address $1E. (In the figure its labeled IRQCR, but
thats been changed in subsequent versions of CodeWarrior.) The bit that you have to set is bit 6,
labeled IRQEN for IRQ Enable. The default value of the IRQEN bit on startup is 0
(disabled ) for the same reason that the I bit is initially set.
Instant Quiz
1. Find the address of the interrupt vector for the MCCNT Modulus Down Counter. Is it
maskable by the I bit, and does it have a local mask?
2. Repeat question 1 for the swi (software interrupt) instruction.
As an example of how you might use this, suppose that you have a program running that
controls an automated drill press, and you would like to add an emergency shutdown switch
attached to the IRQ pin. Suppose also that your main code is located in addresses $4000 through
$4500 and you want to put your ISR starting at address $5000. Heres what your code might look
like:
org
$4000
cli
;
this
clears
the
I
bit
bset
INTCR,
%01000000
;
set
IRQ
Enable
in
the
Interrupt
Control
;
Register
forever:
;
this
is
your
original
program
that
controls
the
drill
press
;
*
;
*
;
*
bra
forever
104
org
$5000
;
heres
your
emergency
shutdown
ISR
;
*
;
*
;
*
rti
;
heres
the
rti
instruction
that
returns
control
to
;
the
main
program
org
$FFF2
dc.w
$5000
;
this
is
the
interrupt
vector
at
address
$FFF2:$FFF3
Lets walk through it. The first thing you have to do is to enable the interrupt. You do this in
the first two instructions by adding the cli instruction and setting the IRQEN bit (bit 6) in the
INTCR register. You cant put this at the end of your original code because the bra
forever
instruction will prevent the program from ever getting to it.
Next, the org
$5000 directive tells CodeWarrior to put your ISR in address $5000, etc. You
end your ISR with an rti instruction.
Finally, you have to load the IRQ vector ($5000) into address $FFF2:$FFF3. You cant do
this with, e.g., a movw instruction because you cant write to $5000, which is in ROM, during
runtime. The org
$5000 and the dc.w directives tell CodeWarrior to do this for you when the
program is loaded into the microcontroller.
As weve seen before, we can just let CodeWarrior figure out where to put the ISR. We can
also let CodeWarrior load the correct vector into the table. Heres the code
org
$4000
cli
;
this
clears
the
I
bit
bset
INTCR,
%01000000
;
set
IRQ
Enable
in
the
IRQ
Control
Register
forever:
;
this
is
your
original
program
that
controls
the
drill
press
;
*
;
*
;
*
bra
forever
isr:
;
heres
your
emergency
shutdown
ISR
;
*
;
*
;
*
rti
;
heres
the
rti
instruction
that
returns
control
to
;
the
main
program
org
$FFF2
dc.w
isr
;
this
is
the
interrupt
vector
at
address
$FFF2:$FFF3
105
What weve done is to use the label isr to demark the beginning of the interrupt service
routine. CodeWarrior will figure out the address of the next line of code (i.e., the first line of the
ISR). Then, the line after the org
$FFF2 directive will put that address into the correct spot in the
vector table. Note that we have to use dc.w because isr represents a 16-bit address.
The XIRQ Interrupt
You can clear or set the I bit to enable or disable the IRQ interrupt anywhere and as many
times as you want in your program. The XIRQ works a bit differently. On startup, the X bit in
the CCR is set, masking the interrupt. You can clear the X bit in software any time you want but
after you do clear it you cant reset it. That is, the XIRQ interrupt becomes non-maskable until
you turn off the microcontroller or hit the Reset button. It is, however, temporarily set when
running any other ISR, so that ISR wont be interrupted by it.
You can clear the X bit using an andcc
#%10111111 instruction (AND Condition Code
Register).
Periodic Interrupts
Its often useful to be able to generate period interrupts, for example to read a sensor output
or periodically turn on some actuator. An easy way to do this is to use the Modulus down
Counter we met in the last chapter. To see how this might work, suppose that every 20 ms we
want to produce a 1 ms pulse at Port B bit 0 to trigger a sensor, but we cant use a simple delay
loop because the processor has other tasks to perform. Heres the code that will do this, assuming
a 24 MHz bus clock:
movb
#1,
DDRB
;
Port
B
bit
0
is
output
movb
#$c6,
MCCTL
;
set
up
MCCNT
control
register
movw
#60000,
MCCNT
;
Down
Counter
will
count
down
from
60,000
bset
MCFLG,
$80
cli
;
enable
interrupts
forever:
;
put
whatever
code
you
want
to
run
normally
here
bra
forever
interruptRoutine:
;
this
is
the
interrupt
service
routine
ldy
#6000
bset
PORTB,
1
;
turn
output
on
msDelay:
;
start
ms
delay
dey
bne
msDelay
bclr
PORTB,
1
;
turn
off
output
after
ms
delay
bset
MCFLG,
$80
;
you
have
to
re-clear
the
MCFLG
to
repeat
106
rti
org
$ffca
dc.w
interruptRoutine
The first two lines set up DDRB and the MCCNT control register; the third loads the number
60,000 into MCCNT; the fourth sets bit 7 of the MCFLG register to clear the flag and start the
counting; the fifth enables interrupts by clearing the I bit.
The next lines do whatever you want the microcontroller to be doing when its not servicing
the interrupt.
The interrupt routine itself is just a simple 1ms delay loop. The thing to note is that you need
to re-set the MC Flag each time the ISR runs, otherwise it wont start counting down again the
next time around. Also we used the trick of letting CodeWarrior figure out the value of the
interrupt vector.
Instant Quiz
3. In the code above, the number $C6 is loaded into the MCCTL register. Using Figure 6-2
and Table 6-1, figure out the effect of each bit in this number, and, in particular, which
bit enables the interrupt, and what is the prescalar value that divides the bus clock.
4. How would you change the code to trigger a sensor every 40 ms?
107
bset
DDRB,
1
;
Port
B
bit
0
is
output
movb
#%01000000,
$1E
;
enable
IRQ
interrupt
cli
;
enable
interrupts
forever:
;
main
program
wai
;
bra
forever
;
isr:
;
this
is
the
ISR
bset
1,1
;
turn
bit
on
ldy
#6000
;
this
is
the
1
ms
delay
delay_1ms:
;
dey
;
bne
delay_1ms
;
bclr
1,1
;
turn
bit
off
rti
;
return
to
main
program,
resume
waiting
org
$FFF2
;
IRQ
vector
=
values
of
isr
label
dc.w
isr
The first lines makes Port B bit 0 an output. The next two lines enable the IRQ interrupt (the
second of the two enables all maskable interrupts). The main program just contains the wai
instruction in a continuous loop, so the processor immediately goes into the wait state. It remains
in this state until the IRQ interrupt occurs.
When the IRQ interrupt does occur, the processor switches to the ISR, which turns on the bit,
waits 1 ms, then turns off the bit. When the ISR is finished, control returns to the instruction in
the main program immediately after the wai instruction. This is just the unconditional branch
instruction to the forever label, and the instruction after that is wai, so the processor goes right
back into the wait state, waiting for the next interrupt. In between intrusion detections the
processor clock is disabled, and, since we havent turned on any other clocks (TCNT, PWM,
etc.) the processor is in a very low power state, drawing just a few microamps from the battery.
Instant Quiz
5. Look up the wai instruction and find its addressing mode(s), the number of clock cycles it
takes to execute when the interrupt occurs, and how it affects the CCR bits.
108
Homework
1. Write a short code fragment to continually read bits 0 and 1 of Port A and turn on bit 0 of
Port B if they are different (feel free to reuse any old code you have, or just write a short
lookup table to do this). At the same time, the microcontroller should monitor a sensor
attached to the IRQ pin and immediately stop processing and light a warning light at bit 0
of Port J if the bit goes low. You should initially clear the bit in the main program.
2. Write a code fragment to do the following
a) The main program should continuously read the number at PORT A and take an
average after each 8 readings. Any time the average is greater than 10, an external
alarm attached to Port B bit 0 should be turned on by writing a 1 to the bit. If the
average is less than 10 the alarm should be off.
b) At the same time, every 10 ms the processor should momentarily stop what it is doing
and read the input from a sensor attached to Port H, bit 0. If this bit is set a warning
light attached to bit 1 of Port B should turn on. If the bit is cleared the light should
turn off.
109
References
1. Huang, H-W (2010) HCS12/9S12 An Introduction to Software and Hardware Interfacing
(2nd Edition). Clifton Park, NY: Delmar, Cengage Learning.
110