Documente Academic
Documente Profesional
Documente Cultură
Lecture 1
C Programming Language
Lecture content
The evolution of C
Conformance
Strengths and weaknesses
Overview of C programming
Lexical elements
The C preprocessor
Q&A
C Programming Language - Lecture
1
The evolution of C
When
Who, where
Comments
mid1960
Ken Thompson,
Bell Laboratories
1971
Denis Ritchie,
Bell Laboratories
1973
1978
Brian Kernighan
and Denis
Ritchie
The evolution of C
When
Comments
1983
1989
1990
1995
1999
2011
New C standard adopted as ISO/IEC 9899:2011 C11 (or
For more C1X)
information about C standards, follow the link
http://www.openbelow
std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
http://www.open-std.org/JTC1/SC22/WG14/
C Programming Language - Lecture
1
Conformance
Both C programs and C
implementations can conform to
Standard C:
A C program is said to be strictly conforming
to Standard C if that program uses only the
features of the language and library
described in the Standard
The program's operation must not depend on
any aspect of the C language that the Standard
characterizes as unspecified, undefined, or
implementation-defined
C Programming Language - Lecture
1
Conformance
There are two kinds of conforming
implementations:
Hosted implementation it accepts any
conforming program
Freestanding implementation accepts any
conforming program that uses no library facilities
other than those provided in the header files
float.h, iso646.h , limits.h, stdarg,h, stdbool.h,
stddef .h and stdint.h
Freestanding conformance is meant to accommodate C
implementations for embedded systems or other target
environments with minimal run-time support. For
example, such systems may have no file system
C Programming Language - Lecture
1
C is a small language
Provides a limited set of features than other languages
It relies on a library of standard functions
C is permissive
Does not implement detailed error-checking mechanisms
It assumes that programmer knows what hi is doing so it
allows a wider degree of freedom than other languages
C Programming Language - Lecture
1
Strengths
Weaknesses
Overview of C programming
A C program is composed of one
or more source files or
translation units, each of
which contains some part of the
entire C program: typically some
number of external functions
Common declarations are often
collected into header files and
are included into the source files
with a special #include
command
One external function must be
named main and this function
is where the program starts
EXAMPLE
#include <stdio.h>
#define SIZE = 10
int size(int a[SIZE])
{
int ret;
ret=printf("size of array is:%d\n", sizeof(a));
return ret;
}
int main()
{
int a[SIZE];
(void)size(a);
return 0;
}
Overview of C programming
A C compiler independently
processes each source file and
translates the C program text into
instructions understood by the computer
The output of the compiler is usually
called object code or an object
module
When all source files are compiled, the
object modules are given to a program
called the linker
The linker resolves references
between the modules, adds functions
from the standard run-time library
The linker produces a single
executable program which can then
be invoked or run
C
source
file
C
source
file
Compil
e
Compil
e
Object
file
Object
file
Link
Library
Executa
ble
module
10
Lexical elements
A C source file is a sequence of characters selected from a character set
C programs are written using the following characters (source character
set) defined in the Basic Latin block of ISO/IEC 10646
Class
Characters
A B C D E F G H I J K
MN O PQ R S TU VW
Y Z a b c d e f g h i
k 1 mn o pq rs tu v
x y z
10 digits
01234567 89
L
X
j
w
SPACE
horizontal tab (HT), vertical tab (VT),
and form feed (FF) control characters
11
Lexical elements
Class
Characters
29 graphic
characters
and their
official
names
!
+
#
=
{
%
~
}
^
[
,
&
]
.
EXCLAMATION MARK
* ASTERISK
PLUS SIGN
APOSTROPHE
QUOTATION MARK
< LESS-THAN SIGN
NUMBER SIGN
( LEFT PARENTHESIS
EQUALS SIGN
|
VERTICAL LINE
LEFT CURLY BRACKET > GREATER-THAN SIGN
PERCENT SIGN
_ LOWLINE (underscore)
TILDE
\
REVERSE SOLIDUS
RIGHT CURLY
/
(backslash)
BRACKET
) SOLIDUS (slash, divide
CIRCUMFLEX ACCENT ;
sign)
LEFT SQUARE
? RIGHT PARENTHESIS
BRACKET
SEMICOLON
COMMA
:
QUESTION MARK
AMPERSAND
HYPHEN-MINUS
RIGHT SQUARE
COLON
EX: Name three uses
of the & symbol in a C source file.
BRACKET
EX: Name two uses
of the
% symbol in a C source file.
FULL
STOP
C Programming Language - Lecture
1
12
Lexical elements
Dividing the source program into lines can be
done with a character or character sequence
Additional characters are sometimes used in C
source programs, including:
formatting characters such as the backspace (BS)
and carriage return (CR) characters treated as
spaces
additional Basic Latin characters, including the
characters $ (DOLLAR SIGN), @ (COMMERClAL
AT), and ` (GRAVE ACCENT) may appear only in
comments, character constants, string constants,
and file names
C Programming Language - Lecture
1
13
Lexical elements
The character set interpreted during the execution of a C program is
not necessarily the same as the one in which the C program is written
Characters in the execution character set are represented by their
equivalents in the source character set or by special character escape
sequences that begin with the backslash (\) character
In addition to the standard characters mentioned before, the execution
character set must also include:
a null character that must be encoded as the value 0 used for marking the
and of strings
a newline character that is used as the end-of-line marker: used to divide
character streams into lines during I/O
the alert, backspace, and carriage return characters
These source and execution character sets are the same when a C
program is compiled and executed on the same computer
For programs that are cross-compiled, when a compiler calculates the
compile-time value of a constant expression involving characters, it must
use the target computer's encoding, not the more natural source
encoding
C Programming Language - Lecture
1
14
Lexical elements
In C source programs the blank (space),
end-of-line, vertical tab, form feed, and
horizontal tab (if present) are known
collectively as whitespace
characters. Comments are also
whitespace
The end-of-line character or character
sequence marks the end of source
program lines. In some implementations,
the formatting characters carriage
return, form feed, and (or) vertical tab
additionally terminate source lines and
are called line break characters
A source line can be continued onto the
next line by ending the first line with a
reverse solidus or backslash (\)
character. The backslash and end-ofline marker are removed to create a
longer, logical source line
EXAMPLE
if (a==b) X=1; el\
se X=2;
Is equivalent to the single line
if (a == b) X=1; else X=2;
EXAMPLE
#define nine (3*3)
Is equivalent to
#define nine /* this
is nine
*/ (3*3)
15
Lexical elements
Comments:
Traditionally, a comment begins with an occurrence
of the two characters /* and ends with the first
subsequent occurrence of the two characters */
Beginning with C99, a comment also begins with
the characters // and extends up to (but does not
include) the next line break
Comments are not recognized inside string or
character constants or within other comments
Comments are removed by the compiler before
preprocessing
Standard C specifies that all comments are to be
replaced by a single space
EXAMPLE
// Program to compute the squares of
// the first 10 integers
#include <stdio.h>
void Squares ( /* no arguments */ )
{
int i;
/*
Loop from 1 to 10,
printing out the squares
*/
for (i=1; i<=10; i++)
printf("%d //squared// is %d\n,i,i*i);
}
EXAMPLE
To cause the compiler to ignore large parts
of a C program, it is best to enclose the
parts to be removed with the preprocessor
commands
#if 0
#endif
Lexical elements
The characters making up a C program are
collected into lexical tokens
There are five classes of tokens:
operators, separators, identifiers,
keywords, and constants
The compiler always forms the longest
tokens possible as it collects characters
in left-la-right order, even if the result
does not make a valid C program
Adjacent tokens may be separated by
whitespace characters or comments
EXAMPLE
Characters
forwhile
b >x
b->x
b--x
b---x
b ,>,x
b,->,x
b,--,x
b,--,-,x
C Tokens
forwhile
Token class
Tokens
Simple operators
!%^*-+=~|.<>/?
+= -= *= /= %= <<= >>=
&= ^= |=
Separator characters
,;:
17
Lexical elements
An identifier or name, is a sequence of Latin capital and small
letters, digits, and the underscore character
An identifier must not begin with a digit, and it must not have
the same spelling as a keyword. C is case sensitive
Standard C further reserves all identifiers beginning with an
underscore and followed by either an uppercase letter or
another underscore
C89 requires implementations to permit a minimum of 31
significant characters in identifiers, and C99 raises this
minimum to 63 characters
External identifiers those declared with storage class extern
may have additional spelling restrictions: C89 requires a
minimum capacity of only six characters, not counting letter
case. C99 raises this to 31 characters
C Programming Language - Lecture
1
18
Lexical elements
MISRA rules on identifiers
Identifiers in an inner scope shall not use the same
name as an identifier in an outer scope, and therefore
hide that identifier.
A typedef name shall be a unique identifier.
A tag name shall be a unique identifier.
No object or function identifier with static storage
duration should be reused.
No identifier in one name space should have the same
spelling as an identifier in another name space, with
the exception of structure member and union member
names.
No identifier name should be reused.
19
Lexical elements
C Keywords
auto _Bool* break case char _Complex* const continue default
restrict* do double else enum extern float for goto if
_Imaginary* inline int long register
return short signed sizeof static struct switch typedef union
unsigned void volatile while
* New in C99
Q: What is the meaning of the sizeof keyword?
Q: What is the meaning of the continue keyword?
20
Lexical elements
The lexical class of constants includes four different kinds of constants:
integers, floating-point numbers, characters, and strings
These are the rules for determining the radix of an integer constant:
If the integer constant begins with the letters 0x or 0x, then it is in hexadecimal
notation, with the characters a through f (or A through F) representing 10
through 15
Otherwise, if it begins with the digit 0, then it is in octal notation
Otherwise, it is in decimal notation
The unsigned suffix may be combined with the long or long long suffix
in any order.
EXAMPLE
Decimal
Hexadecimal
68
0x44 0104
octal
21
Lexical elements
The value of an integer constant is always non-negative in the absence of overflow.
The actual type of an integer constant depends on its size, radix, suffix letters, and type
representation decisions made by the C implementation
MISRA rules on constants: Octal constants (other than zero) and octal escape
sequences shall not be used.
Constant
C89
C99
ddd
int
long
unsigned long
int
long
long long
0ddd
0xddd
int
unsigned
long
unsigned long
int
unsigned
long
unsigned long
long long
unsigned long long
dddU
0dddU
0xdddU
unsigned
unsigned long
unsigned
unsigned long
unsigned long long
dddL
long
unsigned long
long
long long
0dddL
0xdddL
long
unsigned long
long
unsigned long
long long
unsigned long long
22
Lexical elements
Constant
C89
C99
dddUL
0dddUL
0xdddUL
unsigned long
unsigned long
unsigned long long
dddLL
Not applicable
Long long
0dddLL
0xdddLL
Not applicable
long long
unsigned long long
dddULL
0dddULL
0xdddULL
Not applicable
EXAMPLES
C constant
0
0
15-1
32767 215
077777
15
32768 215
0100000
16-1
65535 216
0xFFFF
16
65536 216
0x10000
23
Lexical elements
Floating-point constants may be written with
a decimal point, a signed exponent, or both
Standard C allows a suffix letter (floatingsuffix) to designate constants of types float (F,
f)and long double (L, l). Without a suffix, the
type of the constant is double.
The value of a floating-point constant is always
non-negative in the absence of overflow
If the floating-point constant cannot be
represented exactly, the implementation may
choose the nearest representable value V or the
larger or smaller representative value around V.
EXAMPLES
0.
3e1
3.14
.0
1.0E-3
1e-3
.00234
2e+9
24
Lexical elements
A character constant is
written by enclosing one or
more characters in
apostrophes.
A special escape mechanism
is provided to write
characters or numeric values
that would be inconvenient or
impossible to enter directly
in the source program.
C Programming Language - Lecture
1
EXAMPLES
Character
a
97
\r
13
32
\0
0
\377 255
\23 19
Value
25
Lexical elements
A string constant is a (possibly empty)
sequence of characters enclosed in double
quotes
For each string constant of n characters, at
run time there will be a statically
allocated block of n+1 characters whose
first n characters are the characters from the
string and whose last character is the null
character, \0
This block is the value of the string constant
and its type is char [n+1]
Do not depend on all string constants being
stored at different addresses
p1[
0]
A
65
l w a
10 11
97
8
9
y
12
1
s
11
5
EXAMPLES
""
"\""
"Input numbers:"
"One text and \
its continuation"
char p1[ ]= "Always writable";
char *p2 = "Possibly not writable";
const char p3[ ] = "Never writable";
char p4[ ] = "This long string is permissible"
"in Standard C";
p1[
7]
w r
11 11
32
9
4
i
10
5
t
11
6
97
98
p1[1
4]
\0
l e
10 10
0
8
1
26
Lexical elements
hex escape
27
Lexical elements
Character escape
code
Character
constant
Translation
\a
Alert (bell)
\b
Backspace
\f
Formfeed
\n
New line
\r
Carriage return
\t
Horizontal tab
\v
Vertical tab
\\
Backslash
Quote
Double quote
\?
Question mark
MISRA rules on escapes: Only those escape
sequences that are defined in the ISO C standard
shall be used. C Programming Language - Lecture
1
28
The C preprocessor
The C preprocessor is a simple macro processor that
conceptually processes the source text of a C program
before the compiler proper reads the source program
The preprocessor is controlled by special preprocessor
command lines, which are lines of the source file
beginning with the character #
The preprocessor typically removes all preprocessor
command lines from the source file and makes
additional transformations on the source file as
directed by the commands
The syntax of preprocessor commands is
completely independent of (although in some ways
similar to) the syntax of the rest of the C language
C Programming Language - Lecture
1
29
The C preprocessor
The preprocessor does not
parse the source text, but it
does break it up into tokens
for the purpose of locating
macro calls
Standard C permits whitespace
to precede and follow the #
character on the same source
line
Preprocessor lines are
recognized before macro
expansion
C Programming Language - Lecture
1
C source
file
Preproces
s
Modified
C source
file
Compile
Object
code
30
The C preprocessor
Command
Meaning
#define
#undef
#include
#if
#ifdef
#ifndef
Conditionally include some text with the sense of the test opposite to that of #ifdef.
#else
Alliteratively include some text if the previous #if, #ifdef , #ifndef, or #elif test failed.
#endif
#line
#else
Alternatively include some text based on the value of another constant expression if the previous #if
, #ifdef, #ifndef, or #elif test failed.
defined
Preprocessor function that yields 1 if a name is defined as a preprocessor macro and 0 otherwise;
used in #if and #elif.
# operator
Replace a macro parameter with a string constant containing the parameter's value.
## operator
#pragma
#error
31
The C preprocessor
The #define preprocessor command
causes a name (identifier) to become
defined as a macro to the
preprocessor
A sequence of tokens, called the
body of the macro, is associated
with the name
The #define command has two
forms:
EXAMPLES
#define BLOCK _SIZE 0x100
#define TRACK _SIZE (16-BLOCK_ SIZE)
#define product (x,y) ((x)*(y))
#define incr(v,low,high) \
for ((v) = (low); (v) < = (high); (v) ++))
#ifndef MAXTABLESIZE
#define MAXTABLESIZE 1000
#endif
32
The C preprocessor
Once a macro call has been expanded, the scan for macro calls resumes
at the beginning of the expansion so that names of macros may be
recognized within the expansion for the purpose of further macro
replacement
Macros appearing in their own expansion-either immediately or through
some intermediate sequence of nested macro expansions-are not
reexpanded in Standard C
EXAMPLE
#define plus(x,y) add(y,x)
#define add(x,y) (x)+(y)
the invocation plus(plus(a,b),c) is expanded as shown next
Step
1
2
3
4
5
Result
plus(plus(a,b),c )
add(c,(plus(a,b))
((c)+(plus(a,b)))
((c)+(add(b,a)))
((c)+(((b)+(a))))
33
The C preprocessor
EXAMPLES
The invocation
SQUARE (z++)
will be expanded into:
z++*z++
WHICH HAS THE SIDE EFECT OF DOUBLE
INCREMENTING z
SOLUTION: USE A TRUE FUNCTION NOT A
FUNCTION LIKE MACRO
int square(int x) { return x*x;}
34
The C preprocessor
The # token appearing within a
macro definition is recognized as a
unary "stringization" operator that
must be followed by the name of a
macro formal parameter
During macro expansion, the # and
the formal parameter name are
replaced by the corresponding actual
argument enclosed in string quotes
Merging of tokens to form new tokens
in Standard C is controlled by the
presence of a merging operator,
##, in macro definitions
In a macro replacement list, before
rescanning for more macros, the two
tokens surrounding any ## operator
are combined into a single token
EXAMPLE
#define TEST(a,b) printf( #a "< #b "=%d\n", (a)<(b)
The invocation TEST (0, 0xFFFF) will expand into
printf("0" "<" 0xFFFF" "=:%d\n", (0)<(0xFFFF) );
Which will become after string concatenation:
printf("0<0xFFFF=:%d\n", (0)<(0xFFFF) );
EXAMPLE
#define TEMP(i) temp ## i
The invocation TEMP(1) = TEMP(2 + k) + X will
expand into
temp1 = temp2 + k + X
35
The C preprocessor
MISRA rules on #define
Macros shall not be #defined or #undefd within a block.
#undef shall not be used.
A function should be used in preference to a function-like macro.
A function-like macro shall not be invoked without all of its
arguments.
Arguments to a function-like macro shall not contain tokens that
look like preprocessing directives.
In the definition of a function-like macro each instance of a
parameter shall be enclosed in parentheses unless it is used as the
operand of # or ##.
C macros shall only expand to a braced initializer, a constant, a
string literal, a parenthesised expression, a type qualifier, a storage
class specifier, or a do-while zero construct.
There shall be at most one occurrence of the # or ## preprocessor
operators in a single macro definition.
The # and ## preprocessor operators should not be used.
C Programming Language - Lecture
1
36
The C preprocessor
The #include preprocessor command causes
the entire contents of a specified source text file
to be processed as if those contents had appeared
in place of the #include command
The #include command has the following forms in
Standard C:
#include <char-sequence>
searches for the file in certain standard places according to
implementation-defined search rules
#include char-sequence
will also search in the standard places, but usually after
searching some local places, such as the programmer's
current directory
C Programming Language - Lecture
1
37
The C preprocessor
MISRA rules on #include:
#include statements in a file should only be
preceded by other preprocessor directives
or comments.
Non-standard characters should not occur in
header file names in #include directives.
The #include directive shall be followed by
either a <filename> or "filename sequence.
Precautions shall be taken in order to
prevent the contents of a header file being
included twice.
C Programming Language - Lecture
1
38
The C preprocessor
The preprocessor
conditional commands
allow lines of source text to
be passed through or
eliminated by the
preprocessor on the basis of
a computed condition
The preprocessor
replaces any name in the
#if expression that is
not defined as a macro
with the constant 0
The expressions that may
be used in #if and #elif
commands include integer
constants and all the
integer arithmetic,
relational, bitwise and
logical operators
EXAMPLE
EXAMPLE
#define
#define
#define
#define X86 1
#undef ARM
#undef PPC
X86 0
ARM 0
PPC 1
#if X86
X86-dependent code
#endif
#if ARM
ARM-dependent code
#endif
#if PPC
PPC -dependent code
#endif
#ifdef X86
X86-dependent code
#endif
#ifdef ARM
ARM-dependent code
#endif
#ifdef PPC
PPC -dependent code
#endif
39
The C preprocessor
EXAMPLES
EXAMPLE
#if defined (X86) && defined(ARM)
#error Inconsistent CPU definition!
#endif
EXAMPLE
#include "sizes.h" /* defines SIZE */
#if (SIZE % 256) != 0
#error "SIZE must be a multiple of 256!"
#endif
40
Q&A
1. Eliminate all the comments from the following C program fragment:
/ ** / */*"*/* / *" //* //**/*/
2. Which strings would be recognized as a sequence of C tokens? How many tokens would be found in each case?
1.
2.
3.
4.
5.
6.
X++Y
X+++Y
-12uL
X**2
A*=B
retValue = (100*i+j*k)/i-1
Forloop
100Miles
Miles100
_100Miles
100_Miles
Register
4. How is interpreted an escape sequence that does not obey the presented rules?
5. A Standard C compiler must perform each of the following actions on an input program. In what order are the actions
performed?
1. collecting characters into tokens
2. removing comments
3. processing line continuation
41