CW Language
Note : This is an old file, around CW2.0, some areas
of the language have been enhanced since this article. BTW: Scan to the end of
the file, the 'filled in bits' are in patches
Language overview
People often ask me ‘what can of a language is Clarion’? Is it
object oriented? Or procedural? Or functional? Or logical-reductive? Or list
oriented? Or a database language? And so the list goes on. Most of the languages
available today are a good idea taken to a logical extreme. Everything is an
object, or a procedure, or a function, or a logic clause, or a list or a data
item.
Using one of these tools (provided you pick the right one) is
very effective for most of the task you have, the other parts you have to
kludge. The solution to this kludging is to mix languages, picking the right
tool for the job. The downside is that the programmer has to learn a number of
different paradigms and keep them all in his head at once, and that is if the
languages can mix which often they cannot.
So where does Clarion fit in? Well, my standard answer is that
Clarion is a solution oriented language. If COBOL is a hammer and C++ is
a chisel then Clarion is a Swiss Army penknife, on steroids. As we go through
the various language features you will find again and again I say ‘this is like
xxxx’ where xxxx is some mainstream language. Clarion is shameless in finding
and improving any useful feature found in other languages that fits in with the
Clarion ethos.
This brings me nicely around to the single, major, overriding
design concept in the Clarion language :
You must be able to read it, and understand
it
We take this idea very seriously. When I have an idea for a new
language feature I write a small program containing it and show it to people, if
they can’t guess what it does just from reading the lines involved then it’s
back to the drawing board.
The remainder of this chapter assumes that the above design has
worked. I am not aiming to re-document the language, neither am I aiming to
write a primer on programming in general. What I am expecting is to bring to
view some features and facilities within the language which may have been
overlooked by someone reading the LRM too quickly and also to provide some
low-level detail which will aid a knowledgeable programmer seeking to extract
the most from the compiler.
In may ways this chapter is as solution oriented as the language
is, each feature is taken in turn and considered from all angles before the next
is considered.
The other concept which we will hit again and again is the old
maxim
There’s no such thing as a free
lunch
As each feature is introduced we will often consider the
downside of some other feature. It is possible to worry about this far
too much, if something works for you then leave it alone. What we are aiming to
do here is look at all the tools available to you so that you can make an
informed decision when approaching a problem, but of course if you’re really
good with hammers ….
Program structure
Mapping a program
One of the first things to strike a new Clarion programmer is
that Clarion is designed for producing large applications. It is assumed that
your application will span over many source files and that you will want to
maintain it over the years. Therefore the language pays particular attention to
helping you keep track of where your procedures are, it does this by providing a
MAP structure. A PROGRAM has one and only one map, in it are defined all the
procedures (for all source files) that are to be globally accessible within the
application.
The MAP is further sub-divided into MODULE sections, these
define which source module a given procedure is in. A procedure defined outside
of a module section has to exist in the main PROGRAM module itself. The compiler
enforces that all procedures declared are implemented and that all implemented
procedures have a declaration, it also enforces that the procedure is in the
module specified in the map!
When linking to non-Clarion procedures you may use any string
you wish in the module statement, when linking to other Clarion modules the name
in the module statement must be the name of the source file (although the
.CLW is implicit).
The main PROGRAM also provides the repository for all of the
data in the program that is to be accessible to all procedures, data may come
anywhere between the PROGRAM and CODE sections of the program but not inside the
map.
After producing a map and declaring data you have the CODE of
the program followed by the lines of code themselves these are automatically
executed immediately after loading the Clarion program. Here is a simple clarion
program, it will not actually work as I have left out some confusing detail but
it should give the idea of the structure of a Clarion program :-
PROGRAM !Only one source file has PROGRAM at
the start
MAP
!Outside a MODULE so must be defined inside this source
file
Who(),STRING
MODULE(‘Helper’) ! Written in Helper.CLW
file
ReturnHello(),STRING
END
END
Variable SHORT ! Data may be declared between
PROGRAM & CODE
!Unique to the PROGRAM is code that is
automatically started when the !program runs
CODE
Variable = 7
TYPE( ReturnHello()
& Who() )
Who FUNCTION
CODE
IF Variable = 7
THEN
RETURN ‘World’
ELSE
RETURN ‘Mum’
END
Tech tip : The above program does slightly more than you
might think because two things are happening behind the scenes. Firstly the file
EQUATES.CLW is being included immediately after the PROGRAM statement, secondly
the file BUILTINS.CLW is being included immediately after the MAP keyword. The
first of these defines all the Clarion built in constants, the second all the
Clarion built-in functions. For this reason all Clarion programs must
have a map even if it is empty or half the Clarion language disappears! It does
mean that if you have equates or prototypes that are included in all your
programs then you can modify one of the above two files to include any other
source files you wish.
Modules
To supplement the example above we need to write the
Helper module, we will do this and then see what we have done.
MEMBER ‘Hello’ !PROGRAM in
Hello.CLW
ReturnHello FUNCTION
CODE
IF Variable = 7 THEN
RETURN ‘Hello ’
ELSE
RETURN ‘Hi there ’
END
The MEMBER statement names the program of which this source
module is a member, the prototypes for this module are contained in the PROGRAM
MAP and the global data is in the PROGRAM too so we can get on with defining the
functions.
Local maps and data
The PROGRAM with server MODULE system is very simple and elegant
and the centralization of all key information into the PROGRAM unit has some
great advantages, you never have to look for something it is always in the same
place. However it has the following disadvantages :
- Compile times. All MODULEs implicitly include the PROGRAM
module from the PROGRAM statement to the first CODE, thus whenever the program
source file changes at all the entire program will recompile
- Encapsulation (or rather lack thereof). The module assumes
that everything should be available to everybody but this can be unwise.
Global data is very powerful but if used in excess it becomes harder to
maintain a program as there is no way to track interactions between different
procedures using the same data variable in different ways.
- Team-working is difficult. The PROGRAM source needs to be
checked out by just about everyone which leads to merge problems
- Efficiency and segment limits. In the 16 bit system there is
a 64K limit on the amount of data in the PROGRAM unit (but see ????) and
accessing global data from within a module is slightly slower than accessing
MODULE data.
To avoid these problems Clarion allows a MODULE to have its’ own
MAP and data (the latter being called module static). Using these it is possible
to have procedures and data that are only accessible within one module, this
enhances encapsulation. Additionally, the module MAP may have other module
sections allowing a sub-set of modules to share a given set of declarations.
Idea: Combining local maps with the INCLUDE statement it
is possible to mimic the MODULE paradigm of a language such as M2. For each
module two files are produced, a .CLW file with the source and a .INC file
containing just a MODULE statement with associated procedure declarations. The
in the MAP for each module you simply INCLUDE the .INC file of any modules this
this module can reference. This minimizes re-compiles whilst maximizing
encapsulation. The down-side is that you now have to hunt for a given procedure
declaration rather than just looking in the PROGRAM MAP.
Empty MEMBER
In general usage the Clarion prototyping system is very safe,
the main program names the module it expects a procedure to be in and the module
names the program it expects to be a part of. Prototypes exist in only one place
and therefore will be consistent and as changing any prototype causes a global
recompile there will never be an ‘old obj’ problem. There is one significant
downside however, it is impossible to share a source file between applications
because the top of the member module names the parent program. To side-step this
problem it is possible to leave out the program name from the MEMBER statement
(you still write the program map as you normally would).
Prototypes
Address parameters
Array parameters
Omitability and defaults
Mixed language
Name mangling
Procedures
Routines
Lexical phase
Statement sensitivity
Equate
Size and compiler limitations
Simple data
Safe typing
Base types
Implicit type conversions
Numbers
Integers
Reals
Binary Coded Arithmetic
Bfloat etc
Pictured strings
Dates
Clarion has both date and time data types but these are really
only there for Btrieve compatability. Clarion dates are normally stored in a
long data type and entered / displayed using a picture string. The binding from
a date as we know it and a LONG is done using the concept of a Clarion
Standard Date which is very simply the number of elapsed days since
28/12/1800. The date range is valid until 31/12/2099.
The Clarion Standard Date makes it very simple to perform date
arithmetic. For example to find the number of days between two dates (inclusive
of one end) you can simply subtract them.
Perhaps less obvious, but equally useful, it is possible to
deduce the day of a given date simply by taking the number modulus 7. The
following function returns the day of the week for a given date.
DayOfWeek FUNCTION(long date) !Returns
string
CODE
RETURN CHOICE(date % 7 + 1,
’Sun’,’Mon’,’Tue’,’Wed’,’Thu’,’Fri’,’Sat’)
The other useful corollary of the C.S.D. is that you can step
between days simply by adding on one. The Clarion functions help this by
automatically wrapping if they receive an out of range value. (So if you ask for
32nd of March in will convert it into 1st April for
you)
This fact can be used to produce a function to answer questions
such as ‘what is the first Friday of the month?’. In the first function we
simply add on a day each time seeing when we hit the day we require.
! Returns Clarion standard date as
long
! day => 1 = Sunday, 2 = Monday
etc
! occur => 1 for 1st in month,
2 for 2nd in month
! month => month in question
! year => year in question
! eg to find 1st Friday in August
1996
! DayOfMonth(6,1,8,1996)
DayOfMonth FUNCTION(byte day,byte occur,byte
month,ushort year)
Base LONG,AUTO
CODE
Base = DATE(month,1,year)
LOOP occur TIMES
LOOP UNTIL Base % 7 = day-1
Base += 1
END
Base += 1
END
RETURN Base - 1
This is a general mechanism that will suit for many types of
computation (for example: how many Fridays in May?). However, it is usually
possible to come up with a quicker computed solution if you can face the modulo
arithmetic.
DayOfMonth FUNCTION(byte day,byte occur,byte
month,ushort year)
Base LONG,AUTO
CurDay BYTE,AUTO
CODE
Base = DATE(month,1,year)
CurDay = Base % 7 ! Day of first of
month
Day -= 1 ! Sunday = 0
Base += Day - CurDay ! Add day
difference
IF Day >= CurDay THEN
occur -= 1 ! Have already moved
forward
END
RETURN Base + occur * 7 ! Add on required
weeks
Functions
Strings
Slicing
Functions
Groups
Selection syntax
Location
Unions
Initialization
Run-time binding
Execution Control
IF THEN ELSE
Looping
Evaluation of indices
Tail exiting
Controlled breaking
CASE
EXECUTE
CHOOSE
RETURN
Goto etc
DO and Exit
Recursion
One of the most powerful methods of flow control within a
Clarion program is recursion. A procedure or routine is recursive if it can call
itself. If the procedure or routine contains a call to itself then it is
directly recursive, if it contains a call to another procedure that calls back
to the first procedure then it is indirectly recursive. Clarion procedures and
routines can always be called recursively although some care has to be given to
data allocation if you actually want the calls to work!
Let us build simple recursive function and see how it works
:
The function is to convert from an unsigned long value to a
hexadecimal string. So 5 would become ‘5’, 255 would become ‘0FF’. The strategy
we will use is very much ‘divide and conquer’. Let us start with an easier
function, one that converts from an integer to a hex string but only for numbers
up to 15.
NumToHex FUNCTION(ulong in_value) ! Returns
STRING
Chars
STRING(‘0123456789ABCDEF’),STATIC
CODE
RETURN Chars[in_value+1]
Easy huh? All we do is used the string slice syntax to pick a
suitable character from a string. We have to add 1 on to the incoming number as
Clarion arrays start from 1. Now we come to the key of recursion, we work out
how to build up the more complex case from the simpler ones. In this case the
rule is :-
An n digit hex number is composed of n-1 digits followed by one
more digit. Further the first n-1 digits are the hex representation of the
number divided by sixteen, the remaining digit is the hex representation of the
number modulo sixteen.
So we have taken one problem (printing an n digit number) and
given ourselves two (printing an n-1 digit number, and a one digit number). Now
let us assume we can convert n-1 digit numbers to hex then our problem is
solved! Here is the code :-
NumToHex FUNCTION(ulong in_value) ! Returns
STRING
Chars
STRING(‘0123456789ABCDEF’),STATIC
CODE
RETURN NumToHex(in_value / 16) &
Chars[in_value%16+1]
The routine almost works, if in_value has 4 digits then it takes
the last digit and concatenates it on the end of a call to NumToHex with the top
3 digits. Now when that procedure executes it will concatenate what is
now the last digit onto NumToHex called with the top 2 digits. And so the
calls go on, and on, and on, and on … for ever in fact. The function above has a
fatal flaw, if in_value is 0 then it will call NumToHex with an in_value of 0 ad
nauseum. We have correctly given our function an iteration step but a recursive
function also needs a termination condition. In other words, there must be
some path through a recursive function that does not cause a recursive
call or the function will go on for ever. In our case the termination condition
should be when in_value is zero, this gives us the following :-
NumToHex FUNCTION(ulong in_value) ! Returns
STRING
Chars
STRING(‘0123456789ABCDEF’),STATIC
CODE
IF in_value THEN
RETURN NumToHex(in_value / 16) &
Chars[in_value%16+1]
ELSE
RETURN ‘0’
END
To understand how this works it may help to look at the table
below which runs through the execution sequence for NumToHex(255).
When the recursion is at its’ deepest point there are 3 ‘copies’
of the NumToHex function active at one time. There are also 3 copies of
in_value, one for each function invocation. It is this ability to have multiple
copies of a function active at one time that leads to Clarion needing a stack
(early languages such as Fortran 66 did not need a stack and did not support
recursion).
Each call of a Clarion function not only has its’ own copy of
its’ parameters but also its own copies of local data provided the local data
does not have the thread or static attribute (remember FILEs and
VIEWs are both implicitly static). When you are writing a function you wish to
be recursive you need to decide for each variable whether you want one copy per
call or just one copy. The Chars array I only needed one copy of and you should
try to minimise the amount of ‘multi-copy’ data in a recursive function or you
risk a stack overflow if the recursion goes too deep.
Beware: There is a hidden gotcha if your iterative step
involves a string expression where the non-recursive part of the expression is
another function call that leaves a result on the string stack. With each
recursion you leave another element on the stack which is only 32 items deep.
For example
My_All FUNCTION(string char,ushort
num)
CODE
IF num THEN
L1: RETURN CLIP(char) &
My_All(char,num-1)
ELSE
RETURN ‘’
END
Line L1 is interpreted by the compiler as
Push char onto the stack
CLIP top of stack
Evaluate My_All(char,num-1)
Concatenate top two items on the stack
At the deepest part of the recursion there will be num items on
the string stack, this must be less that 32… The work around is to store the
result of the non-recursive function into a temporary variable, in this case we
can use char as its own temporary variable (it will not corrupt the incoming
parameter as it is a value parameter).
My_All FUNCTION(string char,ushort
num)
CODE
IF num THEN
char = CLIP(char)
RETURN char &
My_All(char,num-1)
ELSE
RETURN ‘’
END
Technical point : The CW compiler evaluates all functions
involved in an expression before evaluating any other part of the expression, so
the string stack problem is not hit unless two (or more) functions are being
called. The CW 2.0 compiler evaluates from left to right so the above problem
could be solved by putting the recursive call left-most in the expression
RETURN My_All(char,num-1) &
CLIP(char)
but this left to right evaluation is not part of the language
specification so should be used with caution.
Routines allow recursion although they cannot take
parameters or return results and they do not have local data. These restrictions
may seem draconian, but if you can live within them they allow recursive
routines that are much faster than functional ones as there is minimal stack
overhead for each routine invocation. Here is the My_All function using
routines.
My_All FUNCTION(string char,ushort
num)
Ins CSTRING(20),AUTO
Buffer
CSTRING(32000)
CODE
Ins = CLIP(char)
DO
FillBuffer
FillBuffer ROUTINE
IF num THEN
num -= 1
DO FillBuffer
Buffer =
Buffer & Ins
END
Queues
Performance issues
Entities
Memos
Blobs
Compiler Options
Run time checking
CW and OOP
Introduction to OOP principles
Need for user defined entities
Inheritance
Implicit inheritance
Polymorphism
*?
Functional overloading
Methods
Virtual
References
Typing and conversion
Dangers
Dynamics
Sparse arrays