Many older codes were developed during the reign of Fortran-66. This was a spartan standard. The fundamental features included:
Fortran-77 was quite an advancement. It addressed many of the portability concerns including:
MIL-STD 1753 - Standardized the following:
Fortran-90 introduced many new capabilities. Some of the major ones are:
For example a typographical error in a variable name could lie hidden - silently injecting erroneous values into an expression. Or just as bad, a misspelled name could be receiving the results of some computation which is then never reused. IMPLICIT NONE comes to the rescue by requiring a conformance between variables explicitly declared, and their usage.
Realistically, many modern optimizing compilers can report some cases of 'used but not defined' and 'defined but never used' variables. However there are cases, especially in conditional expressions, where it is impossible for a compiler to give warning - so it must remain silent. Once again, IMPLICIT NONE comes to the rescue.
Note that even though the IMPLICIT statement has been a part of Fortran since Fortran-77, IMPLICIT NONE was standardized at Fortran-90. It was also a part of MIL-STD 1753. IMPLICIT NONE was almost universally implemented in Fortran-77 compilers.
A system which uses static allocation of data will permanently set
aside 1000 numeric storage units for array SCRATCH. This space is
reserved for the exclusive use of subroutine COMPUTE. If subroutine
COMPUTE represents a routine in a code which is rarely, if ever, used,
then this space is wasted.
Contrast this to an environment where data is dynamically allocated. In this scenario, space for SCRATCH will only be set aside upon entry to the subroutine. As soon as the subroutine has completed, the space is released and available for reuse. This may sound expensive, and perhaps it was in the 1950s on machines with no index registers, but in reality is not. Consider the case where the underlying run-time environment contains a data structure known as a stack for use by local data. A register is set aside as a stack pointer. This stack pointer is simply bumped forwards (or backwards) depending upon how much space is used. All local memory addresses are then based as an offset to this pointer.
Contrary to popular belief, no Fortran standard has ever mandated static allocation. The earliest IBM compilers simply implemented static allocation. As other vendors wrote Fortran compilers they tended to use static allocation as well. However there were exceptions. Burroughs (now Unisys) Fortran has always placed local data on a stack. Cray started offering stack allocation by compiler option in the early 1980s as multitasking started to become popular.
So there are at least two major advantages to stack allocation:
Typically the problem is that one or more routines have local variables
which are assumed to be the same between invocations of that routine. Consider
the following subroutine which print a page count on a listing:
The above routine has two problems. First, it assumes that page_number
was magically initialized to zero prior to the first call. Second, it assumes
that the values will be retained between invocations. Neither assumption
has ever been a requirement of the Standard - though it usually worked
on compilers which had static allocation as a default. With stack allocation,
page_number will have stack trash as an initial value.
To fix the above, SAVE attribute should be specified to retain the value
of page_number between calls. Second, we should give page_number an initial
value. The following example shows this using a Fortran-90 style declaration:
Note that techically just by initializing the variable (page_number
= ) in the declaration, or using a DATA statement, the declaration has
the SAVE attribute. But spelling it out makes the usage obvious to the
reader.
No guidelines or requirements are imposed as to how big, in terms of numbers of bits or bytes, a numeric storage unit is. This is intentional to allow Fortran to be easily implemented on a wide variety of hardware. These days one numeric storage unit tends to be 32-bits to accomodate the IEEE floating point standard. However in the past, I've used computers where a single numeric storage unit was 16, 18, 24, 32, 36, 60, and 64 bits. Even 48 bit numeric storage units are not unknown.
Note that even though DOUBLE PRECISION is required to occupy twice the storage as REAL, the standard does not require twice the precision in calculations. Thus, even if only 1 additional bit were actually used, an implementation would meet the requirements of the standard.
Fortran-77 introduced the CHARACTER data type. Intentionally, there is no relationship defined between character storage units and numeric storage units. This is why storage association (equivalencing and so on) between the two is undefined in the Standard - even though it is commonly implemented as an extension by many compilers.
Since integers and real elements each occupy one numeric storage
unit, iarray(1) occupies the same memory location as rarray(1), iarray(2)
occupies the same location as rarray(2), and so on.
In the olden days, especially with static allocation, it was common to take advantage of storage association to reuse memory. Thus, storage was overlaid in space, but not in time. These days, between stack allocation of local data and dynamic memory management, there is little need to explicitly overlay memory in this way.
A second usage was to allow integer access to floating point (or other data) in order to get bit-level access to the data. In this case, the storage is associated in both space AND time. This latter usage was especially common in Fortran-66 level code with packed Hollerith data. With the introduction of character data type in Fortran-77, most such code should have been thrown away years ago. In Fortran-90, the TRANSFER intrinsic allows bit-level data motion between different data types.
Storage association can also occur with COMMON blocks. It is actually
legal to have a given common block described in multiple ways in multiple
routines. For example, in the spirit of the above routine:
In the above, the storage associated with the common block /scratch/
is shared by the two routines.
Last, a similar effect can be seen between a caller and a callee. Consider:
In the above, routine A considers the storage as integer, and routine
B considers it real.
Again, better techniques are available in modern Fortran compilers to make dependance on storage association obsolete.
Automatic arrays, (which date back to ALGOL-60...), are simply local
arrays where the size is passed in. Upon invocation of the routine, storage
is allocated as if on the end of a stack. The size is passed in via a dummy
argument or through a global value in a common block or module. Here is
a simple example:
Upon activation of the routine, the array is sized correctly. Then
upon exit, storage is released for use by other routines. Thus there is
no chance for memory leakage.
Likewise, arrays with the allocatable attribute may also have a variable
size. However allocatable arrays are only allocated via the ALLOCATE statement.
Additionally, they may be deallocated with the DEALLOCATE statement.
Note in the above example that ALLOCATE can also return an error
status. This allows the program to handle an allocation error condition.
Also note that if an allocatable variable has local scope in the routine (i.e., it is not a global variable contained in a module), a DEALLOCATE is not needed at the end of the routine. The compiler is required, by Fortran-95, to automatically deallocate local allocatable arrays in order to prevent memory leakage.
Allocatable arrays can be made globally accessible by placing them in a module. In this case, no garbage collection is possible.
One advantage of free form over fixed involves the potential of a typgraphical error not being detected by the compiler - even with IMPLICIT NONE. It is possible to have a variable name going over column 72 and getting truncated, yet still being a legal name. For example what if variable IVALUE accidentally went beyond the magic 72nd column and the 'VALUE' portion was treated as a comment. The compiler would use 'I' as the variable name and bad results would occur. With free format, this error can not occur.
Source code can be written so it can be compiled as both fixed and free by following a few simple rules:
In the above, common block /block1/ is defined in the main program
and both callees. However /block2/ is only defined in the callees. If data
needs to be shared between sub1 and sub2 via /block2/ there could be a
problem. However if they are merely sharing scratch space there will not
be a problem.
Why?
Well the Standards, all of them, allow /block2/ to go out of scope between the calls to sub1 and sub2. So that if the two subroutines need to share data, it may not be coherent. Either a SAVE statement should be used, or /block2/ should be declared at a point in the call tree where it won't go out of scope at the wrong time. The /block1/ common block was declared in the main program so never goes out of scope.
Module data is treated in a very similar fashion to COMMON data, and can also go out of scope. So there must be a SAVE statement in the module, or a USE statement in a program unit high enough in the call tree that problems are not encountered.
This concept was placed in the Standard because it allows yet another mechanism for overlaying data. This form of overlaying is rarely seen in modern implementations, but was quite common in the olden days.
Note that the ability to name a block data was introduced in Fortran-77.
Note also that BLOCK DATA routines are becoming obsolete with modules. Module variables can be initialized at compile time just like any other data.
Since a Hollerith constant was typeless, it could be placed into any numeric data type without type conversion. Per the '66 Standard, Hollerith constants could only be used in 3 places:
Of interest, consider the case where a data type could hold more than the number of characters specified in the Hollerith constant. In this case, the compiler was required to left-justify the characters and 'blank fill' the unused bits. Note that 'zero-fill' variants, with both right- and left-justification, were common extensions to most compilers.
The only really portable use of Hollerith constants was to store 1 character per integer. Of course this was quite wasteful of memory because integers could typically hold from 3 to as many as 10 characters each. Considering the restricted memory sizes of the time, there was quite a bit of pressure to pack multiple characters into each integer. Then highly non-portable masking+shifting code was needed to extract/insert characters.
Thankfully the situation was rectified in Fortran-77 with the CHARACTER
data type. Hollerith constants were moved to an appendix in the '77 Standard,
and were completely gone in the '90 Standard.
The Posix 1003.9-1992 Fortran bindings are a standardized set of
library calls for making various low level requests of the operating system.
There are dozens of calls available - documented on many systems in the
intro_pxf man pages. Three of the most popular calls are PXFGETENV (get
environment variables) and PXFGETARG (get command line arguments) calls,
and the IPXFARGC (get argument count) function.
Some people dislike the Posix bindings because they are not 'Fortran-90-like'. The calls were standardized in the early 1990s when Fortran-77 was still prevalent. The committee therefore took the conservative approach that, with the exception of long external names, all calls had to be usable in a Fortran-77 environment.
For example, consider the case where a program needs to know the size of a given file. The PXFSTAT routine is used. However, first PXFSTRUCTCREATE must be called to create an appropriate data structure for return values. A handle is passed back to the user for reference. The user then calls PXFSTAT - giving the desired file name and the handle as input arguments. The PXFSTAT routine updates the structure. Then the user calls PXFINTGET to extract the desired field from the structure. Finally, PXFSTRUCTFREE is called to release the structure.
Old Fortran vs Fortran-90 calling sequence
Two main pointer types:
Fortran-90 pointers
Cray pointers (non-Standard, but commonly used)
The POSIX Fortran standard defines a pair of procedures called PXFGETSUBHANDLE and PXFCALLSUBHANDLE which are sufficient for simple single-argument calls.
If the POSIX Fortran routines are not available, or if more advanced
calls are needed, a pair of simple C routines may be written to obtain
the address of an external name, then call it. Many compilers follow
the convention that Fortran EXTERNAL names are passed by value. So
the C routine to return the address can be written simply as:
Call the above with something like:
A second C routine can be written to call a subroutine - given a pointer
to it. The following passes two call-by-reference arguments, one
integer and one real:
To call this from Fortran:
Calling C routines from Fortran
The conventions for calling C routines from Fortran date back to the original f77 compiler on 7th edition unix, and these conventions have formed a defacto standard in many environments. Since Fortran is case-insensitive, and unix systems tend to like things in lower case, the original compilers folded Fortran external names to lower case. Then, to distinguish the Fortran namespace from the C namespace, an underscore character was appended to the end of the Fortran name. So calling routine XYZZY would result in an external name of 'xyzzy_' - which is still a legal C name. Common block names followed similar conventions. SGI systems follow the above conventions.
For historical reasons, some systems diverge from the above. For example, in the Cray environment, when unix came along, existing Fortran compilers, libraries, and linkers were ported from the proprietary OS to Unicos. In order to ease the conversion, the naming conventions were not changed. On these systems the names are in upper case with no underscore characters. Other systems are known to place underscores before the name.
Argument passing is another place where problems lie. Fortran implementations generally, but not always, depend on call-by-reference. The address of the actual argument is passed by value to the callee, who must then dereference the argument. Since call by reference is easy to emulate in C, there are few problems passing numeric variables.
Character variables are problematic. Fortran character data has a length associated with each datum. In C, there is no such thing as a character string - only arrays of char. So the length must be passed in via some mechanism. The defacto standard is to add an additional actual argument for each character variable in the argument list containing the length of a character datum. The C callee can then use the extra value(s) to properly handle the strings.
Some implementations use other mechanisms to pass the necessary character string information. For example, some Cray implementations use a Fortran Character Descriptor (FCD) with both address and length passed in a single word. A special header file and macros are used to access the address and length. Again this dates back to pre-unix implementations carried forward into Unicos.
Fortran-90 and C++ further confuse issues.
Calling Fortran routines from C
Calling a Fortran routine from C is simply the opposite of calling C routines from Fortran. Generally one calls the Fortran name with the proper case and underscore convention, pass addresses of each of the arguments, and pass lengths of character strings.
Why is Fortran code faster than C code (aliasing)
In a C function, pointers are unrestricted. That is, they can point to any location in memory without restrictions. Also, multiple pointers can point to a single object. And C uses pointers for many things - including passing arrays and data structures into sub-functions. Lets imagine that we are cruising through some code that looks like the following:
*b = *a+ 23;
c = *a * 42;
A good optimizing compiler would like to dereference a from memory once, and use it in both expressions. However if there is any possibility that a and b point to the same location in memory, this would cause erroneous results.
In Fortran, the need for pointers is greatly lessened by various language features. Arrays and data structures may be created dynamically and passed into and out of subprograms without using user-visible pointers. The compiler knows at compile time that objects with different names point to unique places in memory. The only exception is a variable which has the target attribute. So in a Fortran equivalent to the above code, the compiler is free to optimize the memory reference (i.e., keep the data in a register for reuse) unless a and b are targets.
A C protagonist may counter that it is easy to write the above code as:
temp = *a;
*b = temp +23;
c = temp * 42;
However consider a non-trivial application (say 20-100k lines of code) where speed is important. Since C requires user-visible pointers for so many things, there may be tens of thousands of instances where the compiler is forced to be conservative. Is the C programmer really going to look for all those situations? The Fortran programmer need never worry - he will always get good optimization.
C9x introduces a new restrict attribute for pointers - which says that the pointee is not aliased by other pointers. Use of this attribute can help optimization by allowing the pointee to remain in a register for reuse. But note that this is akin to locking a barn door after the animals have escaped. The default action is wrong (for speed), and once again few programmers will have the desire to look for every case where restrict can be used. The Fortran action is to go fast by default and make potential aliasing problems explicit via targets.
Page created August 29, 2000
Updated August 30, 2001