Type system

From Wikipedia, the free encyclopedia
Jump to: navigation , search

In computer science, atype system is a system used with a programming language to help reducebugs incomputer programs[1]. A type system is used as part of defining interfaces between different parts of a computer program, and then checking that the parts have been connected in a consistent way. This is accomplished by extending thesyntax of the language to include some form that indicates atype that a primitive value in the language can be associated with, along with a process for using the types to check that the parts have been connected in a consistent way. The checking can happen at various program lifecycle phases such as atcompile time or atrun time, or combinations. Because of the wide variation in languages, types systems have likewise wide variation, necessitating a somewhat lengthy summary to remain valid for all variations.

In general, a computer program is made up of separate parts that invoke each other. Each part defines an interface by which that part is invoked, which includes the name of the program-part, and a way to pass values to that program-part. When the language that is used to write the program includes a type system, the interface declares or implies a type or a family of types for each value that can be sent to the program-part. This declaration is used by a type-checking process to ensure that a given connection between program-parts can only transport values that are of the type declared in the interface. The value of such a check lies in how the behavior of a program-part is specified. The behavior assumes the bits of information in the value are arranged according to a particular pattern. If the received bits are arranged according to a different pattern, then the behavior produces non-sensical results. The more advanced kinds of type system reduce the number of such declarations required.

A type system has multiple aspects. One aspect is the set of types that are part of the language syntax. Another aspect is the active entities that check the connections between the program parts. These entities execute separately from the program, and may execute during any of a number of the program lifecycle phases. They check that the pieces of the program have been connected in accordance with the declared interfaces. The advanced type systems that minimize the number of declarations made in the source code do so by including active entities that perform computations related to the types. These may even modify code or generate code in accordance with the types.

An example of a simple type system is that of the C language. The portions of a C program are the function definitions. One function is invoked by another function. The interface of a function states the name of the function and a list of values that are passed to the function's code. The code of an invoking function states the name of the invoked, along with the names of variables that hold values to pass to the invoked function. During execution, the values are placed into temporary storage, then execution jumps to the code of the invoked function. The invoked function's code takes the values and uses them in the instructions inside the function. If the instructions inside the function are written with the assumption of receiving an integer value, but the calling code passed a floating-point value, then the wrong result will be computed by the invoked function. The C compiler checks the type declared for each variable sent, against the type declared for each variable in the interface of the invoked function. If the types do not match, the compiler throws a compile-time error.

The type system of C is composed of the syntax andgrammar of the available types, plus the portions of the parser that handle the type declarations that have been made in the source code, plus the portions of the compiler that check that the types of the variables passed to an invoked function match the types stated in the function declaration.

In greater technical depth, a type-system associates atype with each computed value. By examining the flow of these values, a type system attempts to ensure or prove that notype errors can occur. The particular type system in question determines exactly what constitutes a type error, but in general the aim is to prevent operations expecting a certain kind of value from being used with values for which that operation does not make sense (logic errors);memory errors will also be prevented. Type systems are often specified as part ofprogramming languages, and built into the interpreters and compilers for them; although the type system of a language can be extended by optional tools that perform additional kinds of checks using the language's original type syntax and grammar.

A compiler may also use the static type of a value to optimize the storage it needs and the choice of algorithms for operations on the value. In many C compilers thefloat data type, for example, is represented in 32bits, in accord with theIEEE specification for single-precision floating point numbers. They will thus use floating-point-specificmicroprocessor operations on those values (floating-point addition, multiplication, etc.).

The depth of type constraints and the manner of their evaluation affect thetyping of the language. A programming language may further associate an operation with varying concrete algorithms on each type in the case of type polymorphism.Type theory is the study of type systems, although the concrete type systems of programming languages originate from practical issues of computer architecture, compiler implementation, and language design.

Contents

 [hide

[edit]Fundamentals

Formally, type theory studies type systems. A programming language must have occurrence to type check using the type system whether at compiler time or runtime, manually annotated or automatically inferred. AsMark Manasse concisely put it:[2]

The fundamental problem addressed by a type theory is to ensure that programs have meaning. The fundamental problem caused by a type theory is that meaningful programs may not have meanings ascribed to them. The quest for richer type systems results from this tension.

Assigning a data type, what is called typing, gives meaning to a sequences ofbits such as a value inmemory or someobject such as avariable. The hardware of ageneral purpose computer is unable to discriminate between for example a memory address and aninstruction code, or between acharacter, aninteger, or afloating-point number, because it makes no intrinsic distinction between any of the possible values of a sequence of bits might mean. Associating a sequence of bits with a type conveys thatmeaning to the programmable hardware to form asymbolic system composed of that hardware and some programmer.

A program associates each value with at least one particular type, but it also occurs also that a one value is associated with manysubtypes. Other entities, such asobjects,modules, communication channels,dependencies can become associated with a type. Even a type can become associated with a type. An implementation of some type system could in theory associate some identifications named this way:

These are the kinds of abstractions typing can go through on a hierarchy of levels contained in a system.

When a programming language evolves a more elaborate type system, it gains a more finely-grained rule set than basic type checking, but this comes at a price when the type inferences (and other properties) becomeundecidable, and when more attention must be paid by the programmer to annotate code or to consider computer-related operations and functioning. It is challenging to find a sufficiently expressive type system that satisfies all programming practices intype safe manner.

The more type restrictions that are imposed by the compiler, the morestrongly typed a programming language is. Strongly typed languages often require the programmer to make explicit conversions in contexts where an implicit conversion would cause no harm. Pascal's type system has been described as "too strong" because, for example, the size of an array or string is part of its type, making some programming tasks difficult.[3][4]Haskell is also strongly typed but its types are automatically inferred so that explicit conversions are unnecessary.

A programming language compiler can also implement adependent type or aneffect system, which enables even more program specifications to be verified by a type checker. Beyond simple value-type pairs, a virtual "region" of code is associated with an "effect" component describing what is being done with what, and enabling for example to "throw" an error report. Thus the symbolic system may be atype and effect system, which endows it with more safety checking than type checking alone.

Whether automated by the compiler or specified by a programmer, a type system makes program behavior illegal that is outside the type-system rules. Advantages provided by programmer-specified type systems include:

  • Abstraction (or modularity) – Types enable programmers to think at a higher level than the bit or byte, not bothering with low-level implementation. For example, programmers can begin to think of a string as a collection of character values instead of as a mere array of bytes. Higher still, types enable programmers to think about and expressinterfaces between two ofany-sized subsystems. This enables more levels of localization so that the definitions required for interoperability of the subsystems remain consistent when those two subsystems communicate.
  • Documentation – In more expressive type systems, types can serve as a form ofdocumentation clarifying the intent of the programmer. For instance, if a programmer declares a function as returning a timestamp type, this documents the function when the timestamp type can be explicitly declared deeper in the code to be integer type.

Advantages provided by compiler-specified type systems include:

  • Optimization – Static type-checking may provide useful compile-time information. For example, if a type requires that a value must align in memory at a multiple of four bytes, the compiler may be able to use more efficient machine instructions.
  • Safety – A type system enables the compiler to detect meaningless or probably invalid code. For example, we can identify an expression3 / "Hello, World" as invalid, when the rules do not specify how to divide aninteger by astring. Strong typing offers more safety, but cannot guarantee completetype safety.

Type safety contributes to program correctness, but can only guarantee correctness at the expense of making the type checking itself anundecidable problem. In atype system with automated type checking a program may prove to run incorrectly yet be safely typed, and produce no compiler errors. Division by zero is an unsafe and incorrect operation, but a type checker running only atcompile time doesn't scan fordivision by zero in most programming languages, and then it is left as aruntime error. To prove the absence of these more-general-than-types defects, other kinds of formal methods, collectively known asprogram analyses, are in common use. In additionsoftware testing is anempirical method for finding errors that the type checker cannot detect.

[edit]Type checking

The process of verifying and enforcing the constraints of types –type checking – may occur either at compile-time (a static check) orrun-time (a dynamic check). If a language specification requires its typing rules strongly (i.e., more or less allowing only those automatic type conversions that do not lose information), one can refer to the process asstrongly typed, if not, as weakly typed. The terms are not usually used in a strict sense.

[edit]Static typing

A programming language is said to use static typing when type checking is performed during compile-time as opposed to run-time. Statically typed languages includeActionScript 3,Ada,C,D,Eiffel,F#,Fortran,Go,Haskell,haXe,JADE,Java,ML,Objective-C,OCaml,Pascal,Seed7 andScala.C++ is statically typed, aside from itsrun-time type information system. TheC# type system performs static-like compile-time type checking, but also includes full runtime type checking. Perl is statically typed with respect to distinguishing arrays, hashes, scalars, and subroutines.

Static typing is a limited form of program verification (see type safety): accordingly, it allows many type errors to becaught early in the development cycle. Static type checkers evaluate only the type information that can be determined at compile time, but are able to verify that the checked conditions hold for all possible executions of the program, which eliminates the need to repeat type checks every time the program is executed. Program execution may also be made more efficient (e.g. faster or taking reduced memory) by omitting runtime type checks and enabling other optimizations.

Because they evaluate type information during compilation and therefore lack type information that is only available at run-time, static type checkers are conservative. They will reject some programs that may be well-behaved at run-time, but that cannot be statically determined to be well-typed. For example, even if an expression<complex test> always evaluates to true at run-time, a program containing the code

if <complex test> then <do something> else <type error>

will be rejected as ill-typed, because a static analysis cannot determine that theelse branch won't be taken.[5] The conservative behaviour of static type checkers is advantageous when <complex test> evaluates to false infrequently: A static type checker can detect type errors in rarely used code paths. Without static type checking, evencode coverage tests with 100% coverage may be unable to find such type errors. The tests may fail to detect such type errors, because the combination of all places where values are created and all places where a certain value is used must be taken into account.

The most widely used statically typed languages are not formallytype safe. They have "loopholes" in the programming language specification enabling programmers to write code that circumvents the verification performed by a static type checker and so address a wider range of problems. For example, most C-style languages havetype punning, and Haskell has such features asunsafePerformIO: such operations may be unsafe at runtime, in that they can cause unwanted behaviour due to incorrect typing of values when the program runs.

[edit]Dynamic typing

A programming language is said to be dynamically typed when the majority of its type checking is performed at run-time as opposed to at compile-time. In dynamic typing values have types, but variables do not; that is, a variable can refer to a value of any type. Dynamically typed languages include APL,Erlang,Groovy,JavaScript,Lisp,Lua,MATLAB,GNU Octave,Perl (for user-defined types, but not built-in types),PHP,Pick BASIC,Prolog,Python,R,Ruby,Smalltalk andTcl.

Implementations of dynamically typed languages generally associate run-time objects with "tags" containing their type information. This run-time classification is then used to implement type checks and dispatchoverloaded functions, but can also enable pervasive uses ofdynamic dispatch,late binding and similar idioms that would be cumbersome at best in a statically typed language, requiring the use of variant types or similar features.

More broadly, as explained below, dynamic typing can improve support for dynamic programming language features, such as generating types and functionality based on run-time data. (Nevertheless, dynamically typed languages need not support any or all such features, and somedynamic programming languages are statically typed.) On the other hand, dynamic typing provides fewera priori guarantees: a dynamically typed language accepts and attempts to execute some programs that would be ruled as invalid by a static type checker, either due to errors in the program or due to static type checking being too conservative.

Dynamic typing may result in runtime type errors—that is, at runtime, a value may have an unexpected type, and an operation nonsensical for that type is applied. Such errors may occur long after the place where the programming mistake was made—that is, the place where the wrong type of data passed into a place it should not have. This may make the bug difficult to locate.

Dynamically typed language systems' run-time checks can potentially be more sophisticated than those of statically typed languages, as they can use dynamic information as well as any information from the source code. On the other hand, runtime checks only assert that conditions hold in a particular execution of the program, and the checks are repeated for every execution of the program.

Development in dynamically typed languages is often supported by programming practices such asunit testing. Testing is a key practice in professional software development, and is particularly important in dynamically typed languages. In practice, the testing done to ensure correct program operation can detect a much wider range of errors than static type-checking, but full test coverage over all possible executions of a program (including timing, user inputs, etc.), if even possible, would be extremely costly and impractical. Static typing helps by providing strong guarantees of a particular subset of commonly made errors never occurring.

[edit]Combinations of dynamic and static typing

The presence of static typing in a programming language does not necessarily imply the absence of all dynamic typing mechanisms. For example, Java and some other ostensibly statically typed languages supportdowncasting and other type operations that depend on runtime type checks, a form of dynamic typing. More generally, most programming languages include mechanisms for dispatching over different 'kinds' of data, such asdisjoint unions,polymorphic objects, andvariant types: Even when not interacting with type annotations or type checking, such mechanisms are materially similar to dynamic typing implementations. See programming language for more discussion of the interactions between static and dynamic typing.

Certain languages, for example Clojure,Common Lisp, orCython, are dynamically typed by default, but allow this behaviour to be overridden through the use of explicit type hints that result in static typing. One reason to use such hints would be to achieve the performance benefits of static typing in performance-sensitive parts of code.

As of the 4.0 Release, the .NET Framework supports a variant of dynamic typing via theSystem.Dynamic namespace whereby astatic object of type 'dynamic' is a placeholder for the .NET runtime to interrogate its dynamic facilities to resolve the object reference.

[edit]Static and dynamic type checking in practice

The choice between static and dynamic typing requirestrade-offs.

Static typing can find type errors reliably at compile time. This should increase the reliability of the delivered program. However, programmers disagree over how commonly type errors occur, and thus disagree over the proportion of those bugs that are coded that would be caught by appropriately representing the designed types in code. Static typing advocates believe programs are more reliable when they have been well type-checked, while dynamic typing advocates point to distributed code that has proven reliable and to small bug databases. The value of static typing, then, presumably increases as the strength of the type system is increased. Advocates ofdependently typed languages such asDependent ML andEpigram have suggested that almost all bugs can be considered type errors, if the types used in a program are properly declared by the programmer or correctly inferred by the compiler.[6]

Static typing usually results in compiled code that executes more quickly. When the compiler knows the exact data types that are in use, it can produce optimized machine code. Further, compilers for statically typed languages can find assembler shortcuts more easily. Some dynamically typed languages such asCommon Lisp allow optional type declarations for optimization for this very reason. Static typing makes this pervasive. See optimization.

By contrast, dynamic typing may allow compilers to run more quickly and allowinterpreters to dynamically load new code, since changes to source code in dynamically typed languages may result in less checking to perform and less code to revisit. This too may reduce the edit-compile-test-debug cycle.

Statically typed languages that lack type inference (such as C and Java) require that programmers declare the types they intend a method or function to use. This can serve as additional documentation for the program, which the compiler will not permit the programmer to ignore or permit to drift out of synchronization. However, a language can be statically typed without requiring type declarations (examples includeHaskell,Scala,OCaml,F# and to a lesser extentC#), so explicit type declaration is not a necessary requirement for static typing in all languages.

Dynamic typing allows constructs that some static type checking would reject as illegal. For example,eval functions, which execute arbitrary data as code, become possible. Aneval function is possible with static typing, but requires advanced uses ofalgebraic data types. Furthermore, dynamic typing better accommodates transitional code and prototyping, such as allowing a placeholder data structure (mock object) to be transparently used in place of a full-fledged data structure (usually for the purposes of experimentation and testing).

Dynamic typing typically allows duck typing (which enableseasier code reuse). Many languages with static typing also featureduck typing or other mechanisms likegeneric programming which also enables easier code reuse.

Dynamic typing typically makes metaprogramming easier to use. For example,C++templates are typically more cumbersome to write than the equivalentRuby orPython code. More advanced run-time constructs such asmetaclasses andintrospection are often more difficult to use in statically typed languages. In some languages, such features may also be used e.g. to generate new types and behaviors on the fly, based on run-time data. Such advanced constructs are often provided bydynamic programming languages; many of these are dynamically typed, althoughdynamic typing need not be related to dynamic programming languages.

[edit]Strong and weak typing: Liskov Definition

In 1974 Liskov and Zilles described a strong-typed language as one in which "whenever an object is passed from a calling function to a called function, its type must be compatible with the type declared in the called function."[7] Jackson wrote, "In a strongly typed language each data area will have a distinct type and each process will state its communication requirements in terms of these types."[8]

[edit]Strong and weak typing

A type system is said to feature strong typing when it specifies one or more restrictions on how operations involving values of different data types can be intermixed. A computer language that implements strong typing will prevent the successful execution of an operation on arguments that have the wrong type.

Weak typing means that a language implicitly converts (or casts) types when used. Consider the following example:

var x := 5;    // (1)  (x is an integer)
var y := "37"; // (2)  (y is a string)
x + y;         // (3)  (?)

In a weakly typed language, the result of this operation depends on language-specific rules.Visual Basic would convert the string "37" into the number 37, perform addition, and produce the number 42.JavaScript would convert the number 5 to the string "5", perform string concatenation, and produce the string "537." In JavaScript, the conversion to string is applied regardless of the order of the operands (for example, y + x would be "375") while inAppleScript, the left-most operand determines the type of the result, so that x + y is the number 42 but y + x is the string "375".

In the same manner, due to JavaScript's dynamic type conversions:

var y = 2 / 0;                        // y now equals a constant for infinity
y == Number.POSITIVE_INFINITY         // returns true
Infinity == Number.POSITIVE_INFINITY  // returns true
"Infinity" == Infinity                // returns true
y == "Infinity"                       // returns true

A Ccast gone wrong exemplifies the problems that can occur if strong typing is absent: if a programmer casts a value from one type to another in C, not only must the compiler allow the code at compile time, but the runtime must allow it as well. This may permit more compact and faster C code, but it can makedebugging more difficult.

[edit]Safely and unsafely typed systems

A third way of categorizing the type system of a programming language uses the safety of typed operations and conversions. Computer scientists consider a language "type-safe", if it does not allow operations or conversions that lead to erroneous conditions.

Some observers use the term memory-safe language (or justsafe language) to describe languages that do not allow undefined operations to occur. For example, a memory-safe language willcheck array bounds, or else statically guarantee (i.e., at compile time before execution) that array accesses out of the array boundaries will cause compile-time and perhaps runtime errors.

var x := 5;     // (1)
var y := "37";  // (2)
var z := x + y; // (3)

In languages like Visual Basic, variablez in the example acquires the value 42. While the programmer may or may not have intended this, the language defines the result specifically, and the program does not crash or assign an ill-defined value toz. In this respect, such languages are type-safe; however, in some languages, if the value ofy was a string that could not be converted to a number (e.g. "Hello World"), the results would be undefined. Such languages are type-safe (in that they will not crash), but can easily produce undesirable results. In other languages like JavaScript, the numeric operand would be converted to a string, and then concatenation performed. In this case, the results are not undefined and are predictable.

Now let us look at the same example in C:

int x = 5;
char y[] = "37";
char* z = x + y;

In this example z will point to a memory address five characters beyond y, equivalent to three characters after the terminating zero character of the string pointed to byy. The content of that location is undefined, and might lie outside addressable memory. The mere computation of such a pointer may result in undefined behavior[citation needed] (including the program crashing) according to C standards, and in typical systemsdereferencing z at this point could cause the program to crash. We have a well-typed, but not memory-safe program—a condition that cannot occur in a type-safe language.

In some languages, like JavaScript, the use of special numeric values and constants allows type-safety for mathematical operations without resulting in runtime errors. For example, when dividing aNumber by aString, or aNumber by zero.

var x = 32;
var aString = new String("A");
x = x/aString;                  // x now equals the constant NaN, meaning Not A Number
isNaN(x);                       // returns true
typeof(x);                      // returns "number"
var y = 2 / 0;                  // y now equals a constant for infinity
y == Number.POSITIVE_INFINITY;  // returns true
typeof(y);                      // returns "number"

[edit]Variable levels of type checking

Some languages allow different levels of checking to apply to different regions of code. Examples include:-

  • The use strict directive in Perl applies stronger checking.
  • The @ operator in PHP suppresses some error messages.
  • The Option Strict On in VB.NET allows the compiler to require a conversion between objects.

Additional tools such as lint andIBM Rational Purify can also be used to achieve a higher level of strictness.

[edit]Optional type systems

It has been proposed, chiefly by Gilad Bracha, that the choice of type system be made independent of choice of language; that a type system should be a module that can be "plugged" into a language as required. He believes this is advantageous, because what he calls mandatory type systems make languages less expressive and code more fragile.[9] The requirement that types do not affect the semantics of the language is difficult to fulfil: for instance, class based inheritance becomes impossible.

[edit]Polymorphism and types

The term "polymorphism" refers to the ability of code (in particular, methods or classes) to act on values of multiple types, or to the ability of different instances of the same data structure to contain elements of different types. Type systems that allow polymorphism generally do so in order to improve the potential for code re-use: in a language with polymorphism, programmers need only implement a data structure such as a list or anassociative array once, rather than once for each type of element with which they plan to use it. For this reason computer scientists sometimes call the use of certain forms of polymorphismgeneric programming. The type-theoretic foundations of polymorphism are closely related to those ofabstraction,modularity and (in some cases)subtyping.

[edit]Duck typing

In "duck typing",[10] a statement calling amethodm on an object does not rely on the declared type of the object; only that the object, of whatever type, must supply an implementation of the method called, when called, at run-time.

Duck typing differs from structural typing in that, if thepart (of the whole module structure) needed for a given local computation is presentat runtime, the duck type system is satisfied in its type identity analysis. On the other hand, a structural type system would require the analysis of the whole module structure at compile time to determine type identity or type dependence.

Duck typing differs from a nominative type system in a number of aspects. The most prominent ones are that for duck typing, type information is determined at runtime (as contrasted to compile time), and the name of the type is irrelevant to determine type identity or type dependence; only partial structure information is required for that for a given point in the program execution.

Duck typing uses the premise that (referring to a value) "if it walks like a duck, and quacks like a duck, then it is a duck" (this is a reference to theduck test that is attributed toJames Whitcomb Riley). The term may have been coined[citation needed] by Alex Martelli in a 2000 message[11] to the comp.lang.python newsgroup (seePython).

Duck typing has been demonstrated to increase programmer productivity in a controlled experiment.[12][not in citation given]

[edit]Specialized type systems

Many type systems have been created that are specialized for use in certain environments with certain types of data, or for out-of-bandstatic program analysis. Frequently, these are based on ideas from formaltype theory and are only available as part of prototype research systems.

[edit]Dependent types

Dependent types are based on the idea of using scalars or values to more precisely describe the type of some other value. For example, matrix(3, 3) might be the type of a 3×3 matrix. We can then define typing rules such as the following rule for matrix multiplication:

matrix_{multiply} : matrix(k, m) \times matrix(m, n) \to matrix(k, n)

where k,m,n are arbitrary positive integer values. A variant ofML calledDependent ML has been created based on this type system, but because type checking for conventional dependent types is undecidable, not all programs using them can be type-checked without some kind of limits. Dependent ML limits the sort of equality it can decide to Presburger arithmetic. Other languages such as Epigram make the value of all expressions in the language decidable so that type checking can be decidable. It is also possible to make the language[vague] Turing-complete at the price of undecidable type checking, as in Cayenne.

[edit]Linear types

Linear types, based on the theory oflinear logic, and closely related touniqueness types, are types assigned to values having the property that they have one and only one reference to them at all times. These are valuable for describing large immutable values such as files, strings, and so on, because any operation that simultaneously destroys a linear object and creates a similar object (such as 'str = str + "a"') can be optimized "under the hood" into an in-place mutation. Normally this is not possible, as such mutations could cause side effects on parts of the program holding other references to the object, violatingreferential transparency. They are also used in the prototype operating system Singularity for interprocess communication, statically ensuring that processes cannot share objects in shared memory in order to prevent race conditions. TheClean language (aHaskell-like language) uses this type system in order to gain a lot of speed[not specific enough to verify] while remaining safe.

[edit]Intersection types

Intersection types are types describing values that belong toboth of two other given types with overlapping value sets. For example, in most implementations of C the signed char has range -128 to 127 and the unsigned char has range 0 to 255, so the intersection type of these two types would have range 0 to 127. Such an intersection type could be safely passed into functions expecting either signed or unsigned chars, because it is compatible with both types.

Intersection types are useful for describing overloaded function types: For example, if "intint" is the type of functions taking an integer argument and returning an integer, and "floatfloat" is the type of functions taking a float argument and returning a float, then the intersection of these two types can be used to describe functions that do one or the other, based on what type of input they are given. Such a function could be passed into another function expecting an "intint" function safely; it simply would not use the "floatfloat" functionality.

In a subclassing hierarchy, the intersection of a type and an ancestor type (such as its parent) is the most derived type. The intersection of sibling types is empty.

The Forsythe language includes a general implementation of intersection types. A restricted form isrefinement types.

[edit]Union types

Union types are types describing values that belong toeither of two types. For example, in C, the signed char has range -128 to 127, and the unsigned char has range 0 to 255, so the union of these two types would have range -128 to 255. Any function handling this union type would have to deal with integers in this complete range. More generally, the only valid operations on a union type are operations that are valid onboth types being unioned. C's "union" concept is similar to union types, but is not typesafe, as it permits operations that are valid oneither type, rather than both. Union types are important in program analysis, where they are used to represent symbolic values whose exact nature (e.g., value or type) is not known.

In a subclassing hierarchy, the union of a type and an ancestor type (such as its parent) is the ancestor type. The union of sibling types is a subtype of their common ancestor (that is, all operations permitted on their common ancestor are permitted on the union type, but they may also have other valid operations in common).

[edit]Existential types

Existential types are frequently used in connection withrecord types to representmodules andabstract data types, due to their ability to separate implementation from interface. For example, the type "T = ∃X { a: X; f: (X → int); }" describes a module interface that has a data member of typeX and a function that takes a parameter of the same type X and returns an integer. This could be implemented in different ways; for example:

  • intT = { a: int; f: (int → int); }
  • floatT = { a: float; f: (float → int); }

These types are both subtypes of the more general existential type T and correspond to concrete implementation types, so any value of one of these types is a value of type T. Given a value "t" of type "T", we know that "t.f(t.a)" is well-typed, regardless of what the abstract type X is. This gives flexibility for choosing types suited to a particular implementation while clients that use only values of the interface type—the existential type—are isolated from these choices.

In general it's impossible for the typechecker to infer which existential type a given module belongs to. In the above example intT { a: int; f: (int → int); } could also have the type ∃X { a: X; f: (int → int); }. The simplest solution is to annotate every module with its intended type, e.g.:

  • intT = { a: int; f: (int → int); } as ∃X { a: X; f: (X → int); }

Although abstract data types and modules had been implemented in programming languages for quite some time, it wasn't until 1988 thatJohn C. Mitchell andGordon Plotkin established the formal theory under the slogan: "Abstract [data] types have existential type".[13] The theory is a second-order typed lambda calculus similar toSystem F, but with existential instead of universal quantification.

[edit]Explicit or implicit declaration and inference

Many static type systems, such as those of C and Java, requiretype declarations: The programmer must explicitly associate each variable with a particular type. Others, such as Haskell's, performtype inference: The compiler draws conclusions about the types of variables based on how programmers use those variables. For example, given a functionf(x, y) that addsx and y together, the compiler can infer that x and y must be numbers – since addition is only defined for numbers. Therefore, any call tof elsewhere in the program that specifies a non-numeric type (such as a string or list) as an argument would signal an error.

Numerical and string constants and expressions in code can and often do imply type in a particular context. For example, an expression3.14 might imply a type offloating-point, while[1,2, 3] might imply a list of integers – typically anarray.

Type inference is in general possible, if it is decidable in the type theory in question. Moreover, even if inference is undecidable in general for a given type theory, inference is often possible for a large subset of real-world programs. Haskell's type system, a version ofHindley-Milner, is a restriction ofSystem Fω to so-called rank-1 polymorphic types, in which type inference is decidable. Most Haskell compilers allow arbitrary-rank polymorphism as an extension, but this makes type inference undecidable. (Type checking is decidable, however, and rank-1 programs still have type inference; higher rank polymorphic programs are rejected unless given explicit type annotations.)

[edit]Types of types

A type of types is a kind. Kinds appear explicitly intypeful programming, such as atype constructor in the Haskell language.

Types fall into several broad categories:

[edit]Unified Type System

Some languages like C# have a unified type system. This means that all C# types including primitive types inherit from a single root object. Every type in C# inherits from the Object class.Java has several primitive types that are not objects. Java provides wrapper object types that exist together with the primitive types so developers can use either the wrapper object types or the simpler non-object primitive types.

[edit]Compatibility: equivalence and subtyping

A type-checker for a statically typed language must verify that the type of anyexpression is consistent with the type expected by the context in which that expression appears. For instance, in an assignment statement of the form x := e, the inferred type of the expression e must be consistent with the declared or inferred type of the variablex. This notion of consistency, called compatibility, is specific to each programming language.

If the type of e and the type ofx are the same and assignment is allowed for that type, then this is a valid expression. In the simplest type systems, therefore, the question of whether two types are compatible reduces to that of whether they areequal (or equivalent). Different languages, however, have different criteria for when two type expressions are understood to denote the same type. These differentequational theories of types vary widely, two extreme cases being structural type systems, in which any two types are equivalent that describe values with the same structure, and nominative type systems, in which no two syntactically distinct type expressions denote the same type (i.e., types must have the same "name" in order to be equal).

In languages with subtyping, the compatibility relation is more complex. In particular, if A is a subtype of B, then a value of typeA can be used in a context where one of type B is expected, even if the reverse is not true. Like equivalence, the subtype relation is defined differently for each programming language, with many variations possible. The presence of parametric or ad hoc polymorphism in a language may also have implications for type compatibility.

[edit]Programming style

Some programmers prefer statically typed languages; others prefer dynamically typed languages. Statically typed languages alert programmers to type errors during compilation, and they may perform better at runtime. Advocates of dynamically typed languages claim they better support rapid prototyping and that type errors are only a small subset of errors in a program.[14][15] Likewise, there is often no need to manually declare all types in statically typed languages with type inference; thus, the need for the programmer to explicitly specify types of variables is automatically lowered for such languages; and some dynamic languages have run-time optimisers[16][17] that can generate fast code approaching the speed of static language compilers, often by using partial type inference.[citation needed]

[edit]See also

[edit]References

  1. ^Cardelli 2004, p. 1: "The fundamental purpose of a type system is to prevent the occurrence ofexecution errors during the running of a program."
  2. ^Pierce 2002, p. 208.
  3. ^Infoworld 25 April 1983
  4. ^Brian Kernighan:Why Pascal is not my favorite language
  5. ^Pierce 2002.
  6. ^Xi, Hongwei; Scott, Dana (1998). "Dependent Types in Practical Programming".Proceedings of ACM SIGPLAN Symposium on Principles of Programming Languages (ACM Press): 214–227.CiteSeerX:10.1.1.41.548.
  7. ^Liskov, B; Zilles, S (1974). "Programming with abstract data types".ACM Sigplan Notices. CiteSeerX: 10.1.1.136.3043.
  8. ^Jackson, K. (1977). "Parallel processing and modular software construction". Lecture Notes in Computer Science54: 436–443. doi:10.1007/BFb0021435.http://www.springerlink.com/content/wq02703237400667/.
  9. ^Bracha, G.:Pluggable Types
  10. ^Rozsnyai, S.; Schiefer, J.; Schatten, A. (2007). "Concepts and models for typing events for event-based systems".Proceedings of the 2007 inaugural international conference on Distributed event-based systems - DEBS '07. pp. 62.doi:10.1145/1266894.1266904.ISBN 9781595936653.edit
  11. ^Martelli, Alex (26 July 2000). "Re: polymorphism (was Re: Type checking in python?)".Web link.
  12. ^Stefan Hanenberg. ”An experiment about static and dynamic type systems: doubts about the positive impact of static type systems on development time“. OOPSLA 2010.
  13. ^Mitchell, John C.; Plotkin, Gordon D.; Abstract Types Have Existential Type, ACM Transactions on Programming Languages and Systems, Vol. 10, No. 3, July 1988, pp. 470–502
  14. ^Meijer, Erik; Drayton, Peter."Static Typing Where Possible, Dynamic Typing When Needed: The End of the Cold War Between Programming Languages".Microsoft Corporation.http://research.microsoft.com/en-us/um/people/emeijer/Papers/RDL04Meijer.pdf.
  15. ^Eckel, Bruce. "Strong Typing vs. Strong Testing". Google Docs. http://docs.google.com/View?id=dcsvntt2_25wpjvbbhk.
  16. ^"Adobe and Mozilla Foundation to Open Source Flash Player Scripting Engine".http://www.mozilla.com/en-US/press/mozilla-2006-11-07.html.
  17. ^"Psyco, a Python specializing compiler".http://psyco.sourceforge.net/introduction.html.

[edit]Further reading

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值