Runtime Concepts adabe open source

最新推荐文章于 2018-08-19 11:43:13 发布

石头

最新推荐文章于 2018-08-19 11:43:13 发布

阅读量1.2k

点赞数

分类专栏： c/c++ 文章标签： semantic types reference algorithm inheritance instantiation

c/c++ 专栏收录该内容

76 篇文章 1 订阅

订阅专栏

[This page is being used to collaborate on a paper. You are welcome to read and comment on it, but consider it a work in progress, not a final product.]

Title: Runtime Concepts: Generic Programming and Runtime Polymorphism

Authors: Sean Parent, Mat Marcus, Peter Pirkelbauer

[hide]

Abstract 摘要

A key premise of Generic Programming is that algorithms can be expressed in terms of concepts and then applied to any model that satisfies these concepts. In C++, the application of Generic Programming has been largely limited to algorithms operating on collections which are statically and homogeneously typed, while Object Oriented Programming has been used where runtime polymorphism is required.

泛型编程一个关键的基础就是算法可以作用在概念(concepts)上,并且算法可以应用到所有符合这些概念的模型上。在C++中，泛型编程的应用主要被限制在算法操作一个静态的、相似类型的集合上。

A definition of runtime polymorphism in Generic Programming terms is presented along with techniques for efficiently implementing runtime polymorphism. The implementation techniques allow for existing STL algorithms to be used effectively with heterogeneous collections of types, further decoupling types from algorithms allowing for greater code reuse.

Introduction

General overview of development in GP: since LISP in 60ies [?] (?); popularized in 80ies and 90ies - STL implementation [MS88]. functional language approach [?]; unification of GP [DRJ05]; ConceptC++ [GJS+06].

Generic programming offers a number of abstraction benefits compared to otherprogramming paradigms.

Generic programming fosters regularity and value-semantics. Regularityessentially describes the semantics of built-in types and requires operationsfor construction, copy-construction, destruction, assignment,equality-comparison, and for a stricter definition also ordering. Built-in typesemantics is value-based and therefore regular semantics is inherently distinctfrom reference semantics. Regularity together with value-semantics easesreasoning about programs and allows the programmer to perform optimizations[DS98]. We extend the regular operations defined by [DS98] and add efficientoperations for swap and move. Each regular object is swapable by $std :: swap$ which relies on copy construction and assignment. The explicit swap operationallows avoiding the creation of temporary objects. Likewise, the addition ofmove allows the reuse of part objects that are about to be destructed.

Regular semantics particularly simplifies memory management because thelifetime of any value ends whenever its name goes out of scope or the lifetimeof its container ends. Furthermore, the C++ language requires the compiler tomanage the lifetime of temporary values. Consider the following loop:

template <typename T>
void saxpy(T[] x, T[] y, T a, size_t dim)
{
	for (int i = 0; i < dim; i++) {
		y[i] = a * x[i] + y[i];
	}
}

Both functions * and + create temporary values, which are used as input forsubsequent function calls (+ and = respectively), and automatically deallocatedas soon as the full expression has been evaluated [C++ Standard Draft 2003].Using reference/pointer semantics for type T instead would obviate those memorymanagement facilities and either require some automatic memory reclamationscheme or put the burden of releasing memory on the programmer.

Generic programming is concerned with providing algorithms and data structureson an abstract level without compromising efficiency. Hence, some algorithms canhave several implementations that differ in their concept requirements andefficiency. Since stricter concept requirements improve efficiency, selection ofthe best algorithm is concept-based. The current STL implementationdistinguishes concepts using associated types and relies on the C++ overloadresolution mechanism to select the optimal algorithm. Therefore, the selectedalgorithm is multi-variant in respect to its parameter types.

Generic programming does not make the relationship between concept and typestangible in the code. This weak relationship allows the programmer to focus onthe desired behavior of the class without restrictions from prematurely definedinterfaces. This typically results in shallow or no hierarchies, which donot disperse implementation across a number of classes. For example, the AdobeSource Libraries uses inheritance only for X classes. Should a class not meetthe requirements of a concept, models can be used to shape its behavior.

Using generic programming allows to cleanly separate data structures from algorithms.This is achieved by stepwise abstraction from concrete and efficient algorithmsto a more general but equally efficient implementation. For example, the C++ STL[?] introduces iterators, an abstraction for a position within a sequence, asbase for algorithms operating on sequential data structures.

The current C++ standard supports generic programming through functionoverloading and the template mechanism. While templates allow type-basedparameterizations of algorithms and classes, the support for concept definitionand - checking is limited. Recent research in this area provides first classconcept integration and also improves separate compilation [GJS+06]. Bymandating distinct instantiations for distinct parameter tuples, C++ ensuresoptimal performance but simultaneously hampers template use across DLLboundaries and data structures meant to store a heterogeneous set of data.

Motivation

Many software projects require working on heterogeneous collections of data. Typically, this polymorphism is expressed through inheritance. However, inheritance tightly couples a type with a set of operations which can be performed on it. This tight coupling is often manifest through implicit data structures and algorithms which hinder the ability to understand and reuse code. Because the polymorphic types can be of different sizes they must be allocated in the free store, this introduces the use of object factories, adds an additional level of indirection, and imposes a need to manage the free store memory. This in terms leads to the use of reference counted pointers and the requirement of controlling multiple references to objects in a threaded environment which negatively impact performance.

Outline

Body

Definition of Generic Programming

"By generic programming, we mean the definition of algorithms and datastructures at an abstract or generic level, thereby accomplishing many relatedprogramming tasks simultaneously" [MS88].

The ideal for generic programming is to represent code at the highest level of abstraction without loss of efficiency in both actual execution speed and resource usage compared to the best code written through any other means. The general process to achieve this is known as "lifting", a process of abstraction where the types used within a concrete algorithm are replaced by the semantic properties of those types necessary for the algorithm to perform.

Semantic Requirement

In the process of abstraction the tight binding from an algorithm to concretetypes is relaxed and replaced with a set of semantic requirements. Types mustsatisfy these requirements in order to work properly with a generic algorithm.Checking types against arbitrary semantic requirements is in generalundecidable. Thus, semantic requirements are stated in tables, in documentation,and may at times be asserted within the code. Some systems [? Alloy ModelChecker] allow to formally describe semantic requirements, but none of those ispart of current C++.[note: we could also mention axioms of the concept proposal].Instead, the compiler checks for the presence of syntactic constructs, which area part of semantic requirements. Consider the following example, which describesthe requirement that equality holds after a value x is copy constructedfrom another valuey of the same type T. The compiler will check anytype for the presence of the copy constructor and equality operator.

template <class T>
concept CopyEquality
{
	void foo(T x)
	{
		T y(x);
		assert(x == y);
	}
}

Concept

Dealing with individual semantic requirements would be cumbersome in practice.However, it is observed that sets of consistent requirements cluster intonatural groups which are known as "concepts". As an example, even the trivialrequirement for copy given above relies on an equality comparison. The notion ofcopy and equality are very tightly coupled. Although any collection ofrequirements may define a concept, only concepts which enable new classes ofalgorithms are interesting. It is also important to distinguish between thesyntactic and semantic requirements of concepts. As an example, an algorithm mayrequire that a type be copyable, which is part of the concept known as "Regular"- although the algorithm does not require that the types be equality comparable,we would say that it's defined for regular types because even if the typedoesn't implement equality comparison, the result of copying must be equivalentobjects. Such a type is said to be pseudo regular.

Model

Neither is it desireable nor possible to forsee all algorithms and their conceptrequirements with which a type would be used. Consequently, models areintroduced as decoupling mechanism allowing clean separation of concrete typesfrom concepts. The designer of a class should be able to focus on providing thecorrect semantics without distraction from potential requirements, some of whichhe might not even know. When the type is used in context of an algorithm a modelshapes its interface and behaviour to meet the concept requirements.

Algorithm

"The central notion is that of generic algorithms, which are parameterizedprocedural schemata that are completely independent of the underlying datarepresentation and are derived from concrete, efficient algorithms" [MS88].

Concept Refinement

A concept $C r$ augmenting another concept $C 0$ withadditional requirements is a concept refinement and denoted by $C r'' pred'' C 0$ .Therefore the number of types meeting the semantic requirements for $C r$ is smaller or equal compared to the number of types that meet the requirementsof $C 0$ . In turn, extending the semantic requirements increasesthe number of algorithms that can be expressed with a given concept. For example,RandomAccess-Iterator refines Bidirectional-Iterator and adds the requirementfor constant time random access ([]-operator), which allows writing a binarysearch algorithm.

Algorithm Refinement

Parallel to concept refinements, an algorithms $A 0$ can be refinedby another algorithm $A r$ to exploit the stronger conceptrequirements and achieve better space- and/or runtime- efficiency. For example,the complexity of reverse on Bidirectional-Iterator is O(n), while it is O(lg n)for Forward-Iterator (assuming less than O(n) memory usage).

Analogy to Algebraic Structures

We emphasize the analogy of our view of generic programming with algebraicstructures. At the core of algebraic structures is a set of axioms andderivation rules on which base useful theorems can be expressed. Concrete Modelsmap undefined terms onto real systems. The more independent axioms a systems hasthe more theorems can be expressed, but the fewer concrete models exist, which are consistentwith the axioms. The following table makes therelationship of generic programming with algebra explicit.

Table <?> : Generic Programming and Algebra
Generic Programming	Algebra
Semantic Requirement	Axiom
Concept	Algebraic Structure
Model	Model
Algorithms	Theorems
Function	Function

Generic Polymorphism

[This section is intended to introduce runtime-polymorphism based on concepts and to show that a concept based definition of runtime polymorphism is a super set of the C++ inheritance model of polymorphism.]

In the presented work, we replace a concrete type with a placeholder-typecalledruntime-concept when instantiating STL-containers. Analogue to thecompile-time definition, we define a runtime-concept as a number ofruntime-models, which satisfy a common set of requirements. Although thecontainers are type checked based on the runtime-concept, they can store anyobject whose type models that runtime-concept. Unlike the generic programmingparadigm, our implementation expresses the relationship between aruntime-concept and its models inside the C++ type system using inheritance.

Representing the concept-model relationship based on OOP polymorphism is not anovel idea and languages such as Eiffel [?], Java [?], and C# [?] implement thattechnique. Conversely, the C++ language designers have repeatedly explored andrejected to adopt this base-/derived- class scheme for compile time conceptchecking. The identified problems include intrusiveness, rigid signatures, typeproliferation, and performance issues [Str03].

Instead of requiring an explicit relationship between runtime-concept andconcrete type we apply the external polymorphism pattern (EPP) [CSH96].Therefore, a runtime-model $M$ is a template class that getsinstantiated with a concrete type $T$ . Its instantiation isorthogonal to the definition of concrete types. A programmer can specialize themodel-template for a particular type in order to provide adjustments to theinterface if needed.

The use of polymorphic objects is non regular andthus problematic. As a matter of fact, polymorphic classes alone cannot beregular but neither can regular classes alone be polymorphic. Consider a classhierarchy with two classes $Base, Derived$ and copy semantics.Writing the code in terms of copy construction would encode the type in thecode, but even using a clone function as illustrated in Fig. <?b> would notwork, as the returned object of Derived::clone would be subject to slicing.Conversely, returning an object by reference, would compromise value semanticsand regularity Fig<?a>.

// (a) Non-polymorphic regularity
Base Base::clone();
Derived Derived::clone();

// (b) Non-regular polymorphism
Base& Base::clone();
Derived& Derived::clone();

Our solution to this problem is a combination of both. We represent runtime-concepts through a regular ( $R 0$ ) and polymorphic layer ( $P 0$ ). The layers of the generic polymorphism pattern are shown in figure <1>. The regular-object leverages its polymorphic pendants and provides a regular interface for them. $R 0$ is a concrete class referring to some object that implement the interface ConceptInterface. Thus $R 0$ can be regarded as a placeholder type for template instantiations. $P 0$ is an abstract class defining the operations on the objects. The model $M 0$ is inherits from $P 0$ and is templatized with a concrete type $T$ . Its instantiation $M 0 (T)$ maps the operations defined in $P 0$ onto operations of $T$ .

Table <?> gives an overview how elements of concept definitions are represented within the regular and polymorphic layer.

Table <?> : Regular Type and Runtime Concept
Regular Type	Regular Layer (R)	Polymorphic Layer (P)
Default Constructor	R a	-
Copy Constructor	R(const R&)	virtual P& clone()
Destructor	~R()	virtual ~P()
Assignment	R& operator=(R&, const R&)	virtual void assign(const P&)
Equality	bool operator==(const R&, const R&)	virtual bool equal(const P&) const
Swap	swap(R&)	-
Move	move(R&)	-

We describe how elements of compile time concepts are mapped in theregular and polymorphic layer and present an example by means of theregularity concept.

Functions (Operators, Constructors, etc): In the regular layer, anysignature or use case pattern requirement can be directly represented by member-or free-standing functions. These functions typically forward calls to thepolymorphic layer, where these functions are realized as member functions.Notably, swap and move operations are exceptions to these rule and can beimplemented more efficiently by swapping (or moving) the pointer to thepolymorphic part.

Data Members: Consider the following concept member_x written in usagepattern style:

template <typename C>
concept member_x
{ 
	C c;
	c.x = 1;
}

The concept requires the presence of a data member x, where x has an assignmentoperator that allows an integer argument. Since the runtime-concept model isnon-intrusive and the concept C does not enforce a particular type for x, datamembers cannot be directly represented. Instead, we use property objects, which sole purpose isto virtualize member access and invoke the appropriate functions in thepolymorphic layer. An example for a property object in the regular layer isgiven by the next code fragment.

struct runtime_concept_member_x
{
	struct x_property
	{ void operator=(int);
	};

	x_property x;
};

The runtime-concept definition for data members of user defined types can bederived by applying the modeling rules recursively.

Associated Types: In the regular layer, associated types are directlyrepresented and identify another regular layer type in the case runtimepolymorphism is needed. Then, the polymorphic object-parts are constructed byfactory methods [GoF?] which create objects of different dynamic types.

Concept Refinements: In the generic polymorphism model, conceptrefinements $C r < : C 0$ are represented by regular $R 0, R r$ and polymorphic classes $P 0, P r$ . $R 0$ and $R r$ model the concepts $C 0$ and $C r$ respectively. A subclassrelationship $R r subclassR 0$ is possiblebut not required. $P 0$ and $P r$ are the polymorphiccounterparts of $R 0$ and $R r$ respectively. The requirement that $P r subtypeP 0$ allowsthe use of models $M r$ wherever $M 0$ is expected (i.e.: together with $R 0$ ).

Given our definition for Generic Programming, we can define runtime-polymorphism for Generic Programming to be a property of algorithms which operate on a collection of models where the types in the model are either partially or fully determined at runtime.

Relationship To Object Oriented Programming

In Object Oriented Programming the notion of inheritance is used to model an "is-a" relationship. The base class(s) provides a syntactic template for modeling a concept where the derived classes must provide the implementation satisfying the semantic requirements for the interface. Normally, type variance is only allowed for a single type (the type which is derived) and not from any affiliated types. Likewise, operations may only be type variant on the first parameter.

Degrees of Runtime Variability

In this paper we have developed our notion of polymorphism in genericprogramming and presented an implementation. We can classify alternativedesigns for heterogeneous containers according to the runtime variabilitythey permit.

Static

The elements of a container are statically and homogeneously typed. Thismodel is directly supported by C++ and results in very efficient machine codebut is inflexible in respect to types of objects. If heterogeneous containers areneeded, programmers typically turn to object oriented polymorphism and storepointers to objects.

Variant

In many applications, the set of types stored in a container is known atcompile-time. In these cases avariant type that is the set-union of allpotential data types suffices to implement the desired behavior. Hence, varianttypes, such as theBoost.Variantare similar to discriminated unions but overcome their limitation in regards tostoring non POD types. A priori information of all involved types enables todetermine the size of the variant at compile time, thus eliminating the need fora two layer architecture. In addition, the compiler can fully type check codeagainst the types inside the variant. The downside of this approach is thedefinition of assignment-semantics, when the types stored in the left-hand andright-hand object differ. Consider the following assignment:

x = y

Since, the actual type of x differs from y, its element has to be destructedbefore the assignment can be carried out. However, when the copy constructionfails x is left in an undefined state. A detailed discussion of this problem andits potential solutions is given byBoost:"Never-Empty" Guarantee

Open

Some application require to store also data-types, for example defined in DLLs,that are not statically known at compile- and link-time. Implementations, likethe presented one, which do not impose such restrictions are called open.Other existing designs includeBoost.Anyand [Dig05]. The boost implementation is similar to ours insofar as it usesthe virtual functions mechanism and EPP to achieve polymorphism. The operations supportedinclude copy-construction, assignment, destruction, safe-, and unsafe castoperations. Instead of having the polymorphic object allocated separately [Dig05]integrates them into one layer. Only, if the polymorphic part exceed some sizeit will be stored apart. Instead of virtual functions each object has a staticfunction table. Due to memory locality, reduced heap allocations, and the lowlevel implementation [Dig05] yields superior performance. However, the reuse ofmemory leads to similar problems described for the variant alternative.In contrast to our approach both implementations do not allow to customize theuser interface.

Open Variant

Open-variance is a hybrid of the variant and open techniques. To achievebetter performance some types are directly represented in the variant. Othertypes, which are either unknown at compile time, occur rarely, or are toolarge, can be represented by one of the open techniques.

Just-In-Time Virtualizing (Type Erasure)

Polymorphism Parameterized By Concept

Runtime Types

Refinement and Dispatching

Since the polymorphic layer (e.g.: $P r$ ) could model a morerefined concept than the regular layer it is bound too (e.g.: $R 0$ ), dispatching based on the concept implemented by the regular layer canlead to suboptimal results. Consider a random access container attached to aregular layer modeling a sequence. The complexity for lower_bound would be O(n)compared to O(lg n) if only the regular layer were considered. The runtime couldimprove even more if the operations of the concrete container $T$ were invoked, instead of the virtual functions defined in the polymorphic layer.

Our solution to this problem replaces the algorithm instantiation and calls adispatch mechanism instead. Based on the dynamic type of the model the dispatchmechanism invokes the most suitable function from a family. The function familyis comprised of instantiation(s) of an algorithm family $A 0, A r$ with either the concrete type $T$ , or one of the regular classes $R 0, R r$ . The dispatch mechanism guarantees the presence of algorithms that operate on $R 0$ . It is the responsibility of the programmer to add more efficient algorithms to the library.

The dispatch mechanism relies on the following propositions:

$P 0 inheritsP r$
$P r inheritsM r$ and therefore $P 0 inherits ... inheritsP r inheritsM r$ .
$foreachT$ there is only one model $M t$ within an inheritance graph rooted in $P 0$ .

Based on the first two propositions, the dispatcher walks the inheritance chainfrom the model $M r (T)$ over polymorphic representations of refinedconcepts $P r$ to the base class $P 0$ . The thirdproposition guarantees that any mapping between an algorithm instantiated with aconcrete type $A (T)$ and a model $M (T)$ is bijectivewithin a concept family. When the dispatcher finds an algorithm, it is typicallya stub that either further unwraps the model $M r (T) - > T$ orrewraps the polymorphic concept in the appropriate regular layer $P r - > R r$ before the call of the actual algorithm.

Performance Experiments

Open-Closed

Generic vs Generic-polymorphic

Related Work

[KLS04], [IN06]

Summary And Future Work

Runtime Type Functions

Runtime Type References

Acknowledgments

Bibliography

[DRJ05] Dos Reis, Gabriel; Järvi, Jaako: What is Generic Programming? LCSD'05.

[GJS+06] Gregor, Douglas; Järvi, Jaako; Siek, Jeremy; Stroustrup, Bjarne; Dos Reis, Gabriel; Lumsdaine, Andrew:Concepts: First-Class Language Support for Generic Programming in C++. to appear OOPSLA'06.

[MS88] Musser, David R.; Stepanov, Alexander A.: Generic Programming. ISSAC '88.

[Vel00] Veldhuizen, Todd: Five compilation models for C++ templates. TMPW '00.

[Str03] Stroustrup, Bjarne: Concept checking - a more abstract way to type checking. C++ Committee, Paper 1510, 2003.

[CSH96] Cleeland, C.; Schmidt, D.; Harrison, T.: External Polymorphism PLoPD '96.

[DS98] James C. Dehnert and Alexander Stepanov. Fundamentals of Generic Programming. In Report of the Dagstuhl Seminar on Generic Programming, volume 1766 of Lecture Notes in Computer Science, pages 1–11, Schloss Dagstuhl, Germany, April 1998.

[IN06] Igarashi, Atsushi; Nagira, Hideshi: Union Types for ObjectOriented Programming SAC '06.

[KLS04] Kiselyov, Oleg; Lämmel, Ralf; Schupke, Keean: Strongly Typed Heterogeneous Collections, Haskell '04.

[Dig05] Diggins, Christopher: An Efficient Variant Type. CUJ 2005.

Appendix

The Proxy Dilemma

In C++ the concept of a reference to a type is tightly bound to the specific type generated by the T& type operation. At times it is desirable to construct a user defined type which acts as a form of reference to another type. We refer to this as a proxy type.

The standard template library tries to allow for arbitrary proxy types by including "reference" as one of the traits for an iterator but this is insufficient. The problem becomes apparent when we look at the implementation of swap:

template <typename T>
void swap(T& x, T& y)
{
    T tmp(x);
    x = y;
    y = tmp;
}

If T is itself a proxy then this code will swap the two proxies, not the underlying values. Ideally what we would like is that the syntax T& would match "any type which is a reference to T" not generate a reference to the proxy type. This failure in type deduction leaves the copy construction of T ambiguous - in this case a copy of the value, not just the proxy, is desired. No actual harm occurs, however, until we assign through the reference.

The best that we are able to easily achieve is a proxy type which behaves as a reference when the referenced value is not mutable.

The original code for the C++ standard template library works, to a very limited extent, with proxies because few algorithms actually require full copy semantics (move is sufficient) and the code makes use of a function iter_swap() for swap operations which avoids the above problem with swap by using the iterators associated value type rather than trying to deduce the type from the proxy. However, the requirement for standard algorithms to be implemented in terms of iter_swap() was never stated and cannot be relied upon. The problem also isn't limited to only swap operations, any calls which take the proxy by mutable reference will fail.

To test the ideas presented in this paper with the standard algorithms we used the following scheme:

proxies maintain a reference count to the number of proxies to a value.
when assigning through a proxy if the reference count is greater than one, then a copy of the value is made and allother proxies referring to the value are set to refer to the copy.

This relies on the fact that it would be inefficient to make a copy of a value and then assign over the top of it. This is a very fragile and costly solution but it was sufficient to test the ideas in this paper. Solving the proxy dilemma properly in C++ is an open problem.

石头

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Runtime Concepts adabe open source

[This page is being used to collaborate on a paper. You are welcome to read and comment on it, but consider it a work in progress, not a final product.]Title: Runtime Concepts: Generic Programming a
复制链接

扫一扫