Writing a Portable Data Access Layer

原创 2004年07月23日 10:42:00

 <?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Writing a Portable Data Access Layer

Silvano Coriani
Microsoft Corporation

April 2004

Applies to:
   Microsoft? Visual Studio? .NET 2003
   Microsoft? .NET Framework 1.1
   Various RDBMS

Summary: Find out how to write smart applications that work transparently with different data sources, from Microsoft Access to SQL Server to Oracle RDBMS. (15 printed pages)


Using a Universal Data Access Approach
Working with Base Interfaces
Writing a Specialized Data Access Layer
Using Data Access Classes from the Other Layers
Some Possible Improvements


During the last six years of doing consulting work, I've heard this question about data access and manipulation many times, and it's become a real obsession: "How can I write my application so that it works with database servers x, y, and z with few or no changes?" Knowing that the data access layer is still the most critical part of a modern application, and usually the #1 enemy for inexperienced developers, my first reaction has always been: you can't!

People's scared faces and the question, "But what about the Universal Data Access mantra that Microsoft proposed with ADO?" prompt me to provide a more detailed explanation of the problem, and a suggested solution.

The problem is that everything goes well while your application is a small prototype, or if you have few concurrent users and simple data access logic, even if you choose the easiest way: the use of RAD tools, like the Data Environment in Microsoft? Visual Basic? 6.0, or some "all-in-one" solutions like the ActiveX? Data Control and other third-party components, which usually hide the complexity of the interaction between your application and a specific data source. But when the number of users grow, and concurrency becomes more of an issue, a lot of performance problems can appear due to the underlying use of dynamic record sets, server-side cursors, and unnecessary locking policies. The design and code changes you'll have to make to the system to reach your users' goals will then cost you a lot more because you didn't take this problem into consideration from the beginning.

Using a Universal Data Access Approach

Microsoft launched the Universal Data Access campaign when ADO entered maturity in MDAC (Microsoft Data Access Commponents version 2.1). The idea was to show developers that with a simple object model (Connection, Command and Recordset), they could write an application that could connect with a wide set of different data sources, in both relational and non-relational form. What the documentation—and the majority of the articles and samples at that time—usually forgot to say was that even using the same data access technology, the programmability and the characteristics of the various data sources were very different from one another.

The net result was that in applications which needed data from several sources, it was easiest to use the "common denominator" of functionalities provided by all data sources, thereby missing the benefits of using data source-specific options that could provide an optimized way to access and manipulate information inside the various RDBMSs.

What always made me skeptical about this approach was that, after a more detailed analysis of the problem with my customers, we usually agreed the portion of the application that interacted with the data source was a very small one, when compared to the rest of the presentation and business logic. By doing a good job with a modular design, it was possible to isolate the RDBMS-specific code in some easily interchangeable modules, and thereby avoid the "one-size fits-all" approach to our data access. Instead, we could use very specific data access code (using stored procedures, command batches and other features, depending on the data source), without touching the majority of the other application code. This always serves as a reminder that correct design is the key to writing portable and efficient code.

ADO.NET brings some important changes into the data access coding arena, like the concept of specialized .NET data providers. Using a specific provider, you can get an optimized way to reach your data sources, bypassing the [very] rich—but sometimes unnecessary—series of software interfaces and services that the OLE DB and ODBC layer interposed between your data access code and the database server. Still, every data source has different characteristics and features, with different SQL dialects, and to write efficient applications you must still use these specific characteristics instead of a "common denominator". From the point of view of portability, managed and unmanaged data access technologies are still very similar.

Outside of "Leverage the unique characteristics of the data source," the other rules necessary to write a good data access layer are usually the same with every data source:

·                     Use a connection pooling mechanism, where possible.

·                     Take care with the limited resources of a database server.

·                     Pay attention to the network round-trips.

·                     Promote the reuse of execution plans and avoid recompilations, where applicable.

·                     Use an adequate locking model to manage concurrency.

In my personal experience using the modular design approach, the amount of code in a complete application which is dedicated to working with a particular data source is not more than 10% of the total. Obviously, this is more complex than just changing the connection string in a configuration file, but I think you'll find that it is a tolerable compromise in return for the performance benefits.

Working with Base Interfaces

Our goal here is to use abstraction, and encapsulate the code specific to a particular data source in a layer of classes that let the rest of the application be independent, or decoupled, from the database server in the backend.

The object-oriented characteristics of the .NET Framework will help us during this process, giving us the opportunity to choose which level of abstraction we want to use. One option is to use the base interfaces that every .NET Data Provider has to implement (IDbConnection, IDbCommand, IDataReader, etc). Another is to create a set of classes—the data access layer—that manage all the data access logic for the application (using the CRUD paradigm, for example). We will examine these two possibilities, starting from a sample order-entry application, based on the Northwind database, to insert and retrieve information from different data sources.

Data provider base interfaces identify the classic behaviors that an application usually requires to interact with a data source:

·                     Define a connection string.

·                     Open and close a physical connection to the data source.

·                     Define a command and related parameters.

·                     Execute the different kind of commands you can create.

·                                Returning a set of data.

·                                Returning a scalar value.

·                                Executing an action on data without returning anything.

·                     Provide a forward-only and read-only access to the returned data set.

·                     Define a set of operations to keep in sync a data set with the content of the data source (a data adapter).

That being said, however, if we encapsulate the various operations needed to retrieve, insert, update and delete information in different data sources (using different data providers) in our data access layer, and only expose members of the base interfaces, we can reach a first level of abstraction—at least from a data provider point of view. Let's take a look at some code illustrating this idea:

The point of this class is to hide, from the upper levels of the application, the details regarding the creation of instances of a particular type coming from a specific data provider, the application can now interact with a data source using the generic behaviors exposed through the base interfaces.

Let's look at how to use this class from the rest of the application:

In the GetCustomers() method of our CustomerData class we can see how, by reading information from a configuration file, it's possible to use the DataFactory class to create an XxxConnection instance with a particular connection string, and write the rest of the code with no particular dependency on the underlying data source.

An example of a business layer class that interacts with our data layer could look like this:

So, what's wrong with this approach? The problem here is there's just one important detail that ties the code to a particular data source: the SQL syntax of the command string! In fact, writing your app this way, the only thing you can do to make it portable is to adopt a base SQL syntax that can be interpreted by any of your data sources, thereby losing any chance to benefit from the specific functionality of a particular data source. This could be a small problem if your application has to do only very simple and standard operations over the data, and if you don't want to use advanced functionality (XML support, for example) in a particular data source. Usually, though, this approach will result in poor performance, since you cannot use the optimized features of each data source.

Writing a Specialized Data Access Layer

Consequently, the use of base interfaces only is not enough to provide an acceptable level of abstraction from the different data sources. In this situation, a good solution could be to "raise the bar" of this abstraction, creating a set of classes (e.g. Customer, Order, etc.) to encapsulate the use of a specific data provider, and exchanging information with the other levels of the application through data structures not related to a particular data source; a typed DataSet, an object collection, etc.

This layer of specialized classes can be created inside a particular assembly, one for every supported data source, and can be loaded on demand from the application, following instructions in a configuration file. In this way, if you want to add a brand new data source to your application, the only thing you have to do is implement a new set of classes, respecting the "contract" defined in the common set of interfaces.

Let's see a real example: If we wanted to support both Microsoft? SQL Server? and Microsoft? Access as data sources, we would create two different projects in Microsoft? Visual Studio? .NET, one for each data source.

The one for SQL Server would look like this:

The code for data retrieval from Microsoft? Access would look like this:

The CustomersData classes implement the IdbCustomers interface. When we need to support a new data source, we only have to create a new class that implements this interface.

An interface of this type can look like this:

We can create private or shared assemblies to encapsulate these data access classes; in the first case, the assembly loader will search for the one we specify in the configuration file inside the AppBase folder, or in a child directory using the classic probing rules. If we have to share these classes with other applications, we can put these assemblies in the global assembly cache.

Using Data Access Classes from the Other Layers

These two almost identical CustomersData classes are contained in two different assemblies that the rest of the application will use. Through the following configuration file, we can now specify which assembly to load and which data source to target.

An example of a possible configuration file would be something like this:

We have to specify two pieces of information inside this file. The first one is a canonical connection string; to have the opportunity to change, for example, the name of the server, or some other parameter for the connection. The second is the fully qualified name of the assembly that the upper layer of the application will load dynamically to find the class to use with a particular data source:

Let's look at this portion of code too:

As you can see, the assembly loads using the name read from the configuration file, and creates and uses an instance of the CustomersData class.

Some Possible Improvements

To see an illustration of the approach I'm suggesting, see the .NET Pet Shop v3.0 sample application. I'd recommend downloading the sample and taking an in-depth look at it—not just for portability issues, but also for other interesting areas like caching and performance optimization.

An important area on which to focus your attention during the design of the data access layer for a portable application is how to pass the information back and forth with the other layers. In my example, I simply use a generic DataTable instance; in a production scenario you might want to consider a different solution, based on what kind of data you have to represent (do you have to deal with hierarchy, etc.). I don't want to reinvent the wheel here, and my suggestion is to take a look at the Designing Data Tier Components and Passing Data Through Tiers guide that describes very well the different scenarios and the benefits of the recommended solutions.

As I said in the introduction, the particular features that your targeted data sources expose—as well as the overall data access—should be considered during the design phase. This should cover such things as stored procedures, XML serialization, and so forth. Regarding Microsoft? SQL Server? 2000, you can find a discussion of how to optimally use these features in the .NET Data Access Architecture Guide, which I strongly suggest you read.

I always receive a lot of requests about the Data Access Application Block and how it is related to the arguments I'm describing in this article. These .NET classes act as a layer of abstraction over the SQL Server .NET Data Provider, and let you write more elegant code to interact with the database server. This is an idea of what you can do:

There's also an extrapolation of this approach available in the open source Data Access Block 3.0 (Abstract Factory Implementation) sample that you can find on GotDotNet. This release implements the same abstract factory pattern, and lets you use different data sources based on the available .NET Data Providers.


You should now be able to build business logic classes that don't require modification based on the choice of a particular data source, yet allow you to exploit the unique features of the given data source to obtain a more optimized result. This comes with a cost; we have to implement multiple sets of classes to encapsulate the low-level operations for a particular data source, together with all the programmable objects that we build for every specific data source (stored procedures, functions, etc.). If you want performance and portability, however, this is the price you have to pay. Based on my practical experiences, it's worth it!


关于OpenTSDB的Writing Data数据写入

摘自官网: pushing data over the Telnet or HTTP APIs, or use an existing tool with OpenTSDB support such ...
  • ws0owws0ow
  • ws0owws0ow
  • 2016年12月20日 14:28
  • 626

【caffe源码研究】第三章:源码篇(9) :DataLayer

先从最基础的Data层讲起。 看看datalayer相关的类的继承关系首先定义了一个template class Batch { public: Blob data_, label_; };...
  • fangjin_kl
  • fangjin_kl
  • 2017年01月06日 00:30
  • 350

Spring 框架参考文档(四)-数据访问之 Data access with JDBC

Spring 框架参考文档(四)-数据访问之(Data access with JDBC) Part IV. 数据访问 这部分参考文档介绍有关于数据访问...
  • xiangjai
  • xiangjai
  • 2016年12月31日 11:40
  • 841

caffe源码学习(五) data layer

  • u011104550
  • u011104550
  • 2016年04月30日 09:55
  • 4420

caffe 的layer的参数说明

最近在学习caffe做实验 今天就记录一下layer的参数及这些常用的参数的解释吧主要还是参考官方网站 http://caffe.berkeleyvision.org/tutorial/laye...
  • Losteng
  • Losteng
  • 2016年03月08日 15:31
  • 7027


作者:JackGao24 博客园 作者:JackGao16 CSDN 文章链接:http://blog.csdn.net/u013108511/article/details 邮箱:gsh...
  • u013108511
  • u013108511
  • 2017年08月07日 09:29
  • 247


  • andrewgithub
  • andrewgithub
  • 2017年12月13日 11:13
  • 89


文件夹建立 mongodb@bd-qa-mongodb-85:/opt/app/mongodb$ls config  data  keyfile  log  mongodb-linux-x86_6...
  • u010522235
  • u010522235
  • 2016年05月28日 13:36
  • 1650


  • tianrolin
  • tianrolin
  • 2016年09月13日 08:27
  • 1795


  • xueyunf
  • xueyunf
  • 2015年07月22日 19:53
  • 3555
您举报文章:Writing a Portable Data Access Layer