准确性 敏感性 特异性_如何掌握类型特异性的艺术

准确性 敏感性 特异性

Do more specific definitions result in less flexibility?

更具体的定义会导致灵活性降低吗?

In this post I will try to avoid the debate about strong/static vs. weak/dynamic types (what more could possibly be said?), or even schema vs. schema less data structures. Instead, I want to focus on the degree of granularity of type definitions: what are the effects and trade-offs?

在这篇文章中,我将尽量避免有关强/静态弱/动态类型(还有什么可以说的)的争论,甚至避免架构架构较少的数据结构的争论。 相反,我想关注类型定义的粒度:影响和权衡是什么?

On the one end of the spectrum, very generic definitions encompass potential properties and behavior of objects. On the other end, you have a rich hierarchy of types, of which some are only subtly different from some other.

一方面,非常笼统的定义涵盖了对象的潜在属性和行为。 另一方面,您拥有丰富的类型层次结构,其中某些类型与其他类型只有细微的差别。

I will touch upon duck typing, SQL table-per-type (TPT) and table-per-type-hierarchy (TPH) concepts, and parameterized APIs.

我将介绍鸭子类型,SQL表类型(TPT)和表类型级(TPH)概念以及参数化API。

When you think of generic types you might think of the Document Object Model (DOM), schemaless XML or YAML, literal objects in JavaScript, or NoSQL database documents. These are broadly generic, in that there are minimal constraints on structure, relations, and content.

当您想到泛型类型时,您可能会想到文档对象模型(DOM),无模式XML或YAML,JavaScript中的文字对象或NoSQL数据库文档。 这些大致上是通用的,因为对结构,关系和内容的约束最小。

Instead, let’s discuss user-defined types. They may or may not be enforced by the program language or a schema, but there will be constraints, assumed or otherwise, in the code that deals with them. Let’s use Vehicle as an analogy.

相反,让我们讨论用户定义的类型。 它们可能由程序语言或模式强制执行,也可能未强制执行,但是处理它们的代码中会假设或以其他方式存在约束。 让我们以Vehicle为例。

车辆 (Vehicle)

A vehicle is a broad concept. Even if we confine discussion to wheeled vehicles, that covers everything from tricycles to semi-trucks. Could you encompass the spectrum of properties and behaviors of those tricycles, cars, and semis in one type? Yeah, you could. Clearly, that’s going to present some problems when handling Vehicle instances in the program code.

车辆是一个广义的概念。 即使我们只讨论轮式车辆,也涵盖了从三轮车到半卡车的所有内容。 您能以一种类型包含这些三轮车,汽车和半自动汽车的性能和行为范围吗? 是的,你可以 。 显然,在程序代码中处理Vehicle实例时,这将带来一些问题。

车辆类型 (The Vehicle Type)

Possible properties and methods of a Vehicle:

车辆的可能特性和方法:

  • tires

    轮胎

    * number

    *号

    * type [pneumatic, other]

    *类型[气动,其他]

  • seats

    座位

    * number

    *号

    * padded [boolean]

    *填充[布尔值]

  • steering [wheel, handlebars]

    转向[方向盘,车把]
  • engine

    发动机

    * type [none, gas, diesel]

    *类型[无,汽油,柴油]

    * number of cylinders [only if type is gas or diesel]

    *气缸数(仅当类型为汽油或柴油时)

  • drive()

    驾驶()
  • fuel()

    汽油()
  • lights[on|high|off]

    灯[开|高|关]

With even this minimal set of properties, the Vehicle type covers a huge domain and presents some challenges, data integrity being one of them. If my Vehicle is a trike, I don’t have an engine. If I don’t have an engine, the property number of cylinders is meaningless. If I have a trike with no engine, but number of cylinders > 0, is that an error?

即使只有这些最小的属性集,“车辆”类型也涵盖了一个巨大的领域并提出了一些挑战,其中之一就是数据完整性。 如果我的车辆是三轮车,那么我没有发动机。 如果我没有引擎,则为number of cylinders 的属性number of cylinders number of cylinders 是没有意义的。 如果我的三轮车没有发动机,但是number of cylinders > 0 ,那是错误的吗?

I can fuel a car or truck, but not a tricycle. What happens if fuel() is called on a tricycle instance? Throw an Error? It is possible that some application logic is confused, but can the request be handled gracefully as a no-op?

我可以给汽车或卡车加油,但不能给三轮车加油。 如果在三轮车实例上调用fuel()会发生什么? 抛出错误? 某些应用程序逻辑可能会混淆,但是可以将请求作为无操作者正常处理吗?

The one perceived advantage to Vehicle is that it is flexible. If we instead split up Vehicle into subclasses MotorVehicle and PedalVehicle, we might put the following in MotorVehicle but not PedalVehicle:

Vehicle的一个明显优势就是灵活性。 如果改为将Vehicle分为MotorVehiclePedalVehicle子类,则可以将以下内容放入MotorVehicle中,而不是PedalVehicle中:

  • steering [wheel]

    方向盘]
  • engine

    发动机

    * type [gas, diesel]

    *类型[汽油,柴油]

    * number of cylinders

    *气缸数

  • fuel()

    汽油()
  • lights[on|high|off]

    灯[开|高|关]

This seemingly makes sense. It is conceivable, though, that a tricycle has lights. It may not have an gas or diesel engine (not a kid’s trike, anyway), but it could have an electric engine. If these cases arise, then there’s some refactoring to do.

这似乎是有道理的。 但是可以想象,三轮车有灯光。 它可能没有汽油或柴油发动机(无论如何不是儿童三轮车),但可能有电动发动机。 如果出现这些情况,则需要进行一些重构。

In some languages or data management systems, you can define interfaces, and compose concrete types that fulfill those interfaces. So, you might have IEnginedVehicle, which might have related interfaces IElectricVehicle and InternalCumbustionVehicle (which in turn might be broken down into IGasVehicle and IDieselVehicle).

在某些语言或数据管理系统中,您可以定义接口,并组成满足这些接口的具体类型。 因此,您可能拥有IEnginedVehicle,它可能具有相关的接口IElectricVehicle和InternalCumbustionVehicle(它们又可能细分为IGasVehicle和IDieselVehicle)。

Interfaces are cheap to define, and good at annotation concepts, but they’re not a complete solution. Some interfaces can be incompatible with others: can a truck be both an ice cream truck and a pizza delivery truck? I suppose, if you want cold pizza or warm ice cream.

接口定义起来很便宜,并且擅长注释概念,但是它们并不是一个完整的解决方案。 有些界面可能与其他界面不兼容:卡车既可以既是冰淇淋卡车又可以是披萨送货卡车? 我想,如果您要冷比萨饼或热冰淇淋。

Aside from that, more specificity boxes you in, and requires you to have some foreknowledge of the all types of vehicles you will encounter.

除此之外,您还需要更多专一性,并且要求您对将要遇到的所有类型的车辆有所了解。

It’s the exceptions that are going to get you as time marches on.

它是会得到你的时间游行例外

For this reason, especially when the domain is broad and in flux, it can be tempting to define vehicle entities less specifically, initially. You want to be open to anything that comes down the pike (pardon the pun).

因此,尤其是在领域很广且不断变化的情况下,可能会很容易在最初就不太明确地定义车辆实体。 您想对落在长矛上的任何东西开放(对双关语)。

针对通用类型进行编码 (Coding against generic types)

On the coding side, there can be no assumptions about what Vehicle is. You must check every property for existence. Methods that exist may be meaningless for the specific entity that is represented by Vehicle. Your best bet is to have your code assume nothing. That makes testing a challenge, though. How can you possibly encompass all reasonable Vehicle configurations in your tests?

在编码方面,不能假设什么是车辆。 您必须检查每个属性是否存在。 对于由Vehicle表示的特定实体,存在的方法可能没有意义。 最好的选择是让您的代码不承担任何责任。 但是,这使测试成为挑战。 您如何在测试中包含所有合理的车辆配置?

On the other hand, you have a pretty flexible system; that is, if no assumptions creep into your code (more about this in “Why a duck?”).

另一方面,您拥有一个非常灵活的系统; 也就是说,如果您的代码中没有任何假设(有关更多信息,请参见“ 为什么选择鸭子 ?”)。

Too much specificity requires constant adjustments to the type model, including decisions of what the taxonomy of inheritance is, what property goes at what level, and potential difficulty in changes to the model when they affect not just code at the data layer, but the presentation layer as well. If you get it way wrong (due to rushed analysis), you have a lot of continuous rework.

太多的特异性要求对类型模型进行不断的调整,包括决定什么是继承分类法,什么属性在什么级别进行,以及当模型更改不仅影响数据层的代码,而且影响表示时,更改模型的潜在困难。层。 如果您弄错了方法(由于匆忙进行分析),则需要进行大量连续的返工。

类型及其属性 (Types and their properties)

If you buy a grab box of stuff from an online novelty store, you can expect a box. You have a vague idea of what it contains, but you won’t know until you open it and sort out each item one-by-one. The burden is on you, the client, and there are limited assumptions you can make (one might hope for a rubber chicken, but no guarantee!).

如果您是从在线新奇商店购买物品抓斗箱 ,则可以预料到。 您对包含的内容有一个模糊的想法,但是直到您将其打开并逐一整理每个项目,您才知道。 客户和您的负担很重,您可以做的假设有限(一个人可能希望得到一只橡皮鸡,但不能保证!)。

A first aid kit has a narrower range of possibilities as to what it contains. It’s a more specific type of object, and you can make assumptions as to its content and proceed accordingly. It’s going to contain gauze and bandages. It will have antiseptic, and probably pain relievers. For stuff that it might contain, you at least have a better idea what to look for.

急救箱所含物品的范围较窄。 它是对象的一种更特定的类型,您可以对其内容进行假设并据此进行。 它将包含纱布和绷带。 它将具有防腐剂,并且可能会减轻疼痛。 对于其中可能包含的内容,您至少要更好地了解要查找的内容。

为什么是鸭子? (Why a duck?)

Duck typing operates by incidence rather than declaration. Program logic revolves around interrogation of an object: “By the way, do you have property A? Do you have method B?…”.

鸭子的分类是根据发生率而不是声明来进行的。 程序逻辑围绕着对对象的询问:“顺便说一下,您是否具有属性A? 您有方法B吗?……”。

Actions are performed based on responses to the interrogation. If it walks like a duck, quacks like a duck and has feathers, then it is probably a duck. Logic that is based on duck typing really doesn’t care, duck or no, because it assumes nothing; it operates on what it finds.

基于对询问的响应来执行动作。 如果它走路像鸭子,嘎嘎像鸭子,并且有羽毛,那么它很可能就是鸭子。 基于鸭子类型的逻辑实际上并不在乎,无论鸭子还是不,因为它什么都不做。 它根据发现的结果进行操作。

Yet assumptions will creep into any software logic that thinks it’s getting what it expects. Perhaps as much as 50% of software maintenance involves fixing incorrect assumptions or refining the ones that are there.

然而,假设会渗入任何认为正在达到预期效果的软件逻辑中。 也许多达50%的软件维护涉及修正错误的假设或完善其中的假设。

鸭子打字和第一React者 (Duck typing and the first responder)

Say I have a fire in my kitchen and call an emergency number. The first responder has a badge, helmet, and arrives in a vehicle with siren and flashing lights. Yay! The fireman! My house is saved. I command, pointing to the kitchen: “Put out that fire!”

假设我的厨房着火了,请拨打紧急电话。 第一响应者戴着徽章,头盔,并带着警笛和闪烁的灯光到达车辆。 好极了! 消防员! 我的房子已保存。 我指着厨房命令:“扑灭那火!”

The policeman looks at me quizzically.

警察好奇地看着我。

I did all my duck typing interrogation, but reached the wrong assumption. Maybe the city recently decided policemen should respond to fire alarms if nearby, to aid the firemen.

我做了所有的鸭子打字询问,但是得出了错误的假设。 也许是城市最近决定,如果附近有警员,应该对火警做出React,以帮助消防员。

I now have to add to my list of questions: “Do you put out fires?”

我现在必须在我的问题列表中添加:“你灭火了吗?”

属性,鉴别符和命名类型的 (Of properties, discriminators, and named types)

Duck typing is extremely flexible, but your code must deal with each object as if it could be anything. Instead of interrogating all properties, though, you can add a special discriminator property that identifies the type of object your code is receiving. One interrogation, and you're off to the races. Of course, the object has to have the correct discriminator value.

鸭子键入非常灵活,但是您的代码必须像对待任何对象一样处理每个对象。 但是,您可以添加一个特殊的鉴别器属性来标识您的代码正在接收的对象类型,而不是询问所有属性。 一次审讯,您就可以参加比赛了。 当然,对象必须具有正确的鉴别值。

A named type is less likely to cause you problems, as types are assigned at object creation. In a weakly typed language, such as Javascript, things may not be as they seem, but you’re somewhat safer assuming.

由于类型是在对象创建时分配的,因此命名类型不太可能引起您的问题。 在弱类型语言(例如Javascript)中,事情可能看起来并不像看起来那样,但是您的假设会更安全。

Still, discriminators or types don’t really address the problem of specificity. The good old Object type doesn’t say much about its instances. It is a type, it does make some guarantees, but doesn’t do much by itself.

尽管如此,区分符或类型并没有真正解决特异性问题。 好的老式Object类型对其实例并没有多说。 这是一种类型,它确实可以保证某些功能,但是它本身并不能做很多事情。

You can pass an object literal to a method, but the method must either 1) assume what it is getting, or 2) be prepared to find out.

您可以将对象文字传递给方法,但是该方法必须1)假定其要获取的内容,或2)准备进行查找。

Maintaining code that handles generic types can be an exercise in aggravation: while you can see what the client code might do, to know what it will do requires the specifics of the data it is handling.

维护处理通用类型的代码可能会很麻烦:虽然您可以看到客户端代码可能会做什么,但是要知道客户端代码要做什么, 需要它正在处理的数据的详细信息。

A debugger helps, but if your breakpoint is buried far down in the call stack, or is in response to a callback, good luck! You may have some heavy excavating to do to know how you got where you are, logic-wise.

调试器可以提供帮助,但是如果您的断点被埋在调用堆栈的下方,或者是响应回调的,那么祝您好运! 在逻辑上,您可能需要进行大量的挖掘工作才能知道自己的位置。

表类型和表类型层次结构 (Table-per-Type and Table-per-Type-Hierarchy)

Relational databases run into this issue as well. If a table represents a type of thing, are all rows in the table type-homogenous? Or could each row reflect a more specific type, and the table represents a supertype of those things?

关系数据库也遇到了这个问题。 如果表表示事物的类型, 那么表中的所有行是否都是同类型的 ? 还是每一行都可以反映一个更具体的类型,而表代表这些东西的超类型?

In the first case (table-per-type, or TPT), each column in each row is guaranteed to contain a valid value (NULL may be valid). Your code can anticipate query results that are consistent in their uniformity.

在第一种情况(每类型表或TPT)中,保证每行中的每一列都包含一个有效值(NULL可能是有效的)。 您的代码可以预期查询结果的一致性。

In the second case, some columns or column values may be valid for some types (rows) but not for others. This is table-per-type-hierarchy, or TPH.

在第二种情况下,某些列或列值可能对某些类型(行)有效,但对其他类型无效。 这是每个类型的表层次结构(TPH)。

A TPH table is a loosely defined type. The integrity of column values in each row is up to program logic. If I have a table called Vehicle containing data for all vehicles in my domain, then the column “oil weight” isn’t going to be applicable for rows representing tricycles.

TPH表是一个松散定义的类型。 每行中列值的完整性取决于程序逻辑。 如果我有一个名为Vehicle的表,其中包含我域中所有车辆的数据,则“机油重量”列将不适用于代表三轮车的行。

The burden is now on the client code to understand the various possible types of vehicles in the Vehicle table, and perform logic accordingly. This is very similar to the case of a duck typed object, where properties may or may not be applicable for each instance of the generic type.

现在,将重担放在客户代码上,以了解“车辆”表中各种可能的车辆类型,并相应地执行逻辑。 这与鸭子类型对象的情况非常相似,后者的属性可能适用于或可能不适用于泛型的每个实例。

模式,有人吗? (Schema, anyone?)

Does a schema (or other type system) take care of this problem? Well, no. As just shown, a TPH schema in a relational database can represent a super-type entity, but the rows may each define more specific entities. A discriminator column value can help sort out the subtype of each row, but it has to be checked in program logic.

模式(或其他类型系统)是否可以解决此问题? 好吧,不。 如刚刚所示,关系数据库中的TPH架构可以表示一个超类型实体,但各行可以定义更具体的实体。 鉴别符列值可以帮助分类每一行的子类型,但是必须在程序逻辑中检查它。

The main benefit of using TPH is avoiding a huge schema with many tables, and lessening the number of joins required to pull together data for a type instance. There are always trade-offs to any approach.

使用TPH的主要好处是避免了包含许多表的庞大模式,并减少了将类型实例的数据汇总在一起所需的联接数。 任何方法都必须权衡取舍。

参数列表和选项 (Parameter lists and options)

Method parameters are another issue. The most common case is where parameter type is defined by order of occurrence:

方法参数是另一个问题。 最常见的情况是参数类型按出现顺序定义:

function circle(int x, int y, double radius){…}

or

要么

function circle(Position xy, double radius){…}

Arguments defined this way are locked-in: you can’t pass a boolean to radius, for instance. In JavaScript, there are no typed parameters, so most functions assume the type based on order of occurrence.

以这种方式定义的参数是锁定的:例如,您不能将布尔值传递给radius。 在JavaScript中,没有类型化的参数,因此大多数函数根据出现的顺序来假定类型。

Not only is the type of parameter known (by declaration) or assumed (by convention), the number of parameters dictates how the method is called.

参数的类型不仅是已知的(通过声明)还是假定的(通过约定),而且参数的数量决定了如何调用该方法。

I always feel a slight annoyance whenever I want to dump some formatted JSON to the console, and have to type JSON.stringify(obj, null, 4). That second argument, which is seldom used, is for the replacer parameter.

每当我想将一些格式化的JSON转储到控制台并不得不输入JSON.stringify(obj, null , 4)我总是会感到JSON.stringify(obj, null , 4) 。 很少使用的第二个参数是replacer参数。

选件 (Options)

In JavaScript, you can pass an object literal as an argument, and this is often used as a named parameter list. Named parameters are more flexible than an argument list, and for more complex methods they can be very useful.

在JavaScript中,您可以将对象文字作为参数传递,并且通常用作命名参数列表。 命名参数比参数列表更灵活,对于更复杂的方法,它们可能非常有用。

function circle(options) {
    const {x, y, radius, ...rest} = options;
    if (rest.linewidth) {...}
    if (rest.fillColor) {...}
    ...
}

Flexible, yes, but a lot of interrogation. Plus, the arguments x, y, and radius are assumed to be there. Best practice seems to be to mix the type-specific parameter list with the more “generic” object literal:

灵活,是的,但是要进行很多询问。 另外,假设参数x, yradius在那里。 最佳实践似乎是将类型特定的参数列表与更“通用”的对象文字混合在一起:

function circle(x, y, radius, options){...}

Where options is typically understood to refer to an object whose properties are documented.

通常将选项理解为指的是其属性已记录对象

该怎么办? (What to do?)

Few practices in software are wholly good or bad (GOTO being the exception[?]). A rigid, type-rich system will no doubt prevent some coding errors, even if those types are not strongly enforced by the language or database. Code that uses specific types is more readable.

很少有软件实践完全是好是坏(GOTO是例外[ ])。 严格的,类型丰富的系统无疑会防止某些编码错误,即使这些类型不是语言或数据库强制执行的也是如此。 使用特定类型的代码更具可读性。

On the other hand, a stringent type hierarchy represents metadata that has to be maintained, and oftentimes the client knows what it is requesting and knows what it will receive. Dotting every “i” and crossing every “t” just for the sake of data transfer between two internal methods at times seems like bookkeeping work.

另一方面,严格的类型层次结构表示必须维护的元数据,并且通常客户端通常知道其请求的内容并知道将接收的内容。 仅仅为了在两个内部方法之间进行数据传输而对每个“ i”进行点划线,并跨过每个“ t”似乎是簿记工作。

There is no right answer, and most programmers use types of varying (or no) specificity. A lot depends on the domain. If you’re writing code for a financial system, it would seem you’d want a rich and rigid set of type definitions; however, I understand some financial systems are written in MUMPS, so what do I know?

没有正确的答案,大多数程序员都使用变化(或没有)特异性的类型。 在很大程度上取决于领域。 如果您正在为金融系统编写代码,则似乎需要一组丰富而严格的类型定义。 但是,我了解一些金融系统是用MUMPS编写的 ,那我知道些什么?

翻译自: https://www.freecodecamp.org/news/the-art-of-type-specificity-d0fdb6918e45/

准确性 敏感性 特异性

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值