oracle tcp空包请求,oracle_死连接检测解释(DCD)

最新推荐文章于 2021-09-04 13:58:25 发布

辰予

最新推荐文章于 2021-09-04 13:58:25 发布

阅读量465

点赞数

文章标签： oracle tcp空包请求

--------

Dead Connection Detection (DCD) is a feature of SQL*Net 2.1

and later, includingOracle Net8 and Oracle NET. DCD detects when a

partner in a SQL*Net V2 client/serveror server/server connection

has terminated unexpectedly, and flags the dead session so PMON can

release the resources associated with it.

死连接检测是net 2.1之后的一个特性。包括NET8和net.

服务器端检测到客户端异常中断，PMON进程会释放分配给该会话的资源。

DCD is intended primarily for environments in which clients

power down their systems without disconnecting from their Oracle

sessions, a problem characteristic of networks with PC

clients.

死链接检测主要的产生环境是在客户端在没有正常断开连接后断开连接，问题出在客户端的网络上。

DCD is initiated on the server when a connection is

established. At this time SQL*Net reads the SQL*Net parameter files

and sets a timer to generate an alarm. The timer

interval is set by providing a non-zero value in minutes for the

SQLNET.EXPIRE_TIME parameter in the sqlnet.ora file.

死连接检测被初始化在服务器端。SQL*NET读取参数文件的设置。时间间隔设置SQLNET.EXPIRE_TIME为非零参数再sqlnet.ora文件里。

When the timer expires, SQL*Net on the server sends a "probe"

packet to the client. (In the case of a database link, the

destination of the link constitutes the server side of the

connection.) The probe is essentially an empty

SQL*Net packet and does not represent any form of SQL*Net level

data, but it creates data traffic on the underlying protocol.

当超过设置时间，SQL*NET发送一个"探索"包到客户端。对于一个数据库链接,链接的目的地构成连接的服务器端。"探索"本质上是个空包。不会表示任何SQL*NET级别的数据，但是它创建了数据流在底层的协议。

If the client end of the connection is still active, the probe

is discarded, and the timer mechanism is reset.

If the client has terminated abnormally, the

server will receive an error from the send call issued for the

probe, and SQL*Net on the server will signal the operating system

to release the connection's resources.

如果客户端是ACTIVE,包被丢弃，探索时间被重置，如果客户端异常终止，服务器端将会接受一个错误。服务端标记操作系统释放链接资源。

On Unix servers, the sqlnet.ora file must be in either

$TNS_ADMIN or $ORACLE_HOME/network/admin. Neither /etc nor

/var/opt/oracle alone is valid.

It should be also be noted that in SQL*Net 2.1.x, an active

orphan process (one processing a query, for example) will not be

killed until the query completes. In SQL*Net 2.2, orphaned

resources will be released regardless of

activity.

This is a server feature only. The client

may be running any supported SQL*Net V2 release.

THE FUNCTION OF THE PROTOCOL STACK

----------------------------------

While Dead Connection Detection is set at the SQL*Net level,

it relies heavily on the underlying protocol stack for it's

successful execution. For example,you might set

SQLNET.EXPIRE_TIME=1 in the sqlnet.ora file, but it is unlikely

that an orphaned server process will be cleaned up immediately upon

expiration of that interval.

TCP/IP, for example, is a connection-oriented protocol, and as

such, the protocol will implement some level of packet timeout and

retransmission in an effort to guarantee the safe and sequenced

order of data packets. If a timely acknowledgement is not received

in response to the probe packet, the TCP/IP stack will retransmit

the packet some number of times before timing out. After TCP/IP

gives up, then SQL*Net receives notification that the probe

failed.

The time that it takes TCP/IP to timeout is dependent on the

TCP/IP stack, and timeouts of many minutes are entirely common.

This has been an area of concern for many

customers, as many retransmissions at the protocol layer causes

what could be a significant lag between the expiration of the DCD

interval and the time when the orphaned process is actually

killed.

The easiest way to determine if the protocol stack is causing

such a delay involves testing different DCD intervals.

TESTING THE PROTOCOL STACK

--------------------------

Set the SQLNET.EXPIRE_TIME parameter to 1 minute and note the

time required toclean up an orphaned server process.

Then set SQLNET.EXPIRE_TIME to 5 minutes and

again observe the time required to clean up the shadow. If the

TCP/IP timeout is the reason the server resources do not get

released, the time to clean up the shadow should increase by about

4 minutes.

If the TCP/IP retransmission timeout is indeed the problem,

the Operating System kernel can be tuned to reduce the interval for

and number of packet retransmissions (on many Unix platforms, the

file /usr/include/netinet/tcp_timer.h contains the configuration

parameters).

Reducing the interval and number of retransmissions may impact

other system components, since in effect you are shrinking the

window allowed forconnections to process data, possibly resulting

in inadvertent loss of connections during periods of heavy system

load. Slower connections from

remote sites may be impacted by this change.

Kernel parameters that may affect retransmission include but

are not limited to TCP_TTL, TCPTV_PERSMIN, TCPTV_MAX, and

TCP_LINGERTIME.

*** To avoid disrupting other system processes, it is

important to contact the appropriate vendor for assistance in

tuning the operating system kernel or protocol stack.

***

MONITORING DEAD CONNECTION DETECTION

------------------------------------

The best way to determine if DCD is enabled and functioning

properly is to generate a server trace and search the file for the

DCD probe packet. To generate a server trace, set

TRACE_LEVEL_SERVER=16 and TRACE_DIRECTORY_SERVER= in sqlnet.ora on

the server (note the locationof the sqlnet.ora file).

The resulting trace file will have a filename

ofsvr_.trc and will be located in the specified

directory.

Is DCD Enabled?

---------------

For pre-Oracle8i versions, enable level 16 SQL*Net server

tracing and search the resultant server trace file for an entry

like the following:

osntns: Enabling dead connection detection

(1 min)

The timer interval listed should match the

value of SQLNET.EXPIRE_TIME.

For Oracle8i onwards, you should see the following:

nstimini: entry

nstimig: entry

nstimig: normal

exit

nstimini: initializing NLTM in asynchronous

mode

nstimini: normal

exit

nstimstart: entry

Is DCD Working?

---------------

Search the server trace file for DCD probe packets. They will

appear in the form of empty data packets, as

follows:

nstimexp: entry

nstimexp: timer expired at 05-OCT-95

12:15:05

nsdo: entry

nsdo: cid=0, opcode=67, *bl=0, *what=1,

uflgs=0x2, cflgs=0x3

nsdo: nsctx: state=8, flg=0x621c,

mvd=0

nsdo: gtn=93, gtc=93, ptn=10,

ptc=2048

nsdoacts: entry

nsdofls: entry

nsdofls: DATA flags:

0x0

nsdofls: sending NSPTDA

packet

nspsend: entry

nspsend: plen=10,

type=6

nttwr: entry

nttwr: socket 4 had bytes

written=10

nttwr: exit

nspsend: 10 bytes to

transport

nspsend:packet dump

nspsend:00 0A 00 00 06 00 00 00

|........|

nspsend:00 00 00 00 00 00 00 00

|........|

nspsend: normal

exit

nsdofls: exit (0)

nsdoacts: flushing

transport

nttctl: entry

nsdoacts: normal

exit

nsdo: normal exit

nstimexp: normal exit

The entry:

nspsend:00 0A 00 00 06 00 00 00

|........|

nspsend:00 00 00 00 00 00 00 00

|........|

represents the probe packet. Note that DCD

packets are 10 bytes long when they are issued to the protocol

stack. Once the protocol header and trailer bytes for the

underlying protocols have been added, the packet could be

approximately 70 bytes long.

If DCD is enabled, you will see these probe packets written to

the trace file when the timer expires. If the

server is a UNIX system, it might be useful to establish a

connection and tail the trace file:

tail -f svr_.trc

The time elapsed after each probe packet is written to the

server trace should match the SQLNET.EXPIRE_TIME value.

Note: from version 9.2.0.4.0 onwards, DCD probe packets are no

longer traced in SQL*Net trace files, however DCD packets can be

observed using other forms of tracing, such as network sniffer

tracing.

KNOWN PROBLEMS OR LIMITATIONS

-----------------------------

- Of the few reported problems, perhaps the most significant

is DCD's poor performance on Windows NT. Dead

connections are cleaned up only when the server is rebooted and the

database is restarted. Exactly how well DCD works

on NT depends on the client's protocol implementation. SQL*Net v2.3

has improved the performance over earlier

releases.

This has been logged as port-specific

Bug#303578.

- On SCO Unix, a problem was reported in which server

processes spin, consuming large amounts of CPU, once the DCD timer

expires. The problem is due to improper signal handling and can be

eliminated by disabling DCD.

This is port-specific Bug#293264

- Orphaned resources are not released if only the client

application is terminated. Only after the client PC has been

rebooted does DCD release these resources. For example, if a

Windows application is killed yet Windows remains running, the

probe packet may be received and discarded as if the connection is

still active. As it currently stands, it appears

that DCD detects dead client

machines, but not dead client processes.

This is logged as generic

Bug#280848.

- The SQL*Net V2 implementation on MVS does not use the

generic DCD mechanism,and therefore the SQLNET.EXPIRE_TIME

parameter does not apply. The KEEPALIVE function of IBM's TCP/IP is

used instead. This was implemented prior to development of

DCD.

This is documented in port-specific

Bug#301318.

- DCD relies heavily on issuing probe packets during any phase

of the connection.This is not be possible with some protocols which

run half-duplex. Hence, DCD is not enabled on protocols like

APPC/LU6.2.

This is not a bug, but is

rather the intended design.

- Local connections using BEQ protocol

adapters are not supported with DCD. Local connections using the IPC protocol adapters are supported

with DCD.

-BUG#1388806 : On Windows NT, DCD FAILS AFTER 16

CONNECTIONS

A FINAL NOTE...

--------------

On most OS'es (even more recent versions of Windows) if a

process exits abnormally or is killed by an administrator, the OS

will still gracefully clean up resources associated with that

process including the network connection(s). It

will tell the server on the other end that it is

closing

the network connection. DCD is still useful for times when

there are problems with the physical network (e.g. ethernet cable

falls off the machine) or if the OS kernel panics and crashes (e.g.

blue screen of death) before it can close the network connections.

It may have another side benefit with certain

load balancing hardware, that may prematurely abort connections it

thinks have

been idle too long, by sending a dummy packet to the client

periodically.

Under no circumstances should you rely 100% on Dead Connection

Detection. It was developed to handle clients

that have abnormally exited. Clients should always exit their

applications gracefully. It is the responsibility of the

application developer to make this possible. DCD is intended only

to clean up after abnormal events.

DCD is much more resource-intensive than similar mechanisms at

the protocol level, so if you depend on DCD to clean up all dead

processes, that will put an undue load on the

server.

Clearly it is advantageous to exit applications cleanly in the

first place.

辰予

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
oracle tcp空包请求,oracle_死连接检测解释(DCD)

--------Dead Connection Detection (DCD) is a feature of SQL*Net 2.1and later, includingOracle Net8 and Oracle NET. DCD detects when apartner in a SQL*Net V2 client/serveror server/server connectionhas...
复制链接

扫一扫

oracle tcp空包请求,oracle_死连接检测解释(DCD)

“相关推荐”对你有帮助么？