Writing Solaris Device Driver: Managing Events and Queueing Tasks

Managing Events and Queueing Tasks

Drivers use events to respond to state changes. This chapter provides the following information on events:

Drivers use task queues to manage resource dependencies between tasks. This chapter provides the following information about task queues:

Managing Events

A system often needs to respond to a condition change such as a user action or system request. For example, a device might issue a warning when a component begins to overheat, or might start a movie player when a DVD is inserted into a drive. Device drivers can use a special message called an event to inform the system that a change in state has taken place.

Introduction to Events

An event is a message that a device driver sends to interested entities to indicate that a change of state has taken place. Events are implemented in the Solaris OS as user-defined, name-value pair structures that are managed using the nvlist* functions. (See the nvlist_alloc(9F) man page. Events are organized by vendor, class, and subclass. For example, you could define a class for monitoring environmental conditions. An environmental class could have subclasses to indicate changes in temperature, fan status, and power.

When a change in state occurs, the device notifies the driver. The driver then uses the ddi_log_sysevent(9F) function to log this event in a queue called sysevent. The sysevent queue passes events to the user level for handling by either the syseventd daemon or syseventconfd daemon. These daemons send notifications to any applications that have subscribed for notification of the specified event.

Two methods for designers of user-level applications deal with events:

  • An application can use the routines in libsysevent(3LIB) to subscribe with the syseventd daemon for notification when a specific event occurs.

  • A developer can write a separate user-level application to respond to an event. This type of application needs to be registered with syseventadm(1M). When syseventconfd encounters the specified event, the application is run and deals with the event accordingly.

This process is illustrated in the following figure.

Figure 5–1 Event Plumbing

Diagram shows how events are logged into the sysevent
queue for notification of user-level applications.

Using ddi_log_sysevent() to Log Events

Device drivers use the ddi_log_sysevent(9F) interface to generate and log events with the system.

ddi_log_sysevent() Syntax

ddi_log_sysevent() uses the following syntax:

int ddi_log_sysevent(dev_info_t *dip, char *vendor, char *class, 
char *subclass, nvlist_t *attr-list, sysevent_id_t *eidp, int sleep-flag);

where:

dip

A pointer to the dev_info node for this driver.

vendor

A pointer to a string that defines the driver's vendor. Third-party drivers should use their company's stock symbol or a similarly enduring identifier. Sun-supplied drivers use DDI_VENDOR_SUNW.

class

A pointer to a string defining the event's class. class is a driver-specific value. An example of a class might be a string that represents a set of environmental conditions that affect a device. This value must be understood by the event consumer.

subclass

A driver-specific string that represents a subset of the class argument. For example, within a class that represents environmental conditions, an event subclass might refer to the device's temperature. This value must be intelligible to the event consumer.

attr-list

A pointer to an nvlist_t structure that lists name-value attributes associated with the event. Name-value attributes are driver-defined and can refer to a specific attribute or condition of the device.

For example, consider a device that reads both CD-ROMs and DVDs. That device could have an attribute with the name disc_type and the value equal to either cd_rom or dvd.

As with class and subclass, an event consumer must be able to interpret the name-value pairs.

For more information on name-value pairs and the nvlist_t structure, see Defining Event Attributes, as well as the nvlist_alloc(9F) man page.

If the event has no attributes, then this argument should be set to NULL.

eidp

The address of a sysevent_id_t structure. The sysevent_id_t structure is used to provide a unique identification for the event. ddi_log_sysevent(9F) returns this structure with a system-provided event sequence number and time stamp. See the ddi_log_sysevent(9F) man page for more information on the sysevent_id_t structure.

sleep-flag

A flag that indicates how a caller responds when resources are not available. If sleep-flag is set to DDI_SLEEP, the driver blocks until the resources become available. With DDI_NOSLEEP, allocations might sleep but are guaranteed to succeed. DDI_NOSLEEP allocations are guaranteed not to sleep but might return NULL if no memory is currently available.

Sample Code for Logging Events

A device driver performs the following tasks to log events:

The following example demonstrates how to use ddi_log_sysevent().


Example 5–1 Calling ddi_log_sysevent()

char *vendor_name = "DDI_VENDOR_JGJG"
char *my_class = "JGJG_event";
char *my_subclass = "JGJG_alert";
nvlist_t *nvl;
...
nvlist_alloc(&nvl, nvflag, kmflag);
...
(void) nvlist_add_byte_array(nvl, propname, (uchar_t *)propval, proplen + 1);
...
if (ddi_log_sysevent(dip, vendor_name, my_class,
my_subclass, nvl, NULL, DDI_SLEEP)!= DDI_SUCCESS)
cmn_err(CE_WARN, "error logging system event");
nvlist_free(nvl);

Defining Event Attributes

Event attributes are defined as a list of name-value pairs. The Solaris DDI provides routines and structures for storing information in name-value pairs. Name-value pairs are retained in an nvlist_t structure, which is opaque to the driver. The value for a name-value pair can be a Boolean, an int, a byte, a string, an nvlist, or an array of these data types. An int can be defined as 16 bits, 32 bits, or 64 bits and can be signed or unsigned.

The steps in creating a list of name-value pairs are as follows.

  1. Create an nvlist_t structure with nvlist_alloc(9F).

    The nvlist_alloc() interface takes three arguments:

    • nvlp – Pointer to a pointer to an nvlist_t structure

    • nvflag – Flag to indicate the uniqueness of the names of the pairs. If this flag is set to NV_UNIQUE_NAME_TYPE, any existing pair that matches the name and type of a new pair is removed from the list. If the flag is set to NV_UNIQUE_NAME, then any existing pair with a duplicate name is removed, regardless of its type. Specifying NV_UNIQUE_NAME_TYPE allows a list to contain two or more pairs with the same name as long as their types are different, whereas with NV_UNIQUE_NAME only one instance of a pair name can be in the list. If the flag is not set, then no uniqueness checking is done and the consumer of the list is responsible for dealing with duplicates.

    • kmflag – Flag to indicate the allocation policy for kernel memory. If this argument is set to KM_SLEEP, then the driver blocks until the requested memory is available for allocation. KM_SLEEP allocations might sleep but are guaranteed to succeed. KM_NOSLEEP allocations are guaranteed not to sleep but might return NULL if no memory is currently available.

  2. Populate the nvlist with name-value pairs. For example, to add a string, use nvlist_add_string(9F). To add an array of 32-bit integers, use nvlist_add_int32_array(9F). The nvlist_add_boolean(9F) man page contains a complete list of interfaces for adding pairs.

To deallocate a list, use nvlist_free(9F).

The following code sample illustrates the creation of a name-value list.


Example 5–2 Creating and Populating a Name-Value Pair List

nvlist_t*
create_nvlist()
{
int err;
char *str = "child";
int32_t ints[] = {0, 1, 2};
nvlist_t *nvl;

err = nvlist_alloc(&nvl, NV_UNIQUE_NAME, 0); /* allocate list */
if (err)
return (NULL);
if ((nvlist_add_string(nvl, "name", str) != 0) ||
(nvlist_add_int32_array(nvl, "prop", ints, 3) != 0)) {
nvlist_free(nvl);
return (NULL);
}
return (nvl);
}

Drivers can retrieve the elements of an nvlist by using a lookup function for that type, such as nvlist_lookup_int32_array(9F), which takes as an argument the name of the pair to be searched for.


Note –

These interfaces work only if either NV_UNIQUE_NAME or NV_UNIQUE_NAME_TYPE is specified when nvlist_alloc(9F) is called. Otherwise, ENOTSUP is returned, because the list cannot contain multiple pairs with the same name.


A list of name-value list pairs can be placed in contiguous memory. This approach is useful for passing the list to an entity that has subscribed for notification. The first step is to get the size of the memory block that is needed for the list with nvlist_size(9F). The next step is to pack the list into the buffer with nvlist_pack(9F). The consumer receiving the buffer's content can unpack the buffer with nvlist_unpack(9F).

The functions for manipulating name-value pairs are available to both user-level and kernel-level developers. You can find identical man pages for these functions in both man pages section 3: Library Interfaces and Headers and in man pages section 9: DDI and DKI Kernel Functions. For a list of functions that operate on name-value pairs, see the following table.

Table 5–1 Functions for Using Name-Value Pairs

Man Page 

Purpose / Functions 

nvlist_add_boolean(9F)

Add name-value pairs to the list. Functions include: 

nvlist_add_boolean(), nvlist_add_boolean_value(), nvlist_add_byte(), nvlist_add_int8(), nvlist_add_uint8(), nvlist_add_int16(), nvlist_add_uint16(), nvlist_add_int32(), nvlist_add_uint32(), nvlist_add_int64(), nvlist_add_uint64(), nvlist_add_string(), nvlist_add_nvlist(), nvlist_add_nvpair(), nvlist_add_boolean_array(), nvlist_add_int8_array, nvlist_add_uint8_array(), nvlist_add_nvlist_array(), nvlist_add_byte_array(), nvlist_add_int16_array(), nvlist_add_uint16_array(), nvlist_add_int32_array(), nvlist_add_uint32_array(), nvlist_add_int64_array(), nvlist_add_uint64_array(), nvlist_add_string_array()

nvlist_alloc(9F)

Manipulate the name-value list buffer. Functions include: 

nvlist_alloc(), nvlist_free(), nvlist_size(), nvlist_pack(), nvlist_unpack(), nvlist_dup(), nvlist_merge()

nvlist_lookup_boolean(9F)

Search for name-value pairs. Functions include: 

nvlist_lookup_boolean(), nvlist_lookup_boolean_value(), nvlist_lookup_byte(), nvlist_lookup_int8(), nvlist_lookup_int16(), nvlist_lookup_int32(), nvlist_lookup_int64(), nvlist_lookup_uint8(), nvlist_lookup_uint16(), nvlist_lookup_uint32(), nvlist_lookup_uint64(), nvlist_lookup_string(), nvlist_lookup_nvlist(), nvlist_lookup_boolean_array, nvlist_lookup_byte_array(), nvlist_lookup_int8_array(), nvlist_lookup_int16_array(), nvlist_lookup_int32_array(), nvlist_lookup_int64_array(), nvlist_lookup_uint8_array(), nvlist_lookup_uint16_array(), nvlist_lookup_uint32_array(), nvlist_lookup_uint64_array(), nvlist_lookup_string_array(), nvlist_lookup_nvlist_array(), nvlist_lookup_pairs()

nvlist_next_nvpair(9F)

Get name-value pair data. Functions include: 

nvlist_next_nvpair(), nvpair_name(), nvpair_type()

nvlist_remove(9F)

Remove name-value pairs. Functions include: 

nv_remove(), nv_remove_all()

Queueing Tasks

This section discusses how to use task queues to postpone processing of some tasks and delegate their execution to another kernel thread.

Introduction to Task Queues

A common operation in kernel programming is to schedule a task to be performed at a later time, by a different thread. The following examples give some reasons that you might want a different thread to perform a task at a later time:

  • Your current code path is time critical. The additional task you want to perform is not time critical.

  • The additional task might require grabbing a lock that another thread is currently holding.

  • You cannot block in your current context. The additional task might need to block, for example to wait for memory.

  • A condition is preventing your code path from completing, but your current code path cannot sleep or fail. You need to queue the current task to execute after the condition disappears.

  • You need to launch multiple tasks in parallel.

In each of these cases, a task is executed in a different context. A different context is usually a different kernel thread with a different set of locks held and possibly a different priority. Task queues provide a generic kernel API for scheduling asynchronous tasks.

A task queue is a list of tasks with one or more threads to service the list. If a task queue has a single service thread, all tasks are guaranteed to execute in the order in which they are added to the list. If a task queue has more than one service thread, the order in which the tasks will execute is not known.


Note –

If the task queue has more than one service thread, make sure that the execution of one task does not depend on the execution of any other task. Dependencies between tasks can cause a deadlock to occur.


Task Queue Interfaces

The following DDI interfaces manage task queues. These interfaces are defined in the usr/src/uts/common/sys/sunddi.h header file. See the taskq(9F) man page for more information about these interfaces.

ddi_taskq_t

Opaque handle 

TASKQ_DEFAULTPRI

System default priority 

DDI_SLEEP

Can block for memory 

DDI_NOSLEEP

Cannot block for memory 

ddi_taskq_create()

Create a task queue 

ddi_taskq_destroy()

Destroy a task queue 

ddi_taskq_dispatch()

Add a task to a task queue 

ddi_taskq_wait()

Wait for pending tasks to complete 

ddi_taskq_suspend()

Suspend a task queue 

taskq_suspended()

Check whether a task queue is suspended 

ddi_taskq_resume()

Resume a suspended task queue 

Using Task Queues

The typical usage in drivers is to create task queues at attach(9E). Most taskq_dispatch() invocations are from interrupt context.

To study task queues used in Solaris drivers, go to http://www.opensolaris.org/. In the left margin menu, click Source Browser. In the Symbol field of the search area, enter ddi_taskq_create. In the Project list, select onnv. Click the Search button. In your search results you should see the USB generic serial driver (usbser.c), the 1394 mass storage HBA FireWire driver (scsa1394/hba.c), and the SCSI HBA driver for Dell PERC 3DC/4SC/4DC/4Di RAID devices (amr.c).

Click the file name amr.c. The ddi_taskq_create() function is called in the amr_attach() entry point. The ddi_taskq_destroy() function is called in the amr_detach() entry point and also in the error handling section of the amr_attach() entry point. The ddi_taskq_dispatch() function is called in the amr_done() function, which is called in the amr_intr() function. The amr_intr() function is an interrupt-handling function that is an argument to the ddi_add_intr(9F) function in the amr_attach() entry point.

Observing Task Queues

This section describes two techniques that you can use to monitor the system resources that are consumed by a task queue. Task queues export statistics on the use of system time by task queue threads. Task queues also use DTrace SDT probes when a task queue starts and finishes execution of a task.

Task Queue Kernel Statistics Counters

Every task queue has an associated set of kstat counters. Examine the output of the following kstat(1M) command:


$ kstat -c taskq
module: unix instance: 0
name: ata_nexus_enum_tq class: taskq
crtime 53.877907833
executed 0
maxtasks 0
nactive 1
nalloc 0
priority 60
snaptime 258059.249256749
tasks 0
threads 1
totaltime 0

module: unix instance: 0
name: callout_taskq class: taskq
crtime 0
executed 13956358
maxtasks 4
nactive 4
nalloc 0
priority 99
snaptime 258059.24981709
tasks 13956358
threads 2
totaltime 120247890619

The kstat output shown above includes the following information:

  • The name of the task queue and its instance number

  • The number of scheduled (tasks) and executed (executed) tasks

  • The number of kernel threads processing the task queue (threads) and their priority (priority)

  • The total time (in nanoseconds) spent processing all the tasks (totaltime)

The following example shows how you can use the kstat command to observe how a counter (number of scheduled tasks) increases over time:


$ kstat -p unix:0:callout_taskq:tasks 1 5
unix:0:callout_taskq:tasks 13994642

unix:0:callout_taskq:tasks 13994711

unix:0:callout_taskq:tasks 13994784

unix:0:callout_taskq:tasks 13994855

unix:0:callout_taskq:tasks 13994926

Task Queue DTrace SDT Probes

Task queues provide several useful SDT probes. All the probes described in this section have the following two arguments:

  • The task queue pointer returned by ddi_taskq_create()

  • The pointer to the taskq_ent_t structure. Use this pointer in your D script to extract the function and the argument.

You can use these probes to collect precise timing information about individual task queues and individual tasks being executed through them. For example, the following script prints the functions that were scheduled through task queues for every 10 seconds:


# !/usr/sbin/dtrace -qs

sdt:genunix::taskq-enqueue
{
this->tq = (taskq_t *)arg0;
this->tqe = (taskq_ent_t *) arg1;
@[this->tq->tq_name,
this->tq->tq_instance,
this->tqe->tqent_func] = count();
}

tick-10s
{
printa ("%s(%d): %a called %@d times/n", @);
trunc(@);
}

On a particular machine, the above D script produced the following output:


callout_taskq(1): genunix`callout_execute called 51 times
callout_taskq(0): genunix`callout_execute called 701 times
kmem_taskq(0): genunix`kmem_update_timeout called 1 times
kmem_taskq(0): genunix`kmem_hash_rescale called 4 times
callout_taskq(1): genunix`callout_execute called 40 times
USB_hid_81_pipehndl_tq_1(14): usba`hcdi_cb_thread called 256 times
callout_taskq(0): genunix`callout_execute called 702 times
kmem_taskq(0): genunix`kmem_update_timeout called 1 times
kmem_taskq(0): genunix`kmem_hash_rescale called 4 times
callout_taskq(1): genunix`callout_execute called 28 times
USB_hid_81_pipehndl_tq_1(14): usba`hcdi_cb_thread called 228 times
callout_taskq(0): genunix`callout_execute called 706 times
callout_taskq(1): genunix`callout_execute called 24 times
USB_hid_81_pipehndl_tq_1(14): usba`hcdi_cb_thread called 141 times
callout_taskq(0): genunix`callout_execute called 708 times
 
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值