在程序中怎样使用DirectX Audio Plug-ins?
What is IAsyncReader for and how can I use it? What is CPullPin?
Most streams in DirectShow use a requested push model. That is, you tell the upstream pin where it should start and stop, and it then delivers you that section of the stream in a sequence of samples delivered to your Receive method on its thread. This works fine for most playback scenarios, but there are some cases where it is not very efficient. One such case is the AVI file format: this is not really designed well for sequential playback, since the chunks you need are often in the wrong place. To play this back efficiently, you need to be able to read more at random in the file, rather than just parsing chunks as they arrive as you would with MPEG for example.
In order to extract the file reading parts of both the MPEG parser and the AVI parser into a common filter (so that for example it could be replaced by a URL filter) it was necessary to have an interface at this point that was efficient in both cases. This is the IAsyncReader interface. Using this interface, each sample is requested separately with a random access style. You can either have the requests fulfilled synchronously on your own thread (SyncRead) or queue them and collect the completed requests later (Request and Wait).
CPullPin provides the client code for this interface for a downstream pin, and really is designed to let the input pin look the same whether it uses IAsyncReader or the normal push-model IMemInputPin::Receive. If you connect to an output pin that uses IMemInputPin::Receive, the data will be sent to your Receive method. If you connect to an output pin that only supports IAsyncReader, then CPullPin provides the 'glue' that pulls data from the output pin and delivers it to your Receive method. It doesn't really get any of the benefit of being able to read data from different positions in the stream. Accordingly you will not be surprised to learn that the mpeg parser uses CPullPin (in sync mode) whereas the AVI file parser uses the IAsyncReader interface in async mode, but using its own code to access discontiguous chunks independently.
In sync mode, the async reader just reads the file. When you call IAsyncReader::SyncReadAligned, it does an unbuffered ReadFile call and then returns. CPullPin in this mode just has a loop that gets a buffer, calls the sync-read method and then calls the (overridden) Receive method to get it to your pin.
In async mode you call Request to queue a request, which is passed to a worker thread within the async reader. At some point, you can then call WaitForNext to get a completed buffer back from the async reader. Win95 does not support overlapped i/o, and hence I don't believe the file reader uses overlapped i/o in the Windows NT case either even when it is available.
CPullPin in async mode has a loop that queues a request and waits for a completed request, but always keeping one request queued. So in this case, the only difference between the two modes is that in async mode, the next request will be sent to the disk almost immediately, whereas in sync mode, the next request is not queued to the disk until your Receive method has completed (in most of the standard parsers, this is very quick).
So in a high bandwidth case where you need to queue the next request before completing the Receive processing, you may want to consider async mode, but in other cases, you are ok in sync mode. If you want to receive data from many discontiguous parts of the file, you probably want to discard CPullPin entirely and write your own access code, but for most streaming cases, sync mode with CPullPin gives you what you want.
It's also worth noting that the DirectShow MPEG splitters use IAsyncReader methods to access parts of the file at pin-connection time: IMemInputPin would not provide data until the graph was running (see discussion below on live material)
What's the difference between stopped, paused and running states in DirectShow?
In a DirectShow graph, there are three possible states: stopped, paused and running. If a graph is stopped, filters hold a minimum set of resources and do not process data. If the graph is running, filters hold a maximal set of resources and data is processed and rendered.
If a graph is paused then data is processed but not rendered. Source filters will start (or continue) to push data into the graph and transform filters will process the data but the renderer will not draw or write the data. The transform filter will eventually block since its buffers are not being consumed by the renderer. Thus a graph can be cued for rapid starting simply by leaving it in paused mode: once some data arrives at the renderer, the rest of the graph will be blocked by the normal flow control mechanisms (GetBuffer or Receive will block) until the graph is run.
Media types that can be rendered statically, such as video (where you can show a frozen frame) but not, for example, audio, are typically rendered in paused state. Repainting the video window can simply be achieved by transitioning the graph to paused mode and back.
How many threads are there in a typical filter graph?
DirectShow filter graphs typically have one thread per stream segment -- that is, one thread would be active on a source filter's output pin and would push data right through the transform filters and into the renderer. The transform function and delivery downstream would normally occur on this thread during the Receive call. A parser or other filter that has more than one output pin would typically have a new thread for each output pin (plus the source filter's thread on which it receives data). This can be done straightforwardly using a COutputQueue class in each output pin.
In addition, there will typically be a filter graph manager thread and the application's thread
What is g_Templates and why won't my application link without it?
The DirectShow base class library strmbase.lib provides a COM class factory implementation to simplify the development of filters as COM objects. This class factory uses the template implementation you provide: your filter code will supply the templates for objects in your DLL in the g_Templates array, with a count of entries in g_cTemplates. If you are building a filter, you will need to supply this array and count.
If, however, you are writing an application that uses the DirectShow base classes, you do not need to define this -- you can set g_cTemplates to 0 and then g_Templates will not be referenced.
If your DLL is a COM dll that uses another class factory implementation (perhaps your dll is a ATL control) then you will be exporting DllGetClassFactory from your DLL. This is implemented in a number of places, including the ATL and MFC libraries as well as DirectShow's library. Make sure that the correct library is found first otherwise even if you define g_cTemplates you will not be able to create any of your ATL objects.
The DLL registers OK but the wrong module is registered for my filters.
You need to make sure that your DLL starts at DllEntryPoint, defined in dllentry.cpp in the class library. Otherwise the class library's global variable g_hInst is not set to the correct module handle, and registration will go wrong.
How can I play back MPEG streams from live sources or the network?
You will need a source filter that introduces the mpeg data into the graph. You will probably also need an MPEG splitter (also called a parser) that works with live streams and you will also need a decompressor if you are using MPEG-2 (the MPEG filter supplied with DirectShow supports only MPEG-1).
The DirectShow-supplied splitters for MPEG-1 and 2 do two tasks in addition to separating out the individual elementary streams: they convert the embedded PTS times into DirectShow timestamps, and they create a media type including the sequence header. Both of these involved searching around in the file at pin-connection time, which is clearly not possible for live sources.
The best solution is to create a new splitter filter, based on the sample code in the DirectShow SDK (under mpegparse). Your source filter would supply data using the IMemInputPin transport rather than IAsyncReader (see the discussion of IAsyncReader above). The splitter filter would look for the first pts and first sequence header after any discontinuity, and would use these for the media type and as a base for timestamp conversion. A default media type would be needed at connection time, to be replaced by a dynamically detected media type when the first sequence header is detected.
How can I host DirectX Audio Plug-ins in my application?
The right way to do this, I believe, is to create a source and sink filter as part of your application and connect the plugin between them. When you call the source filter from your application, it delivers the data to the plugin (a DirectShow transform filter) and the processed data arrives at your sink filter when done, from where it goes back to your application.
Construct the two filters as C++ objects by calling new (rather than registering them with class ids and calling CoCreateInstance). That way, you can call public methods on the objects without needing to define a custom COM interface to get access to them. They still need to be COM objects of course (in particular, don't forget to initialise the refcount to 1 after creating them, and delete them by Release rather than delete).
The source filter is derived from CBaseFilter and has a single output pin derived from CBaseOutputPin. When the app wants to deliver data, you call GetDeliveryBuffer, fill with data, and then Deliver (both these methods are on the output pin.
The sink filter is also derived from CBaseFilter and has a single input pin based on CBaseInputPin. The data arrives at the Receive method of the input pin from where you can either call out to your app or wait for your app to collect it.
Note that the delivery of data from the plug-in to your sync filter is not necessarily tied to the delivery of data from your source into the plug-in. Most transforms (eg most of those written to the Sonic Foundry template) will do the transform synchronously: during your source filter's Deliver call, the data will be processed and delivered to the sink filter, but there is no guarantee that this is the case (or that, eg, the same size and count of buffers are used). A simple approach used by some hosts is to only support 'synchronous' plug-ins: call Deliver with the data, then call the sink filter to pick up the processed data and if it hasn't arrived, complain that the plug-in is not supported. I don't like this idea because it is unnecessarily restrictive on design of transform filters (eg you couldn't do mpeg decode like that). A better idea is to have an async model where the sink filter calls back to your app when the data arrives.
Other things to worry about are:
- Negotiation of size and count of buffers used (separate on input to and output from transform)
- Instantiation of filter graph manager and building the three-filter graph
- Finding the right transform by enumeration using IFilterMapper2 from CLSID_FilterMapper2
- Property pages using OleCreatePropertyFrame are modal, so you either mess about with separate threads for the property page or you have your own property frame to host the property pages.
- Storing the filter's property settings using IPersistStream and eg CreateStreamOnHGLOBAL.
- Timestamping the samples somehow in the source filter in case the transform uses this.