這幾天在做dxva2硬件加速,找不到什麼資料,翻譯了一下微軟的兩篇相關文檔。並准備記錄一下用ffmpeg實現dxva2,將在第三篇寫到。這是第二篇。,英文原址:https://msdn.microsoft.com/en-us/library/aa965245(v=vs.85).aspx
第一篇翻譯的Direct3D device manager,鏈接:http://www.cnblogs.com/betterwgo/p/6124588.html
本主題描述如何在DirectShow的解碼器中支持DirectX Video Acceleration (DXVA) 2.0。具體而言,是描述解碼器與視頻渲染器之間的聯通(communication )。本主題不描述如何實現DXVA解碼。
1.准備(Prerequisites)
本主題假定你熟悉如何寫DirectShow過濾器。更多信息請參考DirectShow SDK文檔的Writing DirectShow Filters主題(https://msdn.microsoft.com/en-us/library/dd391013(v=vs.85).aspx )。代碼簡例假定解碼器繼承自CTransformFilter類,定義如下:
class CDecoder : public CTransformFilter { public: static CUnknown* WINAPI CreateInstance(IUnknown *pUnk, HRESULT *pHr); HRESULT CompleteConnect(PIN_DIRECTION direction, IPin *pPin); HRESULT InitAllocator(IMemAllocator **ppAlloc); HRESULT DecideBufferSize(IMemAllocator *pAlloc, ALLOCATOR_PROPERTIES *pProp); // TODO: The implementations of these methods depend on the specific decoder. HRESULT CheckInputType(const CMediaType *mtIn); HRESULT CheckTransform(const CMediaType *mtIn, const CMediaType *mtOut); HRESULT CTransformFilter::GetMediaType(int,CMediaType *); private: CDecoder(HRESULT *pHr); ~CDecoder(); CBasePin * GetPin(int n); HRESULT ConfigureDXVA2(IPin *pPin); HRESULT SetEVRForDXVA2(IPin *pPin); HRESULT FindDecoderConfiguration( /* [in] */ IDirectXVideoDecoderService *pDecoderService, /* [in] */ const GUID& guidDecoder, /* [out] */ DXVA2_ConfigPictureDecode *pSelectedConfig, /* [out] */ BOOL *pbFoundDXVA2Configuration ); private: IDirectXVideoDecoderService *m_pDecoderService; DXVA2_ConfigPictureDecode m_DecoderConfig; GUID m_DecoderGuid; HANDLE m_hDevice; FOURCC m_fccOutputFormat; };
本主題中,解碼器是指decoder filter,包括接收壓縮視頻數據到輸出解壓縮的視頻數據的過程。解碼設備指圖形驅動所實現的硬件視頻加速器。
一個解碼器要支持DXVA 2.0必須有以下基本步驟:
(1)確定一個文件類型(個人理解:應該是指根據獲取到的原文件類型,找到DXVA2對應的文件類型。比如ffmpeg獲取到了文件類型,要知道這個文件類型在DXVA2中對應的是什麼文件類型)
(2)找到對應的DXVA解碼器配置
(3)告知視頻渲染設備解碼器用的是DXVA
(4)提供一個客戶分配器來分配Direct3D surfaces.
原文:
HRESULT CDecoder::ConfigureDXVA2(IPin *pPin) { UINT cDecoderGuids = 0; BOOL bFoundDXVA2Configuration = FALSE; GUID guidDecoder = GUID_NULL; DXVA2_ConfigPictureDecode config; ZeroMemory(&config, sizeof(config)); // Variables that follow must be cleaned up at the end. IMFGetService *pGetService = NULL; IDirect3DDeviceManager9 *pDeviceManager = NULL; IDirectXVideoDecoderService *pDecoderService = NULL; GUID *pDecoderGuids = NULL; // size = cDecoderGuids HANDLE hDevice = INVALID_HANDLE_VALUE; // Query the pin for IMFGetService. HRESULT hr = pPin->QueryInterface(IID_PPV_ARGS(&pGetService)); // Get the Direct3D device manager. if (SUCCEEDED(hr)) { hr = pGetService->GetService( MR_VIDEO_ACCELERATION_SERVICE, IID_PPV_ARGS(&pDeviceManager) ); } // Open a new device handle. if (SUCCEEDED(hr)) { hr = pDeviceManager->OpenDeviceHandle(&hDevice); } // Get the video decoder service. if (SUCCEEDED(hr)) { hr = pDeviceManager->GetVideoService( hDevice, IID_PPV_ARGS(&pDecoderService)); } // Get the decoder GUIDs. if (SUCCEEDED(hr)) { hr = pDecoderService->GetDecoderDeviceGuids( &cDecoderGuids, &pDecoderGuids); } if (SUCCEEDED(hr)) { // Look for the decoder GUIDs we want. for (UINT iGuid = 0; iGuid < cDecoderGuids; iGuid++) { // Do we support this mode? if (!IsSupportedDecoderMode(pDecoderGuids[iGuid])) { continue; } // Find a configuration that we support. hr = FindDecoderConfiguration(pDecoderService, pDecoderGuids[iGuid], &config, &bFoundDXVA2Configuration); if (FAILED(hr)) { break; } if (bFoundDXVA2Configuration) { // Found a good configuration. Save the GUID and exit the loop. guidDecoder = pDecoderGuids[iGuid]; break; } } } if (!bFoundDXVA2Configuration) { hr = E_FAIL; // Unable to find a configuration. } if (SUCCEEDED(hr)) { // Store the things we will need later. SafeRelease(&m_pDecoderService); m_pDecoderService = pDecoderService; m_pDecoderService->AddRef(); m_DecoderConfig = config; m_DecoderGuid = guidDecoder; m_hDevice = hDevice; } if (FAILED(hr)) { if (hDevice != INVALID_HANDLE_VALUE) { pDeviceManager->CloseDeviceHandle(hDevice); } } SafeRelease(&pGetService); SafeRelease(&pDeviceManager); SafeRelease(&pDecoderService); return hr; } HRESULT CDecoder::FindDecoderConfiguration( /* [in] */ IDirectXVideoDecoderService *pDecoderService, /* [in] */ const GUID& guidDecoder, /* [out] */ DXVA2_ConfigPictureDecode *pSelectedConfig, /* [out] */ BOOL *pbFoundDXVA2Configuration ) { HRESULT hr = S_OK; UINT cFormats = 0; UINT cConfigurations = 0; D3DFORMAT *pFormats = NULL; // size = cFormats DXVA2_ConfigPictureDecode *pConfig = NULL; // size = cConfigurations // Find the valid render target formats for this decoder GUID. hr = pDecoderService->GetDecoderRenderTargets( guidDecoder, &cFormats, &pFormats ); if (SUCCEEDED(hr)) { // Look for a format that matches our output format. for (UINT iFormat = 0; iFormat < cFormats; iFormat++) { if (pFormats[iFormat] != (D3DFORMAT)m_fccOutputFormat) { continue; } // Fill in the video description. Set the width, height, format, // and frame rate. DXVA2_VideoDesc videoDesc = {0}; FillInVideoDescription(&videoDesc); // Private helper function. videoDesc.Format = pFormats[iFormat]; // Get the available configurations. hr = pDecoderService->GetDecoderConfigurations( guidDecoder, &videoDesc, NULL, // Reserved. &cConfigurations, &pConfig ); if (FAILED(hr)) { break; } // Find a supported configuration. for (UINT iConfig = 0; iConfig < cConfigurations; iConfig++) { if (IsSupportedDecoderConfig(pConfig[iConfig])) { // This configuration is good. *pbFoundDXVA2Configuration = TRUE; *pSelectedConfig = pConfig[iConfig]; break; } } CoTaskMemFree(pConfig); break; } // End of formats loop. } CoTaskMemFree(pFormats); // Note: It is possible to return S_OK without finding a configuration. return hr; }
由於這是個通用的例子,所以有些邏輯就放置在了輔助函數裡面,需要由解碼器來實現。以下是所用到的輔助函數:
// Returns TRUE if the decoder supports a given decoding mode. BOOL IsSupportedDecoderMode(const GUID& mode); // Returns TRUE if the decoder supports a given decoding configuration. BOOL IsSupportedDecoderConfig(const DXVA2_ConfigPictureDecode& config); // Fills in a DXVA2_VideoDesc structure based on the input format. void FillInVideoDescription(DXVA2_VideoDesc *pDesc);
4.通知視頻渲染器(Notifying the Video Renderer)
如果解碼器找到了解碼配置,下一步就是通知視頻渲染器將要使用硬件加速來解碼。你可以在CompleteConnect方法中完成這個步驟。這一步必須在選擇分配器之前做,因為它會影響分配器如何選擇。
1)為IMFGetService接口查詢渲染器的輸入Pin(原文:Query the renderer's input pin for the IMFGetService interface.)
2)調用IMFGetService::GetService獲取指向IDirectXVideoMemoryConfiguration接口的指針。該服務的GUID是MR_VIDEO_ACCELERATION_SERVICE。
3)循環調用IDirectXVideoMemoryConfiguration::GetAvailableSurfaceTypeByIndex,從0增長dwTypeIndex 變量。當該方法在pdwType 參數返回DXVA2_SurfaceType_DecoderRenderTarget 時停止循環。這一步確保視頻渲染器支持硬件加速轉碼。對於EVR過濾器而言這一步總是成功的。
4)如果上一步成功,用DXVA2_SurfaceType_DecoderRenderTarget參數調用IDirectXVideoMemoryConfiguration::SetSurfaceType。用這個參數調用SetSurfaceType將視頻渲染器置於DXVA模式。當視頻渲染器處於這種模式時,解碼器必須提供它自己的分配器。
以下代碼展示如何通知視頻渲染器:
HRESULT CDecoder::SetEVRForDXVA2(IPin *pPin) { HRESULT hr = S_OK; IMFGetService *pGetService = NULL; IDirectXVideoMemoryConfiguration *pVideoConfig = NULL; // Query the pin for IMFGetService. hr = pPin->QueryInterface(__uuidof(IMFGetService), (void**)&pGetService); // Get the IDirectXVideoMemoryConfiguration interface. if (SUCCEEDED(hr)) { hr = pGetService->GetService( MR_VIDEO_ACCELERATION_SERVICE, IID_PPV_ARGS(&pVideoConfig)); } // Notify the EVR. if (SUCCEEDED(hr)) { DXVA2_SurfaceType surfaceType; for (DWORD iTypeIndex = 0; ; iTypeIndex++) { hr = pVideoConfig->GetAvailableSurfaceTypeByIndex(iTypeIndex, &surfaceType); if (FAILED(hr)) { break; } if (surfaceType == DXVA2_SurfaceType_DecoderRenderTarget) { hr = pVideoConfig->SetSurfaceType(DXVA2_SurfaceType_DecoderRenderTarget); break; } } } SafeRelease(&pGetService); SafeRelease(&pVideoConfig); return hr; }
如果解碼器找到了有效的配置並成功通知了視頻渲染器,解碼器就可以用DXVA來解碼了。解碼器必須給輸出Pin實現客戶分配器(原為:a custom allocator),如下面一部分描述的。
5.分配解碼數據緩存(Allocating Uncompressed Buffers)
在DXVA 2.0中,解碼器負責分配作為解壓縮視頻數據緩存的Direct3D surfaces。因此,解碼器必須實現一個創建surfaces的custom allocator(不知道怎麼翻譯,不翻譯了,意思大概是由用戶來實現的分配器)。這個分配器提供的media samples會有一個指向Direct3D surfaces的指針。EVR通過調用這個media sample的IMFGetService::GetService取回這個指向surface的指針。這個服務的標識符是MR_BUFFER_SERVICE。
要實現custom allocator,需執行以下步驟:
1)給media samples定義一個類。這個類繼承自CMediaSample。在這個類中,做以下:
a)保存一個指向the Direct3D surface的指針;b)實現IMFGetService接口。在GetService方法中,如果service GUID i是MR_BUFFER_SERVICE,query the Direct3D surface for the requested interface。否則,GetService 會返回MF_E_UNSUPPORTED_SERVICE。c)重寫CMediaSample::GetPointer 方法來返回 E_NOTIMPL.
2)給the allocator定義一個類。the allocator可以繼承自CBaseAllocator類。在這個類中,做以下:
a)重寫CBaseAllocator::Alloc方法。在這個方法中,調用IDirectXVideoAccelerationService::CreateSurface創建surface。( IDirectXVideoDecoderService 接口從IDirectXVideoAccelerationService繼承這個方法)。b)重寫CBaseAllocator::Free方法釋放surface。
3)在你的過濾器的輸出Pin中,重寫CBaseOutputPin::InitAllocator方法。在這個方法中,創建一個你實現的custom allocator的實例。
4)在你的filter中,實現CTransformFilter::DecideBufferSize方法。pProperties 參數表明EVR所需的surface的數量。把這個值增加的解碼器所需的大小,並在allocator中調用IMemAllocator::SetProperties。
以下代碼展示如何實現media sample類:
class CDecoderSample : public CMediaSample, public IMFGetService { friend class CDecoderAllocator; public: CDecoderSample(CDecoderAllocator *pAlloc, HRESULT *phr) : CMediaSample(NAME("DecoderSample"), (CBaseAllocator*)pAlloc, phr, NULL, 0), m_pSurface(NULL), m_dwSurfaceId(0) { } // Note: CMediaSample does not derive from CUnknown, so we cannot use the // DECLARE_IUNKNOWN macro that is used by most of the filter classes. STDMETHODIMP QueryInterface(REFIID riid, void **ppv) { CheckPointer(ppv, E_POINTER); if (riid == IID_IMFGetService) { *ppv = static_cast<IMFGetService*>(this); AddRef(); return S_OK; } else { return CMediaSample::QueryInterface(riid, ppv); } } STDMETHODIMP_(ULONG) AddRef() { return CMediaSample::AddRef(); } STDMETHODIMP_(ULONG) Release() { // Return a temporary variable for thread safety. ULONG cRef = CMediaSample::Release(); return cRef; } // IMFGetService::GetService STDMETHODIMP GetService(REFGUID guidService, REFIID riid, LPVOID *ppv) { if (guidService != MR_BUFFER_SERVICE) { return MF_E_UNSUPPORTED_SERVICE; } else if (m_pSurface == NULL) { return E_NOINTERFACE; } else { return m_pSurface->QueryInterface(riid, ppv); } } // Override GetPointer because this class does not manage a system memory buffer. // The EVR uses the MR_BUFFER_SERVICE service to get the Direct3D surface. STDMETHODIMP GetPointer(BYTE ** ppBuffer) { return E_NOTIMPL; } private: // Sets the pointer to the Direct3D surface. void SetSurface(DWORD surfaceId, IDirect3DSurface9 *pSurf) { SafeRelease(&m_pSurface); m_pSurface = pSurf; if (m_pSurface) { m_pSurface->AddRef(); } m_dwSurfaceId = surfaceId; } IDirect3DSurface9 *m_pSurface; DWORD m_dwSurfaceId; };
以下代碼展示如何在allocator中實現Alloc方法
HRESULT CDecoderAllocator::Alloc() { CAutoLock lock(this); HRESULT hr = S_OK; if (m_pDXVA2Service == NULL) { return E_UNEXPECTED; } hr = CBaseAllocator::Alloc(); // If the requirements have not changed, do not reallocate. if (hr == S_FALSE) { return S_OK; } if (SUCCEEDED(hr)) { // Free the old resources. Free(); // Allocate a new array of pointers. m_ppRTSurfaceArray = new (std::nothrow) IDirect3DSurface9*[m_lCount]; if (m_ppRTSurfaceArray == NULL) { hr = E_OUTOFMEMORY; } else { ZeroMemory(m_ppRTSurfaceArray, sizeof(IDirect3DSurface9*) * m_lCount); } } // Allocate the surfaces. if (SUCCEEDED(hr)) { hr = m_pDXVA2Service->CreateSurface( m_dwWidth, m_dwHeight, m_lCount - 1, (D3DFORMAT)m_dwFormat, D3DPOOL_DEFAULT, 0, DXVA2_VideoDecoderRenderTarget, m_ppRTSurfaceArray, NULL ); } if (SUCCEEDED(hr)) { for (m_lAllocated = 0; m_lAllocated < m_lCount; m_lAllocated++) { CDecoderSample *pSample = new (std::nothrow) CDecoderSample(this, &hr); if (pSample == NULL) { hr = E_OUTOFMEMORY; break; } if (FAILED(hr)) { break; } // Assign the Direct3D surface pointer and the index. pSample->SetSurface(m_lAllocated, m_ppRTSurfaceArray[m_lAllocated]); // Add to the sample list. m_lFree.Add(pSample); } } if (SUCCEEDED(hr)) { m_bChanged = FALSE; } return hr; }
以下代碼是Free方法:
void CDecoderAllocator::Free() { CMediaSample *pSample = NULL; do { pSample = m_lFree.RemoveHead(); if (pSample) { delete pSample; } } while (pSample); if (m_ppRTSurfaceArray) { for (long i = 0; i < m_lAllocated; i++) { SafeRelease(&m_ppRTSurfaceArray[i]); } delete [] m_ppRTSurfaceArray; } m_lAllocated = 0; }
6.解碼(Decoding)
調用IDirectXVideoDecoderService::CreateVideoDecoder方法創建解碼器設備,該方法返回一個指向解碼器設備IDirectXVideoDecoder接口的指針。
對每一幀,調用IDirect3DDeviceManager9::TestDevice來測試設備句柄。如果設備改變了,方法將返回DXVA2_E_NEW_VIDEO_DEVICE。如果這種情況發生,做以下:
1)調用IDirect3DDeviceManager9::CloseDeviceHandle關閉設備句柄
2)釋放IDirectXVideoDecoderService 和IDirectXVideoDecoder 指針
3)打開一個新的設備句柄
4)確定一個新的解碼器配置,如3所述。
5)創建一個新的解碼器設備。
假定設備句柄有效,解碼進程以如下步驟工作:
1)調用IDirectXVideoDecoder::BeginFrame
2)做以下,一次或多次:
a)調用IDirectXVideoDecoder::GetBuffer獲取一個DXVA解碼器緩存
b)填充緩存
c)調用IDirectXVideoDecoder::ReleaseBuffer
3)調用IDirectXVideoDecoder::Execute對該幀執行解碼操作
DXVA 2.0解碼操作所用數據結構與DXVA 1.0相同。
在每一對BeginFrame/Execute的調用之間,你可能要多次調用GetBuffer,但每種DXVA緩存類型只能一次。如果你對同一種緩存類型調用兩次,數據將會覆蓋。
調用Execute之後,調用IMemInputPin::Receive把該幀傳給視頻渲染器,這與軟解一樣。Receive方法是異步的,它返回之後,解碼器可以繼續解碼下一幀。顯示驅動器(display driver)阻止任何解碼命令在緩存使用期間覆寫緩存。解碼器不應該在渲染器釋放sample之前重用surface來解碼另一幀數據。當渲染器釋放sample之後,分配器把sample放回可用sample池中。要獲取下一個可用sample,調用CBaseOutputPin::GetDeliveryBuffer,它轉而調用IMemAllocator::GetBuffer(原文:which in turn calls IMemAllocator::GetBuffer)。