Lambda 與 Bind的性能比較 先讓我們看看下面函數: [cpp] template <typename Function> void do_test_loop(Function func, const uint64_t upper_limit = 1000000000ULL) { for (uint64_t i = 0; i < upper_limit; ++i) func(i); } template <typename Function> void do_test_loop(Function func, const uint64_t upper_limit = 1000000000ULL) { for (uint64_t i = 0; i < upper_limit; ++i) func(i); } 該函數只是簡單對func函數做upper_limit(默認10億次)次循環調用。有很多種方法給被反復調用的函數func傳入實際參數,在這裡我們只討論兩種: 1. 使用std::bind來生成一個多態的std::function<void (uint64_t)>函數。 2.lambda表達式 [cpp] void test_accumulate_bind_function(uint64_t& x, uint64_t i) { x += i; } uint64_t test_accumulate_bind() { namespace arg = std::placeholders; uint64_t x = 0; std::function<void (uint64_t)> accumulator = std::bind(&test_accumulate_bind_function, std::ref(x), arg::_1); do_test_loop(accumulator); return x; } void test_accumulate_bind_function(uint64_t& x, uint64_t i) { x += i; } uint64_t test_accumulate_bind() { namespace arg = std::placeholders; uint64_t x = 0; std::function<void (uint64_t)> accumulator = std::bind(&test_accumulate_bind_function, std::ref(x), arg::_1); do_test_loop(accumulator); return x; } 這是一個簡單的函數,在使用boost::bind函數時,我所遇到的最大的問題是它需要你將函數和邏輯分離,這樣會導致難以理解的代碼。對比較大的函數來說,這並不是什麼大問題,但是對於小函數來說運行時的上下文切換將會很耗時(令人討厭),例如上面所列舉的函數。 與上述函數相同的lambda表達式如下: [cpp] uint64_t test_accumulate_lambda() { uint64_t x = 0; auto accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; } uint64_t test_accumulate_lambda() { uint64_t x = 0; auto accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; } lambda表達式沒有運行時的上下文切換。當然,我們也因此失去了std::function所具有的高級的多態特性。lambda是一種由編譯器靜態關聯的無名類型,這也是為什麼在定義該類型時必須使用auto關鍵字的原因。變量accumulator表示lambda表達式的結果(沒有其他的lambda表達式能生成與此一樣的結果)。即使是兩個內容差不多的表達式也不會有相同的類型。如果do_test_loop是一個在cpp文件中實現的函數,那麼我們將在其的作用域范圍內獲取不到傳入進來的lambda表達式類型。 幸運的是,有些聰明的人已考慮到了這個潛在的問題,並且由一個lambda表達式賦值給一個std::function類型不僅僅是可能的,而且還是極其容易的: [cpp] uint64_t test_accumulate_bound_lambda() { uint64_t x = 0; std::function<void (uint64_t)> accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; } uint64_t test_accumulate_bound_lambda() { uint64_t x = 0; std::function<void (uint64_t)> accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; } 通過使用 lambda語義來替代std::bind,我們獲取到了std::function多態的所有威力和C++ lambda表達式所擁有的便利和高性能表現。這聽起來像是一種雙贏。 對於這三個函數我們可以做個簡單的比較(使用timer類): [cpp] template <typename Function> void run_test(const std::string& name, timer t; volatile_write(func()); timer::duration duration = t.elapsed(); std::cout << '\t' << duration.count() << std::endl; } int main() { run_test("Accumulate (lambda) ", &test_accumulate_lambda); run_test("Accumulate (bind) ", &test_accumulate_bind); run_test("Accumulate (bound lambda)" , &test_accumulate_bound_lambda); } template <typename Function> void run_test(const std::string& name, Function func) { std::cout << name; timer t; volatile_write(func()); timer::duration duration = t.elapsed(); std::cout << '\t' << duration.count() << std::endl; } int main() { run_test("Accumulate (lambda) ", &test_accumulate_lambda); run_test("Accumulate (bind) ", &test_accumulate_bind); run_test("Accumulate (bound lambda)", &test_accumulate_bound_lambda); }事不宜遲,我們先來看看使用gcc 4.4.2 -O3編譯並且在Inter Core i7 Q740機器上運行的結果:Accumulate (lambda) 7 Accumulate (bind) 4401849 Accumulate (bound lambda) 4379315每當我在做性能測試時看到運行結果耗非常懸殊時我都會反匯編程序看看編譯器到底做了什麼。(gdb) disassemble test_accumulate_lambda Dump of assembler code for function _Z22test_accumulate_lambdav: 0x0000000000400e70 <+0>: movabs $0x6f05b59b5e49b00,%rax 0x0000000000400e75 <+5>: retq End of assembler dump.在經過編譯器優化之後,整個函數僅僅是將0x6f05b59b5e49b00(十進制值為:499999999500000000)移動到了rax寄存器中就返回了。編譯器非常智能的知道了我們僅僅是對0到1000000000之間的數字求和並直接幫我們進行了代碼替換的優化,另我影響深刻的是編譯器竟然可以做到這點並且非常合理。函數的內容對do_test_loop函數的實例是靜態已知,所以編譯器將原有的代碼轉化成了如下所示的代碼:[cpp] view plaincopyprint?uint64_t test_accumulate_lambda() { uint64_t x = 0; // do_test_loop: for (uint64_t i = 0; i < 1000000000; ++i) x += i; return x; } uint64_t test_accumulate_lambda() { uint64_t x = 0; // do_test_loop: for (uint64_t i = 0; i < 1000000000; ++i) x += i; return x; }任何優秀的編譯器都將對其進行優化。我認為要從這個簡單例子中獲取的最重要的信息是:編譯器知道lambda函數是具有靜態性的,因此你可以放心的使用lambda函數而不必擔心它性能。那麼我們調用的std::function又是怎樣的一個過程呢?在這裡它的多態性讓我們很難去剖析,當函數do_test_loop被函數std::function<void (uint64_t)>實例化時,編譯器並不知道func的行為,因此它能做任何事情(它只是std::function的入口點)。std::bind和lambda表達式之間的不同之處是極其細微的。如果你多次的運行測試用例,在我的電腦裡lambda表達式的總會比std::bind的快一點,但是這些數據並不具有統計學的意義。這種性能在以後很有可能在不同的機器上會發生改變,如果我要猜測我會說這有std::reference_wrapper的作用。下面讓我們來看看兩個函數的堆棧。std::bind #0 test_accumulate_bind_function (x=@0x7fffffffe5d0, i=0) at lambda_vs_bind.cpp:106 #1 0x0000000000401111 in operator() (__args#0=0, this=<optimized out>) at /usr/local/include/gcc-4.6.2/functional:2161 #2 do_test_loop<std::function<void(long unsigned int)> > (func=<optimized out>, upper_limit=<optimized out>) at lambda_vs_bind.cpp:93 #3 test_accumulate_bind () at lambda_vs_bind.cpp:115 #4 0x0000000000401304 in run_test<unsigned long (*)()> (name=<optimized out>, func=0x401080 <test_accumulate_bind()>) at lambda_vs_bind.cpp:84 #5 0x0000000000401411 in main () at lambda_vs_bind.cpp:136 Lambda Expression #0 std::_Function_handler<void(long unsigned int), test_accumulate_bound_lambda()::<lambda(uint64_t)> >::_M_invoke(const std::_Any_data &, unsigned long) (__functor=..., __args#0=0) at /usr/local/include/gcc-4.6.2/functional:1778 #1 0x0000000000400fa9 in operator() (__args#0=0, this=<optimized out> at /usr/local/include/gcc-4.6.2/functional:2161 #2 do_test_loop<std::function<void(long unsigned int)> > (func=<optimized out>, upper_limit=<optimized out>) at lambda_vs_bind.cpp:93 #3 test_accumulate_bound_lambda () at lambda_vs_bind.cpp:126 #4 0x0000000000401304 in run_test<unsigned long (*)()> (name=<optimized out>, func=0x400f20 <test_accumulate_bound_lambda()>) at lambda_vs_bind.cpp:84 #5 0x000000000040143e in main () at lambda_vs_bind.cpp:140它們的不同之處僅僅是在std::function的operator()函數調用,為了正真發生了什麼,我們來快速的看一下g++ 4.6.2的std::function是怎麼實現的:[cpp] view plaincopyprint?template<typename _Res, typename... _ArgTypes> class function<_Res(_ArgTypes...)> : public _Maybe_unary_or_binary_function<_Res, _ArgTypes...>, private _Function_base { // a whole bunch of implementation details private: typedef _Res (*_Invoker_type)(const _Any_data&, _ArgTypes...); _Invoker_type _M_invoker; }; template<typename _Res, typename... _ArgTypes> class function<_Res(_ArgTypes...)> : public _Maybe_unary_or_binary_function<_Res, _ArgTypes...>, private _Function_base { // a whole bunch of implementation details private: typedef _Res (*_Invoker_type)(const _Any_data&, _ArgTypes...); _Invoker_type _M_invoker; };最令我感興趣的是std::function沒有使用virtual而是使用了一個函數指針。這樣做有一些優勢所在,這樣能夠讓你在使用std::function時不需要處理指針和引用——這在對象內部是非常復雜的。boost:bind那麼老方法boost::bind又是怎樣的呢?為了簡單起見,我們在上面的測試用例中直接用boost來替代std。Accumulate (boost bind) 3223174 Accumulate (boost bound lambda) 4255098令人感到奇怪的是boost::bind要比std::bind要快25%左右,boost::bind的調用堆棧與std::bind的看起來很相像:#0 test_accumulate_bind_function (x=@0x7fffffffe600, i=0) at lambda_vs_bind.cpp:114 #1 0x00000000004018a3 in operator() (a0=0, this=<optimized out>) at /usr/local/include/boost/function/function_template.hpp:1013 #2 do_test_loop<boost::function<void(long unsigned int)> > (upper_limit=<optimized out>, func=<optimized out>) at lambda_vs_bind.cpp:101 #3 test_accumulate_boost_bind () at lambda_vs_bind.cpp:144 #4 0x0000000000401f44 in run_test<unsigned long (*)()> (name=<optimized out>, func=0x401800 <test_accumulate_boost_bind()>) at lambda_vs_bind.cpp:92 #5 0x000000000040207e in main () at lambda_vs_bind.cpp:161(我大概可以寫一整篇的文章來描述問什麼boost::bind要比std::bind快了... ...)[cpp] view plaincopyprint?functional template<typename _Functor, typename... _ArgTypes> inline typename _Bind_helper<_Functor, _ArgTypes...>::type bind(_Functor&& __f, _ArgTypes&&... __args) { typedef _Bind_helper<_Functor, _ArgTypes...> __helper_type; typedef typename __helper_type::__maybe_type __maybe_type; typedef typename __helper_type::type __result_type; return __result_type(__maybe_type::__do_wrap(std::forward<_Functor>(__f)), std::forward<_ArgTypes>(__args)...); } boost/bind/bind.hpp (with the macros expanded) template<class F, class A1, class A2> _bi::bind_t<_bi::unspecified, F, typename _bi::list_av_2<A1, A2>::type> bind(F f, A1 a1, A2 a2) { typedef typename _bi::list_av_2<A1, A2>::type list_type; return _bi::bind_t<_bi::unspecified, F, list_type> (f, list_type(a1, a2)); } functional template<typename _Functor, typename... _ArgTypes> inline typename _Bind_helper<_Functor, _ArgTypes...>::type bind(_Functor&& __f, _ArgTypes&&... __args) { typedef _Bind_helper<_Functor, _ArgTypes...> __helper_type; typedef typename __helper_type::__maybe_type __maybe_type; typedef typename __helper_type::type __result_type; return __result_type(__maybe_type::__do_wrap(std::forward<_Functor>(__f)), std::forward<_ArgTypes>(__args)...); } boost/bind/bind.hpp (with the macros expanded) template<class F, class A1, class A2> _bi::bind_t<_bi::unspecified, F, typename _bi::list_av_2<A1, A2>::type> bind(F f, A1 a1, A2 a2) { typedef typename _bi::list_av_2<A1, A2>::type list_type; return _bi::bind_t<_bi::unspecified, F, list_type> (f, list_type(a1, a2)); }更多信息--------------------------------------------------------------------------------1. 源代碼你可以從這裡獲取到該程序的源代碼http://www.gockelhut.com/c++/files/lambda_vs_bind.cpp。它在g++ 4.6.2的編譯器上通過了編譯並且能夠運行,在支持c++11更好的編譯器上編譯將會更好。我的Boost庫的版本是1.47,較早的版本和更新的版本的庫都將工作得很好,因為boost::bind語法在一段時間內不會有太大更新(將來不一定)。如果你希望編譯和運行都不用boost,那麼將USE_BOOST的值改為0即可。2. volatile_writevolatile_write函數是一個由我編寫的強制的讓系統在內存中寫數據的簡單函數,這樣就能防止優化器去優化那些在函數run_test中沒有做任何事情的代碼。[cpp] view plaincopyprint?template <typename T> void volatile_write(const T& x) { volatile T* p = new T; *p = x; delete p; } template <typename T> void volatile_write(const T& x) { volatile T* p = new T; *p = x; delete p; }--------------------------------------------------------------------------------原文地址:http://www.gockelhut.com/c++/articles/lambda_vs_bind lambda_vs_bind.cpp[cpp] view plaincopyprint?/** * Copyright 2011 Travis Gockel * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. **/ // Turn building and testing boost::bind on or off with this macro #define USE_BOOST 1 // workaround for varieties of g++-4.6 with --std=gnu++0x #ifndef _GLIBCXX_USE_NANOSLEEP # define _GLIBCXX_USE_NANOSLEEP #endif #include <cstdint> #include <chrono> #include <iostream> #include <string> #include <thread> #if USE_BOOST #include <boost/function.hpp> #include <boost/bind.hpp> #endif class timer { public: typedef std::chrono::high_resolution_clock clock; typedef clock::time_point time_point; typedef clock::duration duration; public: timer() { reset(); } void reset() { _starttime = clock::now(); } duration elapsed() const { return clock::now() - _starttime; } protected: time_point _starttime; }; bool test_timer() { using std::chrono::milliseconds; typedef timer::duration duration; const milliseconds sleep_time(500); timer t; std::this_thread::sleep_for(sleep_time); duration recorded = t.elapsed(); // make sure the clock and this_thread::sleep_for is precise within one millisecond (or at least in agreement as to // how inaccurate they are) return (recorded - milliseconds(1) < sleep_time) && (recorded + milliseconds(1) > sleep_time); } template <typename T> void volatile_write(const T& x) { volatile T* p = new T; *p = x; delete p; } template <typename Function> void run_test(const std::string& name, Function func) { std::cout << name; timer t; volatile_write(func()); timer::duration duration = t.elapsed(); std::cout << '\t' << duration.count() << std::endl; } template <typename Function> void do_test_loop(Function func, const uint64_t upper_limit = 1000000000ULL) { for (uint64_t i = 0; i < upper_limit; ++i) func(i); } uint64_t test_accumulate_lambda() { uint64_t x = 0; auto accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; } void test_accumulate_bind_function(uint64_t& x, uint64_t i) { x += i; } uint64_t test_accumulate_bind() { namespace arg = std::placeholders; uint64_t x = 0; std::function<void (uint64_t)> accumulator = std::bind(&test_accumulate_bind_function, std::ref(x), arg::_1); do_test_loop(accumulator); return x; } uint64_t test_accumulate_bound_lambda() { uint64_t x = 0; std::function<void (uint64_t)> accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; } #if USE_BOOST uint64_t test_accumulate_boost_bind() { uint64_t x = 0; boost::function<void (uint64_t)> accumulator = boost::bind(&test_accumulate_bind_function, boost::ref(x), _1); do_test_loop(accumulator); return x; } uint64_t test_accumulate_boost_bound_lambda() { uint64_t x = 0; boost::function<void (uint64_t)> accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; } #endif int main() { if (!test_timer()) { std::cout << "Failed timer test." << std::endl; return -1; } run_test("Accumulate (lambda) ", &test_accumulate_lambda); run_test("Accumulate (bind) ", &test_accumulate_bind); run_test("Accumulate (bound lambda) ", &test_accumulate_bound_lambda); #if USE_BOOST run_test("Accumulate (boost bind) ", &test_accumulate_boost_bind); run_test("Accumulate (boost bound lambda)", &test_accumulate_bound_lambda); #endif } /** * Copyright 2011 Travis Gockel * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. **/ // Turn building and testing boost::bind on or off with this macro #define USE_BOOST 1 // workaround for varieties of g++-4.6 with --std=gnu++0x #ifndef _GLIBCXX_USE_NANOSLEEP # define _GLIBCXX_USE_NANOSLEEP #endif #include <cstdint> #include <chrono> #include <iostream> #include <string> #include <thread> #if USE_BOOST #include <boost/function.hpp> #include <boost/bind.hpp> #endif class timer { public: typedef std::chrono::high_resolution_clock clock; typedef clock::time_point time_point; typedef clock::duration duration; public: timer() { reset(); } void reset() { _starttime = clock::now(); } duration elapsed() const { return clock::now() - _starttime; } protected: time_point _starttime; }; bool test_timer() { using std::chrono::milliseconds; typedef timer::duration duration; const milliseconds sleep_time(500); timer t; std::this_thread::sleep_for(sleep_time); duration recorded = t.elapsed(); // make sure the clock and this_thread::sleep_for is precise within one millisecond (or at least in agreement as to // how inaccurate they are) return (recorded - milliseconds(1) < sleep_time) && (recorded + milliseconds(1) > sleep_time); } template <typename T> void volatile_write(const T& x) { volatile T* p = new T; *p = x; delete p; } template <typename Function> void run_test(const std::string& name, Function func) { std::cout << name; timer t; volatile_write(func()); timer::duration duration = t.elapsed(); std::cout << '\t' << duration.count() << std::endl; } template <typename Function> void do_test_loop(Function func, const uint64_t upper_limit = 1000000000ULL) { for (uint64_t i = 0; i < upper_limit; ++i) func(i); } uint64_t test_accumulate_lambda() { uint64_t x = 0; auto accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; } void test_accumulate_bind_function(uint64_t& x, uint64_t i) { x += i; } uint64_t test_accumulate_bind() { namespace arg = std::placeholders; uint64_t x = 0; std::function<void (uint64_t)> accumulator = std::bind(&test_accumulate_bind_function, std::ref(x), arg::_1); do_test_loop(accumulator); return x; } uint64_t test_accumulate_bound_lambda() { uint64_t x = 0; std::function<void (uint64_t)> accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; } #if USE_BOOST uint64_t test_accumulate_boost_bind() { uint64_t x = 0; boost::function<void (uint64_t)> accumulator = boost::bind(&test_accumulate_bind_function, boost::ref(x), _1); do_test_loop(accumulator); return x; } uint64_t test_accumulate_boost_bound_lambda() { uint64_t x = 0; boost::function<void (uint64_t)> accumulator = [&x] (uint64_t i) { x += i; }; do_test_loop(accumulator); return x; } #endif int main() { if (!test_timer()) { std::cout << "Failed timer test." << std::endl; return -1; } run_test("Accumulate (lambda) ", &test_accumulate_lambda); run_test("Accumulate (bind) ", &test_accumulate_bind); run_test("Accumulate (bound lambda) ", &test_accumulate_bound_lambda); #if USE_BOOST run_test("Accumulate (boost bind) ", &test_accumulate_boost_bind); run_test("Accumulate (boost bound lambda)", &test_accumulate_bound_lambda); #endif }