For example, to measure the network performance between two multi-core serves running Windows Server 2012, NODE1 (192.168.0.1) and NODE2 (192.168.0.2), connected via a 10GigE connection, on NODE1 (the sender), run:
ntttcp.exe -s -m 8,*,192.168.0.2 -l 128k -a 2 -t 15
(Translation: Run ntttcp.exe as a sender, with eight threads dynamically allocated across all cores targeting 192.168.0.2, allocating a 128K buffer length and operating in asynchronous mode with 2 posted send overlapped buffers per thread for 15 seconds.)
And on NODE2 (the receiver), run:
ntttcp.exe -r -m 8,*,192.168.0.2 -rb 2M -a 16 -t 15
(Translation: Run ntttcp.exe as a receiver, with eight threads dynamically allocated across all cores listening on 192.168.0.2, allocating 64KB buffers [since -l is not specified], a 2MB SO_RCVBUF Winsock buffer and operating in asynchronous mode with 16 posted receive overlapped buffers