Project

General

Profile

Bug #544 » perf-stats.txt

Stanislav Vovk, 05/27/2013 04:25 AM

 

* Following Perf stats were collected
-------------------------------------------

0 - Memory Reservation statistics:

CPU_COMMITTED_INST
CPU_CYCLE_COUNT
L2_LWARX_COMPLETE
L2_STWCX_SUCCESS
L2_READ_AFTER_WRITE
L2_WRITE_AFTER_WRITE

1 - L1 Cache performance

L2_MISS_I_FETCH
L2_HIT_I_FETCH
CPU_ICACHE_HIT
L2_MISS_D_FETCH
L2_MISS_STORE
L2_HIT_STORE
L2_HIT_D_FETCH
CPU_DCACHE_HIT

2 - L2 Cache performance

L2_MISS_D_FETCH
L2_MISS_I_FETCH
L2_MISS_STORE
L2_HIT_D_FETCH
L2_HIT_I_FETCH
L2_HIT_STORE

3 - Shadow TLB misses - L2 Miss Eviction

CPU_COMMITTED_INST
CPU_CYCLE_COUNT
CPU_DTLB_RELOAD
CPU_ITLB_RELOAD
L2_MISS_EVICTION
L2_MISS_D_FETCH
L2_MISS_I_FETCH
L2_MISS_STORE

4 - PLB6 Commands

PLB_MASTER_COMMAND
PLB_MASTER_READ
PLB_MASTER_RWITM
PLB_MASTER_DCLAIM
PLB_MASTER_WRITE
PLB_MASTER_INTVN_M
PLB_MASTER_INTVN_S
PLB_MASTER_MEM_DATA

=======================================================

A loop with an active tracepoint inside was executed a million times.
Time was taken before entering the loop and after exiting it. Average time
was then calculated. At very start, application sleeps 2 seconds to wait
that everything is initialized.


* LTTng 2.1.1 stats
-------------------------------------------

root@du1:~# ./rbs-perf-test -t 0 -s -c ./ltt-test-2.1.1 -l 1000000 -t 1
--- Memory Reservation statistics - exec cmd - ./ltt-test-2.1.1 ---
Benchmark test starts now...
Time it took to loop 1000000 times : 3 s 586888706 ns | [ 3586 ns per tp ]
Stats for pid 2826
Counters enabled in user and in kernel space
------------------------------------------------------------------
2735477556 Instruction count # 0.48 Instructions per Cycle
5690106496 CPU cycles # 1.596 GHz
3306491 Completed lwarx instructions
3306180 Successful stwcx instructions
2062689 L2 Read after Write
99173776 L2 Write after Write
1245 Context Switches # 0.000 M/sec
2 CPU-migrations # 0.000 M/sec
359 Page Faults # 0.000 M/sec
3566.016640 Task Clock # 0.635 CPUs utilized
5.615581770 seconds time elapsed

root@du1:~# ./rbs-perf-test -t 1 -s -c ./ltt-test-2.1.1 -l 1000000 -t 1
--- L1 Cache performance - exec cmd - ./ltt-test-2.1.1 ---
Benchmark test starts now...
Time it took to loop 1000000 times : 3 s 618456855 ns | [ 3618 ns per tp ]
Stats for pid 2846
Counters enabled in user and in kernel space
------------------------------------------------------------------
1382879363 L1-icache-loads # 387.507 M/sec
1376632814 L1-icache-load-hits # 99.55 of all L1-icache hits
838963771 L1-dcache-loads # 235.092 M/sec
705560597 L1-dcache-load-hits # 84.10 of all L1-dcache hits
1157 Context Switches # 0.000 M/sec
2 CPU-migrations # 0.000 M/sec
358 Page Faults # 0.000 M/sec
3568.655296 Task Clock # 0.632 CPUs utilized
5.645967350 seconds time elapsed

root@du1:~# ./rbs-perf-test -t 3 -s -c ./ltt-test-2.1.1 -l 1000000 -t 1
--- Shadow TLB misses - L2 Miss Eviction - exec cmd - ./ltt-test-2.1.1 ---
Benchmark test starts now...
Time it took to loop 1000000 times : 3 s 492358390 ns | [ 3492 ns per tp ]
Stats for pid 2863
Counters enabled in user and in kernel space
------------------------------------------------------------------
2733834879 Instruction count # 0.50 Instructions per Cycle
5508422280 CPU cycles # 1.595 GHz
91880488 dcache shadow TLB misses # 26.607 M/sec
37082980 icache shadow TLB misses # 10.739 M/sec
66490 L2 Miss Eviction # 56.39 of all L2 cache misses
1046 Context Switches # 0.000 M/sec
2 CPU-migrations # 0.000 M/sec
358 Page Faults # 0.000 M/sec
3453.270784 Task Clock # 0.626 CPUs utilized
5.518885690 seconds time elapsed

root@du1:~# ./rbs-perf-test -t 4 -s -c ./ltt-test-2.1.1 -l 1000000 -t 1
--- PLB6 Commands - exec cmd - ./ltt-test-2.1.1 ---
Benchmark test starts now...
Time it took to loop 1000000 times : 3 s 562387238 ns | [ 3562 ns per tp ]
Stats for pid 2877
Counters enabled in user and in kernel space
------------------------------------------------------------------
6588193 PLB6 Commands # 1.894 M/sec
106741 READ/READ-ATOMIC commands # 0.031 M/sec
11681 RWITM commands # 0.003 M/sec
90533 DCLAIM commands # 0.026 M/sec
13808 Cache line WRITE commands # 0.004 M/sec
8133 Cache line fill IntvnM # 0.002 M/sec
48460 Cache line fill IntvnS # 0.014 M/sec
59714 Cache line fill memory # 0.017 M/sec
2985 Context Switches # 0.001 M/sec
2 CPU-migrations # 0.000 M/sec
359 Page Faults # 0.000 M/sec
3477.678672 Task Clock # 0.622 CPUs utilized
5.589642530 seconds time elapsed



=======================================================

* LTTng 2.2.0-rc2 stats
-------------------------------------------

root@du1:~# ./rbs-perf-test -t 0 -s -c ./ltt-test-2.2.0 -l 1000000 -t 1
--- Memory Reservation statistics - exec cmd - ./ltt-test-2.2.0 ---
Benchmark test starts now...
Time it took to loop 1000000 times : 8 s 666169809 ns | [8666 ns per tp]
Stats for pid 2737
Counters enabled in user and in kernel space
------------------------------------------------------------------
6682138400 Instruction count # 0.50 Instructions per Cycle
13486937352 CPU cycles # 1.593 GHz
3378637 Completed lwarx instructions
3378115 Successful stwcx instructions
1116115 L2 Read after Write
419986854 L2 Write after Write
4674 Context Switches # 0.001 M/sec
8 CPU-migrations # 0.000 M/sec
375 Page Faults # 0.000 M/sec
8465.781152 Task Clock # 0.790 CPUs utilized
10.710508740 seconds time elapsed

root@du1:~# ./rbs-perf-test -t 1 -s -c ./ltt-test-2.2.0 -l 1000000 -t 1
--- L1 Cache performance - exec cmd - ./ltt-test-2.2.0 ---
Benchmark test starts now...
Time it took to loop 1000000 times : 8 s 750318129 ns | [8750 ns per tp]
Stats for pid 2752
Counters enabled in user and in kernel space
------------------------------------------------------------------
3076810638 L1-icache-loads # 361.077 M/sec
3042256266 L1-icache-load-hits # 98.88 of all L1-icache hits
2320374878 L1-dcache-loads # 272.306 M/sec
1913574867 L1-dcache-load-hits # 82.47 of all L1-dcache hits
5575 Context Switches # 0.001 M/sec
15 CPU-migrations # 0.000 M/sec
373 Page Faults # 0.000 M/sec
8521.200256 Task Clock # 0.790 CPUs utilized
10.788846760 seconds time elapsed

root@du1:~# ./rbs-perf-test -t 2 -s -c ./ltt-test-2.2.0 -l 1000000 -t 1
--- L2 Cache performance - exec cmd - ./ltt-test-2.2.0 ---
Benchmark test starts now...
Time it took to loop 1000000 times : 8 s 769646969 ns | [8769 ns per tp]
Stats for pid 2767
Counters enabled in user and in kernel space
------------------------------------------------------------------
422920080 L2-loads # 49.870 M/sec
422522434 L2-load-hits # 99.91 of all L2 cache hits
6906 Context Switches # 0.001 M/sec
15 CPU-migrations # 0.000 M/sec
372 Page Faults # 0.000 M/sec
8480.512512 Task Clock # 0.784 CPUs utilized
10.812143770 seconds time elapsed

root@du1:~# ./rbs-perf-test -t 3 -s -c ./ltt-test-2.2.0 -l 1000000 -t 1
--- Shadow TLB misses - L2 Miss Eviction - exec cmd - ./ltt-test-2.2.0 ---
Benchmark test starts now...
Time it took to loop 1000000 times : 8 s 621587970 ns | [8621 ns per tp]
Stats for pid 2782
Counters enabled in user and in kernel space
------------------------------------------------------------------
6682376594 Instruction count # 0.50 Instructions per Cycle
13404210872 CPU cycles # 1.593 GHz
117040641 dcache shadow TLB misses # 13.906 M/sec
60244653 icache shadow TLB misses # 7.158 M/sec
222672 L2 Miss Eviction # 58.50 of all L2 cache misses
3934 Context Switches # 0.000 M/sec
12 CPU-migrations # 0.000 M/sec
370 Page Faults # 0.000 M/sec
8416.596896 Task Clock # 0.789 CPUs utilized
10.661324180 seconds time elapsed

root@du1:~# ./rbs-perf-test -t 4 -s -c ./ltt-test-2.2.0 -l 1000000 -t 1
--- PLB6 Commands - exec cmd - ./ltt-test-2.2.0 ---
Benchmark test starts now...
Time it took to loop 1000000 times : 8 s 768384390 ns | [8768 ns per tp]
Stats for pid 2797
Counters enabled in user and in kernel space
------------------------------------------------------------------
11252534 PLB6 Commands # 1.335 M/sec
363519 READ/READ-ATOMIC commands # 0.043 M/sec
30251 RWITM commands # 0.004 M/sec
248485 DCLAIM commands # 0.029 M/sec
32418 Cache line WRITE commands # 0.004 M/sec
30128 Cache line fill IntvnM # 0.004 M/sec
159751 Cache line fill IntvnS # 0.019 M/sec
191845 Cache line fill memory # 0.023 M/sec
8081 Context Switches # 0.001 M/sec
6 CPU-migrations # 0.000 M/sec
372 Page Faults # 0.000 M/sec
8427.167536 Task Clock # 0.639 CPUs utilized
13.190474860 seconds time elapsed
(1-1/7)