![]() |
KIN DB 2004 Project - Stress Tests |
Project hosted at: |
NOTE: This page is deprecated and for reference only. Refer to the
new tests page for updated information and results.
These are some tests performed with routines as per November 17th, so you can
have an idea of how CPU and RAM affect performance of some Kin DataBase main
(massive) routines:
Most of the code for these test can be found into
stress.c. If you want to contribute or discuss any of the figures here, you may post
a comment at the project's
performance forum. For an instance, would be interesting to see results on an Athlon64
with a fast RAM connected...
This is a Pentium 4 HT 2.60GHz with 512KB cache (cpu_family: 15,
model: 2, stepping: 9), 2x512MB (DDR400) in dual bank configuration,
on an Intel i865G chipset (using shared video frame buffer):
Stress version 0.0.2, Copyright (C) 2005 David Lopez Vinacua
GetuT() resolution: 1us, speed: 2 calls per us
There's 257019 pages (4096 bytes each): 980MB
CPU-RAM performance (best of 3):
Page allocation speed: 69us ( 884MP/s)
Function speed on 256MB Spent (us) B/s i/s
bzero() W 202722 1.23G 1.23G
memset32_1() W 202800 1.23G 308M
memset32_2() W 190233 1.31G 328M
scasb_1() (membyte1) R 430196 581M 581M
scasb_2() (membyte2) R 330320 756M 756M
scasb_3() (memchr) R 146233 1.70G 1.70G
scasw_1() (memshort1) R 221978 1.12G 563M
scasw_2() (memshort2) R 180872 1.38G 691M
scasl_1() (memint1) R 126658 1.97G 493M
scasl_2() (memint2) R 121539 2.05G 514M
ChkSum32_1() R 126262 1.98G 495M Res=60 (OK)
In-Cache performance (tests are performed 8 times each, results are best of 4):
Function speed on 8x64KB Spent (us) B/s i/s
scasb_1() (membyte1) R 806 605M 605M
scasb_2() (membyte2) R 605 807M 807M
scasb_3() (memchr) R 202 2.41G 2.41G
scasw_1() (memshort1) R 403 1.21G 605M
scasw_2() (memshort2) R 303 1.61G 805M
scasl_1() (memint1) R 202 2.41G 604M
scasl_2() (memint2) R 151 3.23G 808M
ChkSum32_1() R 102 4.78G 1.19G
CPU benchmarks (best of 12):
488K loop w/counter Spent (us) B/s i/s
loop instruction 384 1.21G
memory counter 384 1.21G
register counter 288 1.61G
This is a Pentium 4 HT 3.20GHz with 1024KB cache (cpu_family: 15,
model: 4, stepping: 1), 2x1GB (DDR400) in dual bank configuration,
on an Intel i865G chipset (NOT using shared video frame buffer):
Stress version 0.0.2, Copyright (C) 2005 David Lopez Vinacua
GetuT() resolution: 1us, speed: 2 calls per us
There's 519151 pages (4096 bytes each): 1.98GB
CPU-RAM performance (best of 3):
Page allocation speed: 49us (1.24GP/s)
Function speed on 256MB Spent (us) B/s i/s
bzero() W 153610 1.62G 1.62G
memset32_1() W 154237 1.62G 405M
memset32_2() W 138509 1.80G 451M
scasb_1() (membyte1) R 342098 730M 730M
scasb_2() (membyte2) R 328722 760M 760M
scasb_3() (memchr) R 121426 2.05G 2.05G
scasw_1() (memshort1) R 174767 1.43G 715M
scasw_2() (memshort2) R 158085 1.58G 790M
scasl_1() (memint1) R 94547 2.64G 661M
scasl_2() (memint2) R 92812 2.69G 673M
ChkSum32_1() R 76785 3.25G 813M Res=60 (OK)
In-Cache performance (tests are performed 8 times each, results are best of 4):
Function speed on 8x64KB Spent (us) B/s i/s
scasb_1() (membyte1) R 652 748M 748M
scasb_2() (membyte2) R 629 776M 776M
scasb_3() (memchr) R 214 2.28G 2.28G
scasw_1() (memshort1) R 326 1.49G 748M
scasw_2() (memshort2) R 291 1.67G 838M
scasl_1() (memint1) R 164 2.97G 744M
scasl_2() (memint2) R 166 2.94G 735M
ChkSum32_1() R 116 4.20G 1.05G
CPU benchmarks (best of 12):
488K loop w/counter Spent (us) B/s i/s
loop instruction 311 1.49G
memory counter 777 599M
register counter 234 1.99G
This is an Athlon XP 2000+ (1.66GHz) with 256KB cache (cpu_family: 6,
model: 8, stepping: 1), 1x256MB (DDR333, but at 2x133MHz) on a VIA KM266
chipset (using shared video frame buffer):
Stress version 0.0.2, Copyright (C) 2005 David Lopez Vinacua
GetuT() resolution: 1us, speed: 5 calls per us
There's 60172 pages (4096 bytes each): 229MB
CPU-RAM performance (best of 3):
Page allocation speed: 27us (1.13GP/s)
Function speed on 128MB Spent (us) B/s i/s
bzero() W 279803 446M 446M
memset32_1() W 279829 446M 111M
memset32_2() W 338480 369M 94.5M
scasb_1() (membyte1) R 357724 349M 349M
scasb_2() (membyte2) R 332922 375M 375M
scasb_3() (memchr) R 283395 441M 441M
scasw_1() (memshort1) R 278807 448M 224M
scasw_2() (memshort2) R 282722 442M 221M
scasl_1() (memint1) R 263147 475M 118M
scasl_2() (memint2) R 263290 474M 118M
ChkSum32_1() R 271597 460M 115M Res=60 (OK)
In-Cache performance (tests are performed 8 times each, results are best of 4):
Function speed on 8x64KB Spent (us) B/s i/s
scasb_1() (membyte1) R 708 689M 689M
scasb_2() (membyte2) R 708 689M 689M
scasb_3() (memchr) R 365 1.33G 1.33G
scasw_1() (memshort1) R 355 1.37G 687M
scasw_2() (memshort2) R 355 1.37G 687M
scasl_1() (memint1) R 179 2.72G 681M
scasl_2() (memint2) R 178 2.74G 685M
ChkSum32_1() R 179 2.72G 681M
CPU benchmarks (best of 12):
488K loop w/counter Spent (us) B/s i/s
loop instruction 899 517M
memory counter 900 517M
register counter 599 777M
This is an Athlon 800MHz with 256KB cache (cpufamily: 6,
model: 4, stepping: 2), 384MB (SDR133), on a VIA KT133 chipset
(shared video frame buffer, but not used):
Stress version 0.0.2, Copyright (C) 2005 David Lopez Vinacua
GetuT() resolution: 1us, speed: 2 calls per us
There's 96636 pages (4096 bytes each): 368MB
CPU-RAM performance (best of 3):
Page allocation speed: 114us ( 535MP/s)
Function speed on 256MB Spent (us) B/s i/s
bzero() W 716643 348M 348M
memset32_1() W 716739 348M 89.2M
memset32_2() W 716439 348M 89.3M
scasb_1() (membyte1) R 1510134 165M 165M
scasb_2() (membyte2) R 1449868 172M 172M
scasb_3() (memchr) R 1094242 228M 228M
scasw_1() (memshort1) R 1177657 212M 106M
scasw_2() (memshort2) R 1136320 220M 110M
scasl_1() (memint1) R 1009123 247M 63.4M
scasl_2() (memint2) R 928329 269M 68.9M
ChkSum32_1() R 1012167 246M 63.2M Res=68 (FAILED!)
In-Cache performance (tests are performed 8 times each, results are best of 4):
Function speed on 8x64KB Spent (us) B/s i/s
scasb_1() (membyte1) R 1465 333M 333M
scasb_2() (membyte2) R 1464 333M 333M
scasb_3() (memchr) R 755 646M 646M
scasw_1() (memshort1) R 735 664M 332M
scasw_2() (memshort2) R 733 666M 333M
scasl_1() (memint1) R 370 1.31G 329M
scasl_2() (memint2) R 368 1.32G 331M
ChkSum32_1() R 371 1.31G 329M
CPU benchmarks (best of 12):
488K loop w/counter Spent (us) B/s i/s
loop instruction 1859 250M
memory counter 1859 250M
register counter 1239 375M
This is a Pentium 4 Celeron 2.40GHz with 128KB cache (cpu_family: 15,
model: 2, stepping: 9), 256MB (DDR333), on an SiS 645 chipset:
Stress version 0.0.2, Copyright (C) 2005 David Lopez Vinacua
GetuT() resolution: 1us, speed: 3 calls per us
There's 64199 pages (4096 bytes each): 244MB
CPU-RAM performance (best of 3):
Page allocation speed: 29us (1.5GP/s)
Function speed on 128MB Spent (us) B/s i/s
bzero() W 172700 723M 723M
memset32_1() W 172645 724M 181M
memset32_2() W 159622 783M 195M
scasb_1() (membyte1) R 178878 698M 698M
scasb_2() (membyte2) R 137472 909M 909M
scasb_3() (memchr) R 75514 1.65G 1.65G
scasw_1() (memshort1) R 95480 1.30G 654M
scasw_2() (memshort2) R 78653 1.58G 794M
scasl_1() (memint1) R 72909 1.71G 428M
scasl_2() (memint2) R 73058 1.71G 427M
ChkSum32_1() R 72784 1.71G 429M Res=60 (OK)
In-Cache performance (tests are performed 8 times each, results are best of 4):
Function speed on 8x64KB Spent (us) B/s i/s
scasb_1() (membyte1) R 665 734M 734M
scasb_2() (membyte2) R 503 970M 970M
scasb_3() (memchr) R 191 2.55G 2.55G
scasw_1() (memshort1) R 337 1.44G 724M
scasw_2() (memshort2) R 259 1.88G 942M
scasl_1() (memint1) R 189 2.58G 645M
scasl_2() (memint2) R 157 3.11G 777M
ChkSum32_1() R 124 3.93G 984M
CPU benchmarks (best of 12):
488K loop w/counter Spent (us) B/s i/s
loop instruction 313 1.48G
memory counter 2323 200M
register counter 234 1.99G
This is a Pentium4 2.40GHz with 512KB cache (cpufamily: 15,
model: 2, stepping: 7), 256MB (DDR266), on a SiS645DX chipset:
Stress version 0.0.2, Copyright (C) 2005 David Lopez Vinacua
GetuT() resolution: 1us, speed: 2 calls per us
There's 64128 pages (4096 bytes each): 244MB
CPU-RAM performance (best of 3):
Page allocation speed: 44us ( 693MP/s)
Function speed on 128MB Spent (us) B/s i/s
bzero() W 252665 494M 494M
memset32_1() W 252408 495M 123M
memset32_2() W 223235 559M 139M
scasb_1() (membyte1) R 236125 529M 529M
scasb_2() (membyte2) R 181539 688M 688M
scasb_3() (memchr) R 90086 1.38G 1.38G
scasw_1() (memshort1) R 124273 1.00G 502M
scasw_2() (memshort2) R 98259 1.27G 636M
scasl_1() (memint1) R 86265 1.44G 362M
scasl_2() (memint2) R 86419 1.44G 361M
ChkSum32_1() R 86870 1.43G 359M Res=60 (OK)
In-Cache performance (tests are performed 8 times each, results are best of 4):
Function speed on 8x64KB Spent (us) B/s i/s
scasb_1() (membyte1) R 884 552M 552M
scasb_2() (membyte2) R 667 732M 732M
scasb_3() (memchr) R 219 2.22G 2.22G
scasw_1() (memshort1) R 438 1.11G 557M
scasw_2() (memshort2) R 328 1.48G 744M
scasl_1() (memint1) R 219 2.22G 557M
scasl_2() (memint2) R 165 2.95G 739M
ChkSum32_1() R 110 4.43G 1.10G
CPU benchmarks (best of 12):
488K loop w/counter Spent (us) B/s i/s
loop instruction 472 986M
memory counter 955 487M
register counter 313 1.48G
This is a Pentium4 HT 3.00GHz with 1MB cache (cpufamily: 15,
model: 3, stepping: 4), 512MB (DDR400), on an ATI RS300 chipset
(with shared 128bit video framebuffer):
Stress version 0.0.2, Copyright (C) 2005 David Lopez Vinacua
GetuT() resolution: 1us, speed: 2 calls per us
There's 112687 pages (4096 bytes each): 429MB
CPU-RAM performance (best of 3):
Page allocation speed: 74us ( 824MP/s)
Function speed on 256MB Spent (us) B/s i/s
bzero() W 294850 847M 847M
memset32_1() W 293855 850M 212M
memset32_2() W 250606 997M 249M
scasb_1() (membyte1) R 376145 664M 664M
scasb_2() (membyte2) R 362579 689M 689M
scasb_3() (memchr) R 137923 1.81G 1.81G
scasw_1() (memshort1) R 194428 1.28G 642M
scasw_2() (memshort2) R 171886 1.45G 727M
scasl_1() (memint1) R 142041 1.76G 440M
scasl_2() (memint2) R 139814 1.78G 447M
ChkSum32_1() R 124947 2.00G 500M Res=60 (OK)
In-Cache performance (tests are performed 8 times each, results are best of 4):
Function speed on 8x64KB Spent (us) B/s i/s
scasb_1() (membyte1) R 702 695M 695M
scasb_2() (membyte2) R 675 723M 723M
scasb_3() (memchr) R 231 2.11G 2.11G
scasw_1() (memshort1) R 351 1.39G 695M
scasw_2() (memshort2) R 315 1.55G 775M
scasl_1() (memint1) R 181 2.69G 674M
scasl_2() (memint2) R 178 2.74G 685M
ChkSum32_1() R 121 4.03G 1.00G
CPU benchmarks (best of 12):
488K loop w/counter Spent (us) B/s i/s
loop instruction 334 1.39G
memory counter 835 557M
register counter 251 1.85G
Note: For times longer than a few hundred microseconds, it is very usual to
have task switching, so consecutive test may have diferent results. This is annoying
on a study, but represents the real world (in a real system, there will be task switching
events). If you preform a test, try to keep that machine as idle as posible to minimize
spurious time accounting. Before every test, stress application tries to give the OS
a chance to switch to other tasks, and results are usually stable (very small variations).
Note: System 4 fails checksum (result value is not the expected one). Will
investigate that...
After examining those tests, some funny conclusions arise:
Return to home page |
![]() |