Performance

This page contains benchmark results for selected Prolog backends. The main goal of this page it to give you some data for comparing predicate performance in plain Prolog and using Logtalk objects. Benchmark results are provided for both static code and dynamic code.

Results are given in number of calls per second. By default, the benchmark code repeats each goal up to 100000 times in order to get more accurate results. The exception is SICStus Prolog where a value of 1000000 was used for more accurate results.

Benchmarks run on an Apple iMac 3.8 GHz 8-Core Intel Core i7, 32GB RAM, macOS 14.7.1.

Benchmark goals

All the tests have been performed using the benchmarks example distributed with Logtalk 3.85.0, using static binding with optional features (including events support) disabled. This provides the most relevant scenario for comparing Logtalk performance with plain Prolog performance. The benchmarks example contains loader files for easily setting up this and other test scenarios (e.g. dynamic binding).

Static code test goals

The benchmarks example provides list length and naive list reverse predicates defined in plain Prolog, in a Prolog module, and in a Logtalk object (predicate definitions are the same in all cases). The following goals are used for the first two benchmark tests:

s11: generate_list(30, List), my_length(List, _)

s12: generate_list(30, List), module:mod_length(List, _)

s13: generate_list(30, List), object::length(List, _)

s21: generate_list(30, List), my_nrev(List, _)

s22: generate_list(30, List), module:mod_nrev(List, _)

s23: generate_list(30, List), object::nrev(List, _)

These benchmark tests use a list of 30 elements as an argument to the list predicates. Increasing the list length may lead to decreasing performance differences between plain Prolog and Logtalk as the list length computation time starts to outweigh the overhead of the message sending mechanism. Likewise, decreasing the list length may lead to increasing performance differences between plain Prolog and Logtalk (up to the point you will be closing on the Logtalk message sending mechanism overhead when compared to plain Prolog predicate calls). However, these tests make use of common library predicates where static binding is easily enabled, eliminating the message sending mechanism overheads. The next two examples deal with graph search:

s31: maze_solve(1, 7, _)

s32: module:mod_maze_solve(1, 7, _)

s33: maze::solve(1, 7, _)

s41: graph_path(0, 4, _)

s42: module:mod_graph_path(0, 4, _)

s43: graph::path(0, 4, _)

When static binding is used, the performance of each set of goals is expected to be similar. The performance of Logtalk can be worse due to the overhead of the extra argument added to each compiled object predicate for carrying execution context information. This overhead depends on the Prolog abstract machine and on the optimizations used to pass unchanged arguments between predicate calls.

Category test goals

Category predicates can be called using either the ::/1 or the ^^/1 control constructs. When using the ^^/1 control construct, the lookup for both the predicate declaration and the predicate definition begins in this and is restricted to the imported categories. Depending on how the category is compiled, Logtalk may use static binding for ^^/1 calls, providing the same performance level as calls to local object predicates. The following goals are used for the benchmark tests:

c1: leaf::obj_local

c2: leaf::ctg_direct

c3: leaf::ctg_self

The obj_local method calls a local object predicate; the performance of such calls is equal or close to plain Prolog. The ctg_direct method uses the ^^/1 control construct to call an imported category predicate. The ctg_self method uses the ::/1 message sending control construct to call an imported category predicate. While the ^^/1 calls may use static binding, the ::/1 calls always use dynamic binding and a lookup caching mechanism. Note that the choice between either control construct is not simply a question of performance as the control constructs provide different semantics for calling imported category predicates. All three predicates perform the same computation (generating a list of twenty elements and calculating its length) using local predicates.

Dynamic code test goals

Dynamic code tests include both object database updates and creating and abolishing dynamic objects. The benchmarks example provides an object named database, which defines a set of predicates for testing the Logtalk built-in database methods as described below. The following goals are used for the benchmark tests:

d1: create_object(xpto, [], [], []), abolish_object(xpto)

d2: plain_dyndb(_)

d3: database::this_dyndb(_)

d4: database::self_dyndb(_)

d5: database::obj_dyndb(_)

The first test simply creates and abolishes a (dynamic) object. The remaining tests are used for benchmarking object database updates, comparing with plain Prolog database updates. The *_dyndb tests simply assert (using assertz/1) and retract a clause (using retract/1) of a dynamic predicate with arity one. The plain_dyndb(_) test uses the Prolog built-in database predicates. The other three tests use the Logtalk built-in database methods, using a direct method call (this_dyndb(_)), a call using ::/1 (self_dyndb(_)), and a call using ::/2 (obj_dyndb(_)).

Static code benchmark results

Number of calls per second. The last columns show the trade-off between plain Prolog and Logtalk. Dynamic binding is never used in the Prolog module tests.

Static binding (no events support)

Prolog compiler      s11           s12           s13         s13/s11   
CxProlog 0.98.3 475163 - 415265 87.4 %
ECLiPSe 7.0#57 2542284 1695466 2460666 96.8 %
GNU Prolog 1.6.0 3571429 - 3333333 93.3 %
SICStus Prolog 4.9.0 27027027 25641026 27027027 100.0 %
SWI-Prolog 9.3.14 1586999 1559722 1562695 98.5 %
Trealla Prolog 2.60.0 131166 124374 120015 91.5 %
XSB 5.0.0 3125000 3030303 2941176 94.1 %
YAP 7.6.0 1785714 1724138 1754386 98.2 %
Prolog compiler      s21           s22           s23         s23/s21   
CxProlog 0.98.3 89545 - 85407 95.4 %
ECLiPSe 7.0#57 135901 122859 120678 88.8 %
GNU Prolog 1.6.0 187617 - 198413 105.8 %
SICStus Prolog 4.9.0 766284 732601 768049 100.2 %
SWI-Prolog 9.3.14 84190 83572 82314 97.8 %
Trealla Prolog 2.60.0 11473 11328 10520 91.7 %
XSB 5.0.0 176991 176367 171527 96.9 %
YAP 7.6.0 97371 96339 95785 98.4 %
Prolog compiler      s31           s32           s33         s33/s31   
CxProlog 0.98.3 150145 - 141462 94.2 %
ECLiPSe 7.0#57 436687 403474 407477 93.3 %
GNU Prolog 1.6.0 213675 - 204082 95.5 %
SICStus Prolog 4.9.0 1414427 1342282 1335113 94.4 %
SWI-Prolog 9.3.14 205188 205070 200929 97.9 %
Trealla Prolog 2.60.0 59145 58214 52192 88.2 %
XSB 5.0.0 359712 359712 346021 96.2 %
YAP 7.6.0 273224 276243 255754 93.6 %
Prolog compiler      s41           s42           s43         s43/s41   
CxProlog 0.98.3 48027 - 36524 76.0 %
ECLiPSe 7.0#57 89284 79597 84087 94.2 %
GNU Prolog 1.6.0 55432 - 52521 94.7 %
SICStus Prolog 4.9.0 352113 345185 354359 100.6 %
SWI-Prolog 9.3.14 48796 40805 47235 96.8 %
Trealla Prolog 2.60.0 14647 14310 12764 87.1 %
XSB 5.0.0 97087 97087 92937 95.7 %
YAP 7.6.0 59737 59737 57571 96.4 %

Category benchmark results

Number of calls per second. The last column shows the trade-off between static binding (c2) and dynamic binding (c3) when calling category predicates.

Prolog compiler      c1           c2           c3         c3/c2   
CxProlog 0.98.3 227929 226007 216390 95.7 %
ECLiPSe 7.0#57 843955 825235 581380 70.4 %
GNU Prolog 1.6.0 1449275 1449275 952381 65.7 %
SICStus Prolog 4.9.0 4524887 4545455 2604167 57.3 %
SWI-Prolog 9.3.14 496887 491072 460547 93.8 %
Trealla Prolog 2.60.0 90936 90222 74027 82.0 %
XSB 5.0.0 1098901 1075269 943396 87.7 %
YAP 7.6.0 617284 598802 558659 93.3 %

Dynamic code benchmark results

Number of calls per second. The last column shows the trade-off between plain Prolog (d2) and Logtalk using static binding (d3).

Prolog compiler      d1           d2           d3           d4           d5         d3/d2   
CxProlog 0.98.3 126 143847 135044 118913 122753 93.9 %
ECLiPSe 7.0#57 6868 970814 933650 497683 941957 96.2 %
GNU Prolog 1.6.0 18501 2325581 2222222 854701 2127660 95.6 %
SICStus Prolog 4.9.0 17269 2967359 2314815 1623377 2762431 78.0 %
SWI-Prolog 9.3.14 7547 726475 678440 528259 686629 93.4 %
Trealla Prolog 2.60.0 2880 332515 298230 205476 290353 89.7 %
XSB 5.0.0 2535 318471 316456 284091 330033 99.4 %
YAP 7.6.0 2297 145560 148810 131062 148588 102.2 %

Remarks

  • It's surprisingly difficult to get stable results, specially with some Prolog compilers. One the reasons seems to be the operating-system constant shuffling of processes between the cores.
  • Some results are odd, either above the expected maximum (100% of plain Prolog performance) or lower than what's reasonable to expect. This happens mostly on the most simple benchmark goals. Benchmarks where a more significant amount of work is performed seem to be more (but not complete) immune to these issues.
  • All benchmark tests use the default memory allocation for the different program areas. Changing the size of these program areas can have a big impact on the benchmark results (e.g. increasing stack size to avoid wasting time expanding the stack or doing garbage collection).
  • Logtalk usually performs better with Prolog compilers with mature virtual machines when compared with Prolog compilers with younger and less optimized virtual machines. The presence, in Logtalk compiled code, of a hidden execution context predicate argument is a particular sensitive point in virtual machines optimization as this extra argument is usually passed unchanged between local predicate calls.
  • These are too few and too limited benchmark tests to effectively compare Prolog compiler performance. Notably, some of the Prolog versions used are development versions due to the latest stable version being either too old or containing critical bugs.
  • Processor caches sometimes result in tests one order of magnitude better than the results posted above.
  • Some of the Prolog built-in predicates used for measuring CPU time are not as accurate as we would like. Despite each benchmark goal being proved by default 100000 times, repeating the tests always show some variation on the final results. Increasing or decreasing the number of repetions may help in getting more stable results.