Several performance enhancements have been added:
- A new opcode was added to perform the initial setup for
with
statements, looking up the__enter__()
and__exit__()
methods. (Contributed by Benjamin Peterson.) - The garbage collector now performs better for one common usage pattern: when many objects are being allocated without deallocating any of them. This would previously take quadratic time for garbage collection, but now the number of full garbage collections is reduced as the number of objects on the heap grows. The new logic only performs a full garbage collection pass when the middle generation has been collected 10 times and when the number of survivor objects from the middle generation exceeds 10% of the number of objects in the oldest generation. (Suggested by Martin von Löwis and implemented by Antoine Pitrou; bpo-4074.)
- The garbage collector tries to avoid tracking simple containers which can’t be part of a cycle. In Python 2.7, this is now true for tuples and dicts containing atomic types (such as ints, strings, etc.). Transitively, a dict containing tuples of atomic types won’t be tracked either. This helps reduce the cost of each garbage collection by decreasing the number of objects to be considered and traversed by the collector. (Contributed by Antoine Pitrou; bpo-4688.)
- Long integers are now stored internally either in base 2**15 or in base 2**30, the base being determined at build time. Previously, they were always stored in base 2**15. Using base 2**30 gives significant performance improvements on 64-bit machines, but benchmark results on 32-bit machines have been mixed. Therefore, the default is to use base 2**30 on 64-bit machines and base 2**15 on 32-bit machines; on Unix, there’s a new configure option
--enable-big-digits
that can be used to override this default.Apart from the performance improvements this change should be invisible to end users, with one exception: for testing and debugging purposes there’s a new structseqsys.long_info
that provides information about the internal format, giving the number of bits per digit and the size in bytes of the C type used to store each digit:(Contributed by Mark Dickinson; bpo-4258.)Another set of changes made long objects a few bytes smaller: 2 bytes smaller on 32-bit systems and 6 bytes on 64-bit. (Contributed by Mark Dickinson; bpo-5260.) - The division algorithm for long integers has been made faster by tightening the inner loop, doing shifts instead of multiplications, and fixing an unnecessary extra iteration. Various benchmarks show speedups of between 50% and 150% for long integer divisions and modulo operations. (Contributed by Mark Dickinson; bpo-5512.) Bitwise operations are also significantly faster (initial patch by Gregory Smith; bpo-1087418).
- The implementation of
%
checks for the left-side operand being a Python string and special-cases it; this results in a 1–3% performance increase for applications that frequently use%
with strings, such as templating libraries. (Implemented by Collin Winter; bpo-5176.) - List comprehensions with an
if
condition are compiled into faster bytecode. (Patch by Antoine Pitrou, back-ported to 2.7 by Jeffrey Yasskin; bpo-4715.) - Converting an integer or long integer to a decimal string was made faster by special-casing base 10 instead of using a generalized conversion function that supports arbitrary bases. (Patch by Gawain Bolton; bpo-6713.)
- The
split()
,replace()
,rindex()
,rpartition()
, andrsplit()
methods of string-like types (strings, Unicode strings, andbytearray
objects) now use a fast reverse-search algorithm instead of a character-by-character scan. This is sometimes faster by a factor of 10. (Added by Florent Xicluna; bpo-7462 and bpo-7622.)
No comments:
Post a Comment