Monday, August 9

You don't tug on Superman's cape...

My long-time colleague, L. Peter Deutsch has taken on a project that might change things for Python users. LPD is, of course, the man who made Smalltalk go too fast1,2, the reason Adobe invented Postscript Level 2 (to keep him busy), the man who wrote the Lisp 1.5 interpreter for the PDP-1 while still in short pants. LPD is going to make Python go fast. Don't bet on Psyco.

Here's a long message from Peter regarding his plans, posted here by permission (click the permalink at the end of this post to see the whole thing):


pycore is a project to create a new implementation, also called pycore, of the Python language and libraries. It has the following goals, roughly in descending order of importance:

  1. Radically improve the performance of many Python programs.

  2. Reimplement as many C-coded Python libraries as possible in Python while retaining acceptable performance.

  3. Be able to run any Python program (some possibly slower than CPython) that does not:

    1. Depend on libraries implemented in C that haven't been recoded in Python;

    2. Use some of the more arcane customization facilities;

    3. Depend on being able to manipulate 'int' and 'long' as separate types, rather than having the implementation choose how integers are stored;

    4. Subclass any of the built-in types (bool, int, long, tuple, list, str, unicode, and possibly others).

pycore works by translating compiled Python bytecode to the bytecode of VisualWorks, the Cincom Smalltalk implementation. The VisualWorks JIT compiler is a mature, high-performance engine that is undergoing constant improvement, specifically optimized for a non-type-declared object-oriented language with inheritance: it is a good match for (the normal usage patterns of) Python.


pycore actually includes three different execution mechanisms:


  • A Python bytecode interpreter;

  • A Python-to-VW bytecode translator that represents all objects as dictionaries, and does explicit dictionary lookups for every attribute access (both data and method);

  • A Python-to-VW translator that represents (most) data attributes as Smalltalk instance variables and (most) methods as Smalltalk methods.

The interpreter is currently complete, except for 'exec'; the simple translator is substantially behind; and the optimized translator is only at the design stage. Nevertheless, some Python programs run faster even with the pycore simple translator than with CPython, for example:


  • Recursive fibonacci function, 9x faster

  • Iterating over a large list of integers, 5x faster

  • Creating a list element-by-element, 2x faster

  • Accessing an attribute by calling a method, 2.5x faster

On the other hand, replicating a collection: (1000000 * 'x'), is 7x slower.


So there are many challenges ahead.


We know of 5 other current projects with somewhat similar goals.


  1. Psyco is a fine-grained JIT compiler with dynamic customization. It should do much better than the VW JIT on numeric and string/array inner loops; however, its performance on method invocation is poor. In contrast, the VW JIT has very efficient invocation.

  2. PyPy aims to recode the Python interpreter and libraries in Python, and then use unspecified compiler technology to create a fully compiled system. pycore should be able to leverage the recoded libraries.

  3. Jython is a Java implementation of Python. While it compiles Python to Java, it discards most of Python's unique abilities in doing so (e.g., the ability to add attributes to any object, the ability to change the bindings of methods at run time, all the customization hooks, etc.) pycore does not need to discard any of these abilities: in principle, we believe we could support *all* of Python's extensive customization facilities without losing any performance in the usual cases.

  4. IronPython is a compiled implementation of Python on top of Microsoft's Common Language Runtime (CLR). Its author recently joined the Microsoft CLR group. It is in an early stage of development.

  5. Pirate is a Python compiler that targets the Parrot dynamic-language virtual machine. It is in a very early state of development. The pycore interpreter should be able to run all the test code on the Pirate Web site, and the simple compiler isn't very far behind.

There are surely others we don't know about.


pycore is currently a one-person project. Depending on what happens with the other projects listed above (especially PyPy), it may never get any bigger than this. Indeed, there's no commitment that the present person will ever deliver anything, although if he gets tired of it, he'll make sure that it gets out into the world with an Open Source license so anyone else interested can pick it up.



And when the hackers all get together at night

You know they all call Peter boss.


[1] Of course, Smalltalk was too fast already, even before Peter made it faster.

[2] I helped. See the POPL Paper.


7 comments:

verbat said...

Nice to hear for all the python users.

BTW I'd be interested in seeing how fast the pytalk works when soliciting python's specific features that VW should not directly support (MI,generators and the likes).
Is the VW jit so generic as to support them without problems?

Anyway, now I'm expecting to see Ruby on VW. given that ruby is mostly smalltalk without colons that should be easy :)

Anonymous said...

The VM isn't free (as in beer), is it?
http://webseitz.fluxent.com/wiki/SmallTalk

Anonymous said...

You wrote "Jython is a Java implementation of Python. While it compiles Python to Java, it discards most of Python's unique abilities in doing so (e.g., the ability to add attributes to any object, the ability to change the bindings of methods at run time, all the customization hooks, etc.) pycore does not need to discard any of these abilities: in principle, we believe we could support *all* of Python's extensive customization facilities without losing any performance in the usual cases.
"


Jython doesn't discard any of these things. You can't add methods and attributes to imported Java objects, but you can't add them to imported C objects in Python either.

I have (well, had, it's all C-Python now for performance reasons) a large body of Jython GUI code which used __getattr__ and other magic methods, used 'exec', and which was heavily dependent on being able to install new methods in classes, and in individual objects at runtime.

Bryn
http://www.xoltar.org

AMS said...

Free: The Cincom product isn't free for commercial use, but you *can* download it for free for "personal use".

VM Flexibility: Well, that's the big point of the effort, to figure-out what the implications of the Smalltalk vs. Python object model means for the VM. I can say that the VisualWorks VM (which descends from the original Blue Book VM developed at PARC in the early 80's) has more generality than has ever been exploited. [Conflict of Interest note -- Peter and I, with a lot of help, were the original developers of this VM when we were at ParcPlace Systems in the late 80's.]

Anonymous said...

L. Peter Deutsch here, without a blog account yet.

Bryn said "Jython doesn't discard any of [Python's unique customization abilities]." Well, the most recent version of Jython I was able to find was a nightly CVS snapshot dated 20030111. (The most recent released version, 2.1, is dated 12/31/2001.) This snapshot lacks at least the following, based on a grep of every file in the distribution:

__getattribute__
__metaclass__
__slots__
storage of exception data in the frame (f_exc_* attributes)
properties (__get__ et al)

There is currently a performance challenge going on between CPython and Parrot. I have the test files, and they use at least the following:

__metaclass__
__getattribute__
the ability to change __bases__ and __class__ dynamically

(I don't know whether Jython supports the last of these.)

I can imagine some future version of Jython supporting all of these things, as long as Jython continues to be essentially a Python interpreter (e.g., accessing all Python attributes through explicit dictionary lookups). pycore is far more ambitious: in the normal case, a Python method invocation will be a Smalltalk message send, and a Python attribute access will be (almost) a Smalltalk instance variable reference, with absolutely no additional overhead. The price for this, of course, will be a more complex implementation, and heavy costs for the use of certain customization features (e.g., dynamic addition or deletion of a __getattribute__ method may cause the recompilation of the class and all subclasses).

Needless to say, I don't intend to implement all of the arcane features from the start -- only persuade myself that I'm not doing anything that would prevent them from being implemented later.

Anonymous said...

L Peter Deutsch said:

"
snapshot lacks at least the following, based on a grep of every file in the distribution:

__getattribute__
__metaclass__
__slots__
storage of exception data in the frame (f_exc_* attributes)
properties (__get__ et al)
"

Yes, you're right - I basically stopped paying attention to Python after version 2.1, and those are all features that were introduced after that point. Sorry about that. Jython development has definitely been lagging behind a bit. In my defense, I have to say that the features AMS listed were all in 2.1, and do work in Jython.


and: "
__metaclass__
__getattribute__
the ability to change __bases__ and __class__ dynamically

(I don't know whether Jython supports the last of these.)
"


I believe it does support changing __bases__ and __class__, perhaps with the exception that the Jython class in question can't be derived from a Java class, but I'm not sure. Jython does support the older style of Python metaclasses, aka "the Don Beaudry hack".

I wish you luck, I'm sure it will be a great and useful project.

Bryn
http://www.xoltar.org

Strategy Game Guides said...

Surely I mean who on earth could possibly have anything to do with Superman. Which reminds, has anyone seen the new clip from Injustice?