List Info

Thread: Cloning threading.py using proccesses




Cloning threading.py using proccesses
user name
2006-10-10 21:49:50
Fredrik Lundh <fredrikpythonware.com> wrote:
> 
> Josiah Carlson wrote:
> 
> > Presumably with this library you have created, you
have also written a
> > fast object encoder/decoder (like marshal or
pickle).  If it isn't any
> > faster than cPickle or marshal, then users may
bypass the module and opt
> > for fork/etc. + XML-RPC
> 
> XML-RPC isn't close to marshal and cPickle in
performance, though, so 
> that statement is a bit misleading.

You are correct, it is misleading, and relies on a few
unstated
assumptions.

In my own personal delving into process splitting, RPC,
etc., I usually
end up with one of two cases; I need really fast
call/return, or I need
not slow call/return.  The not slow call/return is (in my
opinion)
satisfactorally solved with XML-RPC.  But I've personally
not been
satisfied with the speed of any remote 'fast call/return'
packages, as
they usually rely on cPickle or marshal, which are slow
compared to
even moderately fast 100mbit network connections.  When we
are talking
about local connections, I have even seen cases where the
cPickle/marshal calls can make it so that forking the
process is faster
than encoding the input to a called function.

I've had an idea for a fast object encoder/decoder (with
limited support
for certain built-in Python objects), but I haven't gotten
around to
actually implementing it as of yet.


> the really interesting thing here is a ready-made
threading-style API, I 
> think.  reimplementing queues, locks, and semaphores
can be a reasonable 
> amount of work; might as well use an existing
implementation.

Really, it is a matter of asking what kind of API is
desireable.  Do we
want to have threading plus other stuff be the style of API
that we want
to replicate?  Do we want to have shared queue objects, or
would an
XML-RPC-esque remote.queue_put('queue_X', value) and
remote.queue_get('queue_X', blocking=1) be better?


 - Josiah

_______________________________________________
Python-Dev mailing list
Python-Devpython.org
ht
tp://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/p
ython-dev/nessto%40sharedlog.com
Cloning threading.py using proccesses
user name
2006-10-11 08:23:40
Josiah Carlson wrote:
> Fredrik Lundh <fredrikpythonware.com> wrote:
>> Josiah Carlson wrote:
>>
>>> Presumably with this library you have created,
you have also written a
>>> fast object encoder/decoder (like marshal or
pickle).  If it isn't any
>>> faster than cPickle or marshal, then users may
bypass the module and opt
>>> for fork/etc. + XML-RPC
>> XML-RPC isn't close to marshal and cPickle in
performance, though, so 
>> that statement is a bit misleading.
> 
> You are correct, it is misleading, and relies on a few
unstated
> assumptions.
> 
> In my own personal delving into process splitting, RPC,
etc., I usually
> end up with one of two cases; I need really fast
call/return, or I need
> not slow call/return.  The not slow call/return is (in
my opinion)
> satisfactorally solved with XML-RPC.  But I've
personally not been
> satisfied with the speed of any remote 'fast
call/return' packages, as
> they usually rely on cPickle or marshal, which are slow
compared to
> even moderately fast 100mbit network connections.  When
we are talking
> about local connections, I have even seen cases where
the
> cPickle/marshal calls can make it so that forking the
process is faster
> than encoding the input to a called function.

This is hard to believe. I've been in that business for a
few
years and so far have not found an OS/hardware/network
combination
with the mentioned features.

Usually the worst part in performance breakdown for RPC is
network
latency, ie. time to connect, waiting for the packets to
come through,
etc. and this parameter doesn't really depend on the OS or
hardware
you're running the application on, but is more a factor of
which
network hardware, architecture and structure is being used.

It also depends a lot on what you send as arguments, of
course,
but I assume that you're not pickling a gazillion objects


> I've had an idea for a fast object encoder/decoder
(with limited support
> for certain built-in Python objects), but I haven't
gotten around to
> actually implementing it as of yet.

Would be interesting to look at.

BTW, did you know about http://sou
rceforge.net/projects/py-xmlrpc/ ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1,
Oct 11 2006)
>>> Python/Zope Consulting and Support ...       
http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
____________________________________________________________
____________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for
free ! ::::
_______________________________________________
Python-Dev mailing list
Python-Devpython.org
ht
tp://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/p
ython-dev/nessto%40sharedlog.com
Cloning threading.py using proccesses
user name
2006-10-11 16:38:48
"M.-A. Lemburg" <malegenix.com> wrote:
> 
> Josiah Carlson wrote:
> > Fredrik Lundh <fredrikpythonware.com> wrote:
> >> Josiah Carlson wrote:
> >>
> >>> Presumably with this library you have
created, you have also written a
> >>> fast object encoder/decoder (like marshal
or pickle).  If it isn't any
> >>> faster than cPickle or marshal, then users
may bypass the module and opt
> >>> for fork/etc. + XML-RPC
> >> XML-RPC isn't close to marshal and cPickle in
performance, though, so 
> >> that statement is a bit misleading.
> > 
> > You are correct, it is misleading, and relies on a
few unstated
> > assumptions.
> > 
> > In my own personal delving into process splitting,
RPC, etc., I usually
> > end up with one of two cases; I need really fast
call/return, or I need
> > not slow call/return.  The not slow call/return is
(in my opinion)
> > satisfactorally solved with XML-RPC.  But I've
personally not been
> > satisfied with the speed of any remote 'fast
call/return' packages, as
> > they usually rely on cPickle or marshal, which are
slow compared to
> > even moderately fast 100mbit network connections. 
When we are talking
> > about local connections, I have even seen cases
where the
> > cPickle/marshal calls can make it so that forking
the process is faster
> > than encoding the input to a called function.
> 
> This is hard to believe. I've been in that business for
a few
> years and so far have not found an OS/hardware/network
combination
> with the mentioned features.
> 
> Usually the worst part in performance breakdown for RPC
is network
> latency, ie. time to connect, waiting for the packets
to come through,
> etc. and this parameter doesn't really depend on the OS
or hardware
> you're running the application on, but is more a factor
of which
> network hardware, architecture and structure is being
used.

I agree, that is usually the case.  But for pre-existing
connections
remote or local (whether via socket or unix domain socket),
pickling
slows things down significantly.  What do I mean?  Set up a
daemon that
reads and discards what is sent to it as fast as possible. 
Then start
sending it plain strings (constructed via something like
32768*''). 
Compare it to a somewhat equivalently sized pickle-as-you-go
sender. 
Maybe I'm just not doing it right, but I always end up with
a slowdown
that makes me want to write my own fast encoder/decoder.


> It also depends a lot on what you send as arguments, of
course,
> but I assume that you're not pickling a gazillion
objects 

According to tests on one of the few non-emulated linux
machines I have
my hands on, forking to a child process runs on the order of
.0004-.00055 seconds.  On that same machine, pickling...
    128*['hello world', 18, {1:2}, 7.382]
...takes ~.0005 seconds.  512 somewhat mixed elements isn't
a gazillion,
though in my case, I believe it was originally a list of
tuples or
somesuch.

> > I've had an idea for a fast object encoder/decoder
(with limited support
> > for certain built-in Python objects), but I
haven't gotten around to
> > actually implementing it as of yet.
> 
> Would be interesting to look at.

It would basically be something along the lines of cPickle,
but would
only support the basic types of: int, long, float, str,
unicode, tuple,
list, dictionary.

> BTW, did you know about http://sou
rceforge.net/projects/py-xmlrpc/ ?

I did not know about it.  But it looks interesting.  I'll
have to
compile it for my (ancient) 2.3 installation and see how it
does.  Thank
you for the pointer.


 - Josiah

_______________________________________________
Python-Dev mailing list
Python-Devpython.org
ht
tp://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/p
ython-dev/nessto%40sharedlog.com
Cloning threading.py using proccesses
user name
2006-10-11 16:41:52
Josiah Carlson wrote:

> It would basically be something along the lines of
cPickle, but would
> only support the basic types of: int, long, float, str,
unicode, tuple,
> list, dictionary.

if you're aware of a way to do that faster than the current
marshal 
implementation, maybe you could work on speeding up marshal
instead?

</F>

_______________________________________________
Python-Dev mailing list
Python-Devpython.org
ht
tp://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/p
ython-dev/nessto%40sharedlog.com
Cloning threading.py using proccesses
user name
2006-10-11 16:59:30
    Josiah> It would basically be something along the
lines of cPickle, but
    Josiah> would only support the basic types of: int,
long, float, str,
    Josiah> unicode, tuple, list, dictionary.

Isn't that approximately marshal's territory?  If you can
write a faster
encoder/decoder, it might well be possible to apply the
speedup ideas to
marshal.

Skip
_______________________________________________
Python-Dev mailing list
Python-Devpython.org
ht
tp://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/p
ython-dev/nessto%40sharedlog.com
Cloning threading.py using proccesses
user name
2006-10-12 00:31:19
On 10/12/06, Josiah Carlson <jcarlsonuci.edu> wrote:
>
> It would basically be something along the lines of
cPickle, but would
> only support the basic types of: int, long, float, str,
unicode, tuple,
> list, dictionary.
>

Great idea! Check this thread for past efforts:

http://mail.python.org/pipermail/python-dev/200
5-June/054313.html

The 'gherkin' module discussed there now lives in the
cheeseshop as
part of the FibraNet package.

http://che
eseshop.python.org/pypi/FibraNet

I love benchmarks, especially when they come around for the
second time.

I wrote a silly script which compares dumps performance
between
different serialization modules for different simple objects
using
Python 2.4.3. All figures are 'dumps per second'.

test: a tuple: ("a" *
1024,1.0,[1,2,3],{'1':2,'3':4})

gherkin: 10895.7762314
pickle: 6510.97245984
cPickle: 34218.5455317
marshal: 85562.2443672
xmlrpclib: 9468.0766772

test: a large string: 'a' * 10240

gherkin:  45955.4065455
pickle:     10209.0239868
cPickle:   13773.8138516
marshal: 24937.002069
xmlrpclib: Traceback

test: a small string: 'a' * 128

gherkin: 73453.0960495
pickle: 28357.0210654
cPickle: 122997.592425
marshal: 202428.776201
xmlrpclib: Traceback

test: a tupe of ints: tuple(range(64))
gherkin: 4522.06801154
pickle: 2273.12937965
cPickle: 23969.9306043
marshal: 143691.72582
xmlrpclib: 2781.3083894

Marshal is very quick for most cases, but still has this
warning in
the documentation.

"""Warning: The marshal module is not
intended to be secure against
erroneous or maliciously constructed data. Never unmarshal
data
received from an untrusted or unauthenticated
source."""

-Sw
_______________________________________________
Python-Dev mailing list
Python-Devpython.org
ht
tp://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/p
ython-dev/nessto%40sharedlog.com
Cloning threading.py using proccesses
user name
2006-10-12 13:22:30
"M.-A. Lemburg" <malegenix.com> wrote:
>
> This is hard to believe. I've been in that business for
a few
> years and so far have not found an OS/hardware/network
combination
> with the mentioned features.

Surely you must have - unless there is another M.-A. Lemburg
in IT!
Some of the specialist systems, especially those used for
communication,
were like that, and it is very likely that many still are. 
But they
aren't currently in Python's domain.  I have never used any,
but have
colleagues who have.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH,
England.
Email:  nmm1cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679
_______________________________________________
Python-Dev mailing list
Python-Devpython.org
ht
tp://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/p
ython-dev/nessto%40sharedlog.com
Cloning threading.py using proccesses
user name
2006-10-11 15:20:44
On 10/10/06, Josiah Carlson <jcarlsonuci.edu> wrote:
> > the really interesting thing here is a ready-made
threading-style API, I
> > think.  reimplementing queues, locks, and
semaphores can be a reasonable
> > amount of work; might as well use an existing
implementation.
>
> Really, it is a matter of asking what kind of API is
desireable.  Do we
> want to have threading plus other stuff be the style of
API that we want
> to replicate?  Do we want to have shared queue objects,
or would an
> XML-RPC-esque remote.queue_put('queue_X', value) and
> remote.queue_get('queue_X', blocking=1) be better?

Whatever the API is, I think it is useful if you can swap
between
threads and processes just by changing the import line. 
That way you
can write applications without deciding upfront which to
use.
_______________________________________________
Python-Dev mailing list
Python-Devpython.org
ht
tp://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/p
ython-dev/nessto%40sharedlog.com
[1-8]

about | contact  Other archives ( Real Estate discussion Medical topics )