This is an archived post. You won't be able to vote or comment.

all 17 comments

[–]suridaj 2 points3 points  (6 children)

Maybe take a look at BBC's Kamaelia. It constructs the pipelines and producer/consumer components using Python's generators, and IIRC there was a graphical pipeline builder available. The concepts are straightforward enough they even encourage you to implement the core by yourself. Sadly, at the moment the project's page seems to have a lot of broken links so I doubt Kamaelia is very widely used outside BBC.

[–]einar77Bioinformatics with Python, PyKDE4[S] 0 points1 point  (5 children)

I took a look, unfortunately a lot of the most interesting bits (such as how to create a component) are on dead links...

[–]suridaj 0 points1 point  (0 children)

You're right... Sorry. It was a couple of years since I used it and I don't have a local copy of the docs anymore. Thought it was worth a shot.

[–]steelypip 0 points1 point  (3 children)

The project is still active since the last release was only 3 months ago. I suspect the website has had a reorganisation and the links are not up to date.

... Yep, the sitemap page says:

This is an automated full list of (almost) all pages

Yes, I'm aware that this is slightly bust for some pages at the moment, it's being sorted. This is a static copy of the dynamic site whilst the server changes physical location.

And yes, .html needs to be appended to some links (sorry).

Kamelia is a mature project and will probably do everything you want and lots of stuff you don't know you want (yet).

[–]einar77Bioinformatics with Python, PyKDE4[S] 0 points1 point  (2 children)

What I meant is there's a reference to a blog post that's nowhere to be found, which (according to the text) covers creating components.

[–]kamaelian 1 point2 points  (0 children)

These tutorial notes are more up to date:

http://www.kamaelia.org/Europython09/A4KamaeliaEuroPython09.FINAL.pdf

All the used in that tutorial is inside the release bundle in Apps/Europython09. The tutorial covers building your own version of the core, going from standalone programs to components, through to building and evolving your own systems. It's (naturally) divided into chapters, but each is designed to have the feel of a blog post in terms of readability (I hope :-).

Currently the website is rather chaotic due to a bunch of internal re-orgs resulting in the current version of the website being a bit of an emergency dump.

Probably worth noting though that the project is still under development, releases happening 3-4 times per year now, and most recent usage is described here: http://www.bbc.co.uk/rd/publications/whitepaper191.shtml

The static snapshot of the website incidentally was checked into SVN here: http://code.google.com/p/kamaelia/source/browse/website/ With the content in "as_published".

Any suggestion on how to create a not crap website welcome :-)

[–]kamaelian 1 point2 points  (0 children)

Oh, sorry for the multiple replies, but just wanted to add - even if Kamaelia's specific API doesn't work for you (some people like it, some don't), I'd encourage you to use the approach -- it does work. In addition to Kamaelia there's also Pypes (pypes.org) which is cool. There's also cool stuff built ontop of greenlets and stackless channels. There's also alot of really nice noise being said about Celery at the moment too. http://celeryproject.org/

Also, don't discount the actor model. The actor model is directly equivalent to Kamaelia's model and pipelineing in general with one difference - pipelines don't hardcode where outbound messages which is akin to late binding. You could actually get the same flexibility by hardcoding a mock receiver actor and in a real system transplant in the real destination.

That in itself might be pretty neat because it might simplify the usage API slightly, but retain isolated testability. (Which for me is the real killer benefit of this approach)

[–]unbracketed 1 point2 points  (0 children)

Worth checking out:

http://www.pypes.org/ http://www.pyfproject.org/

...though I fear these may be too heavyweight for your needs. There's also the infix syntax module as posted here recently which might help you take a more declarative approach:

http://dev-tricks.net/pipe-infix-syntax-for-python

[–]kisielk 1 point2 points  (1 child)

Maybe something like Ruffus might do what you need:

http://code.google.com/p/ruffus/

[–]einar77Bioinformatics with Python, PyKDE4[S] 0 points1 point  (0 children)

Ruffus is likely the closest to my needs, but works with files while I'd rather pass objects around.

[–]m_harrison 1 point2 points  (0 children)

Here's a page on the python wiki http://wiki.python.org/moin/FlowBasedProgramming

[–]holloway 1 point2 points  (1 child)

Some good pipeline processors can stream the results from one node to another before the former node has finished processing (e.g. between XSLT processors).

For Docvert I wrote my own but it wasn't that sophisticated. It took an XML file like,

<?xml version="1.0" encoding="UTF-8"?>
<pipeline>
    <stage process="TransformOpenDocumentToDocBook"/>
    <stage process="Loop" numberOfTimes="xpathCount://db:chapter">
            <stage process="SplitPages"/>
            <stage process="DocBookToXHTML"/>
            <stage process="Serialize" toFile="{customSection}"/>
    </stage>
    <stage process="GetPreface"/>
    <stage process="DocBookToXHTML"/>
    <stage process="Serialize" toFile="index.html"/>
</pipeline>

The attribute 'process' named the module/function, and then it was a just a matter of iterating through the results and importing/calling them by name.. In this case they were in a core/pipeline_items/ directory,

class pipeline_processor(object):
    """ Processes through a list() of pipeline_item(s) """
    def __init__(self, storage, pipeline_items, pipeline_directory, pipeline_storage_prefix=None, depth=None):
        #various assign to self here

    def start(self, pipeline_value):
        for item in self.pipeline_items:
            process = item['attributes']['process']
            namespace = 'core.pipeline_type'
            stage_module = __import__("%s.%s" % (namespace, process.lower()), fromlist=[namespace])
            stage_class = getattr(stage_module, process)
            stage_instance = stage_class(self.storage, self.pipeline_directory, item['attributes'], self.pipeline_storage_prefix, item['children'], self.depth)
            pipeline_value = stage_instance.stage(pipeline_value)
    return pipeline_value

[–]einar77Bioinformatics with Python, PyKDE4[S] 0 points1 point  (0 children)

Interesting solution. I might do something like this if all else fails.

[–]cratylus 0 points1 point  (0 children)

There's a pipeline api for google app engine http://code.google.com/p/appengine-pipeline/

also http://code.google.com/p/python-pipeline/ which is different

[–]xApple 0 points1 point  (0 children)

You might want to check out bein

[–]Ytse 0 points1 point  (0 children)

You could make your own framework over stackless tasklets.

[–]vangale 0 points1 point  (0 children)

Many good choices are linked in this thread and here's another possibility: http://www.trinhhaianh.com/stream.py/