<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Labix Blog &#187; Erlang</title>
	<atom:link href="http://blog.labix.org/tag/erlang/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.labix.org</link>
	<description>by Gustavo Niemeyer</description>
	<lastBuildDate>Mon, 16 Jan 2012 04:02:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Death of goroutines under control</title>
		<link>http://blog.labix.org/2011/10/09/death-of-goroutines-under-control</link>
		<comments>http://blog.labix.org/2011/10/09/death-of-goroutines-under-control#comments</comments>
		<pubDate>Sun, 09 Oct 2011 19:53:47 +0000</pubDate>
		<dc:creator>Gustavo Niemeyer</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Design]]></category>
		<category><![CDATA[Erlang]]></category>
		<category><![CDATA[Go]]></category>
		<category><![CDATA[Project]]></category>
		<category><![CDATA[Snippet]]></category>
		<category><![CDATA[Test]]></category>

		<guid isPermaLink="false">http://blog.labix.org/?p=717</guid>
		<description><![CDATA[Certainly one of the reasons why many people are attracted to the Go language is its first-class concurrency aspects. Features like communication channels, lightweight processes (goroutines), and proper scheduling of these are not only native to the language but are &#8230; <a href="http://blog.labix.org/2011/10/09/death-of-goroutines-under-control">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Certainly one of the reasons why many people are attracted to the <a href="http://golang.org">Go</a> language is its first-class concurrency aspects. Features like communication channels, lightweight processes (<i>goroutines</i>), and proper scheduling of these are not only native to the language but are integrated in a tasteful manner.</p>
<p><span id="more-717"></span>If you stay around listening to community conversations for a few days there&#8217;s a good chance you&#8217;ll hear someone proudly mentioning the tenet:</p>
<blockquote><p>Do not communicate by sharing memory; instead, share memory by communicating.</p></blockquote>
<p>There is a <a href="http://blog.golang.org/2010/07/share-memory-by-communicating.html">blog post</a> on the topic, and also a <a href="http://golang.org/doc/codewalk/sharemem/">code walk</a> covering it.</p>
<p>That model is very sensible, and being able to approach problems this way makes a significant difference when designing algorithms, but that&#8217;s not exactly news. What I address in this post is an open aspect we have today in Go related to this design: the <i>termination</i> of background activity.</p>
<p>As an example, let&#8217;s build a purposefully simplistic goroutine that sends lines across a channel:</p>
<pre>
type LineReader struct {
        Ch chan string
        r *bufio.Reader
}

func NewLineReader(r io.Reader) *LineReader {
        lr := &#038;LineReader{make(chan string), bufio.NewReader(r)}
        go lr.loop()
        return lr
}
</pre>
<p>The type has a channel where the client can consume lines from, and an internal buffer<br />
used to produce the lines efficiently. Then, we have a function that creates an initialized<br />
reader, fires the reading loop, and returns. Nothing surprising there.</p>
<p>Now, let&#8217;s look at the loop itself:</p>
<pre>
func (lr *LineReader) loop() {
        for {
                line, err := lr.r.ReadSlice('\n')
                if err != nil {
                        close(lr.Ch)
                        return
                }
                lr.Ch <- string(line)
        }
}
</pre>
<p>In the loop we'll grab a line from the buffer, close the channel in case of errors and stop, or otherwise send the line to the other side, perhaps blocking while the other side is busy with other activities. Should sound sane and familiar to Go developers.</p>
<p>There are two details related to the termination of this logic, though: first, the error information is being dropped, and then there's no way to interrupt the procedure from outside in a clean way. The error might be easily logged, of course, but what if we wanted to store it in a database, or send it over the wire, or even handle it taking in account its nature? Stopping cleanly is also a valuable feature in many circumstances, like when one is driving the logic from a test runner.</p>
<p>I'm not claiming this is something <i>difficult</i> to do, by any means.  What I'm saying is that there isn't today an <i>idiom</i> for handling these aspects in a simple and consistent way. Or maybe there wasn't. The <i>tomb</i> package for Go is an experiment I'm releasing today in an attempt to address this problem.</p>
<p>The model is simple: a <i>Tomb</i> tracks whether the goroutine is alive, dying, or dead, and the death reason.</p>
<p>To understand that model, let's see the concept being applied to the LineReader example. As a first step, creation is tweaked to introduce Tomb support:</p>
<pre>
type LineReader struct {
        Ch chan string
        r *bufio.Reader
        <span style="color: blue">*tomb.Tomb</span>
}

func NewLineReader(r io.Reader) *LineReader {
        lr := &#038;LineReader{
                make(chan string),
                bufio.NewReader(r),
                <span style="color: blue">tomb.New(),</span>
        }
        go lr.loop()
        return lr
}
</pre>
<p>Looks very similar. Just a new field in the struct and its respective initialization. We've used it as an embedded field just so we can use the Tomb methods directly in the <i>lr</i> variable.</p>
<p>Next, the loop function is modified to support tracking of errors and interruptions:</p>
<pre>
func (lr *LineReader) loop() {
        <span style="color: blue">defer lr.Done()</span>
        for {
                line, err := lr.r.ReadSlice('\n')
                if err != nil {
                        close(lr.Ch)
                        <span style="color: blue">lr.Fatal(err)</span>
                        return
                }
                select {
                case lr.Ch <- string(line):
                <span style="color: blue">case <-lr.Dying:</span>
                        close(lr.Ch)
                        return
                }
        }
}
</pre>
<p>Note a few interesting points here: first, <i>Done</i> is called to track the goroutine termination right before the loop function returns. Then, the previously loose error now goes into the <i>Fatal</i> Tomb method, flagging the goroutine as dying. Finally, the channel send was tweaked so that it doesn't block in case the goroutine is dying for whatever reason.</p>
<p>A Tomb has both <i>Dying</i> and <i>Dead</i> channels, which are closed when the Tomb state changes accordingly. These channels enable explicit blocking until the state changes, and also to selectively unblock select statements in those cases, as done above.</p>
<p>With the loop modified as above, a Stop method can trivially be introduced to request the clean termination of the goroutine synchronously from outside:</p>
<pre>
func (lr *LineReader) Stop() os.Error {
        <span style="color: blue">lr.Fatal(tomb.Stop)</span>
        return <span style="color: blue">lr.Wait()</span>
}
</pre>
<p>In this case the <i>Fatal</i> method will put the goroutine in a dying state from outside, and <i>Wait</i> will block until the goroutine terminates itself and notifies via the <i>Done</i> method as seen before. This procedure behaves correctly even if the goroutine was already dead or in a dying state due to internal errors, because only the first call to Fatal with an actual error is recorded as the cause for the goroutine death. The <i>tomb.Stop</i> value is used as a reason when terminating cleanly without an actual error, and it causes Wait to return nil once the goroutine terminates, flagging a clean stop per common Go idioms.</p>
<p>(<b>UPDATE:</b> there was <a href="http://groups.google.com/group/golang-nuts/browse_thread/thread/383f7cabbb174460">a minor simplification</a> in the API since this post was originally written, and the paragraph above was adapted to cover the new API)</p>
<p>This is pretty much all that there is to it. When I started developing in Go I wondered if coming up with a good convention for this sort of problem would require more support from the language, such as some kind of goroutine state tracking in a similar way to what <a href="http://www.erlang.org/doc/reference_manual/processes.html">Erlang does</a> with its lightweight processes, but it turns out this is mostly a matter of organizing the workflow with existing building blocks.</p>
<p>The tomb package and its Tomb type are a tangible representation of a good convention for goroutine termination, with familiar method names inspired in existing idioms. If you want to make use of it, goinstall the package with:</p>
<pre>
$ goinstall launchpad.net/tomb
</pre>
<p>The API documentation with details is available at:</p>
<p><span style="padding-left: 2em;"><a href="http://goneat.org/lp/tomb">http://goneat.org/lp/tomb</a></span></p>
<p>Have fun!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.labix.org/2011/10/09/death-of-goroutines-under-control/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Efficient algorithm for expanding circular buffers</title>
		<link>http://blog.labix.org/2010/12/23/efficient-algorithm-for-expanding-circular-buffers</link>
		<comments>http://blog.labix.org/2010/12/23/efficient-algorithm-for-expanding-circular-buffers#comments</comments>
		<pubDate>Thu, 23 Dec 2010 12:57:40 +0000</pubDate>
		<dc:creator>Gustavo Niemeyer</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Article]]></category>
		<category><![CDATA[C/C++]]></category>
		<category><![CDATA[Erlang]]></category>
		<category><![CDATA[Go]]></category>
		<category><![CDATA[Haskell]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Lua]]></category>
		<category><![CDATA[Perl]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Snippet]]></category>

		<guid isPermaLink="false">http://blog.labix.org/?p=580</guid>
		<description><![CDATA[Circular buffers are based on an algorithm well known by any developer who&#8217;s got past the &#8220;Hello world!&#8221; days. They offer a number of key characteristics with wide applicability such as constant and efficient memory use, efficient FIFO semantics, etc. &#8230; <a href="http://blog.labix.org/2010/12/23/efficient-algorithm-for-expanding-circular-buffers">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Circular buffers are based on an algorithm well known by any developer who&#8217;s got past the <i>&#8220;Hello world!&#8221;</i> days.  They offer a number of key characteristics with wide applicability such as constant and efficient memory use, efficient FIFO semantics, etc.</p>
<p>One feature which is not always desired, though, it the fact that circular buffers traditionally will either overwrite the last element, or raise an overflow error, since they are generally implemented as a buffer of <i>constant</i> size.  This is an unwanted property when one is attempting to <i>consume</i> items from the buffer and it is not an option to blindly drop items, for instance.</p>
<p>This post presents an efficient (and potentially novel) algorithm for implementing circular buffers which preserves most of the key aspects of the traditional version, while also supporting dynamic expansion when the buffer would otherwise have its oldest entry overwritten. It&#8217;s not clear if the described approach is novel or not (most of my novel ideas seem to have been written down 40 years ago), so I&#8217;ll publish it below and let you decide.</p>
<p><span id="more-580"></span><b>Traditional circular buffers</b></p>
<p>Before introducing the variant which can actually expand during use, let&#8217;s go through a quick review on traditional circular buffers, so that we can then reuse the nomenclature when extending the concept.  All the snippets provided in this post are written in Python, as a better alternative to pseudo-code, but the concepts are naturally portable to any other language.</p>
<p>So, the most basic circular buffer needs the buffer itself, its total capacity, and a position where the next write should occur.  The following snippet demonstrates the concept in practice:</p>
<pre>
buf = [None, None, None, None, None]
bufcap = len(buf)
pushi = 0   

for elem in range(7):
    buf[pushi] = elem
    pushi = (pushi + 1) % bufcap

print buf # => [5, 6, 2, 3, 4]
</pre>
<p>In the example above, the first two elements of the series (0 and 1) were overwritten once the pointer wrapped around. That&#8217;s the specific feature of circular buffers which the proposal in this post will offer an alternative for.</p>
<p>The snippet below provides a full implementation of the traditional approach, this time including both the pushing and popping logic, and raising an error when an overflow or underflow would occur.  Please note that these snippets are not necessarily idiomatic Python.  The intention is to highlight the algorithm itself.</p>
<pre>
class CircBuf(object):

    def __init__(self):
        self.buf = [None, None, None, None, None]
        self.buflen = self.pushi = self.popi = 0
        self.bufcap = len(self.buf)

    def push(self, x):
        assert self.buflen == 0 or self.pushi != self.popi, \
               "Buffer overflow!"
        self.buf[self.pushi] = x
        self.pushi = (self.pushi + 1) % self.bufcap
        self.buflen += 1

    def pop(self):
        assert self.buflen != 0, "Buffer underflow!"
        x = self.buf[self.popi]
        self.buf[self.popi] = None
        self.buflen -= 1
        self.popi = (self.popi + 1) % self.bufcap
        return x
</pre>
<p>With the basics covered, let&#8217;s look at how to extend this algorithm to support dynamic expansion in case of overflows.</p>
<p><b>Dynamically expanding a circular buffer</b></p>
<p>The approach consists in imagining that the same buffer can contain both a circular buffer area (referred to as <i>the ring area</i> from here on), and an overflow area, and that it is possible to transform a mixed buffer back into a pure circular buffer again.  To clarify what this means, some examples are presented below.  The full algorithm will be presented afterwards.</p>
<p>First, imagine that we have an empty buffer with a capacity of 5 elements as per the snippet above, and then the following operations take place:</p>
<pre>
for i in range(5):
    circbuf.push(i)

circbuf.pop() # => 0
circbuf.pop() # => 1

circbuf.push(5)
circbuf.push(6)

print circbuf.buf # => [<font style="color: blue">5, 6, 2, 3, 4</font>]
</pre>
<p>At this point we have a full buffer, and with the original implementation an additional push would raise an assertion error. To implement expansion, the algorithm will be changed so that those items will be appended at the end of the buffer.  Following the example, pushing two additional elements would behave the following way:</p>
<pre>
circbuf.push(7)
circbuf.push(8)

print circbuf.buf # => [<font style="color: blue">5, 6, 2, 3, 4,</font> <font color="red">7, 8</font>]
</pre>
<p>In that example, elements 7 and 8 are part of the overflow area, and the ring area remains with the same capacity and length of the original buffer. Let&#8217;s perform a few additional operations to see how it would behave when items are popped and pushed while the buffer is split:</p>
<pre>
circbuf.pop() # => 2
circbuf.pop() # => 3
circbuf.push(9)

print circbuf.buf # => [<font style="color: blue">5, 6,</font> None, None, <font style="color: blue">4,</font> <font style="color: red">7, 8, 9</font>]
</pre>
<p>In this case, even though there are two free slots available in the ring area, the last item pushed was still appended at the overflow area.  That&#8217;s necessary to preserve the FIFO semantics of the circular buffer, and means that the buffer may expand more than strictly necessary given the space available. In most cases this should be a reasonable trade off, and should stop happening once the circular buffer size stabilizes to reflect the production vs. consumption pressure (if you have a producer which constantly operates faster than a consumer, though, please look at the literature for plenty of advice on the problem).</p>
<p>The remaining interesting step in that sequence of events is the moment when the ring area capacity is expanded to cover the full allocated buffer again, with the previous overflow area being integrated into the ring area.  This will happen when the content of the previous partial ring area is fully consumed, as shown below:</p>
<pre>
circbuf.pop() # => 4
circbuf.pop() # => 5
circbuf.pop() # => 6
circbuf.push(10)

print circbuf.buf # => [<font style="color: blue">10,</font> None, None, None, None, <font style="color: blue">7, 8, 9</font>]
</pre>
<p>At this point, the whole buffer contains just a ring area and the overflow area is again empty, which means it becomes a traditional circular buffer.</p>
<p><b>Sample algorithm</b></p>
<p>With some simple modifications in the traditional implementation presented previously, the above semantics may be easily supported. Note how the additional properties did not introduce significant overhead. Of course, this version will incur in additional memory allocation to support the buffer expansion, bu that&#8217;s inherent to the problem being solved.</p>
<pre>
class ExpandingCircBuf(object):

    def __init__(self):
        self.buf = [None, None, None, None, None]
        self.buflen = self.ringlen = self.pushi = self.popi = 0
        self.bufcap = self.ringcap = len(self.buf)

    def push(self, x):
        if self.ringlen == self.ringcap or \
           self.ringcap != self.bufcap:
            self.buf.append(x)
            self.buflen += 1
            self.bufcap += 1
            if self.pushi == 0: # Optimization.
                self.ringlen = self.buflen
                self.ringcap = self.bufcap
        else:
            self.buf[self.pushi] = x
            self.pushi = (self.pushi + 1) % self.ringcap
            self.buflen += 1
            self.ringlen += 1

    def pop(self):
        assert self.buflen != 0, "Buffer underflow!"
        x = self.buf[self.popi]
        self.buf[self.popi] = None
        self.buflen -= 1
        self.ringlen -= 1
        if self.ringlen == 0 and self.buflen != 0:
            self.popi = self.ringcap
            self.pushi = 0
            self.ringlen = self.buflen
            self.ringcap = self.bufcap
        else:
            self.popi = (self.popi + 1) % self.ringcap
        return x
</pre>
<p>Note that the above algorithm will allocate each element in the list individually, but in sensible situations it may be better to allocate additional space for the overflow area in advance, to avoid potentially frequent reallocation.  In a situation when the rate of consumption of elements is about the same as the rate of production, for instance, there are advantages in doubling the amount of allocated memory per expansion.  Given the way in which the algorithm works, the previous ring area will be exhausted before the mixed buffer becomes circular again, so with a constant rate of production and an equivalent consumption it will effectively have its size doubled on expansion.</p>
<p><b>UPDATE:</b> Below is shown a version of the same algorithm which not only allows allocating more than one additional slot at a time during expansion, but also incorporates it in the overflow area immediately so that the allocated space is used optimally.</p>
<pre>
class ExpandingCircBuf2(object):

    def __init__(self):
        self.buf = []
        self.buflen = self.ringlen = self.pushi = self.popi = 0
        self.bufcap = self.ringcap = len(self.buf)

    def push(self, x):
        if self.ringcap != self.bufcap:
            expandbuf = (self.pushi == 0)
            expandring = False
        elif self.ringcap == self.ringlen:
            expandbuf = True
            expandring = (self.pushi == 0)
        else:
            expandbuf = False
            expandring = False

        if expandbuf:
            self.pushi = self.bufcap
            expansion = [None, None, None]
            self.buf.extend(expansion)
            self.bufcap += len(expansion)
            if expandring:
                self.ringcap = self.bufcap

        self.buf[self.pushi] = x
        self.buflen += 1
        if self.pushi < self.ringcap:
            self.ringlen += 1
        self.pushi = (self.pushi + 1) % self.bufcap

    def pop(self):
        assert self.buflen != 0, "Buffer underflow!"
        x = self.buf[self.popi]
        self.buf[self.popi] = None
        self.buflen -= 1
        self.ringlen -= 1
        if self.ringlen == 0 and self.buflen != 0:
            self.popi = self.ringcap
            self.ringlen = self.buflen
            self.ringcap = self.bufcap
        else:
            self.popi = (self.popi + 1) % self.ringcap
        return x
</pre>
<p><b>Conclusion</b></p>
<p>This blog post presented an algorithm which supports the expansion of circular buffers while preserving most of their key characteristics.  When not faced with an overflowing buffer, the algorithm should offer very similar performance characteristics to a normal circular buffer, with a few additional instructions and constant space for registers only. When faced with an overflowing buffer, the algorithm maintains the FIFO property and enables using contiguous allocated memory to maintain both the original circular buffer and the additional elements, and follows up reusing the full area as part of a new circular buffer in an attempt to find the proper size for the given use case.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.labix.org/2010/12/23/efficient-algorithm-for-expanding-circular-buffers/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Introducing The Hacking Sandbox</title>
		<link>http://blog.labix.org/2010/09/25/introducing-the-hacking-sandbox</link>
		<comments>http://blog.labix.org/2010/09/25/introducing-the-hacking-sandbox#comments</comments>
		<pubDate>Sat, 25 Sep 2010 16:33:54 +0000</pubDate>
		<dc:creator>Gustavo Niemeyer</dc:creator>
				<category><![CDATA[C/C++]]></category>
		<category><![CDATA[Erlang]]></category>
		<category><![CDATA[Go]]></category>
		<category><![CDATA[Haskell]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Lua]]></category>
		<category><![CDATA[Perl]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://blog.labix.org/?p=410</guid>
		<description><![CDATA[When I started programming in Python long ago, one of the features which really hooked me up was the quality interactive interpreter offered with the language implementation. It was (and still is) a fantastic way to experiment with syntax, semantics, &#8230; <a href="http://blog.labix.org/2010/09/25/introducing-the-hacking-sandbox">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>When I started programming in Python long ago, one of the features which really hooked me up was the quality interactive interpreter offered with the language implementation. It was (and still is) a fantastic way to experiment with syntax, semantics, modules, and whatnot.  So much so that many first-class Python practitioners will happily tell you that the interactive interpreter is used not only as a programming sandbox, but many times as the their personal calculator too.  This kind of interactive interpreter is also known as a <a href="http://en.wikipedia.org/wiki/Read-eval-print_loop">REPL</a>, standing for <i>Read Eval Print Loop</i>, and many languages have pretty advanced choices in that area by now.</p>
<p>After much rejoice with Python&#8217;s REPL, though, and as a normal human being, I&#8217;ve started wishing for more.  The problem has a few different levels, which are easy to understand.</p>
<p><span id="more-410"></span>First, we&#8217;re using <a href="http://twistedmatrix.com/">Python Twisted</a> in Ensemble, one of the projects being pushed at Canonical.  Twisted is an event-driven framework, which among other things means it works a lot with closures and callbacks.  Having to redefine multi-line functions frequently to drive experiments isn&#8217;t exactly fun in a line-based interactive interpreter.  Then, some of the languages I&#8217;ve started playing with, such as <a href="http://erlang.org">Erlang</a>, have limited REPLs which differ in functionality significantly compared to what may be done in a text file. And finally, other languages I&#8217;ve been programming with recently, such as <a href="http://golang.org">Go</a>, lack a reasonable REPL altogether (there are only unusable hacks around).</p>
<p>Alright, so here is the idea: what if instead of being given an interactive REPL, you were presented with your favorite text editor, and whenever you wrote the file down, it was executed and results presented?  That&#8217;s The Hacking Sandbox, or <a href="http://labix.org/hsandbox">hsandbox</a>.  It supports 11 different programming languages out of the box, and given its nature it should be trivial to support any other language.</p>
<p>Here is a screenshot to clarify the idea:</p>
<p><a href="http://blog.labix.org/wp-content/uploads/2010/09/hsandbox.png"><img src="http://blog.labix.org/wp-content/uploads/2010/09/hsandbox.png" alt="" title="hsandbox screenshot" width="600" height="359" class="aligncenter size-full wp-image-417" /></a></p>
<p>Note that if you open a sandbox for a language like C or Go, the skeleton of what&#8217;s needed to run a program will already be in place, so you just have to &#8220;fill the blanks&#8221;.</p>
<p>For more details and download information, please check the <a href="http://j.mp/hsandbox">hsandbox web page</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.labix.org/2010/09/25/introducing-the-hacking-sandbox/feed</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Python has a GIL, and lots of complainers</title>
		<link>http://blog.labix.org/2010/07/09/python-has-a-gil-and-lots-of-complainers</link>
		<comments>http://blog.labix.org/2010/07/09/python-has-a-gil-and-lots-of-complainers#comments</comments>
		<pubDate>Fri, 09 Jul 2010 19:15:49 +0000</pubDate>
		<dc:creator>Gustavo Niemeyer</dc:creator>
				<category><![CDATA[Erlang]]></category>
		<category><![CDATA[Go]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://blog.labix.org/?p=381</guid>
		<description><![CDATA[I&#8217;ve just read a post by Brett Cannon where, basically, he complains about complainers. If you don&#8217;t know who Brett is, you&#8217;re probably not a heavy Python user. Brett is a very important Python core developer which has been around &#8230; <a href="http://blog.labix.org/2010/07/09/python-has-a-gil-and-lots-of-complainers">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve just read <a href="http://sayspy.blogspot.com/2010/07/two-types-of-people-who-cause-biggest.html">a post by Brett Cannon</a> where, basically, he complains about complainers.</p>
<p>If you don&#8217;t know who Brett is, you&#8217;re probably not a heavy Python user.  Brett is a very important Python core developer which has been around for a while and who does a great job at it.  His post, though, makes me a bit sad.</p>
<p><span id="more-381"></span>Brett points out that there are two types of personalities which do not contribute to open source.  The first one he defines as:</p>
<blockquote><p>
The first type is the &#8220;complainer&#8221;. This is someone who finds something they don&#8217;t like, points out that the thing they don&#8217;t like is suboptimal, but then offers no solutions.
</p></blockquote>
<p>And the second one is defined as:</p>
<blockquote><p>
(&#8230;) This is someone who, upon finding out about a decision that they think was sub-optimal, decides to bring up new ideas and solutions. The person is obviously trying to be helpful by bringing up new ideas and solutions, thinking that the current one is simply going to flop and they need to stop people from making a big mistake.  The thing is, this person is not helping. (&#8230;)
</p></blockquote>
<p>This, on itself, is already shortsighted. If you&#8217;re tired of hearing the same arguments again and again for 10 years, from completely different people, there&#8217;s a pretty good chance that there&#8217;s an actual issue with your project, and your users are trying in their way to contribute and interact with you in the hope that it might get fixed.</p>
<p>This is really important:  They are <i>people</i>, which <i>use your project</i>, and are trying to <i>improve it</i>. If you can&#8217;t stand that, you should stop maintaining an open source project now, or pick something which no one cares about.</p>
<p>The other issue which took my attention in his post is his example: the Python GIL.  Look at the way in which Brett dismisses the problem:</p>
<blockquote><p>
(I am ignoring the fact that few people write CPU-intensive code requiring true threading support, that there is the multiprocessing library, true power users have extension  modules which do operate with full threading, and that there are multiple VMs out there with a solution that have other concurrency solutions)
</p></blockquote>
<p>Brett, we can understand that <a href="http://www.artima.com/weblogs/viewpost.jsp?thread=214235">the GIL is hard to remove</a>, but it&#8217;s a <a href="http://www.dabeaz.com/GIL/">fundamental flaw in the most important Python implementation</a>, and being dismissive about it will either draw further complaints at you, or will simply drive users away from the language entirely.</p>
<p>I can understand why you think this way, though.  Guido presents the same kind of feeling about the GIL for a very long time.  Here is one excerpt from a <a href="http://mail.python.org/pipermail/python-3000/2007-May/007414.html">mail thread about it</a>:</p>
<blockquote><p>
Nevertheless, you&#8217;re right the GIL is not as bad as you would initially think: you just have to undo the brainwashing you got from Windows and Java proponents who seem to consider threads as the only way to approach concurrent activities.</p>
<p>Just Say No to the combined evils of locking, deadlocks, lock granularity, livelocks, nondeterminism and race conditions.
</p></blockquote>
<p>I apologize, but I have a very hard time reading this and not complaining.</p>
<p>In my world, the golden days of geometric growth in vertical processing power is over, multi-processed machines are here to stay, and the amount of traffic flowing through networks is just increasing.  It feels reasonable to desire a less naïve approach to deal with real world problems, such as executing tasks concurrently.</p>
<p>I actually would love to not worry about things like non-determinism and race conditions, and would love even more to have a <i>programming language</i> which helps me with that!</p>
<p>Python, though, has a Global Interpreter Lock (yes, I&#8217;m talking about CPython, the most important interpreter).  Python programs execute in sequence.  No <a href="http://www.infoq.com/interviews/doug-lea-fork-join">Fork/Join frameworks</a>, no <a href="http://golang.org">coroutines</a>, no <a href="http://erlang.org">lightweight processes</a>, nothing.  Your <i>Python</i> code <i>will execute in sequence</i> if it lives in the same process space.</p>
<p>The answer from Brett and Guido to concurrency?  Develop your code in C, or write your code to execute in multiple processes.  If they really want people to get rid of non-determinism, locking issues, race conditions, and so on, they&#8217;re not helping at all.</p>
<p>I know this is just yet another complaint, though. I honestly cannot fix the problem either, and rather just talk about it in the hope that someone who&#8217;s able to do it will take care of it.  That said, I wish that the language maintainers would <i>do the same</i>, and tell the world that it&#8217;s an unfortunate problem, and that they wished someone else would go there and fix it!  If, instead, maintainers behave in a ridiculously dismissive way, like Guido did in that mail thread, and like Brett is doing in his post, the smart people that could solve the problem get turned down.  People like to engage with motivated maintainers.. they like to solve problems that others are interested in seeing solved.</p>
<p>Perhaps agreeing with the shortcomings won&#8217;t help, though, and no one will show up to fix the problem either. But then, at least users will know that the maintainers are on the same side of the fence, and the hope that it will get fixed survives.  If the maintainers just complain about the users which complain, and dismiss the problem, users are put in an awkward position.  I can&#8217;t complain.. I can&#8217;t provide ideas or solutions.. I can&#8217;t fix the problem.. they don&#8217;t even <i>care</i> about the problem.  Why am I using this thing at all?</p>
<p>Would you rather have users, or have no complainers?</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.labix.org/2010/07/09/python-has-a-gil-and-lots-of-complainers/feed</wfw:commentRss>
		<slash:comments>31</slash:comments>
		</item>
		<item>
		<title>Integrating IRC with LDAP and two-way SMSing</title>
		<link>http://blog.labix.org/2010/06/19/integrating-irc-with-ldap-and-two-way-smsing</link>
		<comments>http://blog.labix.org/2010/06/19/integrating-irc-with-ldap-and-two-way-smsing#comments</comments>
		<pubDate>Sat, 19 Jun 2010 21:56:07 +0000</pubDate>
		<dc:creator>Gustavo Niemeyer</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[Erlang]]></category>
		<category><![CDATA[Mobile]]></category>
		<category><![CDATA[Project]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://blog.labix.org/?p=296</guid>
		<description><![CDATA[A bit of history I don&#8217;t know exactly why, but I&#8217;ve always enjoyed IRC bots. Perhaps it&#8217;s the fact that it emulates a person in an easy-to-program way, or maybe it&#8217;s about having a flexible and shared &#8220;command line&#8221; tool, &#8230; <a href="http://blog.labix.org/2010/06/19/integrating-irc-with-ldap-and-two-way-smsing">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><b>A bit of history</b></p>
<p>I don&#8217;t know exactly why, but I&#8217;ve always enjoyed IRC bots.  Perhaps it&#8217;s the fact that it emulates a person in an easy-to-program way, or maybe it&#8217;s about having a flexible and shared &#8220;command line&#8221; tool, or maybe it&#8217;s just the fact that it helps people perceive things in an asynchronous way without much effort.  Probably a bit of everything, actually.</p>
<p><span id="more-296"></span></p>
<p>My bot programming started with <a href="http://labix.org/pybot">pybot</a> many years ago, when I was still working at <a href="http://www.conectiva.com.br">Conectiva</a>.  Besides having many interesting features, this bot eventually got in an abandonware state, since <a href="http://www.canonical.com">Canonical</a> already had pretty much equivalent features available when I joined, and I had other interests which got in the way.  The code was a bit messy as well.. it was a time when I wasn&#8217;t very used to testing software properly (a friend has a great excuse for that kind of messy software: <i>&#8220;I was young, and needed the money!&#8221;</i>).</p>
<p>Then, a couple of years ago, while working in the <a href="http://landscape.canonical.com">Landscape</a> project, there was an opportunity of getting some information more visible to the team.  Coincidently, it was also a time when I wanted to get some practice with the concepts of <a href="http://erlang.org">Erlang</a>, so I decided to write a bot from scratch with some nice support for plugins, just to get a feeling of how the promised stability of Erlang actually took place for real.  This bot is called <a href="https://launchpad.net/mup">mup</a> (Mup Pet, more formally), and its code is available publicly through <a href="https://launchpad.net/mup">Launchpad</a>.</p>
<p>This was a nice experiment indeed, and I did learn quite a bit about the ins and outs of Erlang with it.  Somewhat unexpected, though, was the fact that the bot grew up a few extra features which multiple teams in Canonical started to appreciate.  This was of course very nice, but it also made it more obvious that the egocentric reason for having a bot written in Erlang would now hurt, because most of Canonical&#8217;s own coding is done in Python, and that&#8217;s what internal tools should generally be written in for everyone to contribute and help maintaining the code.</p>
<p>That&#8217;s where the desire of migrating mup into a Python-based brain again came from, and having a new feature to write was the perfect motivator for this.</p>
<p><b>LDAP and two-way SMSing over IRC</b></p>
<p>Canonical is a <i>very</i> distributed company.  Employees are distributed over dozens of countries, literally.  Not only that, but most people also work from their homes, rather than in an office.  Many different countries also means many different timezones, and working from home with people from different timezones means flexible timing.  All of that means communication gets&#8230; well.. interesting.</p>
<p>How do we reach someone that should be in an online meeting and is not?  Or someone that is traveling to get to a sprint?  Or how can someone that has no network connectivity reach an IRC channel to talk to the team?  There are probably several answers to this question, but one of them is of course SMS.  It&#8217;s not exactly cheap if we consider the cost of the data being transfered, but pretty much everyone has a mobile phone which can do SMS, and the model is not that far away from IRC, which is the main communication system used by the company.</p>
<p>So, the itch was itching.  Let&#8217;s scratch it!</p>
<p>Getting the mobile phone of employees was already a solved problem for mup, because it had a plugin which could interact with the LDAP directory, allowing people to do something like this:</p>
<blockquote><p>
&lt;joe&gt; mup: poke gustavo<br />
&lt;mup&gt; joe: niemeyer is Gustavo Niemeyer &lt;&#8230;@canonical.com&gt; &lt;time:&#8230;&gt; &lt;mobile:&#8230;&gt;
</p></blockquote>
<p>This just had to be migrated from Erlang into a Python-based brain for the reasons stated above. This time, though, there was no reason to write something from scratch.  I could even have used pybot itself, but there was also <a href="http://sourceforge.net/projects/supybot/">supybot</a>, an IRC bot which started around the same time I wrote the first version of pybot, and unlike the latter, supybot&#8217;s author was much more diligent in evolving it.  There is quite a comprehensive list of plugins for supybot nowadays, and it includes means for testing plugins and so on.  The choice of using it was straighforward, and getting &#8220;<i>poke</i>&#8221; support ported into a plugin wasn&#8217;t hard at all.</p>
<p>So, on to SMSing.  Canonical already had a contract with an SMS gateway company which we established to test-drive some ideas on <a href="https://landscape.canonical.com">Landscape</a>. With the mobile phone numbers coming out of the LDAP directory in hands and an SMS contract established, all that was needed was a plugin for the bot to talk to the SMS gateway.  That &#8220;conversation&#8221; with the SMS gateway allows not only sending messages, but also receiving SMS messages which were sent to a specific number.</p>
<p>In practice, this means that people which are connected to IRC can very easily deliver an SMS to someone using their nicks.  Something like this:</p>
<blockquote><p>
&lt;joe&gt; @sms niemeyer Where are you?  We&#8217;re waiting!
</p></blockquote>
<p>And this would show up in the mobile screen as:</p>
<blockquote><p>
joe&gt; Where are you?  We&#8217;re waiting!
</p></blockquote>
<p>In addition to this, people which have <i>no connectivity</i> can also contact individuals and channels on IRC, with mup working as a middle man.  The message would show up on IRC in a similar way to:</p>
<blockquote><p>
&lt;mup&gt; [SMS] &lt;niemeyer&gt; Sorry, the flight was delayed. Will be there in 5.
</p></blockquote>
<p>The communication from the bot to the gateway happens via plain HTTPS.  The communication back is a bit more complex, though.  There is a small proxy service deployed in <a href="http://code.google.com/appengine">Google App Engine</a> to receive messages from the SMS gateway.  This was done to avoid losing messages when the bot itself is taken down for maintenance.  The SMS gateway doesn&#8217;t handle this case very well, so it&#8217;s better to have something which will be up most of the time buffering messages.</p>
<p>A picture is worth 2<sup>10</sup> words, so here is a simple diagram explaining how things got linked together:</p>
<p><a href="http://blog.labix.org/wp-content/uploads/2010/06/mup-sms.png"><img src="http://blog.labix.org/wp-content/uploads/2010/06/mup-sms.png" alt="" title="SMS integration diagram" width="449" height="255" class="aligncenter size-full wp-image-308" /></a></p>
<p>This is now up for experimentation, and so far it&#8217;s working nicely.  I&#8217;m hoping that in the next few weeks we&#8217;ll manage to port the rest of mup into the supybot-based brain.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.labix.org/2010/06/19/integrating-irc-with-ldap-and-two-way-smsing/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

