Tutorial: Using Coroutines in Corona

Tutorial: Using Coroutines in Corona

Today’s guest tutorial comes to you courtesy of Steven Johnson, the Technical Director and head of R&D at Xibalba Studios. Steven has been using Lua since 2003 and he has worked with custom engines, Vision SDK, LÖVE, HTML5 emulation, and projects with SDL bindings in C++ and LuaJIT. When he’s not working on his own hobby projects, he likes to research math and graphics concepts.


Coroutines, introduced in Lua 5.0, are one of the language’s key features, but despite having been available in Corona from the beginning, they seem to receive very little attention. This is unfortunate, as coroutines are quite powerful, giving you the ability to start and stop blocks of code as needed. Whether you want to do advanced timer manipulation or create state machines, coroutines give you greater control over when the parts of your code execute. This tutorial will touch on just a few of their many uses, in particular those which play to Corona’s own strengths.

The Basics

Let’s begin with a quick primer on the API. All functions are found in the coroutine table. As of this writing, these have not been included in Corona’s SDK API reference, but may be found in the Lua manual.

Creating a coroutine is simple enough:

This gives us a reference, co, to the coroutine.

We can inspect its type:

What we find is that co is a distinct kind of object, rather than a table or userdata as we might expect. Don’t worry too much about the term “thread”; in Lua, it and “coroutine” are mostly interchangeable[1]. Also, these are not operating system threads, which preempt running code to switch tasks. Rather, coroutines are collaborative; the coroutine itself decides when to give up control.

We can also ask about the coroutine’s status:

As we can see, simply creating the coroutine doesn’t run it. This shouldn’t be too surprising. Creating a function, for instance, is distinct from calling it. To run the coroutine, we must do this:

The body executes, and the message inside gets printed, as expected.

What is the coroutine’s status now? Let’s check:

The body has run its course, so nothing remains for the coroutine to do, and it goes dead.

Once a coroutine is dead, all we can do with it is ask its status, which will always be "dead". If we try to resume it once more, nothing happens. Or rather, as we’ll see shortly, it fails silently.

At this point, coroutines seem like nothing more than a complicated way to call a function once, and only once!

The missing piece of the puzzle is the ability to yield:

The first coroutine.resume kicks off the coroutine. The code at the beginning of the body executes, and a message is printed. So far, nothing new.

When the coroutine.yield fires, we suddenly hop outside of the coroutine’s body and back among the code that created it. Instead of getting the message “Second resume”, we see “After first yield”.

The coroutine is once again in a suspended state, just as it was after being created. It is not dead; there is still code left to execute. However, the coroutine won’t run again until we explicitly resume it.

When we do so, execution picks up where it left off, immediately after the yield, and we get the expected “Second resume”. Another yield and a final resume round out the snippet, giving us “After second yield” followed by “Final resume”.

After each of the yields, the status will be "suspended". After the final resume, our coroutine is once again "dead".

Working With Data

It’s possible to pass data to and from a coroutine. We can send values to the coroutine by passing them as arguments to coroutine.resume:

As we see, on the very first resume (the one which kicks the coroutine off, after creation), the data winds up in the parameters. On each subsequent resume, they instead show up as return values of the most recent coroutine.yield.

Receiving isn’t too different, but we go through coroutine.yield instead:

Here we see that any arguments passed to coroutine.yield end up as return values of the most recent coroutine.resume, as do any values returned from the coroutine body.

(Programming in Lua contains an excellent example of data-passing in action.)

Now, there’s something peculiar about the yield snippet. In the printed results, what does that true mean? It indicates that the resume was successful. Had an error occurred along the way, that true would be a false instead, the only other return value being an error message. When I mentioned before how resuming a dead coroutine fails silently, this is what is happening.

And speaking of dead coroutines, that’s what we’re stuck with, following an error.

Wrapping Up

We’ll often only need to create, resume, and yield a coroutine. This is common enough that a convenience function, coroutine.wrap, is provided in order to make coroutines more friendly to use:

This resembles our earlier examples, except resuming the coroutine seems to behave like a regular function call. In fact, the wrapper is just a function; coroutine.resume is taken care of behind the scenes, together with the coroutine reference itself.

Passing data around works much as before, except unlike coroutine.resume, the wrapper doesn’t return a success boolean. So what if it fails? For instance, in the last example, if we were to call the wrapper once more, now that the coroutine has finished?

crash

“KERBLOOEY!”

Of course, when developing in Corona, this is generally what we’ll want. It’s an error like any other. With all of this under our belt, we’re ready to advance.

The Coroutine-Timer Tag Team

coroutine.wrap offers some interesting possibilities. In particular, the wrapper being just a function opens some doors for us. Many Corona APIs, the various event listeners for instance, take function arguments, and will happily accept our coroutine-in-disguise.

Now, presumably we would favor coroutines over regular functions because we want the yield capability. Once we yield, however, the rest of the coroutine won’t happen, unless the wrapper gets called again (remember, doing this just performs a resume, under the hood). Therefore, we’ll typically want to use coroutines in logic that we expect to trigger multiple times.[2]

As it happens, we have many such mechanisms in Corona. Timers, for instance:

As you might imagine, this will fire every second or so, printing one of the now-familiar messages each time. This is all well and good, but notice the iteration count: three iterations. Two for the yields and one more to run the final leg. If we overestimate this count, say by specifying four iterations, we’ll end up trying to resume a dead coroutine. The code above is quite simple, so this isn’t a huge worry.

As our coroutine body grows larger, it becomes increasingly difficult to maintain a correct count, especially when the yields are in loops or behind function calls. Once we bring if statements into the mix, we can’t even depend on a fixed number, and then we’re completely out of luck.

We really want something that “just works”. Fortunately, "timer" events come packaged with a reference, source, which can be used to cancel the corresponding timer. Rather than hard-coding some number of iterations, we can simply let the timer run indefinitely, only canceling it once we’ve finished our task.

A first attempt might go something like this:

And this will work… until it doesn’t. Unfortunately, coroutines flush out a leaky abstraction: Corona is recycling the event tables. Presumably this is to avoid garbage collector spikes, and is a great idea. With normal functions, we would never know the difference. But we’re being naughty. By yielding, we end up holding on to that event table across multiple frames. Meanwhile, Corona has been passing the table around, swapping timer references in and out. When it comes time to deal with “our” timer, we might cancel a different one altogether!

We can play nice by saving the reference up front:

This does work. However, the "timer" event also contains count and time fields, and similar issues crop up.

We can account for these too. Our code probably won’t care where the event table came from (and as we just saw, it shouldn’t), and will be none the wiser if we give it a “shadow” table.[3] We update this shadow table instead, and all is well again.

A Timer Utility

Figuring these things out each time is going to get old. For that matter, most coroutine-based timers are going to have the same “look” to them, once we do get these details in order. We ought to roll this all up for reuse:

Now, we’ve been looking at timers, but the same ideas extend to "enterFrame" listeners, and even to repeating transitions. I tend to favor timers because of the customizable delay, and find them a bit more natural to cancel. If the coroutine is going to run forever anyway, however, it’s rather arbitrary.

Since we’ve brought up canceling, note that it’s perfectly fine to cancel the timer early. That said, it’s important to recognize that the timer and coroutine are two distinct things, so we would still need to yield the coroutine. Our helper function lets us do both at once (from inside the coroutine, of course) by calling coroutine.yield("cancel").

Waiting Around

So we have coroutines running on top of timers. Where do we go from here? Just using a timer encourages us to think of individual steps in the coroutine as taking place in time. If we apply this perspective to an earlier example, first we DoSomething(), then later we DoSomethingElse(). It’s a short leap from there to wanting something more explicitly chronological, like “DoSomethingElse() five seconds from now”. Can we achieve this?

Well, we do know when “now” is: we just ask system.getTimer. “The future,” then, is just “now” plus 5000 milliseconds. Once “the future” arrives, we know our five seconds are up. The most straightforward approach is to loop until then, yielding on each iteration. [4]

Consider the following:

Then, lo and behold, we can do this:

Five seconds may be a long time, relatively speaking, just to wait around. The optional update parameter lets us sneak in little batches of work, as necessary.

We can wait for something other than time, obviously. Perhaps some condition must be satisfied:

Let’s try it out! Here, we fire off a transition, then wait for it to finish:

Similarly, we can wait for some property to be true:

Let’s use this to wait until an object becomes visible:

Many more ideas could be explored. Obviously the “Wait until X is false” variants have their place. We can even watch multiple states, for which we’d have “Wait until all states in X are true”, “Wait until any state in X is true”, and so forth.

Each of these helper functions was fairly general, but nothing prevents us from making more specific ones. Compound operations, such as “Wait until object is visible, then wait ten seconds”, may also come in handy.

As the last examples show, coroutines play well with timers and transitions. Just one more tool in our kit.

State Machines

Sometimes we’ll run across code like the following:

Or, using strings:

This works well enough, but it’s often overkill, especially when we’re merely performing a sequence of actions. Beyond that, the pattern itself has some inherent dangers. When using integer states, we must remember which state belongs to what action. It doesn’t take long to lose our place. If we switch to the wrong state, say by mistyping the integer, we could easily have a debugging nightmare on our hands.

This isn’t such a problem with strings, of course, but we do have the hassle of thinking up good names. This is less trivial than it may seem. Things start out easy enough. We have "starting", "walking", and "waiting" states… so far so good! Then we’re walking again. Hmm. "walking2"? Sure, why not. Next it’s back to waiting. (sigh"waiting2" it is.

Finally we come to the really awkward stuff. What do we call “updating x, choosing a fill color, and emptying an array”? We may end up making little functions, such as Start, Walk, and Wait in the last example, rather than inline code, just to reduce some of this visual congestion. The naming problem strikes again! This also reduces code locality: we need to go hunting to find those functions and see what’s in them. Furthermore, if we have several such functions, we’ve really only shifted the clutter around.

It would be better if the last snippet could instead be written:

It should be obvious where this is headed. That style falls out quite naturally from moving our logic into a coroutine.

As a broader example, we might write the high-level game loop for some sort of sports title:

State Machines, Take Two

Now, we may want “conventional” state machines. Character AI, for instance, typically consists of several independent behaviors, each of them fairly significant, which the character cycles through. Thankfully, we can accommodate this as well.

Consider a very simple AI, for one of the players in our sports title:

The player begins as the defender. The defensive strategy consists entirely of getting the ball away from the opponent (it isn’t a very sophisticated sport). With that objective in mind, the player tries to get in range of, and steal, the ball. If this succeeds, or the opponent lost the ball in some other way, the player has command of the ball and goes on the offensive.

Offense is a similar affair. The player tries to draw near the goal and score a point. If the opponent regains possession (ball was grabbed, the player scored, etc.), it’s back to defense.

Switching states is a simple matter of calling the appropriate function. I tend to lump all my states into a table, when doing this sort of thing. There usually isn’t any obviously correct order for the states, so forward declarations end up being too much hassle, especially as the switches become ever more tangled.

The return state() syntax is key. This is what is known as a tail call. Whenever we call a function normally, Lua must leave behind some information so that, once the function finishes, execution can pick up where it left off. The trouble is, when we switch states, we have no intention of coming back! While we can hop around using the standard state() form, eventually so much of this bookkeeping piles up that it overflows the stack, and we crash. A tail call, on the other hand, in effect declares “I’m done here”. Lua honors this (by doing nothing) and the problem goes away.

Between the game loop and the player AI, we have two coroutines running. Nothing stops us from going further, say by making two full teams’ worth of player AI. In the end, it comes down to what works.

Long-Running Processes

Timer-backed coroutines are great when it comes to events that span time. As it turns out, they’re perfect for tackling another class of problems, too: actions that just plain take a while!

“A while” could mean on the order of a minute or two, yet even a 50-millisecond operation will make our frame rate hiccup, if it doesn’t allow the rest of the program a chance to act.

A good example of this is loading a game level. We might find ourselves performing some quite time-consuming steps, such as unzipping large files or downloading images. The number of resources may also be significant. Each one takes time, and it adds up.

There’s not always much we can do about the predicament itself. What we probably can do is take some action between, or during, these many operations. If the long-running process is embedded in a coroutine, we can call coroutine.yield to temporarily cede control.

Yielding between might look like so:

And during:

The yields put the action to sleep now and then, giving time back to Corona.

As with any long-running activity in Corona, it’s good practice to have something visual going on, be it the activity indicator, or a progress view, or even a simple animation. For particularly lengthy loads, we might even add a little mini-game as an overlay, to pass the time.

Some Yield Helpers

A shortcoming of using coroutine.yield directly is that, when we do yield, we’re done for that frame, even though there might still be time left for work. We can mitigate this somewhat by parametrizing the yield operation. That way, we can experiment with different strategies of staggering the yields, until we settle on a winner. The last snippet then becomes:

As an example, we might use the following routine. When called, it tries to yield, but only actually does so if a certain amount of time has elapsed:

(The doend construction limits the scope of since, keeping it private to YieldOnTimeout.)

In use, it might look like:

This is no panacea, however. Sometimes we’ll get unlucky, say when there are only a couple milliseconds to spare and we run one more operation, which ends up taking five. To account for this, it may be best to underestimate the timeout.

Another possibility is to yield every few calls:

Yet another idea is to yield randomly, say 25% of the time:

Coroutines as a Debugging Tool

The print() statement is a hallmark of debugging, across a broad spectrum of programming languages. Sometimes this comes down to convenience, such as when it would take too much effort to configure a debugger, then place and watch breakpoints. On rare occasions, integrating a debugger may even seem to make the problem go away: an attack of the dreaded Heisenbug!. A common strategy is to sprinkle print() statements around suspect points in the code, then compare the output with our expectations. If a message doesn’t appear, either the code in question was never visited or the program crashed along the way. As we narrow down the scope of the problem, we can remove instances of print() that are no longer necessary.

One concern in normal code is that most (or all) of these print() statements will execute, so we wind up with one long burst of messages. We might also need some sense of when a piece of code was executed, and what our program looked like at the time. It’s hard to guess where the process broke down.

Enter coroutines. If we can roughly isolate the troublesome code, it can be temporarily embedded in a coroutine. Then, by following each print() with a yield, we’re able to inspect the state of the world at that moment. Consider the following:

All of our messages and display objects show up at once.

With coroutines:

By using coroutines, we get to watch the action unfold. In the example above, we now have a couple seconds to review the state of affairs after each print2, without significantly changing the code shape. If we only need to do some visual inspection, we can even just use coroutine.yield directly.

I employed this technique quite recently, in order to test some layout routines. Since these primarily involve display objects, print() only got me so far. Many of the operations consist of one object being positioned relative to another, so if something went wrong in the middle, the whole layout fell apart. By viewing the steps one at a time, I was able to pinpoint when and where things went awry.

Bear in mind that this method does slightly alter the program flow, on account of the yields. Therefore, we must include this in our notion of isolation. In the previous example, for instance, DoSomethingElse shouldn’t depend on what happens inside the wrapped-up code.

The Magic Touch

A timer won’t always be the best fit for debugging. With a short delay, the steps may go by too soon. Much longer, and we might instead grow weary waiting, especially if the problem tends to show up quite late.

Fortunately, time isn’t the only way to drive a coroutine. We hinted earlier at using wrapped coroutines as event listeners. "touch" listeners are one such choice, and in fact give us direct control: on each touch, we resume the wrapped-up code.

With Corona, it’s easy to create a dummy display object and assign it such a listener. Then, instead of waiting around for the timer, we just click rapidly through any steps we want to ignore. Once we find something suspicious, we can take our time.

The previous example then becomes:

After one click, we see:

debug1

(The gray circle is our “button”, the dummy display object.)

After another:

debug2

Finally:

debug3

This is actually a situation where coroutine.create and coroutine.resume are more appropriate. We don’t want to crash our program just because we clicked too many times and ran a dead coroutine. We also end up with a basic sandbox, where an error can occur without bringing down our whole program (the coroutine will then be dead, of course). If our code snippet was properly isolated, this should work just fine.

Once everything is good to go, we can remove all the “temporary debugging code” and move on.

Gotchas

There are a few specific cases where coroutines break, owing mainly to some quirks in the interaction between Lua and C. Thanks to some redesign in the codebase, these have been fixed in Lua 5.2+. Corona is based on 5.1, though, so for the time being these issues are a fact of life.

Two such problem areas involve protected calls and metamethods, as Peter “Corsix” Cawley points out on his blog (see point #2). Note that the article itself mainly concerns 5.2, whereas we only care about the 5.1-related misbehavior. There have been some attempts to deal with protected calls, such as Coxpcall, created expressly to allow them in Copas (a library for building TCP/IP servers atop coroutines).

Iterators are the other problem spot:

This is a rather lame iterator (it doesn’t even attempt to iterate!), but it demonstrates the issue. Attempting to yield directly from the so-called iterator function results in an error, in Lua 5.1. We see the message “Before”, and then our program goes down in flames. Note that this is not the same as using coroutines as iterators (an incredibly useful feature but, alas, an entire topic of its own!).

Thankfully these all tend to be rare situations, but it’s good to be aware of them.

Saving

The final gotcha comes from a design perspective. Being able to yield offers us great flexibility, but the flip side is that we do actually need to traverse the coroutine body to arrive at a given point in our code, with all of our local variables and program state in order. We can’t just jump to somewhere in the middle, in general. This presents a problem if the application must be able to save and later restore itself exactly where it left off. This isn’t insurmountable, but does need to be accounted for early on. It will almost certainly be difficult, and probably not worth the hassle.

If the application only needs to be saved every now and then (say between levels or at checkpoints, in a game), or has some leeway regarding what gets restored, this is a much smaller issue.

Examples

Most of the code in this article has been adapted from the samples repository found here. It can be downloaded from here. These samples were made to be shown during a Corona Geek hangout, so I wrote in a style which allows entire segments of the program to be turned on and off quickly. In particular, code gets blocked in long comments. Doing so lets us start with a commented-out section:

We can then enable all its code at once, simply by adding another hyphen at the beginning of the comment:

To disable it once more, just remove the hyphen. I used this blocking method in main.lua, to require() each example, as well as within the modules themselves.

Also, in order to reduce switches between the simulator and console while recording, I overrode print() in main.lua, such that messages show up on the screen (as text objects) rather than the console. To disable this, simply remove or comment out the assignment to print().

In Summary

Coroutines are a truly powerful feature of Lua which, when combined with such mechanisms as Corona’s timers and event listeners, offer us a novel and useful approach to a whole host of difficult problems. We’ve only scratched the surface. Chapter 9 of Programming in Lua covers several topics only touched on here. Explore, and have fun!


[1] A coroutine is a thread. The reverse is also true, with one very important exception: the main thread. This is the “normal” part of the program, where our code runs when it’s not inside a coroutine.

[2] Typically, but not always. We could, for instance, assign the same wrapper to several listeners. Then, as these fire off piecemeal, the coroutine is gradually driven forward.

Another use case is to abort complex code. We might find ourselves in the middle of some heavy operation, ten function calls deep, when we realize we just can’t deal with it. It’s easy enough to return out of one function. Getting out of nine more is a different story! On the other hand, if this is all inside a coroutine, we can just yield. Suddenly we’re back in the main thread, and can just throw the coroutine away. Another way to achieve this is to error() out, although this seems a bit impolite if we don’t actually have an error.

As the saying goes, “It’s better to ask forgiveness than permission.” Essentially, if it’s too much trouble to figure out up front whether an operation has even a chance of success, the best idea might be to just go for it. When we combine this approach with a set of choices, we arrive at a technique called backtracking, which can be summed up in another expression, “If at first you don’t succeed, try again.”

We might want to do normal yields in the coroutine. The ability to send data through coroutine.yield comes to our rescue here. At the outset, we reserve a couple values, “success” and “failure” perhaps, then resume until we run across one or the other.

[3] Creating shadow tables reintroduces garbage, of course. However, coroutine-based timers will by and large be created far less frequently than garden variety timers, and be longer-running besides, so this is unlikely to be a problem in practice.

[4] Another possibility, when yielding, is to pass control of the coroutine to a scheduler, which will later resume the coroutine once it’s ready to go. This has its pros and cons. Since we’re no longer spinning in a loop, it may prove more efficient, especially when we have several coroutines going. At the same time, if we still need to do updates, these costs just reappear in a different form, say as "enterFrame" listener overhead.

Scheduler implementations abound. Some examples in pure Lua are Lumen and this Gist. Others, coded partially in C and C++, include ConcurrentLualuasched, and Nylon. Unfortunately, we can’t simply drop these last few into a Corona codebase. That said, enough of their code is written in Lua to reward some study.


Brent Sorrentino
[email protected]

Brent Sorrentino is a full-time Developer Evangelist and technical writer who assists others with game development, code, and in overcoming other challenges to help them bring their app dreams to life.

14 Comments
  • Andrzej // Futuretro Studios
    Posted at 03:27h, 11 February

    Thank you Steven for this great tutorial. I’ve been looking for a way into coroutines to prevent frame stutter on heavy load operations and you’ve finally provided one I can understand and adapt to my own purposes. Brilliant stuff!

    • Thomas Vanden Abeele
      Posted at 09:45h, 11 February

      Let me know if this fixes problems with heavy load operations – because as far as I know coroutines, in contrast to threading (which coroutines are not), will not help you fix this. But I’m hoping I’m wrong here!

      • Steven Johnson (Star Crunch)
        Posted at 17:22h, 11 February

        Thanks for all the kinds words, everybody!

        Thomas, it’s too bad this was already queued up by the time you mentioned image loading in those other threads, or I might have incorporated some of those caveats here.

        In the end, it will depend upon whether the bottlenecks are fine-grained enough that you can accommodate them. A blocking API call like newImageSheet (and probably most, if not all, other resources) won’t be, unfortunately. If there’s ever some asynchronous or incremental way to load one, on the other hand, it could easily piggyback on the “long-running processes” approach.

        That said, if the “heavy load operation” is just scores or hundreds of small- or medium-sized resources, then we’re getting somewhere!

  • Thomas Vanden Abeele
    Posted at 04:23h, 11 February

    Alright! Very cool to see some in depth info about Lua coding here! Please, more of this stuff!!!

  • JCH_APPLE
    Posted at 05:02h, 11 February

    Really very interesting, thank you for this brilliant “more than tutorial”

  • Rachel
    Posted at 09:09h, 11 February

    Love it! In-depth Lua code like this really helps out the community. Please keep it coming!

  • Conor O'Nolan
    Posted at 07:00h, 12 February

    I was hoping this might help give the illusion of speeding things up in response to a buttonclick. I tried firing part 1 of my script on ‘began’ phase and part 2 on ‘ended’ phase.
    Seems not to work.
    In my code, I put up a dialogbox then load a large graphic.
    The dialogbox doesn’t show ordinarily until the file is loaded. So no illusion here.

    If the coroutine did as I expected, the dialogbox would show instantly and then the file would load while user distracted.

    Also tried firing twice with one function, but that doesn’t work either.
    Any thoughts?

    • Star Crunch
      Posted at 14:32h, 12 February

      Hi Conor.

      The interim between "began" and "ended" will probably be too tight to make much difference, in the general case. How does it behave when you hold the touch for a while?

      Maybe I could say more if I saw some code. In that case, though, it’d be better off in the forum since I’d see responses.

  • Michael
    Posted at 14:41h, 02 March

    Excellent post. Very detailed and filled with some easy to follow and incredibly useful code. Thanks for sharing!

  • Jens-Christian Finnerup
    Posted at 01:09h, 03 March

    Thanks for this awesome post!

    Its great to see some Lua focused tutorials,
    Looking forward to future ones

  • Stephen
    Posted at 16:38h, 21 March

    Can’t get it to work….hummm

    Here is my code
    ———— start of code ————
    local done = false
    local function DoSomething()
    print(“Did Something”)
    end
    local function DoSomethingElse()
    print (“Did Something Else”)
    end
    local function WaitUntilTrue( func, update )
    while not func() do
    if update then
    update()
    end
    coroutine.yield() — attempt to yield across metamethod/C-call boundary
    end
    end

    local pos = { x = 20 }
    DoSomething()

    transition.to( pos, {
    x = 50,

    onComplete = function()
    done = true — NOW we’re done.
    end
    })

    WaitUntilTrue( function()
    return done — Are we done yet?
    end )
    DoSomethingElse()

    ————– end of code ————-

    errors at coroutine.yield() with “attempt to yield across metamethod/C-call boundary”

    am i suppose to put a coroutine.create or wrap somewhere??

    help, i must be stupid. 🙁

  • Ben
    Posted at 10:50h, 09 August

    Heres my question, imagine the following scenario : starting multiples coroutines that access the same global variable. Would this create a race condition and lead to data loss or even worse?

    • Rob Miracle
      Posted at 14:31h, 09 August

      In any multi-threaded system (or in this case Coroutines) there will be race conditions when accessing common data and it’s the programmer’s responsibility to defend against it.