Porting a serverless chatbot from Python to Rust

Porting a serverless chatbot from Python to Rust

Notes on the experience of porting a small chatbot HTTP API from Python to Rust using the Rocket web framework.

Console is starting as a free weekly newsletter highlighting the best tools for developers. We want to know how our list is performing, which means keeping an eye on some key metrics like active subscribers and unsubscribe rates. We can log into Mailchimp to see this data, but I wanted to get a quick daily update rather than forming a habit of constantly logging in.

We default to asynchronous communication and use the chat room built into Basecamp for informal socialising or posting non-critical links and comments. I check it each morning, so it seemed like a good place to post a status update.

Like most chat products Basecamp has a bot API to send arbitrary messages into the chat. It is therefore easy to use the Mailchimp API to pull the stats and post them into Basecamp. This code just needs to run every morning.

Keeping our tech stack as simple as possible is a priority. We don't have a complex product and dealing with servers, OS updates, database replication, etc is something we will put off as long as possible! However, a chat bot has to run somewhere. This is the perfect use-case for a serverless function - just write the code and let the platform deal with the infrastructure maintenance.

Azure vs Google vs AWS

I have tried AWS Lambda, Google Cloud Functions, and Azure Functions, and have several personal utility functions running on both Google and Azure. Whilst I have been impressed with Google Cloud's database products, they are still behind when it comes to the breadth of functionality in their serverless functions product. Google doesn't support Rust and has no concept of custom runtimes like AWS and Azure do.

This meant it was down to AWS vs Azure. Both have excellent documentation, lots of language SDKs, and a large ecosystem of developers. However, I prefer the Azure web portal, like how they build in public on GitHub, and rate them best for environmental sustainability. AWS is the more popular platform and Lambda/Azure Functions are competitive. I just prefer Azure.

Creating a Python serverless bot

I have been writing Python for over 10 years, so it was the default choice to create a simple serverless bot. Azure has native Python support, the Basecamp API takes a simple HTTP POST, and Mailchimp has an official Python SDK.

This meant that within a few hours I had written a function, created the Azure Resource Manager template, and configured GitHub Actions to automatically deploy on push. The bot was running and delivering us stats each morning.

Python bot posting our daily Mailchimp stats into Basecamp.

Why port to Rust?

Over Christmas I usually leave London to visit my family, however the change in COVID-19 rules meant that I could no longer leave the city. With some extra time, particularly in the quiet period between Christmas and New Year, I ended up spending several days playing around with Rust.

Although it has been around since 2010, Rust has become particularly popular in the last few years. Stack Overflow have listed it as the most loved language since 2016.

I won't go into the standard Rust pitch of "better safety, concurrency, performance" - you can find detailed Rust vs Language X posts elsewhere - but I did want to write up my experience as a Rust newbie.

Reason 1: Fun and interesting

My main reason for porting from Python to Rust was that it was fun and interesting! I like Python (which sits at #3 in the "most loved" 2020 ranking) because it is easy to start writing code and you can get things done quickly. It also enforces a particular coding style which means more consistent layout, spacing, indenting, etc.

Python doesn't make you think about concepts like static typing and memory management because it handles it for you. Rust is very different. It forces you to consider types, ensure all return cases are handled consistently, understand scoping, and many other things which are absent in Python. This is by design because it makes the resulting code much more robust.

Rust uses the concept of "correctness". It's not that Rust is "right" and Python is "wrong", but forcing you think about these concepts, and enforcing them before code will compile, helps to avoid errors:

almost all (92%) of the catastrophic system failures are the result of incorrect handling of non-fatal errors explicitly signaled in software.

- Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-intensive Systems

The number of times I have found bugs in my Python code due to silly mistakes like typos or not handling a particular error case makes me appreciate Rust's approach. Better to invest more time in development rather than finding bugs and errors once code has been shipped to production.

I still need to write more extensive unit tests and mock the external API calls but tests are a core language feature. Developers are encourged to write tests inside the source that is being tested, with integration tests in a "tests" directory.

Reason 2: Portability

Neither AWS Lambda nor Azure Functions have native support for Rust, but they both support Custom Runtimes/Handlers. This allows you to build a generic web endpoint which, if it can handle to standard HTTP requests and return a standard HTTP response, can be executed by the platform.

(This is true for an HTTP triggered function on Azure. Getting into more platform-specific function types, like a Timer or Queue trigger, means relying more on the platform features. This makes it more difficult to port away, but not impossible).

All languages have a range of frameworks that help building web endpoints, often with an embedded web server. Rust has Actix and Rocket, amongst others. By creating generic HTTP endpoints, the application can be run anywhere. AWS Lambda has some experimental Rust language bindings, but creating a set of generic web endpoints is better because it is not then tied to a specific platform.

This is not unique to Rust. Python has Flask (or Django, for a more "batteries included" approach). However, Rust differs because it is a statically compiled language. This means the output is a binary that runs entirely independently, only requiring the system libc implementation. If you compile it with a musl target it becomes 100% static and will therefore run on any Linux OS.

Statically compiled binaries have all their dependencies included executable. This means the binary should continue to run "forever", even if the platform changes. Indeed, because the application is completely isolated and separate from the platform, there is no reliance on specific system-installed packages. Both can be updated independent of each other. This can be partly achieved using Python Virtual Environments, but they are complex and have developer overhead (PEP-0582 aims to fix that). Static compilation is a more robust approach.

It is currently convenient to run this bot on Azure, but it would be trivial to move it to AWS or a generic web server. Using upx to compress the executable means the final binary with a full embedded web server is just 2.5MB. This application is so lightweight it could even run on a Raspberry Pi in the office - I'd just need to compile it for ARM.

The final 100% static, upx compressed binary. Only 2.5MB.

This is definitely overkill for a chatbot project. There is no tangible benefit to statically compiling a few hundred lines of Rust vs deploying the same length Python code to Azure Functions. There are only a couple of dependencies!

But this is a great test project to learn how it works (see Reason #1 above). Internal tools tend to be built quickly and with less polish than software you intend to ship to customers. Yet they often last the longest in production because of how widely used they become. In some organizations it can be hard to justify spending expensive developer time on such projects, so investing in maintainabilty upfront is worth it.

Reason 3: Developer tooling

The overall experience of writing Rust is made much more enjoyable because of the developer tooling. Everything revolves around Cargo which makes documentation, dependency management, linting, testing, and builds all part of the core language. Installing Rust using the standard rustup command also includes Cargo and there are several useful extras like Clippy and Audit.

This is a big win for developers because everything you need is available out of the box. Third-party packages can be found in the Crate registry. The equivalent in Python, PyPI and pip, does not feel as slick and wasn't even included with Python when I first started coding it (when setuptools/easy_install were the primary method of package management). You also have to be careful to use a Python virtual environment if you have a system managed installation of Python, otherwise you will end up installing system-level dependencies.

I'm a vim user so the official rust.vim plugin gives me autocomplete and in-editor checks, but the output from Cargo (warnings and errors) is very helpful and pinpointing problems. rust-clippy is fun, too.

Where Python has an advantage is the number of libraries. PyPI has over 280,000 packages whereas Cargo has 53,000. Quantity is not an indication of quality, and Python has been stable longer, but it is reasonable to expect vendors to provide official Python SDKs. That is less common with Rust.

Challenges learning Rust

I wrote the Python bot in a few hours but it took me several days to get the Rust port working. This would be hard to justify if it wasn't a learning project, but everyone gets faster with experience. I had to learn not just a new language but also how to properly cross-compile for a different platform. Here are my notes on some of those challenges:

Language challenges

There are several features which make Rust robust, safe, and correct, but require a shift in mindset if you're used to other languages. For example, the way scoping and ownership works means you can't just declare global variables and must consider when they go out of scope. Another example is the requirement to handle all return types/errors so that the program will never crash unexpectedly.

Both of these make sense, but being forced to think about memory and correctness upfront is new to me. This is an example of how Rust shifts effort, bunching it all in the development process rather than these things (maybe) appearing later as bugs. From the perspective of building quality programs, this is better, but it does slow down development and make it harder for newbies. No doubt this becomes faster as it becomes a normal way of thinking.

Web framework challenges

Rust is designed for system programming which means its really good at command line interfaces, programs that need to run robustly millions of times, or within memory/processing constrained environments like embedded systems. That doesn't mean it's not good for other use cases, but like most people don't write webapps in C maybe Rust isn't the best choice for building HTTP APIs?

There are several web frameworks for Rust: Rocket, Actix, Warp, Iron - but only Actix has released a stable 1.0 release, and there has been considerable controversy over how it uses unsafe Rust.

I started with Actix, which uses the relatively new support for asynchronous Rust programming, but couldn't get it working with the non-async Mailchimp crate. I tried writing my own basic calls using the built-in async HTTP client but that doesn't support HTTPS connections without a feature flag. There is example code but it states "As of actix-web 2.0.0, one must be very careful about setting up https communication" without explaining what that setup involves. I saw strange behaviour with initial requests timing out then subsequent requests succeeding, which I assumed was to do with async.

Unable to get it to work, I switched to Rocket. This is being updated to support the new Rust async features but for now is synchronous. I guessed it would solve my problem with the Mailchimp Crate not working properly with async code, and it did seem to solve the issue. This was too difficult to debug as a Rust newbie.

Rocket also requires Rust nightly. All this shows that Rust is not as mature as Python when it comes to building HTTP APIs. If I had built the chatbot as a CLI then I could have avoided the web framework question (and async problems), but then it would not fit into spec for running a serverless function on Azure.

Chatbot running the embedded web server framework, Rocket.

Build challenges

I run Manjaro Linux locally so compiling it for the same platform - x86_64-unknown-linux-gnu - was no problem. This is considered a Tier 1 Rust platform so it is officially supported and pretty much "guaranteed to work".

Azure also runs Linux but requires the compile target to be x86_64-unknown-linux-musl. This falls under Tier 2 which means "guaranteed to build". Note the subtle difference between "work" and "build". It seems safe to assume that Tier 2 platforms will work fine, but the Rust Book warns that may not always be the case because they do not run automated tests for these platforms.

I ran into problems cross-compiling for musl because of OpenSSL. The compiler searches for the relevant OpenSSL library headers to compile against, and by default it will probably discover the system OpenSSL. I'm not using musl libc locally so the system install is "standard" OpenSSL. This was also the case when I was compiling on a clean GitHub Action Ubuntu Linux environment.

The builds actually compiled and the web server launched (including when deployed to Azure). However, when an HTTPS request was made to the Mailchimp API, the executable segfaulted. Digging into this revealed the problem was with the OpenSSL library.

After a lot of searching I came across a Docker image which provides a clean environment for building Rust linked against musl libc. It includes curl, pq, sqlite3, and zlib, but the main one I needed was OpenSSL. This container image solved my problem and I can run using Docker locally as well as in the GitHub Action build workflow.

Compiling totorobot for Linux musl on GitHub Actions.

Conclusions

This was a fun project to port to Rust. I was able to learn not just a new language but also a new approach to deploying software. Coming from the world of dynamic languages like Python, this is a different way of shipping code. The chatbot was small enough to not take too much time, but also touched several key concepts like making external HTTPS requests, building an API server, and writing basic unit tests. It's cool to be able to package the entire executable in a tiny binary that can run on any system it is built for.

The bot code is available open source on GitHub and I will continue to work on improvements in my spare time. It needs more testing and probably some refactoring to avoid deliberate panics.

That said, I'm not convinced that Rust is (currently) the right language for writing web APIs. If I had already decided to use Rust, creating a CLI that could be called by a cronjob or as a systemd service would likely fit the language strengths better. Of course that would require a server to run it somewhere (even if it was just a Raspberry Pi). If I had to create an HTTP API, Go may be a better choice as a Python alternative. But if safety and low overhead were crucial, maybe Rust is the right choice.

This highlights that there is no one "right" way to do everything - the tool should fit the job rather than forcing the job to fit the tool.

Discover the best tools for developers

A free weekly email digest of the best tools and beta releases for developers.