Stackademic

Stackademic is a learning hub for programmers, devs, coders, and engineers. Our goal is to democratize free coding education for the world.

Follow publication

Why did Elon Musk say that Rust is the Language of AGI?

--

and why WasmEdge is on the critical path of AGI adoption of Rust!

Update: The zero Python dependency, portable, and super-fast llama2 runtime is here! Written in Rust and runs on WasmEdge. Watch the demo video and find the source code repo below.

Why not Python?

Today’s LLM applications, including inference apps and agents, are mostly written in Python. But this is about to change. Python is simply too slow, too bloated, and paradoxically too unwieldy for the new wave of developers. In fact, Chris Lattner, the inventor of LLVM, Clang, and Swift, had demonstrated that Python can be 35,000x slower than compiled languages — and that’s why he invented the Mojo language as a Python alternative.

According to Chris Lattner, a compiled language could be 35,000x faster than Python.

That forces developers to push more and more application logic to natively compiled code, such as C, C++, and Rust. For example, highly popular projects like llama.cpp, whisper.cpp and llama2.c, are all written with zero Python dependency.

Greg Brockman is the co-founder and president of OpenAI. He, too, thinks that Python is too slow. https://twitter.com/gdb/status/1676726449934331904

That, in turn, degrades the Python developer experience. In fact, managing Python installs for cloud deployments has become a major challenge.

Chris Albon is the Director of Machine Learning at the Wikimedia. Even he has difficulty figuring out “modern” Python. https://twitter.com/santiviquez/status/1676677829751177219

In other words, Python is not only VERY slow but also hard to use for developing LLM applications.

Rust!

The challenges with Python have created opportunities for high-performance compiled languages. With C and C++ losing ground to Rust in the general developer community, Elon Musk has noted that Rust could be the language of the AGI.

Let that sink in! https://twitter.com/elonmusk/status/1649603943033450496

Rust is named the most beloved programming language by StackOverflow for the past 7 years in a row, with its market share increasing steadily.

Rust + Wasm, the best of both worlds

Yet, compiling Rust directly to native machine code has other problems.

  • Security. Native binaries could crash the entire system.
  • Portability. Native binaries are specific to the underlying OS and hardware.
  • Performance. Due to security and portability requirements, native binaries are often required to run inside Linux containers. Such containers add startup and runtime overheads to the program slowing them down dramatically.

Wasm has emerged as a leading secure runtime for Rust applications to address those issues. With the cloud-optimized Wasm runtime, WasmEdge, developers now have the option to use high-performance Rust in every layer of the LLM application stack, as a high-performance alternative to Python.

Use Rust + Wasm in place of Python to enhance performance, reduce footprint, and improve security.
  • Agent layer: networking intensive tasks for receiving internet events, connecting to databases, and calling other web services. Rust and WasmEdge provide async and non-blocking I/O for high-density and high-performance agent apps. Example: flows.network.
  • Inference layer: CPU-intensive tasks to pre-process data (eg words and sentences) into numbers, and to post-process numbers into sentences or structured JSON data. Those functions could be written in Rust to achieve the best performance and run in WasmEdge to achieve safety and portability. Example: mediapipe-rs.
  • Tensor layer: GPU-intensive tasks that are passed from Wasm to native tensor libraries such as llama.cpp, PyTorch, and Tensorflow, via WasmEdge’s WASI-NN plugins.

Conclusions

Rust and Wasm could be high-performance and developer-friendly alternatives to Python today.

  • They integrate better with the underlying GPU tensor libraries, which are also written in C/C++/Rust.
  • They are more efficient in implementing application-specific pre- and post-processing data functions, which is the bulk of the inference workload.
  • They are more efficient in implementing networking-intensive and long-running tasks required for LLM agents.
  • They have container image sizes much smaller than Python images (several MBs vs hundreds of MBs).
  • They are safer than Python containers due to the limited software supply chain and reduced attack surface.
  • They are easier to install and manage dependencies than Python programs.

Resources

“There’s plenty of room at the Top: What will drive computer performance after Moore’s law?” by MIT’s Leiserson & Thompson et al., Science, 2020, Vol 368, Issue 6495. It demonstrates that Python could be over 62,000 times slower than optimized C programs. The authors predict that a new computer revolution will come from mass migrating software from Python to compiled languages.

“A Lightweight Design for Serverless Function as a Service” by Long, Tai, Hsieh & Yuan, IEEE Software, 2021, vol. 38, no. 1, pp. 75–80. It demonstrates the AOT-optimized Wasm applications could vastly outperform Linux container applications at both startup and runtime.

The WasmEdge WASI-NN plugins allow Rust programs in WasmEdge to run Pytorch and Tensorflow inference applications.

The mediapipe-rs crate is a Rust library for developers to create applications that use Google’s mediapipe series of AI models. It compiles to and runs in WasmEdge.

“Running llama2.c in WasmEdge” by Yuan, Medium, 2023. It showcases how to run a complete inference application for the llama2 models in WasmEdge.

The flow.network is a serverless platform for LLM agents built on WasmEdge.

Thank you for reading until the end. Please consider following the writer and this publication. Visit Stackademic to find out more about how we are democratising free programming education around the world.

--

--

Published in Stackademic

Stackademic is a learning hub for programmers, devs, coders, and engineers. Our goal is to democratize free coding education for the world.

Responses (58)

Write a response