back

beluga

a standard C compiler

beluga is a standard C compiler being developed based on an earlier version of lcc. It supports C90 (to be precise, ISO/IEC 9899:1990) as its ancestor does and is planned to extend the coverage to C99 (and C11 finally).

Compared to its parent, beluga carefully implements the language standard and thus provides production-quality diagnostics including caret diagnostics, range highlighting, typedef preservation and macro expansion tracking:

screenshot for enhanced front-end features

The generated code is not highly optimized, but satisfactory enough for daily use. (This is a hobby project; never easy for me alone to catch up production compilers like gcc and clang+llvm.)

beluga currently produces assembly output for x86 only (and uses an assembler from the target system). Thanks to its origin, however, it can be readily retargeted to other platforms. Support for 64-bit machines (like x86-64) requires new significant features to be implemented and is one of most important goals of this project.

Also I'm redesigning each part of the compiler aiming for better structure (e.g., see below for an integrated preprocessor) and have a plan to completely replace the back-end interface and implementation to ease adoptation of more ambitious optimization techniques mainly based on a CFG.

An integrated preprocessor

The preprocessor formerly developed as a separate executable under the name of sea-canary, has been integrated into the compiler. It reads source code and delivers tokens (not characters) to the compiler proper via a token stream (not via a temporary file). It is fairly fast, is correct enough to pass many complicated test cases, produces highly compact output and has rich diagnostics. For example, it catches, with -Wtoken-paste-order option, code that subtly depends on an unspecified evaluation order of the ## operator like this:

#define concat(x, y, z) x ## y ## z
concat(3.14e, -, f)    /* non-portable */

and, due to the line mapper shared by the compiler, it pinpoints problematic spots as precisely as possible:

range highlighting on sub-expression from macro expansion

The current version conforms to C90, but supports features like empty arguments and variadic macros introduced in C99 and widely used now.

How to install

Refer to INSTALL.md for an installation guide.

Repository

beluga had used Mercurial (which is also and commonly called hg) as its repository, and moved into git to be published on github. You can browse the repository through the web, or clone it as follows:

git clone https://github.com/mycoboco/beluga.git

If you want to contribute to the project, simply fork it and send me pull requests.

Recent commits are:

and more.

Issue tracker

beluga employed github for its issue tracker. If you'd like to file a bug or ask a question, do not hesitate to post a new issue; a self-describing title or a short text to deliver your idea would be enough. Even if issues I have posted are all written in English, nothing keeps you from posting in Korean.

Open issues are:

and more.

License

LICENSE.md describes the license imposed on users of beluga.

Try it out

You can play with the front-end of beluga below; it was built by itself. Not all common headers you can find on a Unix-like system are available in the sandbox; only 15 standard headers provided by C90 are allowed to be #included. The code you give this site is not saved on the server.

For simplicity's sake, beluga for this try-it-out is configured to stop after 5 errors encountered (with saying "too many errors"), and is restricted to handle up to about 2MB-sized code. These are, of course, limitations only for this page and not from the implementations per se.