beluga
is a standard C compiler being developed based on an earlier version
of lcc
. It supports C90 (to be precise,
ISO/IEC 9899:1990)
as its ancestor does and is planned to extend the coverage to
C99
(and
C11
finally).
Compared to its parent, beluga
carefully implements the language standard
and thus provides production-quality diagnostics including caret diagnostics,
range highlighting, typedef
preservation and macro expansion tracking:
The generated code is not highly optimized, but satisfactory enough for daily use. (This is a hobby project; never easy for me alone to catch up production compilers like gcc and clang+llvm.)
beluga
currently produces assembly output for
x86 only (and uses an assembler from the
target system). Thanks to its origin, however, it can be readily retargeted to
other platforms. Support for 64-bit machines (like
x86-64) requires new significant
features to be implemented and is one of most important goals of this project.
Also I'm redesigning each part of the compiler aiming for better structure (e.g., see below for an integrated preprocessor) and have a plan to completely replace the back-end interface and implementation to ease adoptation of more ambitious optimization techniques mainly based on a CFG.
An integrated preprocessor
The preprocessor formerly developed as a separate executable under the name of
sea-canary
, has been integrated into the compiler. It reads source code and
delivers tokens (not characters) to the compiler proper via a token stream (not
via a temporary file). It is
fairly fast, is correct enough
to pass many complicated
test cases, produces
highly compact output and has rich diagnostics. For example, it catches, with
-Wtoken-paste-order
option, code that subtly depends on an unspecified
evaluation order of the ##
operator like this:
#define concat(x, y, z) x ## y ## z
concat(3.14e, -, f) /* non-portable */
and, due to the line mapper shared by the compiler, it pinpoints problematic spots as precisely as possible:
The current version conforms to C90, but supports features like empty arguments and variadic macros introduced in C99 and widely used now.
How to install
Refer to INSTALL.md
for an installation guide.
Repository
beluga
had used Mercurial (which is also and commonly called hg
) as its repository, and moved into git to be published on github. You can browse the repository through the web, or clone it as follows:
git clone https://github.com/mycoboco/beluga.git
If you want to contribute to the project, simply fork it and send me pull requests.
Recent commits are:
- doc(deps): update to use https
- doc(several): use https
- fix(bcc): change lib32 to lib according to gentoo changes
- doc(INSTALL.md): change lib32 to lib according to gentoo changes
- fix(deps): update ocelot document files
and more.
Issue tracker
beluga
employed github for its issue tracker. If you'd like to file a bug or ask a question, do not hesitate to post a new issue; a self-describing title or a short text to deliver your idea would be enough. Even if issues I have posted are all written in English, nothing keeps you from posting in Korean.
Open issues are:
- introduce cfg with basic blocks
- recognize type definitions whenever possible
- implement VA_OPT
- errors from compiling some ioccc codes
- bring back mulops_calls flag for 64-bit multiplication/division
and more.
License
LICENSE.md
describes the license imposed on users of beluga
.
Try it out
You can play with the front-end of beluga
below; it was built by itself. Not all
common headers you can find on a Unix-like system are available in the sandbox; only
15 standard headers provided by C90
are allowed to be #include
d. The code you give this site is not saved on the
server.
For simplicity's sake, beluga
for this try-it-out is configured to stop after 5
errors encountered (with saying "too many errors"), and is restricted to handle up to
about 2MB-sized code. These are, of course, limitations only for this page and not from
the implementations per se.